Forum Discussion

Max_Mustermann's avatar
Max_Mustermann
Copper Contributor
Aug 20, 2024

Power Query losing Columns

Hello there,

 

 

I have a problem importing some folders into power query. The interesting part is the different files in said folder seem to have a varying amount of columns - and power query only loads the columns that are in the example file.

 

 

I tried this twice, one set of data is a folder of PDFs. I choose the right page to extract and find out power query has only loaded the columns as in the example file - but i can load the others with some steps (picture):

 

 

The other set is a folder with CSVs. I try replicating the exact steps here, but Power Query does not load the additional column - if I refuse to choose an example file that has all columns already. If I want to automate said steps I can't choose a fitting file every time, my only option is to accept Excel losing a column of data. Is there any way of preventing this? Thanks in advance.

  • Max_Mustermann 

    In case of csv check Transform Sample File query, first step looks like

    = Csv.Document(Parameter1,[Delimiter=";", Columns=10, Encoding=65001, QuoteStyle=QuoteStyle.None])

    Change Columns=10 on Columns=20 or like, i.e. max possible number of columns.

    When in combined file remove all columns with no data. As variant that could be function which removes all empty rows and columns as

    (Source as table) as table =>
    
    let
        RemoveEmptyColumns = Table.SelectColumns(
            Source,
            List.Select(
                Table.ColumnNames(Source),
                each List.NonNullCount(Table.Column(Source,_)) <> 0
            )
        ),
        RemoveEmptyRows = Table.SelectRows(
            RemoveEmptyColumns,
            each not List.IsEmpty(List.RemoveMatchingItems( Record.FieldValues(_), {"", null} ) )
        )
    in
        RemoveEmptyRows

    Not sure about PDF, check the connector options if something similar.

  • Max_Mustermann 

    In case of csv check Transform Sample File query, first step looks like

    = Csv.Document(Parameter1,[Delimiter=";", Columns=10, Encoding=65001, QuoteStyle=QuoteStyle.None])

    Change Columns=10 on Columns=20 or like, i.e. max possible number of columns.

    When in combined file remove all columns with no data. As variant that could be function which removes all empty rows and columns as

    (Source as table) as table =>
    
    let
        RemoveEmptyColumns = Table.SelectColumns(
            Source,
            List.Select(
                Table.ColumnNames(Source),
                each List.NonNullCount(Table.Column(Source,_)) <> 0
            )
        ),
        RemoveEmptyRows = Table.SelectRows(
            RemoveEmptyColumns,
            each not List.IsEmpty(List.RemoveMatchingItems( Record.FieldValues(_), {"", null} ) )
        )
    in
        RemoveEmptyRows

    Not sure about PDF, check the connector options if something similar.

    • Max_Mustermann's avatar
      Max_Mustermann
      Copper Contributor

      Thank youSergeiBaklan for the reply, i will need some time until i am able to test around with it.

       

      I just wanted to clarify that the PDF-case is working quite fine. I only encounter said problem on the CSV-case. Maybe I will encounter more problems in future, so i can specify when and how often it occurs.

      • SergeiBaklan's avatar
        SergeiBaklan
        MVP

        Max_Mustermann 

        That could be one more option with csv. You may use Columns=null

        = Csv.Document(Parameter1,[Delimiter=",", Columns=null, Encoding=65001, QuoteStyle=QuoteStyle.None])

        If null it will will be taken actual number of columns for each file. However, you shall to adjust combining of the table in the resulting table

        let
        
            // These are automatically generated steps 
            Source = Folder.Files("C:\Test temp"),
            #"Removed Other Columns" = Table.SelectColumns(Source,{"Content"}),
            #"Filtered Hidden Files1" = Table.SelectRows(#"Removed Other Columns", each [Attributes]?[Hidden]? <> true),
            #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File", each #"Transform File"([Content])),
            #"Removed Other Columns1" = Table.SelectColumns(#"Invoke Custom Function1", {"Transform File"}),
        
            // add this step to generate list of column names in all files
            names = List.Union(
                List.Transform(
                    #"Removed Other Columns1"[Transform File], (q) => Table.ColumnNames(q)
                )
            ),
        
            // on this step instead of column names for the sample file
            // use generated list of names as below
            #"Expanded Table Column1" = Table.ExpandTableColumn(
                #"Removed Other Columns1",
                "Transform File",
                //Table.ColumnNames(#"Transform File"(#"Sample File"))
                names
                )
        in
            #"Expanded Table Column1"

Resources