Forum Discussion
Power Query losing Columns
Hello there,
I have a problem importing some folders into power query. The interesting part is the different files in said folder seem to have a varying amount of columns - and power query only loads the columns that are in the example file.
I tried this twice, one set of data is a folder of PDFs. I choose the right page to extract and find out power query has only loaded the columns as in the example file - but i can load the others with some steps (picture):
The other set is a folder with CSVs. I try replicating the exact steps here, but Power Query does not load the additional column - if I refuse to choose an example file that has all columns already. If I want to automate said steps I can't choose a fitting file every time, my only option is to accept Excel losing a column of data. Is there any way of preventing this? Thanks in advance.
In case of csv check Transform Sample File query, first step looks like
= Csv.Document(Parameter1,[Delimiter=";", Columns=10, Encoding=65001, QuoteStyle=QuoteStyle.None])
Change Columns=10 on Columns=20 or like, i.e. max possible number of columns.
When in combined file remove all columns with no data. As variant that could be function which removes all empty rows and columns as
(Source as table) as table => let RemoveEmptyColumns = Table.SelectColumns( Source, List.Select( Table.ColumnNames(Source), each List.NonNullCount(Table.Column(Source,_)) <> 0 ) ), RemoveEmptyRows = Table.SelectRows( RemoveEmptyColumns, each not List.IsEmpty(List.RemoveMatchingItems( Record.FieldValues(_), {"", null} ) ) ) in RemoveEmptyRows
Not sure about PDF, check the connector options if something similar.
In case of csv check Transform Sample File query, first step looks like
= Csv.Document(Parameter1,[Delimiter=";", Columns=10, Encoding=65001, QuoteStyle=QuoteStyle.None])
Change Columns=10 on Columns=20 or like, i.e. max possible number of columns.
When in combined file remove all columns with no data. As variant that could be function which removes all empty rows and columns as
(Source as table) as table => let RemoveEmptyColumns = Table.SelectColumns( Source, List.Select( Table.ColumnNames(Source), each List.NonNullCount(Table.Column(Source,_)) <> 0 ) ), RemoveEmptyRows = Table.SelectRows( RemoveEmptyColumns, each not List.IsEmpty(List.RemoveMatchingItems( Record.FieldValues(_), {"", null} ) ) ) in RemoveEmptyRows
Not sure about PDF, check the connector options if something similar.
- Max_MustermannCopper Contributor
Thank youSergeiBaklan for the reply, i will need some time until i am able to test around with it.
I just wanted to clarify that the PDF-case is working quite fine. I only encounter said problem on the CSV-case. Maybe I will encounter more problems in future, so i can specify when and how often it occurs.
That could be one more option with csv. You may use Columns=null
= Csv.Document(Parameter1,[Delimiter=",", Columns=null, Encoding=65001, QuoteStyle=QuoteStyle.None])
If null it will will be taken actual number of columns for each file. However, you shall to adjust combining of the table in the resulting table
let // These are automatically generated steps Source = Folder.Files("C:\Test temp"), #"Removed Other Columns" = Table.SelectColumns(Source,{"Content"}), #"Filtered Hidden Files1" = Table.SelectRows(#"Removed Other Columns", each [Attributes]?[Hidden]? <> true), #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File", each #"Transform File"([Content])), #"Removed Other Columns1" = Table.SelectColumns(#"Invoke Custom Function1", {"Transform File"}), // add this step to generate list of column names in all files names = List.Union( List.Transform( #"Removed Other Columns1"[Transform File], (q) => Table.ColumnNames(q) ) ), // on this step instead of column names for the sample file // use generated list of names as below #"Expanded Table Column1" = Table.ExpandTableColumn( #"Removed Other Columns1", "Transform File", //Table.ColumnNames(#"Transform File"(#"Sample File")) names ) in #"Expanded Table Column1"