Forum Discussion
Splitting a file with get and transform
I am a G&T novice and just wondering if it is possible to use this function to split a file into two separate datasets.
I have a .txt file which has a set of headers at the top followed by data in rows. Then at some point half way through the file there is a new set of headers followed by rows of different data.
In Excel I manually split these into two new worksheets but i am wondering if Get & Transform can help automate the process?
Thanks for any feedback in advance
Hi Huge,
Yes, the idea is to define where is the row with your second header within the source table. You may add Index column, select the row with second header, define it's position based on Index and skip all rows before it to have second part of the table.
Then duplicate the query and do exactly the same but keep only first rows before second header.
Both queries to land as tables into separate sheets.
Like in this script
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], AddIndex = Table.AddIndexColumn(Source, "Index", 0, 1), // Select the row with second header SecondHeader = Table.SelectRows(AddIndex, each ([Column1] = "x")), // and find it's position RowToSplit = SecondHeader{0}[Index], // Based on it skip first part SecondPart = Table.Skip(Source,RowToSplit), // and cosmetic PromotedHeaders = Table.PromoteHeaders(SecondPart, [PromoteAllScalars=true]), ToNumber = Table.TransformColumnTypes(PromotedHeaders,{{"x", Int64.Type}, {"y", Int64.Type}}) in ToNumberand in attached file.
Details depends on how your initial data is structured.
4 Replies
- SergeiBaklanDiamond Contributor
Hi Huge,
Yes, the idea is to define where is the row with your second header within the source table. You may add Index column, select the row with second header, define it's position based on Index and skip all rows before it to have second part of the table.
Then duplicate the query and do exactly the same but keep only first rows before second header.
Both queries to land as tables into separate sheets.
Like in this script
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], AddIndex = Table.AddIndexColumn(Source, "Index", 0, 1), // Select the row with second header SecondHeader = Table.SelectRows(AddIndex, each ([Column1] = "x")), // and find it's position RowToSplit = SecondHeader{0}[Index], // Based on it skip first part SecondPart = Table.Skip(Source,RowToSplit), // and cosmetic PromotedHeaders = Table.PromoteHeaders(SecondPart, [PromoteAllScalars=true]), ToNumber = Table.TransformColumnTypes(PromotedHeaders,{{"x", Int64.Type}, {"y", Int64.Type}}) in ToNumberand in attached file.
Details depends on how your initial data is structured.
- Hugh BarryCopper Contributor
Sergei,
On a related point, as I mentioned I am a novice with G&T and getting used to the idea of "connections" to data. Once I perfect the query that transforms the data and splits the file I would like to be able to apply this to new output.
Right now I am creating a new query that connects to and transforms e.g. "January.txt" but next month when I have "February.txt" it seems that i will have to connect to that file and re-create the query to apply to it.
I am sure there is something I am missing as otherwise the approach I might take is have a connection to "Generic.txt" which never changes and then delete all data within and replace with new - then refresh.
Hope I am clear and thanks for the feedback once again.
Hugh
- SergeiBaklanDiamond Contributor
Hi Hugh,
If I understood correctly you'd like to apply same transformation to data imported from January.txt, February.txt, etc.
If so based on your initial query you may create the function for which input parameter will be the name of text file (how to create e.g. https://blog.learningtree.com/creating-functions-m-power-query-formula-language/, first i googled), after that your query will look like
let Source = fnMyTransformation("January.txt") in SourceNot to hardcode name of the file could be defined as parameter or picked-up from Excel cell.
- Hugh BarryCopper Contributor
Thanks Sergei, really appreciate the feedback,
Hugh