Forum Discussion
Headers split from notepad++ to Excel
Dear Experts,
Greetings!
I have a data from attached notepad ++ and want headers like in below format, using text delimiter as " | " didn't helped.
possible to use some legacy formulas?
This is the kind of key info./requirement you should always provide upfront & when talking about legacy formulas you should also say which version of Excel is in use. Not knowing I assumed 365/Web
Below is a 365 LAMBDA option where no string is hard-coded
#1 When opening the text file from within Excel make sure you select options "Delimited" then "Comma" in the Import Wizard, this to ensure you get the data as a single column
#2 With a file like the one you shared data is then imported in sheet 'headers', range A1:A6. The full reference is therefore: headers!A1:A6
#3 In the attached workbook I stored the LAMBDA in the Name Manager and named it RESTRUCT_HEADERS
You have 2 options to run the LAMBDA. Easiest one is in B8 above:
=RESTRUCT_HEADERS( headers!A1:A6 )or, as in B2 above (formula EDITED):
=LAMBDA(input_range, LET( data, DROP( input_range, 2 ), delim, "|", CLEANSTR, LAMBDA(string, TRIM( CLEAN( string ) ) ), MAKEBOTTOMROW, LAMBDA(string, TAKE( DROP( TEXTSPLIT( string, delim ),, 25 ),, 22 ) ), seqOutCols, SEQUENCE(, 22 ), dataRow1, CLEANSTR( TAKE( DROP( TEXTSPLIT( INDEX( data, 1, 1 ), delim ),, 22 ),, 2 ) ), outRow1, XLOOKUP( seqOutCols, {6,17}, dataRow1, "" ), dataRow2, CLEANSTR( TAKE( DROP( TEXTSPLIT( INDEX( data, 2, 1 ), delim ),, 24 ),, 2 ) ), dataRow2x, HSTACK( dataRow2, dataRow2 ), outRow2, XLOOKUP( seqOutCols, {4,9,15,20}, dataRow2x, "" ), outRow3, MAKEBOTTOMROW( INDEX( data, 3, 1 ) ), outRow4, MAKEBOTTOMROW( INDEX( data, 4, 1 ) ), VSTACK( outRow1, outRow2, outRow3, outRow4 ) ) )( headers!A1:A6 )PS Revised the Power Query approach to simplify it and remove dependency on previously hard coded strings. Feel free to re-download
4 Replies
- LorenzoSilver Contributor
Variant + Conditional Formatting
// RESTRUCT_HEADERS =LAMBDA(input_range, LET( data, DROP( input_range, 2 ), delim, "|", outColumns, 22, HdrStartAt, LET( inRow2, INDEX( input_range, 2, 1 ), splitted, TRIM( TEXTSPLIT( inRow2, "|") ), firstStr, TAKE( TOROW( IF( LEN(splitted), splitted, NA() ), 2), 1 ), FIND( firstStr, inRow2 ) ), shiftedData, MID( data, HdrStartAt, LEN(data) -HdrStartAt ), MAKE_ROW, LAMBDA(string, TRIM( TAKE( DROP( TEXTSPLIT( string, delim ),, 2 ),, outColumns ) ) ), COMPACT_ROW, LAMBDA(array, TOROW( IF( LEN(array), array, NA() ), 2 ) ), GET_VALUES, LAMBDA(array, row_num, take_cols, TAKE( COMPACT_ROW( DROP( TRIM( TEXTSPLIT( INDEX( array, row_num, 1 ), delim ) ),, 2 ) ),, take_cols ) ), seqOutCols, SEQUENCE(, outColumns ), outRow4, MAKE_ROW( INDEX( shiftedData, 4, 1 ) ), outRow3, MAKE_ROW( INDEX( shiftedData, 3, 1 ) ), stringRow3, TAKE( COMPACT_ROW( outRow3 ), 1, 1 ), posStrRow3, TOROW( (XMATCH(outRow3, stringRow3) * seqOutCols) -1, 2 ), shiftPosRow3, posStrRow3 +2, outRow2, XLOOKUP( seqOutCols, posStrRow3, GET_VALUES( shiftedData, 2, 4 ), "" ), outRow1, XLOOKUP( seqOutCols, TOROW( IF( MOD(shiftPosRow3, outColumns / 2), shiftPosRow3, NA() ), 2 ), GET_VALUES( shiftedData, 1, 2 ), "" ), VSTACK( outRow1, outRow2, outRow3, outRow4 ) ) ) - LorenzoSilver Contributor
EDIT: Revised the query to simplify it and remove dependency on previously hard coded strings
// CsvSource let Source = Csv.Document( // Don't forget to update the below file path with yours File.Contents("C:\Lz\Downloads\headers.txt"), [Delimiter=",", Columns=1, Encoding=1252] ) in Source // RestructuredHeaders let Source = CsvSource, RemovedTopRows = Table.Skip( Source, 2 ), delim = "|", TextSplitToList = Table.AddColumn( RemovedTopRows, "Splitted", each List.Skip( Text.Split( [Column1], delim ), 22 ), type list ), RemovedColumn = Table.RemoveColumns( TextSplitToList, {"Column1"} ), AddedRowNum = Table.AddIndexColumn( RemovedColumn, "Row", 1, 1 ), reptDelim2 = Text.Repeat( delim, 2 ), reptDelim3 = reptDelim2 & delim, reptDelim4 = reptDelim2 & reptDelim2, reptDelim5 = reptDelim2 & reptDelim3, AddDelimitersRow2 = (ListSource as list ) as list => let Source = List.Repeat( List.FirstN( List.Skip( ListSource, 2 ), 2 ), 2 ), ConvertedToTable = Table.FromList( Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error), AddedIndex = Table.AddIndexColumn(ConvertedToTable, "Index", 1, 1 ), AddedModulo = Table.AddColumn(AddedIndex, "Modulo", each Number.Mod( [Index], 2 ), Int64.Type ), RemovedIndex = Table.RemoveColumns(AddedModulo,{"Index"}), AddedDelimiters = Table.AddColumn(RemovedIndex, "DelimitedValue", each if [Modulo] = 1 then Text.Combine( {reptDelim3, Text.Trim([Column1]), reptDelim4} ) else Text.Combine( {Text.Trim([Column1]), reptDelim2} ), type text ) in AddedDelimiters[DelimitedValue], TransformedList = Table.AddColumn( AddedRowNum, "CustomList", each if [Row] = 1 then List.Transform( List.FirstN( [Splitted], 2 ), (text) => Text.Combine( {reptDelim5, Text.Trim(text), reptDelim5} ) ) else if [Row] = 2 then AddDelimitersRow2( [Splitted] ) else List.Transform( List.FirstN( List.Skip( [Splitted], 3 ), 22 ), Text.Trim ), type list ), RemovedOtherColumns = Table.SelectColumns( TransformedList, {"CustomList"} ), ListToText = Table.AddColumn( RemovedOtherColumns, "TextToSplit", each Text.Combine( [CustomList], delim ), type text ), RemovedList = Table.SelectColumns(ListToText, {"TextToSplit"} ), SplitByDelimiter = Table.SplitColumn( RemovedList, "TextToSplit", Splitter.SplitTextByDelimiter( delim, QuoteStyle.Csv), List.Transform( {1..22}, (num) => "Col" & Text.From(num) ) ) in SplitByDelimiter- anupambit1797Steel Contributor
Thanks Lorenzo , possible to use some legacy formulas? , what I tried was to check the Num of "|" in each row, and then replace all the rows with the max num of "|", then use the textsplit.. though it didn't worked..
Anything similar if we can think of for dynamic( rather than what you mentioned to hardcode some txt), playing with the "|" and " ", if possible?
Thanks & Regards
Anupam Shrivastava
- LorenzoSilver Contributor
possible to use some legacy formulas?
This is the kind of key info./requirement you should always provide upfront & when talking about legacy formulas you should also say which version of Excel is in use. Not knowing I assumed 365/Web
Below is a 365 LAMBDA option where no string is hard-coded
#1 When opening the text file from within Excel make sure you select options "Delimited" then "Comma" in the Import Wizard, this to ensure you get the data as a single column
#2 With a file like the one you shared data is then imported in sheet 'headers', range A1:A6. The full reference is therefore: headers!A1:A6
#3 In the attached workbook I stored the LAMBDA in the Name Manager and named it RESTRUCT_HEADERS
You have 2 options to run the LAMBDA. Easiest one is in B8 above:
=RESTRUCT_HEADERS( headers!A1:A6 )or, as in B2 above (formula EDITED):
=LAMBDA(input_range, LET( data, DROP( input_range, 2 ), delim, "|", CLEANSTR, LAMBDA(string, TRIM( CLEAN( string ) ) ), MAKEBOTTOMROW, LAMBDA(string, TAKE( DROP( TEXTSPLIT( string, delim ),, 25 ),, 22 ) ), seqOutCols, SEQUENCE(, 22 ), dataRow1, CLEANSTR( TAKE( DROP( TEXTSPLIT( INDEX( data, 1, 1 ), delim ),, 22 ),, 2 ) ), outRow1, XLOOKUP( seqOutCols, {6,17}, dataRow1, "" ), dataRow2, CLEANSTR( TAKE( DROP( TEXTSPLIT( INDEX( data, 2, 1 ), delim ),, 24 ),, 2 ) ), dataRow2x, HSTACK( dataRow2, dataRow2 ), outRow2, XLOOKUP( seqOutCols, {4,9,15,20}, dataRow2x, "" ), outRow3, MAKEBOTTOMROW( INDEX( data, 3, 1 ) ), outRow4, MAKEBOTTOMROW( INDEX( data, 4, 1 ) ), VSTACK( outRow1, outRow2, outRow3, outRow4 ) ) )( headers!A1:A6 )PS Revised the Power Query approach to simplify it and remove dependency on previously hard coded strings. Feel free to re-download