Home

Convert External Web / PDF Table to Full Raw Excel Data Table

%3CLINGO-SUB%20id%3D%22lingo-sub-917670%22%20slang%3D%22en-US%22%3EConvert%20External%20Web%20%2F%20PDF%20Table%20to%20Full%20Raw%20Excel%20Data%20Table%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-917670%22%20slang%3D%22en-US%22%3E%3CP%3EI%20need%20some%20help%20with%20making%20a%20process%20much%20simpler.%20I'd%20like%20to%20take%20a%20table%20from%20an%20external%20source%20and%20turn%20it%20into%20full%2C%20raw%20data.%20The%20issue%20is%20the%20formatting%20from%20the%20source.%20Below%20is%20how%20it%20looks.%3C%2FP%3E%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20style%3D%22width%3A%20199px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Fgxcuf89792.i.lithium.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F138189i19EEC1E947D899B3%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20alt%3D%2210.17.19%20Excel%20Table.PNG%22%20title%3D%2210.17.19%20Excel%20Table.PNG%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%3CP%3EI%20need%20raw%20data%20with%20all%20the%20combinations%20per%20item%20in%20%3CU%3E%3CSTRONG%3EHeader1%3C%2FSTRONG%3E%20%3C%2FU%3Ethat%20would%20look%20like%20this.%3C%2FP%3E%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-left%22%20style%3D%22width%3A%20199px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Fgxcuf89792.i.lithium.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F138190iF13EFDA16DF2BA10%2Fimage-dimensions%2F199x398%3Fv%3D1.0%22%20width%3D%22199%22%20height%3D%22398%22%20alt%3D%2210.17.19%20Excel%20Table%202.PNG%22%20title%3D%2210.17.19%20Excel%20Table%202.PNG%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EIs%20there%20an%20efficient%20way%20to%20do%20this%20process%20with%20a%20much%20larger%20table%3F%20One%20that%20could%20be%20run%20as%20a%20macro%20would%20be%20amazing.%20Thanks!%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-917670%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EExcel%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EFormulas%20and%20Functions%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EMacros%20and%20VBA%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E%3CLINGO-SUB%20id%3D%22lingo-sub-1038772%22%20slang%3D%22en-US%22%3ERe%3A%20Convert%20External%20Web%20%2F%20PDF%20Table%20to%20Full%20Raw%20Excel%20Data%20Table%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1038772%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F427596%22%20target%3D%22_blank%22%3E%40rlussky%3C%2FA%3E%26nbsp%3B%3C%2FP%3E%3CP%3EHello%2C%3C%2FP%3E%3CP%3EThis%20is%20a%20simple%20job%20with%20power%20query%3A%20see%20enclosed%20file%20with%20solution%3A%3C%2FP%3E%3CP%3E%3CU%3E%3CSTRONG%3EStep1.%3C%2FSTRONG%3E%3C%2FU%3E%3C%2FP%3E%3CP%3E-%20Load%20the%20starting%20table%20into%20power%20query%20as%20Table1%2C%20fill%20down%20the%20Header1%20column%2C%26nbsp%3B%3C%2FP%3E%3CP%3E-%20remove%20the%20column%20Header3%3C%2FP%3E%3CP%3E-%20remove%20the%20rows%20with%20null%20values%20in%20Header2%20column%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CSTRONG%3EPower%20Query%20steps%3A%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3Elet%3CBR%20%2F%3ESource%20%3D%20Excel.CurrentWorkbook()%7B%5BName%3D%22Table1%22%5D%7D%5BContent%5D%2C%3CBR%20%2F%3E%23%22Changed%20Type%22%20%3D%20Table.TransformColumnTypes(Source%2C%7B%7B%22Header1%22%2C%20type%20text%7D%2C%20%7B%22Header2%22%2C%20type%20text%7D%2C%20%7B%22Header3%22%2C%20type%20text%7D%7D)%2C%3CBR%20%2F%3E%23%22Filled%20Down%22%20%3D%20Table.FillDown(%23%22Changed%20Type%22%2C%7B%22Header1%22%7D)%2C%3CBR%20%2F%3E%23%22Removed%20Columns%22%20%3D%20Table.RemoveColumns(%23%22Filled%20Down%22%2C%7B%22Header3%22%7D)%2C%3CBR%20%2F%3E%23%22Filtered%20Rows%22%20%3D%20Table.SelectRows(%23%22Removed%20Columns%22%2C%20each%20(%5BHeader2%5D%20%26lt%3B%26gt%3B%20null))%3CBR%20%2F%3Ein%3CBR%20%2F%3E%23%22Filtered%20Rows%22%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CU%3E%3CSTRONG%3EStep2.%3C%2FSTRONG%3E%3C%2FU%3E%3C%2FP%3E%3CP%3E-%20Load%20the%20starting%20table%20into%20power%20query%20as%20Table2%2C%20fill%20down%20the%20Header1%20column%2C%26nbsp%3B%3C%2FP%3E%3CP%3E-%20remove%20the%20column%20Header2%3C%2FP%3E%3CP%3E-%20remove%20the%20rows%20with%20null%20values%20in%20Header3%20column%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CSTRONG%3EPower%20Query%20steps%3A%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3Elet%3CBR%20%2F%3ESource%20%3D%20Excel.CurrentWorkbook()%7B%5BName%3D%22Table1%22%5D%7D%5BContent%5D%2C%3CBR%20%2F%3E%23%22Changed%20Type%22%20%3D%20Table.TransformColumnTypes(Source%2C%7B%7B%22Header1%22%2C%20type%20text%7D%2C%20%7B%22Header2%22%2C%20type%20text%7D%2C%20%7B%22Header3%22%2C%20type%20text%7D%7D)%2C%3CBR%20%2F%3E%23%22Filled%20Down%22%20%3D%20Table.FillDown(%23%22Changed%20Type%22%2C%7B%22Header1%22%7D)%2C%3CBR%20%2F%3E%23%22Removed%20Columns%22%20%3D%20Table.RemoveColumns(%23%22Filled%20Down%22%2C%7B%22Header2%22%7D)%2C%3CBR%20%2F%3E%23%22Filtered%20Rows%22%20%3D%20Table.SelectRows(%23%22Removed%20Columns%22%2C%20each%20(%5BHeader3%5D%20%26lt%3B%26gt%3B%20null))%3CBR%20%2F%3Ein%3CBR%20%2F%3E%23%22Filtered%20Rows%22%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CU%3E%3CSTRONG%3EStep3.%3C%2FSTRONG%3E%3C%2FU%3E%3C%2FP%3E%3CP%3E-%20Merge%20Table1%20with%20Table2%20based%20on%20Header1%3C%2FP%3E%3CP%3E-%20Add%20Header3%20from%20merged%20table%3C%2FP%3E%3CP%3E-%20Sort%20the%20columns%20to%20obtain%20the%20requested%20order%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CSTRONG%3EPower%20Query%20steps%3A%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3Elet%3CBR%20%2F%3ESource%20%3D%20Table.NestedJoin(Table1%2C%7B%22Header1%22%7D%2CTable2%2C%7B%22Header1%22%7D%2C%22Table2%22%2CJoinKind.LeftOuter)%2C%3CBR%20%2F%3E%23%22Expanded%20Table2%22%20%3D%20Table.ExpandTableColumn(Source%2C%20%22Table2%22%2C%20%7B%22Header3%22%7D%2C%20%7B%22Table2.Header3%22%7D)%2C%3CBR%20%2F%3E%23%22Sorted%20Rows%22%20%3D%20Table.Sort(%23%22Expanded%20Table2%22%2C%7B%7B%22Header1%22%2C%20Order.Ascending%7D%2C%20%7B%22Table2.Header3%22%2C%20Order.Ascending%7D%2C%20%7B%22Header2%22%2C%20Order.Ascending%7D%7D)%3CBR%20%2F%3Ein%3CBR%20%2F%3E%23%22Sorted%20Rows%22%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EFor%20more%20information%2C%20please%20do%20not%20hesitate%20to%20contact%20me.%3C%2FP%3E%3CP%3ERgds%2C%3C%2FP%3E%3CP%3EDirk%20Verliefden%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E
rlussky
Regular Visitor

I need some help with making a process much simpler. I'd like to take a table from an external source and turn it into full, raw data. The issue is the formatting from the source. Below is how it looks.

10.17.19 Excel Table.PNG

I need raw data with all the combinations per item in Header1 that would look like this.

10.17.19 Excel Table 2.PNG

 

Is there an efficient way to do this process with a much larger table? One that could be run as a macro would be amazing. Thanks!

1 Reply

@rlussky 

Hello,

This is a simple job with power query: see enclosed file with solution:

Step1.

- Load the starting table into power query as Table1, fill down the Header1 column, 

- remove the column Header3

- remove the rows with null values in Header2 column

 

Power Query steps:

let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Header1", type text}, {"Header2", type text}, {"Header3", type text}}),
#"Filled Down" = Table.FillDown(#"Changed Type",{"Header1"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Header3"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Header2] <> null))
in
#"Filtered Rows"

 

Step2.

- Load the starting table into power query as Table2, fill down the Header1 column, 

- remove the column Header2

- remove the rows with null values in Header3 column

 

Power Query steps:

let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Header1", type text}, {"Header2", type text}, {"Header3", type text}}),
#"Filled Down" = Table.FillDown(#"Changed Type",{"Header1"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Header2"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Header3] <> null))
in
#"Filtered Rows"

 

Step3.

- Merge Table1 with Table2 based on Header1

- Add Header3 from merged table

- Sort the columns to obtain the requested order

 

Power Query steps:

let
Source = Table.NestedJoin(Table1,{"Header1"},Table2,{"Header1"},"Table2",JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(Source, "Table2", {"Header3"}, {"Table2.Header3"}),
#"Sorted Rows" = Table.Sort(#"Expanded Table2",{{"Header1", Order.Ascending}, {"Table2.Header3", Order.Ascending}, {"Header2", Order.Ascending}})
in
#"Sorted Rows"

 

For more information, please do not hesitate to contact me.

Rgds,

Dirk Verliefden

 

Related Conversations
Tabs and Dark Mode
cjc2112 in Discussions on
46 Replies
Extentions Synchronization
Deleted in Discussions on
3 Replies
Stable version of Edge insider browser
HotCakeX in Discussions on
35 Replies
How to Prevent Teams from Auto-Launch
chenrylee in Microsoft Teams on
30 Replies
flashing a white screen while open new tab
Deleted in Discussions on
14 Replies
Security Community Webinars
Valon_Kolica in Security, Privacy & Compliance on
13 Replies