Forum Discussion
Convert External Web / PDF Table to Full Raw Excel Data Table
- Nov 28, 2019
Hello,
This is a simple job with power query: see enclosed file with solution:
Step1.
- Load the starting table into power query as Table1, fill down the Header1 column,
- remove the column Header3
- remove the rows with null values in Header2 column
Power Query steps:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Header1", type text}, {"Header2", type text}, {"Header3", type text}}),
#"Filled Down" = Table.FillDown(#"Changed Type",{"Header1"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Header3"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Header2] <> null))
in
#"Filtered Rows"Step2.
- Load the starting table into power query as Table2, fill down the Header1 column,
- remove the column Header2
- remove the rows with null values in Header3 column
Power Query steps:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Header1", type text}, {"Header2", type text}, {"Header3", type text}}),
#"Filled Down" = Table.FillDown(#"Changed Type",{"Header1"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Header2"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Header3] <> null))
in
#"Filtered Rows"Step3.
- Merge Table1 with Table2 based on Header1
- Add Header3 from merged table
- Sort the columns to obtain the requested order
Power Query steps:
let
Source = Table.NestedJoin(Table1,{"Header1"},Table2,{"Header1"},"Table2",JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(Source, "Table2", {"Header3"}, {"Table2.Header3"}),
#"Sorted Rows" = Table.Sort(#"Expanded Table2",{{"Header1", Order.Ascending}, {"Table2.Header3", Order.Ascending}, {"Header2", Order.Ascending}})
in
#"Sorted Rows"For more information, please do not hesitate to contact me.
Rgds,
Dirk Verliefden
Hello,
This is a simple job with power query: see enclosed file with solution:
Step1.
- Load the starting table into power query as Table1, fill down the Header1 column,
- remove the column Header3
- remove the rows with null values in Header2 column
Power Query steps:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Header1", type text}, {"Header2", type text}, {"Header3", type text}}),
#"Filled Down" = Table.FillDown(#"Changed Type",{"Header1"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Header3"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Header2] <> null))
in
#"Filtered Rows"
Step2.
- Load the starting table into power query as Table2, fill down the Header1 column,
- remove the column Header2
- remove the rows with null values in Header3 column
Power Query steps:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Header1", type text}, {"Header2", type text}, {"Header3", type text}}),
#"Filled Down" = Table.FillDown(#"Changed Type",{"Header1"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Header2"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Header3] <> null))
in
#"Filtered Rows"
Step3.
- Merge Table1 with Table2 based on Header1
- Add Header3 from merged table
- Sort the columns to obtain the requested order
Power Query steps:
let
Source = Table.NestedJoin(Table1,{"Header1"},Table2,{"Header1"},"Table2",JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(Source, "Table2", {"Header3"}, {"Table2.Header3"}),
#"Sorted Rows" = Table.Sort(#"Expanded Table2",{{"Header1", Order.Ascending}, {"Table2.Header3", Order.Ascending}, {"Header2", Order.Ascending}})
in
#"Sorted Rows"
For more information, please do not hesitate to contact me.
Rgds,
Dirk Verliefden
- rlusskyDec 17, 2019Copper Contributor
Dirk VERLIEFDEN - Thank you! This was really helpful in understanding the logic.