Help request. Power query manipulation of messy data

New Contributor

New to Power Query (mind is kinda blown I never new this existed, been using the heck out of Excel and array formulas/nested if's to do similar things, in a harder way and repeatedly, for years).


My "starting point" is right after pdf import of a table. The pdf is diabolical. 300+ pg's, each of which has a table. When I import, PowerQuery does not recognize the table spanning multiple pg's, and does not recognize the table on every pg. So I need to set up query steps for the pg's, BUT the each pg does not import as the same number of columns!


End goal is to tease out those chunks of data that are 4 rows tall but with the qty is stuck to the side, in a nearby column. Eventually having one set per row.


Two questions:

1. Anyone see an elegant way to do a single pg? I am working on replacing, [images] and null's with blank; Collapsing all columns down to one, comma delimited; Replacing multiple commas with one comma; then splitting things back out. But no success yet.

2. The pdf has ~300pgs, one table on each page. I import and and select all the pg's via check box, which loads each pg as a query. How do I automatically apply my "applied steps" to all queries? It seems I can copy the "advanced editor" lines in to each pg, but I don't want to do that 300 times, every week.

I could maybe append all the pg's first, if I can get my solution to be nimble enough.

Since all tables are from the same pdf, the query editor doesn't pick a sample table, like it does when you load multiple pdfs at once.


I have a sample excel file. Page002 and Page003 are examples of how the pages import. Goal is a manually created version.


Thank you for any ideas!

2 Replies

I am trying to figure out how to post my sample file.

Edit:  Added google docs link. If there is a better way, Please let me know. 

Hi @pau11y 


With the sample you shared this seems doable. To confirm, could you share a PDF file?