May 12 2023 02:08 AM
Hi
I receive financial accounts information, Nominal Ledger Activity Report by individual account code, from clients in PDF format, which can run to 100's of pages, example below:
I use Power Query to load elements into Excel for analysis - which works great when the block of information being loaded is on a single sheet on the PDF file, but when the block of information I need runs onto several pages, I end up with multiple worksheets (as PQ is treating each page as a separate query) in Excel when I've loaded it.
I've tried to transform the data, by appending the data together - issues with column headings missing after the first page. Also, after appending, when I close and load, I cannot just load the appended query, it loads everything into multiple Excel worksheets - all the individual pages of the PDF.
Any help would be gratefully received.
Thanks.
May 12 2023 05:09 AM
Solution...when the block of information I need runs onto several pages, I end up with multiple worksheets (as PQ is treating each page as a separate query) in Excel when I've loaded it
If you get n queries it's probably because you select multiple Tables/Pages in the Navigator, ex.:
Instead, you should uncheck Select multiple items > Click on the file name (what I squared in green), then at the bottom of the Navigator click Transform Data. You then get a single query with all the Tables/Pages found in the PDF, ex.:
From there Combine & Transform as you expect
Re. issues with column headings missing after the first page
This is +/- expected, this mainly depends on how the PDF is structured. In principal that's manageable...
Difficult to say more without an actual PDF sample. Hope this helps a bit
May 12 2023 07:40 AM