SOLVED

PDF to Excel

Copper Contributor

My client emailed me his bank statements in PDF format, and I've been trying to use Power Query to convert them to Excel format without success. Please help me in this matter. A Sample file is included.

6 Replies

@g_avinash 

In some cases, you can use Power Query (prerequisite, of course, that your excel version offers this possibility) to import data from a PDF file. Power Query allows you to connect to various data sources, including PDF files. Open Excel, go to the "Data" tab, click on "From File" and select "From PDF." Follow the prompts to import the data from the PDF file.

If there isn't that much data, I would recommend converting it to Excel beforehand (see inserted file in Excel) and then using the data through the Excel sheet. Otherwise, when there is many data, you should pay attention to a few things when importing data from PDF.

  1. PDFs with structured tables: If the PDF file contains tables with clear column and row structures, Power Query can recognize and extract the tabular data efficiently. You can select the table in the PDF and import it using Power Query, which will create a table in Excel with the extracted data.
  2. PDFs with text-based content: Power Query can also handle PDF files that contain text-based content, such as reports, invoices, or receipts. It can extract the text and convert it into a structured format in Excel. You may need to define the delimiters or use specific transformations to split the text into columns and rows.
  3. PDFs with consistent formatting: If the PDF files you're working with have consistent formatting across different pages, Power Query can be set up to apply the same transformations to each page automatically. This is useful when dealing with multipage PDF documents or recurring reports.
  4. PDFs with selectable text: Power Query requires selectable text to extract data accurately. If the PDF file contains scanned images or non-selectable text, the extraction process might not work properly. In such cases, you may need to use optical character recognition (OCR) tools to convert the scanned text into selectable text before using Power Query.
  5. PDFs with multiple layouts: Power Query allows you to define and apply different extraction rules for PDF files with varying layouts. You can create custom queries and transformations to handle different page structures and extract the required data accordingly.

The success of using Power Query to import data from PDF files depends on the quality and structure of the PDF file itself. All this is roughly described because my knowledge of PQ is modest.

Try table2xl.com
best response confirmed by g_avinash (Copper Contributor)
Solution

@g_avinash 

Afraid that's nothing to do with the file created such way.

@g_avinash 

 

You could try receiptsai.com

 

Upload a PDF bank statement and you can download all the transactions into CSV or Excel.

 

Best of luck!

 

-Ronald

 

 

I'm having the same problem. I have watched a bunch of videos that shows how easy it is to this, but it's not. All of my PDF's (883) are the same layouts with the same content in rows, Only the content has changed and not the layout. I personally think the PDF to Excel does not work with 365. I shouldn't have to go find an online source to import a "Microsoft Print to PDF" into Microsoft Excel. If the online sites can do the conversion, Microsoft Excel should be able to do it, too!

I was having the same problem with importing data from PDF files into Excel, and I found that using Power Query can be quite effective. If your Excel version supports it, you can utilize Power Query to connect to PDF files and extract the data directly. You can access this feature by navigating to the "Data" tab in Excel, selecting "From File," and then choosing "From PDF." Power Query can handle various types of PDF content, including structured tables and text-based content. For instance, if your PDF contains tables, Power Query can recognize and extract the tabular data efficiently. Additionally, it can handle PDFs with consistent formatting across multiple pages and adapt to different layouts. So, if you're looking to embed Excel charts with data imported from PDF files, using Power Query could be the solution you need.

 

1 best response

Accepted Solutions
best response confirmed by g_avinash (Copper Contributor)
Solution

@g_avinash 

Afraid that's nothing to do with the file created such way.

View solution in original post