Forum Discussion
PDF to Excel
My client emailed me his bank statements in PDF format, and I've been trying to use Power Query to convert them to Excel format without success. Please help me in this matter. A Sample file is included.
Afraid that's nothing to do with the file created such way.
- NikolinoDEGold Contributor
In some cases, you can use Power Query (prerequisite, of course, that your excel version offers this possibility) to import data from a PDF file. Power Query allows you to connect to various data sources, including PDF files. Open Excel, go to the "Data" tab, click on "From File" and select "From PDF." Follow the prompts to import the data from the PDF file.
If there isn't that much data, I would recommend converting it to Excel beforehand (see inserted file in Excel) and then using the data through the Excel sheet. Otherwise, when there is many data, you should pay attention to a few things when importing data from PDF.
- PDFs with structured tables: If the PDF file contains tables with clear column and row structures, Power Query can recognize and extract the tabular data efficiently. You can select the table in the PDF and import it using Power Query, which will create a table in Excel with the extracted data.
- PDFs with text-based content: Power Query can also handle PDF files that contain text-based content, such as reports, invoices, or receipts. It can extract the text and convert it into a structured format in Excel. You may need to define the delimiters or use specific transformations to split the text into columns and rows.
- PDFs with consistent formatting: If the PDF files you're working with have consistent formatting across different pages, Power Query can be set up to apply the same transformations to each page automatically. This is useful when dealing with multipage PDF documents or recurring reports.
- PDFs with selectable text: Power Query requires selectable text to extract data accurately. If the PDF file contains scanned images or non-selectable text, the extraction process might not work properly. In such cases, you may need to use optical character recognition (OCR) tools to convert the scanned text into selectable text before using Power Query.
- PDFs with multiple layouts: Power Query allows you to define and apply different extraction rules for PDF files with varying layouts. You can create custom queries and transformations to handle different page structures and extract the required data accordingly.
The success of using Power Query to import data from PDF files depends on the quality and structure of the PDF file itself. All this is roughly described because my knowledge of PQ is modest.
- CK_UniqueCopper Contributor
Good day,
I have been trying to use PDF to excel by using Data> New Query> From... but I have all the options except from PDF. Please can you assist me. I have bank statements in PDF that I need to convert into excel. Your help would be much appreciated.
Thank you and regards,
- NikolinoDEGold Contributor
To assist you in converting your bank statements from PDF to Excel using Power Query in Office 365, let’s go through a detailed step-by-step guide. If your Excel version doesn't show the "From PDF" option, we will also cover alternative solutions.
Step-by-Step Guide for Power Query
- Ensure you have the correct Excel version:
- Power Query's "From PDF" feature is available in Microsoft 365 subscription versions and Excel 2016 or later. Ensure your Excel is updated to the latest version.
- Using Power Query to Import PDF Data:
- Open Excel.
- Go to the "Data" tab.
- Select "Get Data" (or "New Query").
- Choose "From File" and then "From PDF".
If you don’t see the "From PDF" option, it means your Excel version doesn’t support this feature. In such cases, consider these alternatives:
Alternative Solutions if "From PDF" Option is Unavailable
Method 1: Using Online PDF to Excel Converters
- Online PDF to Excel Conversion Tools:
- Table2XL: A straightforward tool for converting PDF tables to Excel.
- ReceiptsAI: Specifically designed for bank statements and similar documents.
- Adobe Acrobat Online Tools: Adobe’s online tool for converting PDFs to Excel.
- Steps:
- Upload the PDF file to one of these online tools.
- Download the converted Excel file.
- Import the Excel file into your Power Query for further processing.
Method 2: Using Adobe Acrobat
- Open the PDF in Adobe Acrobat.
- Export the PDF:
- Go to File > Export To > Spreadsheet > Microsoft Excel Workbook.
- Save the file.
- Import the Excel file into Power Query for any additional data transformation.
Method 3: Using a Third-Party Software
- Invest in a third-party software tool designed for PDF to Excel conversion, such as:
- Able2Extract Professional
- PDF2XL
- Steps:
- Install the software.
- Open the PDF in the software.
- Convert the PDF to Excel.
- Import the Excel file into Power Query.
Addressing Specific User Queries
User 7: Missing "From PDF" Option
If you do not see the "From PDF" option under Data > New Query > From File, you can:
- Verify Excel Version: Ensure you are using an Office 365 subscription or Excel 2016 or later.
- Update Excel: Sometimes, updating to the latest version can resolve missing features.
- Use Alternatives: Use one of the online converters mentioned above to convert your PDF bank statements to Excel, then import the converted Excel file into your Power Query for further processing.
Summary
While Power Query is a powerful tool for importing and transforming data directly from PDFs, its availability depends on the version of Excel you are using. If the feature is not available, using online conversion tools or third-party software can be effective alternatives. For users facing difficulties, ensuring the correct version of Excel and considering other conversion tools are practical steps to achieve the desired results. The text was created with the help of AI.
My answers are voluntary and without guarantee!
Hope this will help you.
- mmx400Copper ContributorTry table2xl.com
Afraid that's nothing to do with the file created such way.
- ronaldwichhartCopper Contributor
You could try receiptsai.com
Upload a PDF bank statement and you can download all the transactions into CSV or Excel.
Best of luck!
-Ronald
- OG_partsaceCopper ContributorI'm having the same problem. I have watched a bunch of videos that shows how easy it is to this, but it's not. All of my PDF's (883) are the same layouts with the same content in rows, Only the content has changed and not the layout. I personally think the PDF to Excel does not work with 365. I shouldn't have to go find an online source to import a "Microsoft Print to PDF" into Microsoft Excel. If the online sites can do the conversion, Microsoft Excel should be able to do it, too!
- JohnnwhiteCopper Contributor
I was having the same problem with importing data from PDF files into Excel, and I found that using Power Query can be quite effective. If your Excel version supports it, you can utilize Power Query to connect to PDF files and extract the data directly. You can access this feature by navigating to the "Data" tab in Excel, selecting "From File," and then choosing "From PDF." Power Query can handle various types of PDF content, including structured tables and text-based content. For instance, if your PDF contains tables, Power Query can recognize and extract the tabular data efficiently. Additionally, it can handle PDFs with consistent formatting across multiple pages and adapt to different layouts. So, if you're looking to embed Excel charts with data imported from PDF files, using Power Query could be the solution you need.
- JethriaCopper ContributorIf your Excel file doesn't contain important or sensitive data then you can try some online PDF to Excel converter tool!