SOLVED

PDF to Excel

Copper Contributor

My client emailed me his bank statements in PDF format, and I've been trying to use Power Query to convert them to Excel format without success. Please help me in this matter. A Sample file is included.

12 Replies

@g_avinash 

In some cases, you can use Power Query (prerequisite, of course, that your excel version offers this possibility) to import data from a PDF file. Power Query allows you to connect to various data sources, including PDF files. Open Excel, go to the "Data" tab, click on "From File" and select "From PDF." Follow the prompts to import the data from the PDF file.

If there isn't that much data, I would recommend converting it to Excel beforehand (see inserted file in Excel) and then using the data through the Excel sheet. Otherwise, when there is many data, you should pay attention to a few things when importing data from PDF.

  1. PDFs with structured tables: If the PDF file contains tables with clear column and row structures, Power Query can recognize and extract the tabular data efficiently. You can select the table in the PDF and import it using Power Query, which will create a table in Excel with the extracted data.
  2. PDFs with text-based content: Power Query can also handle PDF files that contain text-based content, such as reports, invoices, or receipts. It can extract the text and convert it into a structured format in Excel. You may need to define the delimiters or use specific transformations to split the text into columns and rows.
  3. PDFs with consistent formatting: If the PDF files you're working with have consistent formatting across different pages, Power Query can be set up to apply the same transformations to each page automatically. This is useful when dealing with multipage PDF documents or recurring reports.
  4. PDFs with selectable text: Power Query requires selectable text to extract data accurately. If the PDF file contains scanned images or non-selectable text, the extraction process might not work properly. In such cases, you may need to use optical character recognition (OCR) tools to convert the scanned text into selectable text before using Power Query.
  5. PDFs with multiple layouts: Power Query allows you to define and apply different extraction rules for PDF files with varying layouts. You can create custom queries and transformations to handle different page structures and extract the required data accordingly.

The success of using Power Query to import data from PDF files depends on the quality and structure of the PDF file itself. All this is roughly described because my knowledge of PQ is modest.

Try table2xl.com
best response confirmed by g_avinash (Copper Contributor)
Solution

@g_avinash 

Afraid that's nothing to do with the file created such way.

@g_avinash 

 

You could try receiptsai.com

 

Upload a PDF bank statement and you can download all the transactions into CSV or Excel.

 

Best of luck!

 

-Ronald

 

 

I'm having the same problem. I have watched a bunch of videos that shows how easy it is to this, but it's not. All of my PDF's (883) are the same layouts with the same content in rows, Only the content has changed and not the layout. I personally think the PDF to Excel does not work with 365. I shouldn't have to go find an online source to import a "Microsoft Print to PDF" into Microsoft Excel. If the online sites can do the conversion, Microsoft Excel should be able to do it, too!

I was having the same problem with importing data from PDF files into Excel, and I found that using Power Query can be quite effective. If your Excel version supports it, you can utilize Power Query to connect to PDF files and extract the data directly. You can access this feature by navigating to the "Data" tab in Excel, selecting "From File," and then choosing "From PDF." Power Query can handle various types of PDF content, including structured tables and text-based content. For instance, if your PDF contains tables, Power Query can recognize and extract the tabular data efficiently. Additionally, it can handle PDFs with consistent formatting across multiple pages and adapt to different layouts. So, if you're looking to embed Excel charts with data imported from PDF files, using Power Query could be the solution you need.

 

If your Excel file doesn't contain important or sensitive data then you can try some online PDF to Excel converter tool!

Good day,

@NikolinoDE 

 

I have been trying to use PDF to excel by using Data> New Query> From... but I have all the options except from PDF. Please can you assist me. I have bank statements in PDF that I need to convert into excel. Your help would be much appreciated.

 

Thank you and regards,

@CK_Unique 

To assist you in converting your bank statements from PDF to Excel using Power Query in Office 365, let’s go through a detailed step-by-step guide. If your Excel version doesn't show the "From PDF" option, we will also cover alternative solutions.

Step-by-Step Guide for Power Query

  1. Ensure you have the correct Excel version:
    • Power Query's "From PDF" feature is available in Microsoft 365 subscription versions and Excel 2016 or later. Ensure your Excel is updated to the latest version.
  2. Using Power Query to Import PDF Data:
  1. Open Excel.
  2. Go to the "Data" tab.
  3. Select "Get Data" (or "New Query").
  4. Choose "From File" and then "From PDF".

If you don’t see the "From PDF" option, it means your Excel version doesn’t support this feature. In such cases, consider these alternatives:

Alternative Solutions if "From PDF" Option is Unavailable

Method 1: Using Online PDF to Excel Converters

  1. Online PDF to Excel Conversion Tools:
    • Table2XL: A straightforward tool for converting PDF tables to Excel.
    • ReceiptsAI: Specifically designed for bank statements and similar documents.
    • Adobe Acrobat Online Tools: Adobe’s online tool for converting PDFs to Excel.
  2. Steps:
  1. Upload the PDF file to one of these online tools.
  2. Download the converted Excel file.
  3. Import the Excel file into your Power Query for further processing.

Method 2: Using Adobe Acrobat

  1. Open the PDF in Adobe Acrobat.
  2. Export the PDF:
    1. Go to File > Export To > Spreadsheet > Microsoft Excel Workbook.
    2. Save the file.
  3. Import the Excel file into Power Query for any additional data transformation.

Method 3: Using a Third-Party Software

  1. Invest in a third-party software tool designed for PDF to Excel conversion, such as:
    • Able2Extract Professional
    • PDF2XL
  2. Steps:
  1. Install the software.
  2. Open the PDF in the software.
  3. Convert the PDF to Excel.
  4. Import the Excel file into Power Query.

Addressing Specific User Queries

User 7: Missing "From PDF" Option

If you do not see the "From PDF" option under Data > New Query > From File, you can:

  1. Verify Excel Version: Ensure you are using an Office 365 subscription or Excel 2016 or later.
  2. Update Excel: Sometimes, updating to the latest version can resolve missing features.
  3. Use Alternatives: Use one of the online converters mentioned above to convert your PDF bank statements to Excel, then import the converted Excel file into your Power Query for further processing.

Summary

While Power Query is a powerful tool for importing and transforming data directly from PDFs, its availability depends on the version of Excel you are using. If the feature is not available, using online conversion tools or third-party software can be effective alternatives. For users facing difficulties, ensuring the correct version of Excel and considering other conversion tools are practical steps to achieve the desired results. The text was created with the help of AI.

 

My answers are voluntary and without guarantee!

 

Hope this will help you.

apparently data query with pdf file does not work on a Mac, there is any third party software works on a mac? @NikolinoDE 

@rprieto55 

If you are using a Mac and the "From PDF" option in Power Query is not available, you can use third-party software specifically designed for macOS to convert PDF files to Excel. Here are some reliable third-party solutions that work on a Mac:

Third-Party Software for PDF to Excel Conversion on Mac

  1. Adobe Acrobat Pro DC
    • Features: Adobe Acrobat Pro DC is a powerful tool that allows you to convert PDF files to various formats, including Excel.
    • Steps:
      1. Open the PDF file in Adobe Acrobat Pro DC.
      2. Go to File > Export To > Spreadsheet > Microsoft Excel Workbook.
      3. Save the converted file.
  2. Able2Extract Professional
    • Features: Able2Extract is a versatile PDF conversion tool that supports converting PDFs to Excel on a Mac.
    • Steps:
      1. Download and install Able2Extract Professional for Mac.
      2. Open the PDF file in Able2Extract.
      3. Select the content you want to convert.
      4. Click on Excel to convert the selected PDF content to an Excel file.
  3. PDF2XL
    • Features: PDF2XL is another dedicated tool for converting PDF files to Excel, offering a user-friendly interface and accurate conversions.
    • Steps:
      1. Download and install PDF2XL on your Mac.
      2. Open the PDF file in PDF2XL.
      3. Configure the conversion settings as needed.
      4. Export the PDF to an Excel file.
  4. Online PDF to Excel Converters
    • Table2XL:
      • Features: A straightforward online tool for converting PDF tables to Excel.
      • Steps:
        1. Visit the Table2XL website.
        2. Upload your PDF file.
        3. Download the converted Excel file.
    • ReceiptsAI:
      • Features: Specifically designed for converting bank statements and similar documents.
      • Steps:
        1. Visit the ReceiptsAI website.
        2. Upload your PDF file.
        3. Download the converted Excel file.
    • Adobe Acrobat Online Tools:
      • Features: Adobe’s online tool for converting PDFs to Excel.
      • Steps:
        1. Visit Adobe Acrobat Online Tools.
        2. Upload your PDF file.
        3. Download the converted Excel file.

Steps to Import the Converted Excel File into Power Query on Mac

  1. Open Excel:
    • Launch Excel on your Mac.
  2. Import the Excel File:
    • Go to the Data tab.
    • Select Get Data.
    • Choose From Workbook and locate the Excel file you converted.
  3. Transform Data Using Power Query:
    • Once the file is imported, you can use Power Query to further process and transform the data as needed.

Summary

While macOS does not support the "From PDF" option in Power Query, you can use third-party software like Adobe Acrobat Pro DC, Able2Extract Professional, PDF2XL, or online converters to convert your PDF files to Excel. Once converted, you can easily import and manipulate the data in Excel using Power Query. These tools provide reliable alternatives to achieve the desired results on a Mac. The text and steps were edited with the help of AI.

@rprieto55 

 

online tool:

http://e.anyoupin.cn/bsbm/stu60/a83/spire/upload_pdf.html

 

Transaction Details Page 3 of 349

IMPS/P2A/109712895857/GOOGLEINDIADIGI/-

08-04-21 FORTAP-MUMBAI/ 2,200.00 3,57,283.33

--------------------------------------------------------------------------------

Page Total: 0 56,250.00 3,57,283.33Cr

--------------------------------------------------------------------------------

Note: Cheques received in inward clearing will be considered for debiting/

returning on the basis opening balance in account

Unless the constituent notifies the bank of any discrepancy in this statement

BANK OF BARODA Date :30-05-2023

VIDYARANYAPURA Time : 16:23:52

ADDRESS:

HELPLINE NO. : 1800223344/18001024455/18002584455

BRANCH PHONE NO. :

MICR CODE: 560012100 IFSC CODE: BARB0VJVIPU Page No: 3

IMPS/P2A/109810816712/GOOGLEINDIADIGI/-

--------------------------------------------------------------------------------

A/C Number : 74390200000002 Account Open Date :24-06-2019

Statement of account for the period of 01-04-2021 to 30-05-2023

--------------------------------------------------------------------------------

DATE PARTICULARS CHQ.NO. WITHDRAWALS DEPOSITS BALANCE

--------------------------------------------------------------------------------

08-04-21 FORTAP-MUMBAI/ 2,000.00 3,59,283.33

IMPS/P2A/109813238784/GOOGLEINDIADIGI/-

08-04-21 FORTAP-MUMBAI/ 11,200.00 3,70,483.33

IMPS/P2A/109815708175/GOOGLEINDIADIGI/-

08-04-21 FORTAP-MUMBAI/ 1,250.00 3,71,733.33

IMPS/P2A/109816926918/GOOGLEINDIADIGI/-

08-04-21 FORTAP-MUMBAI/ 1,200.00 3,72,933.33

IMPS/P2A/109817016121/GOOGLEINDIADIGI/-

08-04-21 FORTAP-MUMBAI/ 900.00 3,73,833.33

IMPS/P2A/109817107872/GOOGLEINDIADIGI/-

09-04-21 FORTAP-MUMBAI/ 8,900.00 3,82,733.33

IMPS/P2A/109912583798/GOOGLEINDIADIGI/-

09-04-21 SERBAN-BANGALO 794001 3,40,000.00 42,733.33

SRI SAI MANJUNATHA ENTER

09-04-21 FORTAP-MUMBAI/ 900.00 43,633.33

IMPS/P2A/109913806340/GOOGLEINDIADIGI/-

12-04-21 FORTAP-MUMBAI/ 3,200.00 46,833.33

IMPS/P2A/110009540324/GOOGLEINDIADIGI/-

12-04-21 FORTAP-MUMBAI/ 400.00 47,233.33

IMPS/P2A/110013221244/GOOGLEINDIADIGI/-

12-04-21 FORTAP-MUMBAI/ 3,900.00 51,133.33

IMPS/P2A/110014412304/GOOGLEINDIADIGI/-

12-04-21 DIGITB-VADODAR 28412957 3,150.00 54,283.33

UPI/110028412957/16:00:22/UPI/9886689112@ybl/Paym

1 best response

Accepted Solutions
best response confirmed by g_avinash (Copper Contributor)
Solution

@g_avinash 

Afraid that's nothing to do with the file created such way.

View solution in original post