Forum Discussion
Unable to get Data from one particular Indian bank statement pdf file
The issue you're experiencing with importing data from a PDF file from Union Bank of India could be due to how the data is formatted or structured in the PDF. Excel's "Get Data" feature may struggle with PDFs that have complex formatting, embedded text, or specific encoding that makes it hard to extract tables.
Here are several approaches you can try to solve the problem:
1. Use a PDF Converter Tool
If Excel's built-in PDF import feature doesn't work, you can try converting the PDF file into another format first, such as Word or a different PDF structure. After that, reattempt the import.
You can use online or desktop tools like:
- Adobe Acrobat Pro: It has an "Export to Excel" option, which might better interpret the table structure.
- Small pdf or PDF 2Excel: These are free tools that can convert the PDF into Excel-friendly formats.
2. Copy and Paste as Text
Sometimes, the data isn't properly organized into tables within the PDF. In such cases:
1. Open the PDF file in a PDF viewer.
2. Select and copy the text containing the data.
3. Paste the text into Excel.
4. Use Excel's Text-to-Columns feature (under the "Data" tab) to split the text into columns.
3. Inspect the PDF Formatting
It’s possible that the "Get Data" feature cannot detect any tables because the data in the PDF is not structured as actual tables (e.g., the data might be presented as an image or be improperly formatted). You can:
1. Open the PDF and inspect how the data is displayed.
2. If the data looks like an image or the formatting is complex, you can try an OCR (Optical Character Recognition) tool to extract the text. Tools like Adobe Acrobat or Google Drive can perform OCR on PDFs to convert images of text into actual text.
4. Use Power Query to Manually Extract Data
If Excel’s built-in PDF table detection fails, you can manually adjust the import using Power Query:
- Go to Data > Get Data > From File > From PDF.
- When you reach the page selection, even if the tables don’t show up, try selecting individual elements from the PDF.
- Look at different sections of the PDF to see if Power Query can grab any part of the data, and combine them later in Excel if needed.
5. Check for Hidden Layers in the PDF
Some PDFs have hidden layers or objects that are preventing Excel from detecting the data correctly. To resolve this, try:
- Opening the PDF in a different PDF reader or editor and saving it as a new PDF file with all layers flattened.
- Alternatively, you can try exporting the PDF as plain text (.txt) first, then re-importing that text into Excel.
6. Check for Corruption in the PDF
Sometimes, the issue could stem from the PDF file being corrupt or improperly generated. Try:
- Opening the file in a different PDF reader and exporting it as a new PDF, or
- Re-downloading the file from the bank's website or requesting a new version.
By following these steps, you should be able to extract the data from the PDF in one form or another and import it into Excel. If none of these options work, manually copying and pasting the data may be your last resort, followed by formatting it properly within Excel. The text was created with the help of AI.
My answers are voluntary and without guarantee!
Hope this will help you.
Was the answer useful? Mark as best response and like it!
This will help all forum participants.