Forum Discussion

Talvo's avatar
Talvo
Copper Contributor
Oct 25, 2025

Import data from a Microsoft Forms PDF into Excel

Hi all. I have a number of PDFs for clients which contain questionnaires, score sheets, etc. Some are completely external, and some are from Microsoft Forms.

A while back, I created a Macro/some VBA code to read data from the external PDFs, import it into Excel and display the figures; it works really well.

Today, I've tried to update it to load data from the PDFs created by "printing" a Microsoft Forms complete questionnaire and saving as a PDF - unfortunately, when trying to import this into Excel (using "Get Data -> From File -> PDF"), the "Likert" questions are appearing but showing "null" for all the columns, not identifying which contains the "selected" checkbox.

Possibly a slim hope, but I was wondering if anyone has experience doing something similar, or can recommend a way to get this to work?

I know there are potentially other ways to approach this (I tried connecting it directly to the linked Sharepoint Excel file, but the URL is entirely powered by URL arguments ?foo=bar&baz=boing etc, and Excel demands those all be deleted when giving it a filename to connect to to get data from the web, helpfully); connecting it to a "local" version of the Excel file on OneDrive is a possibility, but ideally I'd a) want the tool to be accessible to people who have access to the PDFs but not the spreadsheet, and b) have to save the print-out PDFs in every client folder to store their data together, and the spreadsheet may occasionally be purged, so the PDF is the "safe" way to access the data.)

PDFs can't be run through any online tools due to containing sensitive data.

Many thanks.

1 Reply

  • How about this:

     

    1. Use OCR with positional logic
    If your printed PDFs are consistent in layout:
    •    Use Adobe Acrobat Pro or Tesseract OCR to extract text and checkbox symbols.
    •    Write a VBA macro or Python script to parse the OCR output and infer selections based on position or symbols (e.g., “☒” vs “☐”).
    •    This works best if the layout is fixed and you can map each Likert option to a column.
    2. Switch to fillable PDFs
    Instead of printing the Forms:
    •    Use Microsoft Word or Adobe Acrobat to create a fillable PDF version of the questionnaire.
    •    Users fill it out digitally, and you can extract checkbox states using Excel’s “Get Data → From PDF” or VBA.
    3. Use Microsoft Forms API or Excel sync
    If you control the original Forms:
    •    Responses are stored in a linked Excel file or SharePoint list.
    •    You can use Power Automate or Forms API to extract responses directly, bypassing the PDF entirely.

Resources