Forum Discussion
Import data from a Microsoft Forms PDF into Excel
How about this:
1. Use OCR with positional logic
If your printed PDFs are consistent in layout:
• Use Adobe Acrobat Pro or Tesseract OCR to extract text and checkbox symbols.
• Write a VBA macro or Python script to parse the OCR output and infer selections based on position or symbols (e.g., “☒” vs “☐”).
• This works best if the layout is fixed and you can map each Likert option to a column.
2. Switch to fillable PDFs
Instead of printing the Forms:
• Use Microsoft Word or Adobe Acrobat to create a fillable PDF version of the questionnaire.
• Users fill it out digitally, and you can extract checkbox states using Excel’s “Get Data → From PDF” or VBA.
3. Use Microsoft Forms API or Excel sync
If you control the original Forms:
• Responses are stored in a linked Excel file or SharePoint list.
• You can use Power Automate or Forms API to extract responses directly, bypassing the PDF entirely.