Forum Discussion
how do i extract data from a pdf file using power query in excel?
The following works with a PDF stored on a Personal OneDrive:
- From a Web Browser, download the PDF file
- Open the Downloads page of the Web Browser (pic. from MSFT Edge below)
- Copy the Download link:
In Excel:
- Data (tab) > From Web > Paste the Download link > OK > ...
If you want to try with the PDF I shared, a Download link is:
https://public.am.files.1drv.com/y4mp7lEIREyPqh2W_UW3tzsSdWp6Kpim9Hpj2b2Sp-OcoTenAZ-33ASspv_OnvEkzD3RpLWfT4ftsglbRAfyHuRz5AIZq3FNL8HRX8-n6-0eV7bV1nTQkSrT77dOVHFIwewQYfCE0hNshQGrbt1JW4dVZuYOMEJy0yN4R7_3CweweFjm6wPeZWFYJxC8QSrloB5Fk9lsiU4a2RjqmGAMdxThaztoypmCrUsT8j7kO3H35E?AVOverride=1
Going to Data >> From Web >> Pasting in the download link is the process I originally used. However, it interprets the OneDrive address as something that requires an authentication.
It comes back with an error message "We could not authenticate with the credentials authenticated. Please try again." There is no authentication. required to view and download the file, but when you try to put the URL in power query it doesn't seem to understand that.
- LorenzoMar 30, 2023Silver Contributor
Providing no feedback, good or bad, doesn't help people who Search (you did - thanks)
- Russ_MS_CommmunityMar 30, 2023Copper Contributor
Lorenzo I have not been able to get the above solution proposed to work successfully. In every attempt on multiple computers and multiple networks it is still seeking a login.
- LorenzoFeb 25, 2023Silver Contributor
When OneDrive isn't unresponsive this works no problem here. The attached sample query contains a download link and the expected dataset is returned to Excel
After putting in place the query, if I clear to Permissions to OneDrive (the one highlighted below that matches the download link - https://public.am.files.1drv.com...):
of course, I'm asked to Authenticate (even as Anonymous) the next time I refresh the query
IMHO you should have a look at your Data Source Settings and in case this doesn't help provide more information regarding your context, i.e. where the PDF is actually stored (ex. OneDrive Business, SharePoint...)