Forum Discussion
samuel_kodjoe
Jun 28, 2021Copper Contributor
how do i extract data from a pdf file using power query in excel?
hi team, Please i would like to find out if there is a way to extract data from a pdf file using power query in excel? regards!!
Lorenzo
Jun 29, 2021Silver Contributor
Given you don't have the corresponding wizard in Excel user interface, you have to code it yourself. You start with a new blank query and in the formula bar, something like:
= Pdf.Tables(File.Contents("FolderPath\Example.pdf"), [Implementation="1.2"])Then, assuming the function finds a Table in your PDF, click on it in the [Data] column:
NB: re. [Implementation=x.y] the https://docs.microsoft.com/en-us/powerquery-m/pdf-tables says:
- The newest version should always give the best results
I've seen a couple of cases where this wasn't true so you have to test what gives you the best result based on your PDF - Valid values are "1.3", "1.2", "1.1", or null
1.3 isn't avail. yet. Please refer to DanielPerelman's reply on https://docs.microsoft.com/en-us/answers/questions/382594/pdf-file-transformation.html
SergeiBaklan
Jun 29, 2021Diamond Contributor
Small comment, 1.3 is available now. Not sure about exact PQ version.
- LorenzoJun 30, 2021Silver ContributorOn Current Channel v2106 b14131.20278 (latest Click-to-Run) "1.3" is still not avail.
(Same goes with PBI Desktop v2.94.921.0 - June 2021)
If you have it on the Insider Channel it should land here in a few weeks...- SergeiBaklanJun 30, 2021Diamond Contributor
Yes, that's PQ 2.95.223.0 on insiders channel.