Forum Discussion
samuel_kodjoe
Jun 28, 2021Copper Contributor
how do i extract data from a pdf file using power query in excel?
hi team, Please i would like to find out if there is a way to extract data from a pdf file using power query in excel? regards!!
Lorenzo
Jun 29, 2021Silver Contributor
Given you don't have the corresponding wizard in Excel user interface, you have to code it yourself. You start with a new blank query and in the formula bar, something like:
= Pdf.Tables(File.Contents("FolderPath\Example.pdf"), [Implementation="1.2"])Then, assuming the function finds a Table in your PDF, click on it in the [Data] column:
NB: re. [Implementation=x.y] the https://docs.microsoft.com/en-us/powerquery-m/pdf-tables says:
- The newest version should always give the best results
I've seen a couple of cases where this wasn't true so you have to test what gives you the best result based on your PDF - Valid values are "1.3", "1.2", "1.1", or null
1.3 isn't avail. yet. Please refer to DanielPerelman's reply on https://docs.microsoft.com/en-us/answers/questions/382594/pdf-file-transformation.html
samuel_kodjoe
Jun 29, 2021Copper Contributor
please this is what i get trying to implement for 1.2. the other values are giving me invalid answers
- LorenzoJun 29, 2021Silver ContributorCarefully read the error message you get, it clearly indicates what the problem is. I checked the formula you entered and it's OK (no syntaxt error nor anything like that)