Forum Discussion
samuel_kodjoe
Jun 28, 2021Copper Contributor
how do i extract data from a pdf file using power query in excel?
hi team, Please i would like to find out if there is a way to extract data from a pdf file using power query in excel? regards!!
samuel_kodjoe
Jun 29, 2021Copper Contributor
wow.....so there is no ad ins i can install to access this with 2019 version?
Lorenzo
Jun 29, 2021Silver Contributor
Copy/paste the following code in a new blank query using the Advanced Editor
let
Source = Record.ToTable(
Record.RemoveFields(#shared,
Record.FieldNames(#sections[Section1])
)
),
SelectedTypeFunction = Table.SelectRows(Source, each [Value] is function),
RemovedValue = Table.SelectColumns(SelectedTypeFunction, {"Name"}),
SortedByName = Table.Sort(RemovedValue, {{"Name", Order.Ascending}})
in
SortedByName
then load the result of that query on a sheet. If you find Pdf.Tables in column [Name] the function is available in your version of Excel/PQ. Per its name the function is looking for table(s) in the PDF...
- samuel_kodjoeJun 29, 2021Copper Contributorhi Lz!
I saw Pdf Tables in column name after loading the code, how then do i locate it.
Thanks for the feedback.- LorenzoJun 29, 2021Silver Contributor
Given you don't have the corresponding wizard in Excel user interface, you have to code it yourself. You start with a new blank query and in the formula bar, something like:
= Pdf.Tables(File.Contents("FolderPath\Example.pdf"), [Implementation="1.2"])
Then, assuming the function finds a Table in your PDF, click on it in the [Data] column:
NB: re. [Implementation=x.y] the Pdf.Tables doc. says:
- The newest version should always give the best results
I've seen a couple of cases where this wasn't true so you have to test what gives you the best result based on your PDF - Valid values are "1.3", "1.2", "1.1", or null
1.3 isn't avail. yet. Please refer to DanielPerelman's reply on this thread
- SergeiBaklanJun 29, 2021MVP
Small comment, 1.3 is available now. Not sure about exact PQ version.
- The newest version should always give the best results