Announcing data import from PDF documents
Published Aug 05 2020 02:10 PM 119K Views
Microsoft

We are excited to announce a new and exciting update to Get & Transform Data capabilities in Excel – PDF data connector! This has been one of the top requests from you and we heard you. With the new From PDF connector you can connect to PDF files, and use the included data from the file, just like any other data source in Excel.

 

The new From PDF connector is available as part of an Office 365 subscription. If you are an Office 365 subscriber, find out how to get the latest updates.

 

The following sections describe how to connect to a PDF file, select data, and bring that data into Excel.

 

Connect to a PDF file

To connect to a PDF file, open the Get Data menu from the Data tab on the ribbon. Select From File and click From PDF.

Picture1.png

You are prompted to provide the location of the PDF file you want to use. Once you provide the file location and the PDF file loads, a Navigator window appears and displays the list of tables and pages in the document that you can import the data from.

Picture2.png

You can browse through the PDF document data and select one or multiple elements to import into Excel. When you are ready to import, select the Load button to bring the data into Excel, or Transform Data to clean your data and prepare it for analysis with Power Query Editor.

 

Advanced scenarios

In some cases, you may want to import a range of pages from a PDF document at once. For this, you can specify the Start page and End Page as optional parameters for your PDF connection in the underlying M formula from the Power Query Editor:

Pdf.Tables(File.Contents("C:\Sample.pdf"), [StartPage=5, EndPage=10])

For more information, refer to the Pdf.Tables M function documentation.

 

We hope you will like this new addition to Excel and we’d love to hear what you think about it. Please click File > Feedback and let us know. We’re excited to hear from you!

 

Guy Hunkin

– Excel Team

61 Comments

@Guy Hunkin , thank you. Is it for all channels?

Microsoft

@Sergei Baklan , it is for the Current Channel as we do with other new features in Excel. It will go to other channels according to the update cadence of a particular channel.

Steel Contributor

Nice. Sadly some of the data we need only comes to us via PDF. This will avoid a lot of manual keying or copying and reformatting text out of PDF files.

Good development

Awesome!

Steel Contributor

Great!

Copper Contributor

This is really super. I'll be the king of accounting inside 3 months :)

Copper Contributor

Great! Excel still getting functions to keep up relevant to companies innovation!

 

Am I dreaming to see OCR (optical character recognition) to come in the future for excel? VBA automating a collection of images / pictures / scanned images to retrieve data?

 OCR?

Optical handwritten recognition?

Nice!

Microsoft

@EduardoGS ,

 

I suggest you submit these ideas via Excel User Voice. It helps us to prioritize - https://excel.uservoice.com/.

 

Guy

- Excel Team

 

Brass Contributor

From File in Excel now = From File in Power BI Desktop - Kudos to the Excel Team - Just 5 More to go 

 

Cheers Sam

Gold Contributor

Great!!!

Long-awaited opportunity, hope that this will also be further developed. Like how to convert Java scripts into Excel formulas or VBA, which could be a great opportunity for Excel (Formeln oder/und VBA) over Java. Fantasy around a bit, the solution seduced me :)

Brass Contributor

Love it! 

Copper Contributor

I tested this new functionality today on some PDFs at work.  For some PDFs it had no problem identifying pages that contained data in table formats.  For other documents where the data layout looks like it was originally a spreadsheet, Excel thought the pages were blank.  Is there a reason, this new functionality would recognize data on a page(s) when there is? 

Microsoft

@Mark_Kusek Can you please share the problematic PDFs with me (just make sure that you are not sharing any private or sensitive information) and point me to the problematic tables/pages? This will help me to reproduce the problem locally and investigate.

 

Guy

- Excel Team

Brass Contributor

Keen to use this - built some amazing solutions with it for Power BI already.  But it's not showing up in my menu?

I'm running Excel "Subscription Product, Microsoft 365 Apps for enterprise". Current Channel. Just updated. Build 16.0.13127.20502.

Copper Contributor

@Guy HunkinI can't share the problematic PDF b/c it contains personal identifiable information (PII).

Microsoft

@Mark_Kusek, the only way for me to check this is by testing it with the problematic PDF locally in our lab. If you are able to reproduce it on a PDF document that doesn't contain any sensitive data - please let me know.

 

Guy

- Excel Team

Microsoft

@Mike_Honey , the PDF connector should be available for you under the Data tab on the ribbon > Get Data > From File > From PDF. Please restart Excel and let me know if the connector appears for you as needed.

 

Guy

- Excel Team

Brass Contributor

@Guy Hunkin - yes that worked - thanks!

Copper Contributor

@Guy Hunkin thank you for your post! I have Microsoft Office 365 ProPlus and "From PDF" function is not available. Is it only available on Beta channel? If yes, when it will be available on all Office 365 channels? Many thanks! 

Copper Contributor

@Guy Hunkin  I just discovered this feature, and I love it. It is working as it should. I use it in Power BI as well in Excel. It helps me a lot to organize my marketing data sources, from which, unfortunately, some are published like PDFs. I am also using many third-party data sources, and those connectors are starting to work better and better, fortunately.

Microsoft

@karakhan ,

 

PDF connector is available to all customers running Excel for Office 365 on the Current Update Channel. What's your Excel version please? Go to File > Account > and you will see your Excel version next to the About Excel button. It should look something like this:

1.png

 

Guy

- Excel Team

 

Copper Contributor

@Guy Hunkin thank you for quick reply. Untitled.jpg

I see your Excel version is 2011. How can I get latest version of Excel?

Thanks.

Microsoft

@karakhan ,

 

The PDF connector is currently available for Current Channel customers only. It will be coming to Semi-Annual soon. In a meantime, you can try switching to Current Channel as described at the following link:

https://docs.microsoft.com/en-us/deployoffice/change-update-channels

 

My version is much newer than yours since I am a Microsoft worker and I am kind of a Beta tester of the new Excel versions before they are released to customers.

 

Guy

- Excel Team

Copper Contributor

Unfortunately, we are on an older version of Excel (2008) so I cannot test this new feature.  Before I submit a request to our IT dept to upgrade, can someone confirm whether the PDF connector will read data from form fields.  We conduct inspections in the field on mobile devices using PDF editable forms but the data becomes trapped in the form.  I'm hoping this connector will allow us to start mining data from key fields on the form.  Please advise.

 

Copper Contributor

@Fred_BIRD I just tried to convert from PDF fields, but they are not visible. But I can not confirm for sure that it is not working.

Copper Contributor

Hi, @Guy Hunkin!

 

I have two PCs:

 

Excel PDF Data Connector = Enable = PC - A = License = Microsoft 365 ( Subscription ) 

O365-ExcelEnablePDFDataConnector-Show.PNG

 

Excel PDF Data Connector = Disable = PC - B = License = Microsoft Office 2019 Pro Plus ( VL )

O2019-ExcelDisablePDFDataConnector-NotShow.PNG

 

Office Pro Plus 2019 newest version of Microsoft 365.

a. Why not show (enable) PDF data connector on Office Pro Plus 2019?

b. When show (enable) PDF data connector  on Office Pro Plus 2019?

 

Thanks,

Samir Morimoto

@SamirMMBr , as mentioned in the post, connector is available only for Office 365 subscribers. I believe it will be never available for 2019, if only for Excel 2022 or how it will be called. One-time purchase versions are not feature-updated. 

Microsoft

Hi @djordje_m!

 

PDF fields are currently unsupported. Maybe https://azure.microsoft.com/en-us/services/cognitive-services/form-recognizer/ can solve your problem.

 

Guy

- Excel Team

Microsoft

@SamirMMBr , currently PDF connector is available only in Office 365. We also plan to enable it in the next perpetual Office version. 

 

Hope it help.

 

Guy

- Excel team

Copper Contributor

Hi I am running Microsoft 365 Apps for enterprise, Version 2008 (Build 16127.21064 Click-to-Run) which, if I read above correctly, sholdf have the PDF connector for Power Query.

 

However, it is not showing.    Is "Microsoft 365 Apps for enterprise" not O365?

Steel Contributor

Hi @Teessider66 - given that build number you seem to be on a deferred channel. The release notes for deferred don't yet show the excel PDF connector. You might need to wait another few months. Release notes for Semi-Annual Enterprise Channel releases in 2020 - Office release notes | Microsoft...

Gold Contributor

Will it be available for the previous versions like 2016 as well, in time, or is it just for the 365 versions and up?

@NikolinoDE , as @Guy Hunkin clarified in previous comment, now only for subscribers. Planed for next perpetual Office version, which means for one which will be after 2019.  Thus not for 2016 & 2019.

Gold Contributor

Then we mortals just wait for the next perpetual version of Office.

Thank you

Copper Contributor

This doesn't work for me. I have a work 365 account but I can't see the PDF option. 

Copper Contributor

gg.jpg

@Seany68 , it's better to share such screenshot

 

since without knowing on which channel you it's hard to discuss available functionality.

Copper Contributor

This great feature to come to Excel that i will be able to mark use of!

Microsoft

@Seany68 , can you please go to the Data tab on the ribbon, open the Get Data dropdown, go to From File, take a screenshot and share it with me here.

 

Guy

- Excel Team

Copper Contributor

Hi @Guy Hunkin can you please tell me how I find out when this will be released in the semi annual enterprise channel?

I'm on Version 2008 (Build 13127.21348) which from what I can tell was only released 2 days ago, but I still don't see the PDF option

Thanks

Copper Contributor

Thank you @Guy Hunkin - very helpful feature! Curious if this is available in Mac-based systems running Office365? I have a colleague running MS Excel for Mac 16.45, and the option did not appear to be available at first glance.

Microsoft

@SCSorenson , PDF connector is not available in Excel for Mac yet.

 

Guy

- Excel Team

Copper Contributor

@Guy Hunkin Thank you for your quick reply! We look forward to this feature.

Microsoft

@MattNaylor, please restart Excel and then check again if you can see the From PDF connector under Data tab of the ribbon > Get Data > From File. If you still can't see the PDF connector, please, kindly perform the following steps:

  1. Send us a frown via File > Feedback > I Don't Like Something. Make sure to attach logs and provide your email so we'll be able to reach out to you on this case.
  2. Let me know what your Session ID is. You can find it via File > Info > About Excel > copy the Session ID value as specified on the third row on top > and paste it in your reply.

It will help us to investigate.

 

Guy

- Excel Team

Copper Contributor

@Guy Hunkin 

1 - done

2 - Session ID: DF59CD3D-A5F2-4636-91C0-0616C31E8338

thanks

Copper Contributor

Hi,

I installed Microsoft Office 2019 Professional Plus edition in my Windows Server 2016. In excel, I don't see the Get Data-> From File->From PDF option. How to get the PDF option in my excel. 

Please guide me as I recently purchased the new version to get the PDF option in Power Query.

 

Regards

Vishal

Gold Contributor

Hello Mr Vishal,

 

With the permission of everyone involved ...

Please Read the Announcing from Mr. Guy Hunkin

The new From PDF connector is available as part of an Office 365 subscription. If you are an Office 365 subscriber, find out how to get the latest updates.

 

...and answer from

 

@Nikolino , as @Guy Hunkin clarified in previous comment, now only for subscribers. Planed for next perpetual Office version, which means for one which will be after 2019.  Thus not for 2016 & 2019.

 

Hadn't seen it either :))

 

Microsoft

Thanks, @MattNaylor! We will look into this.

 

Guy

- Excel Team

Microsoft

@MattNaylor , thank you for submitting all the information! My team has looked into the issue you reported. You will have to wait for the semi-annual release that happened in January 2021 (currently in preview) to get the PDF connector. Hope it helps.

 

Guy

- Excel Team

Version history
Last update:
‎Aug 05 2020 02:12 PM
Updated by: