Forum Discussion
Venugs
Mar 16, 2023Copper Contributor
Looking at Populating SharePoint document library metadata from pdf file
Looking to find a way whereby when a pdf document is uploaded to a SharePoint document library, metadata needs to be populated directly from the last page of the document without any user input. ...
Paul de Jong
Mar 16, 2023Iron Contributor
"metadata needs to be populated directly from the last page of the document without any user input."
Is the data you are interested in stored in the last page of the PDF document or is it stored in PDF properties (like Title, Author, Subject, Keywords, ...)?
In the latter case there are not many alternatives. The property promotion mechanism that allows for bi-directional transfer of metadata from Office files to SharePoint columns does not work for pdf files. There are tools (https://collab365.com/best-document-management-solutions-for-sharepoint/#t-1677591384118) that support one-way sync (from pdf to SharePoint columns) during uploading. Not aware of any tools supporting bi-directional sync.
- VenugsMar 16, 2023Copper ContributorSo, the data is extracted from the last page of the pdf file using python packages into an excel file stored in OneDrive.