I have several completed forms in word format. It basically a table that is filled in. Like so:
In order to get the information from this form into sharepoint as metadata in a library i have added content controls to the template, that works well for new forms. Challenge is getting the old forms into sharepoint.
Is there a good way to merge old docs with new template to populate the content controls?
Alternatively, is there a way to ocr read the fields and populate sharepoint metadata?
It is complicated. The content controls as you call them are added to documents by the SharePoint parser on upload to the library. This stamps the content type into the DOCX and makes the content control available in the DOCX, either through quick parts of the custom XML part.
I don't think there is an easy way of adding the content type structure to your old docs (without uploading to SharePoint or building some custom code) - but I'm happy to be wrong on this.
The trick is mapping any old content controls to the actual quick parts that represent the content type fields. This can be done programmatically but you need a dev that understands the OpenXML API.
It can be more complicated again if using MMD fields and the Term Store as there a hidden list involved at the site level and the GUIDs need to match inside the DOCX.
We've solved similar problems in the past and unless you're looking to fix hundreds or thousands of docs, it might not be worth the effort to develop.