Have ChatGPT or AI Read PDF Contracts

Brass Contributor

I know throughout AI builder you can teach it to grab specific fields from unstructured documents. But is there a way for ChatGPT read the PDF documents, these would be employment contracts, and have me ask it questions regarding anomalies, compensation that is different from other standard contracts, what is the compensation, FTE rate.  I want to have some sort of a dialogue. Maybe there is a way to do it through PVA.  Anyone ever attempt this please provide help or guidance or point me somewhere 🙂

2 Replies

Hi @FlowTime1990,

first of all, I like your idea.

but AI models like ChatGPT don't have the built-in capability to directly read PDF documents or perform detailed analysis on them. However, you can integrate AI technologies to work with PDF documents and extract relevant information from them. Here's a potential solution using a combination of Optical Character Recognition OCR and ChatGPT:

1. Optical Character Recognition (OCR):
Use an OCR tool to convert the text from the PDF contracts into machine-readable format. There are various OCR libraries and APIs available, such as Tesseract OCR, Google Cloud Vision API, and Microsoft Azure Cognitive Services OCR.

OCR - Optical Character Recognition - Azure AI services | Microsoft Learn

2. Text Extraction and Preprocessing:
Once you have extracted the text from the PDFs using OCR, you'll need to preprocess the data to clean it up, remove irrelevant information, and structure it appropriately for analysis.

What is Azure AI Document Intelligence? - Azure AI services | Microsoft Learn

3. Named Entity Recognition (NER):
Employ Named Entity Recognition techniques to identify specific entities within the text, such as compensation amounts, FTE rates, anomalies, etc. NER helps extract structured information from unstructured text.

What is the Named Entity Recognition (NER) feature in Azure AI Language? - Azure AI services | Micro...

How to perform Named Entity Recognition (NER) - Azure AI services | Microsoft Learn

4. Create a Chat Interface:
Set up a chat interface where you can interact with the AI model (e.g., ChatGPT) by inputting questions or queries related to the employment contracts. This can be a custom web application, a chatbot platform, or even using Microsoft's Power Virtual Agents (PVA).

Intelligente virtuelle Agenten und Bots | Microsoft Power Virtual Agents

5. Train ChatGPT on the Task:
Train ChatGPT on the annotated data that includes examples of questions and corresponding answers related to the employment contracts. You can use supervised learning techniques for this task.

6. ChatGPT Inference:
Use the trained ChatGPT model to answer questions and provide insights about the contracts based on the extracted information from the PDFs.

7. Implement User Feedback Loop:
Provide a mechanism for users to provide feedback on the accuracy and helpfulness of ChatGPT's responses. This feedback can be used to further improve the model's performance over time.



Additionally, working with sensitive documents like employment contracts requires adherence to data security and privacy protocols. Ensure that you comply with any relevant regulations and take necessary precautions to handle the documents securely.

 



Please click Mark as Best Response & Like if my post helped you to solve your issue.
This will help others to find the correct solution easily. It also closes the item.


If the post was useful in other ways, please consider giving it Like.


Kindest regards,


Leon Pavesic

I was hoping that there was some kind of a plugin. I used code interpreter to upload a csv file and that was pretty amazing....so I figured if there is a way to read PDFs in a SP document library that would be awesome!