Accelerate labelling with GPT models in Language Cognitive Services
Published Mar 16 2023 04:25 AM 4,889 Views
Microsoft

In May 2022, custom named entity recognition and custom text classification were announced as generally available features under Language Cognitive Services. The features allowed you to create your own custom models to classify and extract domain specific information from your documents. A key component of building out a quality model lies in the data you provide.

 

Today we're excited to announce the ability to automatically label your documents with GPT models using Azure Open AI. Now, the most time-consuming part of the process, labelling, can be reduced significantly with this new feature.

 

Let’s walk through how to auto-label your documents using custom named entity recognition. First, follow the quickstart until you have a created project with your data in it. In this example, the entities will extract relevant information from loan contracts, such as Loan Amount, Interest Rate, Borrower Name, Borrower Address, Lender Name, and Lender Address. It’s absolutely crucial to use meaningful and descriptive names for your classes or entities when auto-labeling with GPT, as these names are the descriptions given to the GPT model to accurately predict your class or entity label.

 

GenericStarter.png

 

On entering a document, you can click on the Auto-label button under the Activity Pane. This shows us 2 options for auto-labelling: using your own model that you’ve previously trained or the new feature, auto-label with GPT. The auto-label with GPT option is best for when you want to accelerate your labelling process and rely on the power of the large language models to do the bulk of the work on your behalf.

 

JobType.png

 

 

Clicking Next takes us to the section where we have to select our Azure Open AI resource. You need to have access to Azure Open AI to get access to the Azure Open AI models, which is currently a gated service. You can apply for access here.

 

Once you’ve connected your Azure Open AI resource, you will need to select the AOAI deployment that will be used to access the GPT models that will perform auto-labelling. You can follow the steps to deploy a model in AOAI here.

 

ConfigureResource.png

 

 

On the next 2 steps, you can select which documents you’re interested in auto-labelling and with which entities. Finally, you can click on Start Job to begin the process. You’ll wait a few minutes or longer based on the number of documents you selected. Note that you will be charged an estimate of the number of tokens in each of the documents you auto-label based on Azure Open AI’s pricing.

 

Once completed, you can observe the automatically suggested labels provided by GPT. You can now Accept or Reject these suggested labels as part of your project, using the buttons on the top right of the document page, or by clicking on any of the accept or reject buttons near the dotted-line suggested labels. This process still keeps you in full control of what ends up being part of your final model.

 

FinalAutoLabel.png

 

We’re excited to begin our journey with infusing our language services with large language models!

2 Comments
Co-Authors
Version history
Last update:
‎Mar 16 2023 04:25 AM
Updated by: