Announcing conversational PII detection service’s general availability in Azure AI language

Microsoft

Jun 26, 2024

We are ecstatic to share the release of general availability (GA) support for our Conversational PII redaction service in English-language contexts. GA support ensures better Azure SLA support, production environment support, as well as enterprise-grade security.

Conversational PII (Personally Identifiable Information) redaction is one the many high quality, cost effective, task-optimized language AI capabilities offered by Azure AI Language. This collection of machine learning and AI algorithms in the cloud have helped many customers and enterprises across the globe develop intelligent applications and include models for summarization, sentiment analysis, health text analytics, opinion mining, and much more.

The PII detection service supports a rich set of features with fine-tuned models for various use cases, included text based, conversation based with Conversational PII, as well as Native Document PII redaction where the input and output are structured document files in .pdf, .docx and .txt file format. These services can help to detect sensitive information and protect an individual’s identity and privacy in both generative and non-generative AI applications which are critical for highly regulated industries such as financial services, healthcare or government, enabling our customers to adhere to the highest standards of data privacy, security, and compliance.

We have been proud to regularly iterate on the collaboration and feedback from a variety of satisfied customers of the service since its initial release in private then public preview before this GA release and are pleased to now announce the general availability of Conversational PII.

The Conversational PII redaction service expands upon the Text PII redaction service, supporting customers looking to identify, categorize, and redact sensitive information such as phone numbers and email addresses in unstructured text. This Conversational PII language model is specialized for conversational style inputs, particularly those found in speech transcriptions from meetings and calls. This includes improved performance in input complexities such as:

Text with filler words common when transcribing spoken text (like “um” and “uh”).
Multiple speakers in the text, such as a customer service agent and a customer troubleshooting an issue. Notably, the service should be able to handle sensitive information like a phone number even when one speaker is interrupted by the second when relaying the information.
Non-complete sentences, as is common in transcripts of natural speech conversations.
Sensitive information, like a name, being spelled out letter-by-letter instead of as a full word (“A as in apple, B as in boy, H as in house, and I as in igloo.” instead of “Abhi”).

Image: Two examples of complex conversational input being identified and redacted. For more details on how to call our API for conversational PII redaction check out our public documentation quick-starts.

We’re thrilled to see and hear feedback on these new features in use and are excited to continue enabling our customers through delivering new solutions down the line. To learn more about additional Azure AI Language releases, see our blog post on our Build 2024 conference announcements.

For more details and resources, please explore the following links: