What’s new Azure AI Language | BUILD 2023

Former Employee

May 23, 2023

Introduction

At Azure AI Language (aka. Azure Cognitive Service for Language), we believe that language is at the core of human intelligence. In line with Microsoft’s mission to empower every person and every organization on the planet to achieve more, we are dedicated to providing natural language processing services that break down language barriers. Azure AI Language products are powered by state-of-the-art language models, including Z-Code++ and Azure OpenAI Service, empowering you to retrieve and analyze meaningful insights from a full range of text content. Join us as we explore the latest developments in our language AI products and how they can transform the way you work with text data.

Today we're thrilled to announce more new features and capabilities designed to make your workflow more seamless and efficient than ever before at this year’s Microsoft Build: General Availability (GA) of document and conversation summarization, and public preview of the following features: interactive custom summarization powered by Z-Code++ and Azure OpenAI Service, custom sentiment analysis, enhanced Text Analytics for health, Named Entity Recognition (NER) and Personal Identifiable Information (PII) Detection, increased language coverage, enhanced integration capability, and expanded container support.

Here are some of the highlights.

General Availability (GA) of document and conversation summarization

Since the launch of document and conversation summarization in public preview at Build and Ignite 2022, customers across industries have been utilizing these features to effectively extract and condense essential information from their documents and conversations. With this GA announcement, customers can now confidently implement use-cases in their production environments, with Azure SLA and enterprise-grade security in mind.

This GA encompasses various summarization capabilities and will go live on June 19th:

Extractive and abstractive document summarizations – These features provide different styles of summarization that users can choose from, allowing them either to preserve the original text by extracting key sentences directly from the input (extractive) or to receive a concise (TL;DR) summary based on an understanding of the input (abstractive).

Conversation summarization – This feature supports four aspects (issue, resolution, chapter title, and narrative) which are specially tailored for conversational content. These features provide useful conversational artifacts in addition to textual summarization that help users locate relevant contexts.

The issue and resolution summarizations are designed to meet the needs of, for instance, customer support and call centers, and aim to summarize customer-agent conversations by identifying key customer issue and relevant resolutions.
The chapter title and narrative summarizations are more general-purpose and can be used for conversations with multiple participants. The chapter title feature segments a conversation into sections with each section headed by a title, while the narrative generates a summary for each segment. For each summary/title, there is a "context" field that displays the range of the input conversation from which the respective element was generated.

Public Preview of interactive custom summarization powered by Z-Code++ and Azure OpenAI Service

At Azure AI Language, we understand that every customer has unique needs when it comes to language processing. That is why we have developed a powerful customization capability that allows you to tailor our products to your specific requirements. With custom features, you can create custom models that extract, analyze, and retrieve meaningful insights for business cases.

To make the customization process even easier and more accessible, we introduced auto labelling to significantly reduce the complexity and time-consumption of data labelling efforts. In March, we introduced an accelerated auto labelling experience powered by Azure OpenAI Service for custom named entity recognition (NER), custom text classification, and conversational language understanding (CLU). You can find more details in the blog posts published at the time, custom NER and custom text classification's post and CLU's post.

Today, we are excited to announce this accelerated auto labeling experience is expanding to support custom summarization, which will go live on June 1^st. With this new addition, you have even more power to fine-tune summarization models.

Public Preview of interactive custom sentiment analysis

Today, sentiment analysis helps you understand a person’s sentiment within text using a collection of machine learning and AI algorithms. However, identifying user sentiment can be unique depending on the specific domain and context. To address this, we’re excited to announce that our new, interactive custom sentiment analysis feature will be available in public preview on June 1^st. With this new feature, you can train your own AI model tailored to your users and your business needs. Using our new auto-labelling experience in the Language Studio, you can upload your datasets and label them for training and testing your custom model. The new auto-labeling experience is powered by our prebuilt sentiment analysis model. Within our prebuilt model, we are pleased to introduce a suite of AI quality updates, enhanced with model training on both conversation and document data for English and Italian.

Public Preview of enhanced Text Analytics for Health

Text Analytics for health helps you extract and label relevant medical information from unstructured texts such as doctor's notes, discharge summaries, clinical documents, and electronic health records.

In this release, we are previewing several new capabilities:

New entity categories for social determinants of health
Support for relationship extraction and assertion detection for the new social determinants of health entity categories
New assertion type for temporal context indicating whether an entity is from past, present, or future.

Custom Text Analytics for health is the custom version of Text Analytics for health, which allows you to take advantage of state-of-the-art machine learning models to develop a custom entity extraction model tailored to your unique healthcare domain data. Your training and testing data uploaded for this purpose will be stored in Azure Blob Storage linked to your Custom Text Analytics for health project. The new feature uses all the prebuilt entity categories from Text Analytics for Health and allows you to extend these categories with your own custom vocabulary. To learn more about this exciting new offering, be sure to check out our custom Text Analytics for Health post.

Public Preview of enhancements to Named Entity Recognition and Personally Identifiable Information Detection

Named Entity Recognition (NER) and Personally Identifiable Information (PII) Detection are getting the following of improvements in terms of model quality, extraction granularity, and API response customization.

NER will allow you to control the response returning from the service by letting you specify a list of entities to selectively include or exclude from the service’s response. Additionally, you can now decide the service’s overlap policy which determines how overlapping entities (entities occurring in the same span) should be handled. For example, the address “143 Rodeo Drive” has 2 entities: the number “143” and the address “143 Rodeo Drive” overlapping each other. You can decide if you allow overlap or should match the longest possible span of entities. The former policy will return the Number entity in “143” along with the full Address “143 Rodeo Drive” entity, predicting all the possible entities in a response; the latter will return the entity with the longest span. The default behavior is the longest span.

Predicted entities now come with metadata and tags. Metadata includes any additional information for an entity, such as an ISO date resolution for “Jan 3rd" as “2023-01-03". Tags provide more granular identifiers to an entity. These include identifying if a location is a City, State, Country, Continent, or Airport. To learn more, please read our NER documentation

The set of languages is now 94 in NER.

The new PII Detection preview model offers a better way to identify financial entities in text, such as SWIFT code, IBAN, ABA Routing number and Credit Card Number. Unlike the previous rule-based approach, the new model uses AI technology to detect these entities using their context rather than reliance on regular expressions. Other enhancements exclusion list, and customizable redaction character are possible on PII.

The Conversational PII has also been updated with 4 new languages, an exclusion list support, and customizable redaction character.

Public Preview of improvements in integration capability

We’re always looking to expand the interoperability of capabilities within Azure Cognitive Services for Language with other Microsoft products, to deliver more seamless experiences to customers. With this in mind, we are excited to announce preview features that significantly improve our integration capabilities:

Language Studio's data labelling is now fully compatible with Azure Machine Learning, making it easier for you to incorporate data labelling into your workflow. Please read our Language Studio and Azure Machine Learning blog post to learn more. This compatibility improvement allows you to use the labeling experience in Azure Machine Learning Studio for easier collaboration with more flexibility and enables you to outsource labeling tasks to external labeling vendors through the Azure Marketplace.
Additionally, we've integrated conversational language understanding (CLU) with Power Virtual Agent (PVA), simplifying the process of creating intelligent chatbots and virtual agents. Please read our PVA blog post for more details. This improvement enables you to use Conversational Language Understanding (CLU) to get higher quality of intent triggering and entity extraction with multilingual support, then build your bot with dialog flows in low-code/no-code by connecting your Power Virtual Agents (PVA) with CLU.

These enhancements provide more seamless experiences for our customers, allowing you to create even more powerful solutions with our products.

Public preview of expanded language support

We understand the importance of catering to a global audience. Regardless of whether you are a multinational corporation or a startup, understanding your customers and effectively delivering your solutions requires you to process and comprehend customer inputs in a variety of languages. This is why we have taken steps to expand our language support, providing you with tools to connect users around the world. Our expanded language support encompasses multiple language skills.

Document summarization, both extractive and abstractive, supports 9 languages
Named entity recognition supports 94 (with 55 new) languages
Personally Identifiable Information (PII) Detection supports 94 languages for names and addresses, with significantly improved AI quality across multiple entities (including financial)
Conversation PII Detection supports 4 ( with 3 new) languages
Language Detection supports 120 (with 6 new) languages and 12 Romanized Indic languages

Public preview of connected container for custom Named Entity Recognition

Finally, we're excited to announce the expansion of connected containers to now support custom Named Entity Recognition (NER)! With this feature, you can manage and deploy your custom named entity recognition model in a containerized environment, making it simpler and more efficient to maintain your infrastructure. To learn more, please read our custom NER connected container post.

We look forward to seeing customers use these capabilities to enhance productivity and decision-making, and we remain committed to delivering innovative solutions that enable our customers to achieve their goals. Thank you for your continued trust in our products, and we welcome your feedback as we strive to continuously improve our services.

Updated May 23, 2023

Version 4.0

Former Employee

Joined May 06, 2019

View Profile

Microsoft Foundry Blog

Follow this blog board to get notified when there's new activity