Azure Cognitive Service for Language support 96 languages for custom features
Published Mar 20 2022 04:25 PM 10.1K Views
Microsoft

Last year, Azure Cognitive Services announced the release of Azure Cognitive Service for Language. During that release, multiple NLP services came together under a unified, state-of-the-art, NLP service available under one resource and one user experience. Three new custom features were introduced at the time: custom text classification, custom named entity recognition, and conversational language understanding.

 

Today, the Azure Cognitive Service for Language team announces support for 96 languages for their new custom features! The new language support unlocks and facilitates global market penetration for customers of the Language Service.

 

The full list of supported languages are:

 

 

Afrikaans

Dutch

Italian

Norwegian (Bokmal)

Swahili

Albanian

English

Japanese

Oriya

Swedish

Amharic

English (UK)

Javanese

Pashto

Tamil

Arabic

English (US)

Kannada

Persian (Farsi)

Telugu

Armenian

Esperanto

Kazakh

Polish

Thai

Assamese

Estonian

Khmer

Portuguese (Brazil)

Turkish

Azerbaijani

Filipino

Korean

Portuguese (Portugal)

Ukrainian

Basque

Finnish

Kurdish (Kurmanji)

Punjabi

Urdu

Belarusian

French

Kyrgyz

Romanian

Uyghur

Bengali

Galician

Lao

Russian

Uzbek

Bosnian

Georgian

Latin

Sanskrit

Vietnamese

Breton

German

Latvian

Scottish Gaelic

Welsh

Bulgarian

Greek

Lithuanian

Serbian

Western Frisian

Burmese

Gujarati

Macedonian

Sindhi

Xhosa

Catalan

Hausa

Malagasy

Sinhala

Yiddish

Chinese (Simplified)

Hebrew

Malay

Slovak

Zulu
(Conversational language understanding only)

Chinese (Traditional)

Hindi

Malayalam

Slovenian

 

Croatian

Hungarian

Marathi

Somali

 

Czech

Indonesian

Mongolian

Spanish

 

Danish

Irish

Nepali

Sundanese

 

 

The most exciting feature of these services announced last year was their ability to train multilingual models. This meant you could build projects and tag data in one language, then deploy and query them for all the other languages supported. Train in English, predict in French, German, Spanish, Japanese, and many others. This eased the burden of effort on customers that previously relied on machine translation, or costly data replication efforts in other languages, to support different languages for their AI solutions. With this new language expansion, enterprises now have a way to reach the world within minutes.

 

The custom language services also allowed you to add data of any language in a single project. This makes sure that even with the built-in multilingual capabilities, there was always a path to add data for any specific language to further improve the quality of that language.

 

Let’s walk through a customer complaint classification scenario using conversational language understanding. After signing into the Language Studio with a Language resource and navigating to conversational language understanding, you can create a new project, which will prompt if you’d like to enable multiple languages in your project.

 

blog1.png

 

 

Once creation is complete, we can add 3 different intents as different types of complaints customers may have:

  • Refund Status
  • Delivery Delay
  • Broken Product

We can then go ahead and add a few utterances associated to each intent such as:

  • “I haven’t gotten my refund and it’s been 4 days!”
  • “My delivery was supposed to be here 3 days ago and it never showed up”
  • “My new mugs that arrived yesterday were broken”

You’ll notice the Language column in the utterances allows you to select a different language for an utterance, in case you wanted to add complaints in any of the other 96 supported languages. In practice, you want to add more than just a few examples for each intent for better quality models.

 

blog2.PNG

 

After saving your changes, you can train a model. Click on Train model, then Start a training job, provide a model name such as “v1” and when you’re ready press Train. You should disable evaluation considering there are very few examples in this project.

 

blog3.PNG

 

Once training is completed which may take a few minutes, you’re ready to go to Deploy model. Click on Add deployment and provide a deployment name such as “Test”, select the model you just trained and then Submit.

 

blog4.PNG

 

Now comes the best part, testing this all out in Test model. We can try out a few test queries like:

  • “Where is my delivery?” --> predicted as Delivery Delay
  • “The money for my refund never showed up” --> predicted as Refund Status
  • “My new phone arrived with a broken screen!” --> predicted as Broken Product

 

blog5.PNG

 

When we now try out those same queries but in languages such as French, Spanish, and Chinese, we still get the right predictions! Even though we’ve only trained using English queries, the power of multilingual models has unlocked us for those languages.

  • “Où est ma livraison?” which is “Where is my delivery?” in French --> predicted as Delivery Delay
  • “El dinero de mi reembolso nunca apareció” which is “The money for my refund never showed up” in Spanish --> predicted as Refund Status
  • “我的新手机到货时屏幕坏了!” which is “My new phone arrived with a broken screen!” in Chinese --> predicted as Broken Product

 

blog6.PNG

 

This is just a simple demonstration of how quickly it was to make use of the multilingual capabilities provided by Azure Cognitive Service for Language. The same multilinguality is applicable in both custom text classification and custom named entity recognition, which are services more appropriate classifying categories or extracting information from longer documents such as call transcriptions or legal contracts.

 

We’re excited to see your businesses benefit globally from these features.

 

Get started with the Language services today.

 

Co-Authors
Version history
Last update:
‎Mar 20 2022 04:46 PM
Updated by: