language understanding (luis)
18 TopicsNow in Foundry: Qwen3-Coder-Next, Qwen3-ASR-1.7B, Z-Image
This week's spotlight features three models from that demonstrate enterprise-grade AI across the full scope of modalities. From low latency coding agents to state-of-the-art multilingual speech recognition and foundation-quality image generation, these models showcase the breadth of innovation happening in open-source AI. Each model balances performance with practical deployment considerations, making them viable for production systems while pushing the boundaries of what's possible in their respective domains. This week's Model Mondays edition highlights Qwen3-Coder-Next, an 80B MoE model that activates only 3B parameters while delivering coding agent capabilities with 256k context; Qwen3-ASR-1.7B, which achieves state-of-the-art accuracy across 52 languages and dialects; and Z-Image from Tongyi-MAI, an undistilled text-to-image foundation model with full Classifier-Free Guidance support for professional creative workflows. Models of the week Qwen: Qwen3-Coder-Next Model Specs Parameters / size: 80B total (3B activated) Context length: 262,144 tokens Primary task: Text generation (coding agents, tool use) Why it's interesting Extreme efficiency: Activates only 3B of 80B parameters while delivering performance comparable to models with 10-20x more active parameters, making advanced coding agents viable for local deployment on consumer hardware Built for agentic workflows: Excels at long-horizon reasoning, complex tool usage, and recovering from execution failures, a critical capability for autonomous development that go beyond simple code completion Benchmarks: Competitive performance with significantly larger models on SWE-bench and coding benchmarks (Technical Report) Try it Use Case Prompt Pattern Code generation with tool use Provide task context, available tools, and execution environment details Long-context refactoring Include full codebase context within 256k window with specific refactoring goals Autonomous debugging Present error logs, stack traces, and relevant code with failure recovery instructions Multi-file code synthesis Describe architecture requirements and file structure expectations Financial services sample prompt: You are a coding agent for a fintech platform. Implement a transaction reconciliation service that processes batches of transactions, detects discrepancies between internal records and bank statements, and generates audit reports. Use the provided database connection tool, logging utility, and alert system. Handle edge cases including partial matches, timing differences, and duplicate transactions. Include unit tests with 90%+ coverage. Qwen: Qwen3-ASR-1.7B Model Specs Parameters / size: 1.7B Context length: 256 tokens (default), configurable up to 4096 Primary task: Automatic speech recognition (multilingual) Why it's interesting All-in-one multilingual capability: Single 1.7B model handles language identification plus speech recognition for 30 languages, 22 Chinese dialects, and English accents from multiple regions—eliminating the need to manage separate models per language Specialized audio versatility: Transcribes not just clean speech but singing voice, songs with background music, and extended audio files, expanding use cases beyond traditional ASR to entertainment and media workflows State-of-the-art accuracy: Outperforms GPT-4o, Gemini-2.5, and Whisper-large-v3 across multiple benchmarks. English: Tedlium 4.50 WER vs 7.69/6.15/6.84; Chinese: WenetSpeech 4.97/5.88 WER vs 15.30/14.43/9.86 (Technical Paper) Language ID included: 97.9% average accuracy across benchmark datasets for automatic language identification, eliminating the need for separate language detection pipelines Try it Use Case Prompt Pattern Multilingual transcription Send audio files via API with automatic language detection Call center analytics Process customer service recordings to extract transcripts and identify languages Content moderation Transcribe user-generated audio content across multiple languages Meeting transcription Convert multilingual meeting recordings to text for documentation Customer support sample prompt: Deploy Qwen3-ASR-1.7B to a Microsoft Foundry endpoint and transcribe multilingual customer service calls. Send audio files via API to automatically detect the language (from 52 supported options including 30 languages and 22 Chinese dialects) and generate accurate transcripts. Process calls from customers speaking English, Spanish, Mandarin, Cantonese, Arabic, French, and other languages without managing separate models per language. Use transcripts for quality assurance, compliance monitoring, and customer sentiment analysis. Tongyi-MAI: Z-Image Model Specs Parameters / size: 6B Context length: N/A (text-to-image) Primary task: Text-to-image generation Why it's interesting Undistilled foundation model: Full-capacity base without distillation preserves complete training signal with Classifier-Free Guidance support (a technique that improves prompt adherence and output quality), enabling complex prompt engineering and negative prompting that distilled models cannot achieve High output diversity: Generates distinct character identities in multi-person scenes with varied compositions, facial features, and lighting, critical for creative applications requiring visual variety rather than consistency Aesthetic versatility: Handles diverse visual styles from hyper-realistic photography to anime and stylized illustrations within a single model, supporting resolutions from 512×512 to 2048×2048 at any aspect ratio with 28-50 inference steps (Technical Paper) Try it Use Case Prompt Pattern Multilingual transcription Send audio files via API with automatic language detection Call center analytics Process customer service recordings to extract transcripts and identify languages Content moderation Transcribe user-generated audio content across multiple languages Meeting transcription Convert multilingual meeting recordings to text for documentation E-commerce sample prompt: Professional product photography of a modern ergonomic office chair in a bright Scandinavian-style home office. Natural window lighting from left, clean white desk with laptop and succulent plant, light oak hardwood floor. Chair positioned at 45-degree angle showing design details. Photorealistic, commercial photography, sharp focus, 85mm lens, f/2.8, soft shadows. Getting started You can deploy open‑source Hugging Face models directly in Microsoft Foundry by browsing the Hugging Face collection in the Foundry model catalog and deploying to managed endpoints in just a few clicks. You can also start from the Hugging Face Hub. First, select any supported model and then choose "Deploy on Microsoft Foundry", which brings you straight into Azure with secure, scalable inference already configured. Learn how to discover models and deploy them using Microsoft Foundry documentation. Follow along the Model Mondays series and access the GitHub to stay up to date on the latest Read Hugging Face on Azure docs Learn about one-click deployments from the Hugging Face Hub on Microsoft Foundry Explore models in Microsoft Foundry603Views0likes0CommentsIntroducing Azure Cognitive Service for Language
Azure Cognitive Services has historically had three distinct NLP services that solve different but related customer problems. These are Text Analytics, LUIS and QnA Maker. As these services have matured and customers now depend on them for business-critical workloads, we wanted to take a step back and evaluate the most effective path forward over the next several years for delivering our roadmap of a world-class, state-of-the-art NLP platform-as-a-service. Each service was initially focused on a set of capabilities supporting distinct customer scenarios. For example, LUIS for custom language models most often supporting bots, Text Analytics for general purpose pre-built language services, and QnA Maker for knowledge-based question / answering. As AI accuracy has improved, the cost of offering more sophisticated models has decreased, and customers have increased their adoption of NLP for business workloads, we are seeing more and more overlapping scenarios where the lines are blurred between the three distinct services. As such, the most effective path forward is a single unified NLP service in Azure Cognitive Services. Today we are pleased to announce the availability of Azure Cognitive Service for Language. It unifies the capabilities in Text Analytics, LUIS, and the legacy QnA Maker service into a single service. The key benefits include: Easier to discover and adopt features. Seamlessness between pre-built and custom-trained NLP. Easier to build NLP capabilities once and reuse them across application scenarios. Access to multilingual state-of-the-art NLP models. Simpler to get started through consistency in APIs, documentation, and samples across all features. More billing predictability. The unified Language Service will not affect any existing applications. All existing capabilities of the three services will continue to be supported until we announce a deprecation timeline of the existing services (which would be no less than one year). However, new features and innovation will start happening only on the unified Language service. For example, question answering and conversational language understanding (CLU) are only available in the unified service (more details on these features later). As such, customers are encouraged to start making plans to leverage the unified service. More details on migration including links to resources are provided below. Here is everything we are announcing today in the unified Language service: Text Analytics is now Language Service: All existing features of Text Analytics are included in the Language Service. Specifically, Sentiment Analysis and Opinion Mining, Named Entity Recognition (NER), Entity Linking, Key Phrase Extraction, Language Detection, Text Analytics for health, and Text Summarization are all part of the Language Service as they exist today. Text Analytics customers don’t need to do any migrations or updates to their in-production or in-development apps. The unified service is backward compatible with all existing Text Analytics features. The key difference is when creating a new resource in the Azure portal UI, you will now see the resource labeled as “Language” rather than “Text Analytics”. Introducing conversational language understanding (preview) - the next generation of LUIS: Language Understanding (LUIS) has been one of our fastest growing Cognitive Services with customers deploying custom language models to production for various scenarios from command-and-control IoT devices and chat bots, to contact center agent assist scenarios. The next phase in the evolution of LUIS is conversational language understanding (CLU) which we are announcing today as a preview feature of the new Language Service. CLU introduces multilingual transformer-based models as the underlying model architecture and results in significant accuracy improvements over LUIS. Also new as part of CLU is the ability to create orchestration projects, which allow you to configure a project to route to multiple customizable language services, like question answering knowledge bases, other CLU projects, and even classic LUIS applications. Visit here to learn more. If you are an existing LUIS customer, we are not requiring you to migrate your application to CLU today. However, as CLU represents the evolution of LUIS, we encourage you to start experimenting with CLU in preview and provide us feedback on your experience. You can import a LUIS JSON application directly into CLU to get started. GA of question answering: In May 2021, we launched the preview of custom question answering. Today we are announcing the General Availability (GA) of question answering as part of the new Language Service. If you are just getting started with building knowledge bases that are query-able with natural language, visit here to get started. If you want to know more about migrating legacy QnA Maker knowledge bases to the Language Service see here. Your existing QnA Maker knowledge bases will continue to work. We are not requiring customers to migrate from QnA Maker at this time. However, question answering represents the evolution of QnA Maker and new features will only be developed for the unified service. As such, we encourage you to plan for a migration from legacy QnA Maker if this applies to you. Introducing custom named entity recognition (preview): Documents include an abundant amount of valuable information. Enterprises rely on pulling out that information to easily filter and search through those documents. Using the standard Text Analytics NER, they could extract known types like person names, geographical locations, datetimes, and organizations. However, lots of information of interest is more specific than the standard types. To unlock these scenarios, we’re happy to announce custom NER as a preview capability of the new Language Service. The capability allows you to build your own custom entity extractors by providing labelled examples of text to train models. Securely upload your data in your own storage accounts and label your data in the language studio. Deploy and query the custom models to obtain entity predictions on new text. Visit here to learn more. Introducing custom text classification (preview): While many pieces of information can exist in any given document, the whole piece of text can belong to one or more categories. Organizing and categorizing documents is key to data reliant enterprises. We’re excited to announce custom text classification, a preview feature under the Language service, where you can create custom classification models with your defined classes. Securely upload your data in your own storage accounts and label your data in the language studio. Choose between single-label classification where you can label and predict one class for every document, or multi-label classification that allows you to assign or predict several classes per document. This service enables automation to incoming pieces of text such as support tickets, customer email complaints, or organizational reports. Visit here to learn more. Language studio: This is the single destination for experimentation, evaluation, and training of Language AI / NLP in Cognitive Services. With the Language studio you can now try any of our capabilities with a few buttons clicks. For example, you can upload medical documents and get back all the entities and relations extracted instantly, and you can easily integrate the API into your solution using the Language SDK. You can take it further by training your own custom NER model and deploy it through the easy-to-use interface. Try it out now yourself here. Several customers are already using Azure Cognitive Service for Language to transform their businesses. Here's what two of them had to say: “We used Azure Cognitive Services and Bot Service to deliver an instantly responsive, personal expert into our customers’ pockets. Providing this constant access to help is key to our customer care strategy.” -Paul Jacobs, Group Head of Operations Transformation, Vodafone “Sellers might have up to 100,000 documents associated with a deal, so the time savings can be absolutely massive. Now that we’ve added Azure Cognitive Service for Language to our tool, customers can potentially compress weeks of work into days.” -Thomas Fredell, Chief Product Officer, Datasite To learn more directly from customers, see the following customer stories: Vodafone transforms its customer care strategy with digital assistant built on Azure Cognitive Services Progressive Insurance levels up its chatbot journey and boosts customer experience with Azure AI Kepro improves healthcare outcomes with fast and accurate insights from Text Analytics for health On behalf of the entire Cognitive Services Language team at Microsoft, we can't wait to see how Azure Cognitive Service for Language benefits your business!28KViews5likes0CommentsLearn about Bot Framework Composer’s new authoring experience and deploy your bot to a telephone
The new telephony channel, combined with our Bot Framework developer platform, makes it easy to rapidly build always-available virtual assistants, or IVR assistants, that provide natural language intent-based call handling and the ability to handle advanced conversation flows, such as context switching and responding to follow up questions and still meeting the goal of reducing operational costs for enterprises.6.2KViews1like0Comments