Azure AI Language
32 TopicsMastering Model Context Protocol (MCP): Building Multi Server MCP with Azure OpenAI
Create complex Multi MCP AI Agentic applications. Deep dive into Multi Server MCP implementation, connecting both local custom and ready MCP Servers in a single client session through a custom chatbot interface.7.4KViews7likes3CommentsAnnouncing an accelerator for GenAI-powered assistants using Azure AI Language and Azure OpenAI
We’re thrilled to introduce a new accelerator solution in GitHub Azure-Samples library designed specifically for creating and enhancing your GenAI-based conversational assistants with robust, human-controllable workflows. This accelerator uses key services from Azure AI Language in addition to Azure OpenAI, including PII detection to protect sensitive information, Conversational Language Understanding (CLU) to predict top users’ intents, Custom Question Answering (CQA) to respond to top questions with deterministic answers. Together with Azure OpenAI and Large Language Models (LLMs), the solution is designed to orchestrate and deliver a smooth, human-guided, controllable and deterministic conversational experience. The integration with LLMs will come soon. It’s perfect for developers and organizations looking to build assistants that can handle complex queries, route tasks, and provide reliable answers, all with a controlled, scalable architecture. Why This Accelerator While LLMs have been appreciated by many customers to build conversational assistants for natural, engaging, and context-aware interactions, there are challenges such as the significant efforts required in prompt engineering, document chunking, and reducing hallucinations to improve the quality of their Retrieval-Augmented Generation (RAG) solutions. If an AI quality issue is discovered in production, customers need to find an effective way to address it promptly. This solution aims to help customers utilize offerings in the Azure AI portfolio and address key challenges when building Generative AI (GenAI) assistants. Designed for flexibility and reliability, this accelerator enables human-controllable workflows that meet real-world customer needs. It minimizes the need for extensive prompt engineering by using a structured workflow to prioritize top questions with exact answers and custom intents that are critical to your business and use LLM to handle topics in a conversation that have lower priorities. This architecture not only enhances answer quality and control but also ensures that complex queries are handled efficiently. If you want to fix quickly an incorrect answer for your chatbot built with RAG, you can also attach this accelerator solution to your existing RAG solution and quickly add a QA pair with the correct response in CQA to fix the issue for your users. What This Accelerator Delivers This accelerator provides and demonstrates an end-to-end orchestration using a few capabilities in Azure AI Language and Azure OpenAI for conversational assistants. It can be applied in various scenarios where control over assistant behavior and response quality is essential, like call centers, help desks, and other customer support applications. Below is a reference architecture of the solutions: Key components of this solution include (components in dash boxes coming soon): Client-Side User Interface for Demonstration (coming soon) A web-based client-side interface is included in the accelerator solution, to showcase the accelerator solution in an interactive, user-friendly format. This web UI allows you to quickly explore and test this solution, such as its orchestration routing behavior and functionalities. Workflow Orchestration for Human-Controllable Conversations By combining services like CLU, CQA, and LLMs, the accelerator allows for a dynamic, adaptable workflow. CLU can recognize and route customer-defined intents, while CQA provides exact answers from predefined QA pairs. If a question falls outside the pre-defined scope, the workflow can seamlessly fall back to LLMs, which is enhanced with RAG for contextually relevant, accurate responses. This workflow ensures human-like adaptability while maintaining control over assistant responses. Conversational Language Understanding (CLU) for Intent Routing The CLU service allows you to define the top intents you want the assistants to handle. The top intents can be those critical to your business and/or those most users ask your assistants. This component plays a central role in directing conversations by interpreting user intents and routing them to the right action or AI agents. Whether completing a task or addressing specific customer needs, CLU provides the mechanism to ensure the assistant accurately understands and executes the process of handling custom-defined intents. Custom Question Answering (CQA) for Exact Answers and with No Hallucinations CQA allows you to create and manage predefined QA pairs to deliver precise responses, reducing ambiguity and ensuring that the assistant aligns closely with defined answers. This controlled response approach maintains consistency in interactions, improving reliability, particularly for high-stake or regulatory-sensitive conversations. You can also attach CQA to your existing RAG solution to quickly fix incorrect answers. PII Detection and Redaction for Privacy Protection (coming soon) Protecting user privacy is a top priority, especially in conversational AI. This accelerator showcases an optional integration of Azure AI Language’s Personally Identifiable Information (PII) to automatically identify and redact sensitive information, if compliance with privacy standards and regulations is required LLM with RAG to Handle Everything Else (coming soon) In this accelerator, we are using a RAG solution to handle missed intents or user queries on lower-priority topics. This RAG solution can be replaced with your existing one. The predefined intents and question-answer pairs can be appended and updated over time based on evolving business needs and DSATs (dissatisfaction) discovered in the RAG responses. This approach ensures controlled and deterministic experiences for high-value or high-priority topics while maintaining flexibility and extensibility for lower-priority interactions. Components Configuration for "Plug-and-Play" One of the standout features of this accelerator is its flexibility through a "plug-and-play" component configuration. The architecture is designed to allow you to easily swap, add, or remove components to tailor the solution to your specific needs. Whether you want to add custom intents, adjust fallback mechanisms, or incorporate additional data sources, the modular nature of the accelerator makes it simple to configure. Get Started Building Your GenAI-Powered Assistant Today Our new accelerator is available on GitHub, ready for developers to deploy, customize, and use as a foundation for your own needs. Join us as we move towards a future where GenAI can empower organizations to meet business needs with intelligent, adaptable, and human-controllable assistants. What’s more: Other New Azure AI Language Releases This Ignite Beyond these, Azure AI Language provides additional capabilities to support GenAI customers in more scenarios to ensure quality, privacy and flexible deployment in any types of environments, either clouds or on premises. We are also excited to announce the following new features launching at Ignite. Azure AI Language in Azure AI Studio: Azure AI Language is moving to AI Studio. Extract PII from text, Extract PII from conversation, Summarize text, Summarize conversation, Summarize for call center, and Text Analytics for health are now available in AI Studio playground. More skills follow. Conversational Language Understanding (CLU): Today, customers use CLU to build custom natural language understanding models hosted by Azure to predict the overall intention of an incoming utterance and extract important information from it. However, some customers have specific needs that require an on-premise connection. We are excited to announce runtime containers for CLU for these specific use cases. PII Detection and Redaction: Azure AI Language offers Text PII and Conversational PII services to extract personally identifiable information from input text and conversation to enhance privacy and security, oftentimes before sending data to the cloud or an LLM. We are excited to announce new improvements to these services - the preview API (version 2024-11-15-preview) now supports the option to mask detected sensitive entities with a label (i.e. “John Doe received a call from 424-878-9192” can now be masked with an entity label, i.e. . “[PERSON_1] received a call from [PHONENUMBER_1]”. More on how to specify the redaction policy style for your outputs can be found in our documentation. Native document support: The gating control is removed with the latest API version, 2024-11-15-preview, allowing customers to access native document support for PII redaction and summarization. Key updates in this version include: - Increased Maximum File Size Limits (from 1 MB to 10 MB). - Enhanced PII Redaction Customization: Customers can now specify whether they want only the redacted document or both the redacted document and a JSON file containing the detected entities. Language detection: Language detection is a preconfigured feature that can detect the language a document is written in and returns a language code for a wide range of languages, variants, dialects, and some regional/cultural languages. We are happy to announce today the general availability of scription detection capability, and 16 more languages support, which adds up to 139 total supported languages. Named entity recognition (NER): The Named Entity Recognition (NER) service supports customer scenarios for identifying and analyzing entities such as addresses, names, and phone numbers from inputs text. NER’s Generally Available API (version 2024-11-01) now supports several optional input parameters (inclusionList, exclusionList, inferenceOptions, and overlapPolicy) as well as an updated output structure (with new fields tags, type, and metadata) to ensure enhanced user customization and deeper analysis. More on how to use these parameters can be found in our documentation. Text analytics for health: Text analytics for health (TA4H) is a preconfigured feature that extracts and labels relevant medical information from unstructured texts such as doctor's notes, discharge summaries, clinical documents, and electronic health records. Today, we released support for Fast Healthcare Interoperability Resources (FHIR) structuring and temporal assertion detection in the Generally Available API.2.2KViews3likes0CommentsPower Up Your Open WebUI with Azure AI Speech: Quick STT & TTS Integration
Introduction Ever found yourself wishing your web interface could really talk and listen back to you? With a few clicks (and a bit of code), you can turn your plain Open WebUI into a full-on voice assistant. In this post, you’ll see how to spin up an Azure Speech resource, hook it into your frontend, and watch as user speech transforms into text and your app’s responses leap off the screen in a human-like voice. By the end of this guide, you’ll have a voice-enabled web UI that actually converses with users, opening the door to hands-free controls, better accessibility, and a genuinely richer user experience. Ready to make your web app speak? Let’s dive in. Why Azure AI Speech? We use Azure AI Speech service in Open Web UI to enable voice interactions directly within web applications. This allows users to: Speak commands or input instead of typing, making the interface more accessible and user-friendly. Hear responses or information read aloud, which improves usability for people with visual impairments or those who prefer audio. Provide a more natural and hands-free experience especially on devices like smartphones or tablets. In short, integrating Azure AI Speech service into Open Web UI helps make web apps smarter, more interactive, and easier to use by adding speech recognition and voice output features. If you haven’t hosted Open WebUI already, follow my other step-by-step guide to host Ollama WebUI on Azure. Proceed to the next step if you have Open WebUI deployed already. Learn More about OpenWeb UI here. Deploy Azure AI Speech service in Azure. Navigate to the Azure Portal and search for Azure AI Speech on the Azure portal search bar. Create a new Speech Service by filling up the fields in the resource creation page. Click on “Create” to finalize the setup. After the resource has been deployed, click on “View resource” button and you should be redirected to the Azure AI Speech service page. The page should display the API Keys and Endpoints for Azure AI Speech services, which you can use in Open Web UI. Settings things up in Open Web UI Speech to Text settings (STT) Head to the Open Web UI Admin page > Settings > Audio. Paste the API Key obtained from the Azure AI Speech service page into the API key field below. Unless you use different Azure Region, or want to change the default configurations for the STT settings, leave all settings to blank. Text to Speech settings (TTS) Now, let's proceed with configuring the TTS Settings on OpenWeb UI by toggling the TTS Engine to Azure AI Speech option. Again, paste the API Key obtained from Azure AI Speech service page and leave all settings to blank. You can change the TTS Voice from the dropdown selection in the TTS settings as depicted in the image below: Click Save to reflect the change. Expected Result Now, let’s test if everything works well. Open a new chat / temporary chat on Open Web UI and click on the Call / Record button. The STT Engine (Azure AI Speech) should identify your voice and provide a response based on the voice input. To test the TTS feature, click on the Read Aloud (Speaker Icon) under any response from Open Web UI. The TTS Engine should reflect Azure AI Speech service! Conclusion And that’s a wrap! You’ve just given your Open WebUI the gift of capturing user speech, turning it into text, and then talking right back with Azure’s neural voices. Along the way you saw how easy it is to spin up a Speech resource in the Azure portal, wire up real-time transcription in the browser, and pipe responses through the TTS engine. From here, it’s all about experimentation. Try swapping in different neural voices or dialing in new languages. Tweak how you start and stop listening, play with silence detection, or add custom pronunciation tweaks for those tricky product names. Before you know it, your interface will feel less like a web page and more like a conversation partner.785Views2likes1CommentUsing Semantic Kernel to control a BBC Microbit
Background The BBC Micro:bit has proved very popular in UK schools as a cheap and simple device that can be used to demonstrate basic coding, either using a building block approach or by using Python code in an editor. There are browser-based tools to allow editing of the code in either mode. The BBC Micro:bit is then connected via USB and the browser is then able to upload the code to the BBC Micro:bit. Thus making it a very interactive experience. What if the power of Azure OpenAI prompts could be combined with the simplicity of the BBC Micro:bit to allow prompts to directly program the BBC Micro:bit? This is what this blog explains. Semantic Kernel This blog assumes the reader has a basic understanding of Azure OpenAI and that there is an API that allows you to send requests and get responses from a model provisioned in Azure OpenAI. If not please look here A little background on Semantic Kernel may be useful. This is an open-source framework for the building of AI agents. Semantic Kernel makes use of plugins as a mechanism to perform specific tasks. A plugin is a specific piece of code that is tagged with metadata in a way such that Semantic Kernel will know whether to use the plugin and if so, how to call the plugin. For those conversant with OpenAI function calling, Semantic Kernel automates the process of registering a function with Azure OpenAI and calling that function when the response from Azure OpenAI believes that the user's query can be best answered by calling a function. This mechanism is extremely powerful as it opens up huge possibilities for an Azure OpenAI model to integrate with any number of things. In this example, the BBC Microbit will be a plugin to Semantic Kernel. The Demo Semantic Kernel app In order to keep things really simple, this demonstration is a console application that runs a Semantic Kernel planner with a plugin for the BBC Micro:bit. The code for this is really simple. In the first step below, create the Semantic Kernel with your model, endpoint and key. You will need an instance of Azure OpenAI with a model like gpt-4o provisioned. #pragma warning disable SKEXP0050 #pragma warning disable SKEXP0060 using System.ComponentModel; using Microsoft.SemanticKernel; using Microsoft.SemanticKernel.ChatCompletion; using Microsoft.SemanticKernel.Connectors.OpenAI; using Microsoft.Extensions.Logging; using Microsoft.Extensions.DependencyInjection; using System; // Create the kernel var builder = Kernel.CreateBuilder(); builder.AddAzureOpenAIChatCompletion( "YOUR_MODEL", "YOUR_ENDPOINT", "YOUR_KEY"); Next register the plugin builder.Plugins.AddFromType<LightsPlugin>(); builder.Plugins.AddFromType<MicrobitPlugin>(); in the sample there are 2 plugins, one that is for the BBC Micro:bit and another one that emulates the grid of lights on the BBC Micro:bit, so you can get Semantic Kernel to control this if you don't have a BBC Micro:bit connected. We will, however, concentrate on the BBC Micro:bit in this blog. Next we enable automatic function calling and a chat history object that we will initialise with some instructions to explain to the model what it can do: Kernel kernel = builder.Build(); // Retrieve the chat completion service from the kernel IChatCompletionService chatCompletionService = kernel.GetRequiredService<IChatCompletionService>(); // 2. Enable automatic function calling OpenAIPromptExecutionSettings openAIPromptExecutionSettings = new() { ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions }; // Create the chat history ChatHistory history = new ChatHistory(""" You have a matrix of 5 rows and columns each of which can have a brightness of 0 to 9. Rows make up the horizontal axis. The top row has coordinates 0,1 to 4, 0. Columns represent the vertical axis. The left most column is 0,0 to 0, 4 and the right column is 4,0 to 4,4. """); This is all the initialisation needed and so the final step is to run the prompt loop where you can prompt the application and it will respond. string? userInput; do { // Collect user input Console.Write("User > "); userInput = Console.ReadLine(); if (userInput is not null) { // Add user input history.AddUserMessage(userInput); // Get the response from the AI with automatic function calling var result = await chatCompletionService.GetChatMessageContentAsync( history, executionSettings: openAIPromptExecutionSettings, kernel: kernel); // Print the results Console.WriteLine("Assistant > " + result); // Add the message from the agent to the chat history history.AddMessage(result.Role, result.Content ?? string.Empty); } } while (userInput is not null); As can be seen above, this is just a loop that requests input, passes to the chatCompletionService, gets the response, displays it and then it adds this to the history. In this manner the agent is conversational and so will take into account previous prompts when answering the current one. In this case, the conversation history is not not persisted. BBC Microbit plugin The BBC Micro:bit has a USB port and when plugged into a PC, presents itself as a serial port device to your PC. The port can be found out from Windows Device Manager, but it is often COM3. The plugin uses a feature of the BBC Micro:bit where commands can interactively be sent. This is referred to as Read-Evaluate-Print-Loop REPL The plugin opens a connection to a COM port on initialisation, so this should error if the BBC Micro:bit is not plugged-in or presents itself on a different COM port. The code that may need amending is in the MicrobitPlugin.cs SerialPort serialPort; public MicrobitPlugin() { serialPort = new SerialPort("COM3", 115200); serialPort.Open(); //send crt-c to stop any running program serialPort.Write(new byte[] { 0x03 }, 0, 1); } The above opens the serial port for later use so that commands may be sent As explained in the Semantic Kernel description, meta data is used against a plugin function in order to indicate to Semantic Kernel and then Azure OpenAI what that function does. The simplest function is to set a specific pixel on the BBC Micro:bit display [KernelFunction("set_microbit_light_brightness")] [Description("sets the brightness of a microbit pixel by its row and column ID.")] public async Task<int> SetLightBrightness( Kernel kernel, int rowid, int columnid, int brightness ) { SendCommand(serialPort, $"display.set_pixel({rowid},{columnid},{brightness})"); return brightness; } This then calls the serial port to send the command to the BBC Microbit using the serialPort object that was previously opened: private void SendCommand(SerialPort serialPort, string command) { //send crt-c to stop any running program serialPort.Write(new byte[] { 0x03 }, 0, 1); // Wait for the micro:bit to stop any running program System.Threading.Thread.Sleep(50); // Adjust the delay as needed // Send the command to the micro:bit serialPort.Write(command); serialPort.WriteLine(" "); // send a carriage return to execute the command serialPort.Write(new byte[] { 0x0d }, 0, 1); // Wait for the micro:bit to finish executing the command System.Threading.Thread.Sleep(50); // Adjust the delay as needed // Read the response from the micro:bit var response = serialPort.ReadExisting(); // Print the response to the console Console.WriteLine(response); } As can be seen above, there is metadata which describes the purpose of the function and the parameters are inspected too. For the light, these are the row and column and brightness. This then sends a specific display.set_pixel command to the BBC Micro:bit. The number of rows and columns is not fixed here, nor the brightness levels. These are explained to the model in the original prompt (chat history) before the main Semantic Kernel planner loop is run. Some sample prompts Once the program is built and run. You can try any number of prompts: set the top left light to brightness 4 set it the last light to a lower brightness turn off all lights draw a circle make the circle bigger Draw the letter "W" Using the application, you can see that even with this basic plugin function, the planner can do more complicated things than expected, it has history so can refer to a previous prompt, but can do much more interesting things like draw shapes or letters that require multiple calls to the plugin function! Extending the plugin The REPL interface to the BBC Micro:bit can send all sorts of other commands besides the display.set_pixel one, so it makes sense to expand the plugin to give it more broad capability: [KernelFunction("set_microbit_command")] [Description("send a command to the microbit using REPL.")] public void SendGenericCommand( Kernel kernel, string command ) { SendCommand(serialPort, command); } The above plugin function allows any arbitrary command to be sent to the BBC Microbit. Allowing a wider set of use cases or more compact commands to be sent to the BBC Microbit. Some more prompts As this extra plugin function need not be doing things just to the lights you can ask it to: make a sound ask the current temperature ask it if one of the buttons is pressed For the lights too, there are extra possibilities: display a word with each letter in sequence display a scrolling word or sentence Summary What can be seen in this demonstration is how simple it is for Semantic Kernel to control an external device using a plugin and a simple planning loop. Many of the models on Azure OpenAI have enough training data to natively understand what REPL commands can be sent to the BBC Micro:bit to accomplish a whole task. The main limiting factor seen is the ability of the REPL interface to accept complex commands from the planner. This is largely to do with Python as a language and how control flow get expressed as indentations in the code (rather than using some form of brackets like C#). Further reading There are some labs that may be used to explore this further. In addition, for those who do not have access to a BBC Micro:bit, there are earlier labs that are essentially the same but with an array of virtual lights which represent the light matrix on the BBC Micro:bit. Have a play 🙂541Views2likes0CommentsShare Your Experience with Azure AI and Support a Charity
AI is transforming how leaders tackle problem-solving and creativity across different industries. From creating realistic images to generating human-like text, the potential of large and small language model-powered applications is vast. Our goal at Microsoft is to continuously enhance our offerings and provide the best safe, secure, and private AI services and machine learning platform for developers, IT professionals and decision-makers who are paving the way for AI transformations. Are you using Azure AI to build your generative AI apps? We’re excited to invite our valued Azure AI customers to share their experiences and insights on Gartner Peer Insights. Your firsthand review not only helps fellow developers and decision-makers navigate their choices but also influences the evolution of our AI products. Write a Review: Microsoft Gartner Peer Insights https://gtnr.io/JK8DWRoL0.1.4KViews2likes0CommentsThe Azure Multimodal AI & LLM Processing Solution Accelerator
The Azure Multimodal AI & LLM Processing Accelerator is your one-stop-shop for all backend AI+LLM processing use cases like content summarization, extraction, classification and enrichment. This single accelerator supports all types of input data (text, documents, audio, image, video etc) and combines the best of Azure AI Services and LLMs to achieve reliable, consistent and scalable automation of tasks.7KViews2likes0CommentsDeploying GPT-4o AI Chat app on Azure via Azure AI Services – a step-by-step guide
Are you ready to revolutionize your business with cutting-edge AI technology? Dive into our comprehensive step-by-step guide on deploying a GPT-4o AI Chat app using Azure AI Services. Discover how to harness the power of advanced natural language processing to create interactive, human-like chat experiences. From setting up your Azure account to deploying your AI model and customizing your chat app, this guide covers it all. Unleash the potential of AI in your business and stay ahead of the curve with the latest advancements from Microsoft Azure. Don’t miss out on this opportunity to transform your workflows and elevate customer interactions to new heights!6.4KViews2likes0CommentsConversational Bots 2.0 – Setting a new paradigm
The evolution of AI chatbots is transforming user interactions. Powered by advanced Azure AI, these multi-modal bots can process and respond to various inputs like text, images, and voice. They offer enhanced support and seamless navigation, making them invaluable for improving user experiences.3.8KViews2likes0Comments