azure ai studio
197 TopicsThe Future of AI: Evaluating and optimizing custom RAG agents using Azure AI Foundry
This blog post explores best practices for evaluating and optimizing Retrieval-Augmented Generation (RAG) agents using Azure AI Foundry. It introduces the RAG triad metrics—Retrieval, Groundedness, and Relevance—and demonstrates how to apply them using Azure AI Search and agentic retrieval for custom agents. Readers will learn how to fine-tune search parameters, use end-to-end evaluation metrics and golden retrieval metrics like XDCG and Max Relevance, and leverage Azure AI Foundry tools to build trustworthy, high-performing AI agents.873Views0likes0CommentsDeploy Your First Azure AI Agent Service-Powered App on Azure App Service
1. Introduction Azure AI Agent Service is a fully managed service designed to empower developers to securely build, deploy, and scale high-quality, extensible AI agents without needing to manage the underlying compute and storage resources 1. These AI agents act as “smart” microservices that can answer questions, perform actions, or automate workflows by combining generative AI models with tools that allow them to interact with real-world data sources 1. Deploying Azure AI Agent Service on Azure App Service offers several benefits: Scalability: Azure App Service provides automatic scaling options to handle varying loads. Security: Built-in security features ensure that your AI agents are protected. Ease of Deployment: Simplified deployment processes allow developers to focus on building and improving their AI agents rather than managing infrastructure1. 2. Prerequisites Before you begin deploying Azure AI Agent Service on Azure App Service, ensure you have the following prerequisites in place: Azure Subscription: You need an active Azure subscription. If you don’t have one, you can create a free account on the Azure portal 2. Azure AI Foundry Access: Azure AI Foundry is the platform where you create and manage your AI agents. Ensure you have access to Azure AI Foundry and have the necessary permissions to create hubs and projects 2. Basic Knowledge of Azure App Service: Familiarity with Azure App Service is essential for configuring and deploying your AI agent. Understanding the basics of resource groups, app services, and hosting plans will be beneficial. Development Environment: Set up your development environment with the required tools and SDKs. This includes: Azure CLI: For managing Azure resources from the command line. Azure AI Foundry SDK: For creating and managing AI agents. Code Editor: Such as Visual Studio Code, for writing and editing your deployment scripts. 3. Setting Up Azure AI Agent Service To harness the capabilities of Azure AI Agent Service, follow these steps to set up the environment: a. Create an Azure AI Hub and Project Begin by establishing an AI Hub and initiating a new project within Azure AI Foundry: Access Azure Portal: Log in to the Azure Portal using your Azure credentials. Create AI Hub: Navigate to the search bar and search for “AI Foundry” Select “AI Foundry” and click “Create” and select “Hub”. Provide necessary details such as subscription, resource group, region, name and connect AI services. Review and create the AI Hub. Create a Project: Within the newly created AI Hub, click “Launch Azure AI Foundry” Under your new AI Hub, click “New project” and click “Create”. b. Deploy an Azure OpenAI Model With the project in place, deploy a suitable AI model: Model Deployment: On the left-hand side of the project panel, select “Models + Endpoints” and click “Deploy model” Select “Deploy base model” and choose “gpt-4o” and click “Confirm” Leave the default settings and click “Deploy” Detailed guidance is available in the Quickstart documentation. 4. Create and Configure the AI Agent After setting up the environment and deploying the model, proceed to create the AI agent: On the left-hand side of the project panel, select “Agents”. Click “New agent” and the default agent will be created which already connected to your Azure OpenAI model. 1. Define Instructions: Craft clear and concise instructions that guide the agent’s interactions. For example: instructions = "You are a helpful assistant capable of answering queries and performing tasks." 2. Integrate Tools: Incorporate tools to enhance the agent’s capabilities, such as: Code Interpreter: Allows the agent to execute code for data analysis. OpenAPI Tools: Enable the agent to interact with external APIs. Enable Code Interpreter tool: Still on the agent settings, in the “Actions” section, click “Add” and select “Code interpreter” and click “Save”. On the same agent settings panel at the top, click “Try in playground”. Do some quick test by entering “Hi” to the agent. 5. Develop a Chat Application Utilize the Azure AI Foundry SDK to instantiate and integrate up the agent. In this tutorial we will be using chainlit - an open-source Python package to quickly build Conversational AI application. 1. Setup your local development environment: Follow the steps below from cloning the repository to running the chainlit application. You can find the “Project connection string” inside your project “Overview” section in AI Foundry. Still in AI Foundry, “Agent ID” can be found inside your “Agents” section. git clone -b Deploy-AI-Agent-App-Service https://github.com/robrita/tech-blogs copy sample.env to .env and update python -m venv venv .\venv\Scripts\activate python -m pip install -r requirements.txt chainlit run app.py 2. Full code for reference: import os import chainlit as cl import logging from dotenv import load_dotenv from azure.ai.projects import AIProjectClient from azure.identity import DefaultAzureCredential from azure.ai.projects.models import ( MessageRole, ) # Load environment variables load_dotenv() # Disable verbose connection logs logger = logging.getLogger("azure.core.pipeline.policies.http_logging_policy") logger.setLevel(logging.WARNING) AIPROJECT_CONNECTION_STRING = os.getenv("AIPROJECT_CONNECTION_STRING") AGENT_ID = os.getenv("AGENT_ID") # Create an instance of the AIProjectClient using DefaultAzureCredential project_client = AIProjectClient.from_connection_string( conn_str=AIPROJECT_CONNECTION_STRING, credential=DefaultAzureCredential() ) # Chainlit setup @cl.on_chat_start async def on_chat_start(): # Create a thread for the agent if not cl.user_session.get("thread_id"): thread = project_client.agents.create_thread() cl.user_session.set("thread_id", thread.id) print(f"New Thread ID: {thread.id}") @cl.on_message async def on_message(message: cl.Message): thread_id = cl.user_session.get("thread_id") try: # Show thinking message to user msg = await cl.Message("thinking...", author="agent").send() project_client.agents.create_message( thread_id=thread_id, role="user", content=message.content, ) # Run the agent to process tne message in the thread run = project_client.agents.create_and_process_run(thread_id=thread_id, agent_id=AGENT_ID) print(f"Run finished with status: {run.status}") # Check if you got "Rate limit is exceeded.", then you want to increase the token limit if run.status == "failed": raise Exception(run.last_error) # Get all messages from the thread messages = project_client.agents.list_messages(thread_id) # Get the last message from the agent last_msg = messages.get_last_text_message_by_role(MessageRole.AGENT) if not last_msg: raise Exception("No response from the model.") msg.content = last_msg.text.value await msg.update() except Exception as e: await cl.Message(content=f"Error: {str(e)}").send() if __name__ == "__main__": # Chainlit will automatically run the application pass 3. Test Agent Functionality: Ensure the agent operates as intended. 6. Deploying on Azure App Service Deploying a Chainlit application on Azure App Service involves creating an App Service instance, configuring your application for deployment, and ensuring it runs correctly in the Azure environment. Here’s a step-by-step guide: 1. Create an Azure App Service Instance: Log in to the Azure Portal: Access the Azure Portal and sign in with your Azure account. Create a New Web App: Navigate to “App Services” and select “Create”. Fill in the necessary details: Subscription: Choose your Azure subscription. Resource Group: Select an existing resource group or create a new one. Name: Enter a unique name for your web app. Publish: Choose “Code”. Runtime Stack: Select “Python 3.12” or higher. Region: Choose the region closest to your users. Review and Create: After filling in the details, click “Review + Create” and then “Create” to provision the App Service. 2. Update Azure App Service Settings: Environment Variables: Add both “AIPROJECT_CONNECTION_STRING” and “AGENT_ID” Configuration: Set Startup Command to “startup.sh” Turn “On” the “SCM Basic Auth Publishing Credentials” setting. Turn “On” the “Session affinity” setting. Finally, click “Save”. Identity: Turn the status “On” under “System assigned” tab and click “Save”. 3. Assigned Role to your AI Foundry Project: In the Azure Portal, navigate to “AI Foundry” and select your Azure AI Project where the Agent was created. Select “Access Control(IAM)” and click “Add” to add role assignment. In the search bar, enter “AzureML Data Scientist” > “Next” > “Managed identity” > “Select members” > “App Service” > (Your app name) > “Review + Assign” 4. Deploy Your Application to Azure App Service: Deployment Methods: Azure App Service supports various deployment methods, including GitHub Actions, Azure DevOps, and direct ZIP uploads. Choose the method that best fits your workflow. Using External Public Github: In the Azure Portal, navigate to your App Service. Go to the “Deployment Center” and select the “External Git” deployment option. Enter “Repository”(https://github.com/robrita/tech-blogs) and “Branch”(Deploy-AI-Agent-App-Service). Keep “Public” and hit “Save”. Check Your Deployment: Still under “Deployment Center”, click “Logs” tab to view the deployment status. Once success, head over to the “Overview” section of your App Service to test the “Default domain”. Redeploy Your Application: To redeploy your app, under “Deployment Center”, click “Sync”. By following these steps, you can successfully deploy your Chainlit application on Azure App Service with first class Azure AI Agent Service integration, making it accessible to users globally. Resources Implementation can be found at Deploy-AI-Agent-App-Service References: https://learn.microsoft.com/en-us/azure/ai-services/agents/overview ~Cheers! Robert Rita AI Cloud Solution Architect, ASEAN https://www.linkedin.com/in/robertrita/ #r0bai10KViews3likes6CommentsData Storage in Azure OpenAI Service
Data Stored at Rest by Default Azure OpenAI does store certain data at rest by default when you use specific features (continue reading) In general, the base models are stateless and do not retain your prompts or completions from standard API calls (they aren't used to train or improve the base models). However, some optional service features will persist data in your Azure OpenAI resource. For example, if you upload files for fine-tuning, use the vector store, or enable stateful features like Assistants API Threads or Stored Completions, that data will be stored at rest by the service. This means content such as training datasets, embeddings, conversation history, or output logs from those features are saved within your Azure environment. Importantly, this storage is within your own Azure tenant (in the Azure OpenAI resource you created) and remains in the same geographic region as your resource. In summary, yes – data can be stored at rest by default when using these features, and it stays isolated to your Azure resource in your tenant. If you only use basic completions without these features, then your prompts and outputs are not persisted in the resource by default (aside from transient processing). Location and Deletion of Stored Data Location: All data stored by Azure OpenAI features resides in your Azure OpenAI resource’s storage, within your Azure subscription/tenant and in the same region (geography) that your resource is deployed. Microsoft ensures this data is secured — it is automatically encrypted at rest using AES-256 encryption, and you have the option to add a customer-managed key for double encryption (except in certain preview features that may not support CMK). No other Azure OpenAI customers or OpenAI (the company) can access this data; it remains isolated to your environment. Deletion: You retain full control over any data stored by these features. The official documentation states that stored data can be deleted by the customer at any time. For instance, if you fine-tune a model, the resulting custom model and any training files you uploaded are exclusively available to you and you can delete them whenever you wish. Similarly, any stored conversation threads or batch processing data can be removed by you through the Azure portal or API. In short, data persisted for Azure OpenAI features is user-managed: it lives in your tenant and you can delete it on demand once it’s no longer needed. Comparison to Abuse Monitoring and Content Filtering It’s important to distinguish the above data storage from Azure OpenAI’s content safety system (content filtering and abuse monitoring), which operates differently: Content Filtering: Azure OpenAI automatically checks prompts and generations for policy violations. These filters run in real-time and do not store your prompts or outputs in the filter models, nor are your prompts/outputs used to improve the filters without consent. In other words, the content filtering process itself is ephemeral – it analyzes the content on the fly and doesn’t permanently retain that data. Abuse Monitoring: By default (if enabled), Azure OpenAI has an abuse detection system that might log certain data when misuse is detected. If the system’s algorithms flag potential violations, a sample of your prompts and completions may be captured for review. Any such data selected for human review is stored in a secure, isolated data store tied to your resource and region (within the Azure OpenAI service boundaries in your geography). This is used strictly for moderation purposes – e.g. a Microsoft reviewer could examine a flagged request to ensure compliance with the Azure OpenAI Code of Conduct. When Abuse Monitoring is Disabled: if you disabled content logging/abuse monitoring (via an approved Microsoft process to turn it off). According to Microsoft’s documentation, when a customer has this modified abuse monitoring in place, Microsoft does not store any prompts or completions for that subscription’s Azure OpenAI usage. The human review process is completely bypassed (because there’s no stored data to review). Only the AI-based checks might still occur, but they happen in-memory at request time and do not persist your data at rest. Essentially, with abuse monitoring turned off, no usage data is being saved for moderation purposes; the system will check content policy compliance on the fly and then immediately discard those prompts/outputs without logging them. Data Storage and Deletion in Azure OpenAI “Chat on Your Data” Azure OpenAI’s “Chat on your data” (also called Azure OpenAI on your data, part of the Assistants preview) lets you ground the model’s answers on your own documents. It stores some of your data to enable this functionality. Below, we explain where and how your data is stored, how to delete it, and important considerations (based on official Microsoft documentation). How Azure Open AI on your data stores your data Data Ingestion and Storage: When you add your own data (for example by uploading files or providing a URL) through Azure OpenAI’s “Add your data” feature, the service ingests that content into an Azure Cognitive Search index (Azure AI Search). The data is first stored in Azure Blob Storage (for processing) and then indexed for retrieval: Files Upload (Preview): Files you upload are stored in an Azure Blob Storage account and then ingested (indexed) into an Azure AI Search index. This means the text from your documents is chunked and saved in a search index so the model can retrieve it during chat. Web URLs (Preview): If you add a website URL as a data source, the page content is fetched and saved to a Blob Storage container (webpage-<index name>), then indexed into Azure Cognitive Search. Each URL you add creates a separate container in Blob storage with the page content, which is then added to the search index. Existing Azure Data Stores: You also have the option to connect an existing Azure Cognitive Search index or other vector databases (like Cosmos DB or Elasticsearch) instead of uploading new files. In those cases, the data remains in that source (for example, your existing search index or database), and Azure OpenAI will use it for retrieval rather than copying it elsewhere. Chat Sessions and Threads: Azure OpenAI’s Assistants feature (which underpins “Chat on your data”) is stateful. This means it retains conversation history and any file attachments you use during the chat. Specifically, it stores: (1) Threads, messages, and runs from your chat sessions, and (2) any files you uploaded as part of an Assistant’s setup or messages. All this data is stored in a secure, Microsoft-managed storage account, isolated for your Azure OpenAI resource. In other words, Azure manages the storage for conversation history and uploaded content, and keeps it logically separated per customer/resource. Location and Retention: The stored data (index content, files, chat threads) resides within the same Azure region/tenant as your Azure OpenAI resource. It will persist indefinitely – Azure OpenAI will not automatically purge or delete your data – until you take action to remove it. Even if you close your browser or end a session, the ingested data (search index, stored files, thread history) remains saved on the Azure side. For example, if you created a Cognitive Search index or attached a storage account for “Chat on your data,” that index and the files stay in place; the system does not delete them in the background. How to Delete Stored Data Removing data that was stored by the “Chat on your data” feature involves a manual deletion step. You have a few options depending on what data you want to delete: Delete Chat Threads (Assistants API): If you used the Assistants feature and have saved conversation threads that you want to remove (including their history and any associated uploaded files), you can call the Assistants API to delete those threads. Azure OpenAI provides a DELETE endpoint for threads. Using the thread’s ID, you can issue a delete request to wipe that thread’s messages and any data tied to it. In practice, this means using the Azure OpenAI REST API or SDK with the thread ID. For example: DELETE https://<your-resource-name>.openai.azure.com/openai/threads/{thread_id}?api-version=2024-08-01-preview . This “delete thread” operation will remove the conversation and its stored content from the Azure OpenAI Assistants storage (Simply clearing or resetting the chat in the Studio UI does not delete the underlying thread data – you must call the delete operation explicitly.) Delete Your Search Index or Data Source: If you connected an Azure Cognitive Search index or the system created one for you during data ingestion, you should delete the index (or wipe its documents) to remove your content. You can do this via the Azure portal or Azure Cognitive Search APIs: go to your Azure Cognitive Search resource, find the index that was created to store your data, and delete that index. Deleting the index ensures all chunks of your documents are removed from search. Similarly, if you had set up an external vector database (Cosmos DB, Elasticsearch, etc.) as the data source, you should delete any entries or indexes there to purge the data. Tip: The index name you created is shown in the Azure AI Studio and can be found in your search resource’s overview. Removing that index or the entire search resource will delete the ingested data. Delete Stored Files in Blob Storage: If your usage involved uploading files or crawling URLs (thereby storing files in a Blob Storage container), you’ll want to delete those blobs as well. Navigate to the Azure Blob Storage account/container that was used for “Chat on your data” and delete the uploaded files or containers containing your data. For example, if you used the “Upload files (preview)” option, the files were stored in a container in the Azure Storage account you provided– you can delete those directly from the storage account. Likewise, for any web pages saved under webpage-<index name> containers, delete those containers or blobs via the Storage account in Azure Portal or using Azure Storage Explorer. Full Resource Deletion (optional): As an alternative cleanup method, you can delete the Azure resources or resource group that contain the data. For instance, if you created a dedicated Azure Cognitive Search service or storage account just for this feature, deleting those resources (or the whole resource group they reside in) will remove all stored data and associated indices in one go. Note: Only use this approach if you’re sure those resources aren’t needed for anything else, as it is a broad action. Otherwise, stick to deleting the specific index or files as described above. Verification: Once you have deleted the above, the model will no longer have access to your data. The next time you use “Chat on your data,” it will not find any of the deleted content in the index, and thus cannot include it in answers. (Each query fetches data fresh from the connected index or vector store, so if the data is gone, nothing will be retrieved from it.) Considerations and Limitations No Automatic Deletion: Remember that Azure OpenAI will not auto-delete any data you’ve ingested. All data persists until you remove it. For example, if you remove a data source from the Studio UI or end your session, the configuration UI might forget it, but the actual index and files remain stored in your Azure resources. Always explicitly delete indexes, files, or threads to truly remove the data. Preview Feature Caveats: “Chat on your data” (Azure OpenAI on your data) is currently a preview feature. Some management capabilities are still evolving. A known limitation was that the Azure AI Studio UI did not persist the data source connection between sessions – you’d have to reattach your index each time, even though the index itself continued to exist. This is being worked on, but it underscores that the UI might not show you all lingering data. Deleting via API/portal is the reliable way to ensure data is removed. Also, preview features might not support certain options like customer-managed keys for encryption of the stored data(the data is still encrypted at rest by Microsoft, but you may not be able to bring your own key in preview). Data Location & Isolation: All data stored by this feature stays within your Azure OpenAI resource’s region/geo and is isolated to your tenant. It is not shared with other customers or OpenAI – it remains private to your resource. So, deleting it is solely your responsibility and under your control. Microsoft confirms that the Assistants data storage adheres to compliance like GDPR and CCPA, meaning you have the ability to delete personal data to meet compliance requirements Costs: There is no extra charge specifically for the Assistant “on your data” storage itself. The data being stored in a cognitive search index or blob storage will simply incur the normal Azure charges for those services (for example, Azure Cognitive Search indexing queries, or storage capacity usage). Deleting unused resources when you’re done is wise to avoid ongoing charges. If you only delete the data (index/documents) but keep the search service running, you may still incur minimal costs for the service being available – consider deleting the whole search resource if you no longer need it Residual References: After deletion, any chat sessions or assistants that were using that data source will no longer find it. If you had an Assistant configured with a now-deleted vector store or index, you might need to update or recreate the assistant if you plan to use it again, as the old data source won’t resolve. Clearing out the data ensures it’s gone from future responses. (Each new question to the model will only retrieve from whatever data sources currently exist/are connected.) In summary, the data you intentionally provide for Azure OpenAI’s features (fine-tuning files, vector data, chat histories, etc.) is stored at rest by design in your Azure OpenAI resource (within your tenant and region), and you can delete it at any time. This is separate from the content safety mechanisms. Content filtering doesn’t retain data, and abuse monitoring would ordinarily store some flagged data for review – but since you have that disabled, no prompt or completion data is being stored for abuse monitoring now. All of these details are based on Microsoft’s official documentation, ensuring your understanding is aligned with Azure OpenAI’s data privacy guarantees and settings. Azure OpenAI “Chat on your data” stores your content in Azure Search indexes and blob storage (within your own Azure environment or a managed store tied to your resource). This data remains until you take action to delete it. To remove your data, delete the chat threads (via API) and remove any associated indexes or files in Azure. There are no hidden copies once you do this – the system will not retain context from deleted data on the next chat run. Always double-check the relevant Azure resources (search and storage) to ensure all parts of your data are cleaned up. Following these steps, you can confidently use the feature while maintaining control over your data lifecycle.4.8KViews1like1CommentBeyond Prompts: How Agentic AI is Redefining Human-AI Collaboration
The Shift from Reactive to Proactive AI As a passionate innovator in AI education, I’m on a mission to reimagine how we learn and build with AI—looking to craft intelligent agents that move beyond simple prompts to think, plan, and collaborate dynamically. Traditional AI systems rely heavily on prompt-based interactions—you ask a question, and the model responds. These systems are reactive, limited to single-turn tasks, and lack the ability to plan or adapt. This becomes a bottleneck in dynamic environments where tasks require multi-step reasoning, memory, and autonomy. Agentic AI changes the game. An agent is a structured system that uses a looped process to: Think – analyze inputs, reason about tasks, and plan actions. Act – choose and execute tools to complete tasks. Learn – optionally adapt based on feedback or outcomes. Unlike static workflows, agentic systems can: Make autonomous decisions Adapt to changing environments Collaborate with humans or other agents This shift enables AI to move from being a passive assistant to an active collaborator—capable of solving complex problems with minimal human intervention. What Is Agentic AI? Agentic AI refers to AI systems that go beyond static responses—they can reason, plan, act, and adapt autonomously. These agents operate in dynamic environments, making decisions and invoking tools to achieve goals with minimal human intervention. Some of the frameworks that can be used for Agentic AI include LangChain, Semantic Kernel, AutoGen, Crew AI, MetaGPT, etc. The frameworks can use Azure OpenAI, Anthropic Claude, Google Gemini, Mistral AI, Hugging Face Transformers, etc. Key Traits of Agentic AI Autonomy Agents can independently decide what actions to take based on context and goals. Unlike assistants, which support users, agents' complete tasks and drive outcomes. Memory Agents can retain both long-term and short-term context. This enables personalized and context-aware interactions across sessions. Planning Semantic Kernel agents use function calling to plan multi-step tasks. The AI can iteratively invoke functions, analyze results, and adjust its strategy—automating complex workflows. Adaptability Agents dynamically adjust their behavior based on user input, environmental changes, or feedback. This makes them suitable for real-world applications like task management, learning assistants, or research copilots. Frameworks That Enable Agentic AI Semantic Kernel: A flexible framework for building agents with skills, memory, and orchestration. Supports plugins, planning, and multi-agent collaboration. More information here: Semantic Kernel Agent Architecture. Azure AI Foundry: A managed platform for deploying secure, scalable agents with built-in governance and tool integration. More information here: Exploring the Semantic Kernel Azure AI Agent. LangGraph: A JavaScript-compatible SDK for building agentic apps with memory and tool-calling capabilities, ideal for web-based applications. More information here: Agentic app with LangGraph or Azure AI Foundry (Node.js) - Azure App Service. Copilot Studio: A low-code platform to build custom copilots and agentic workflows using generative AI, plugins, and orchestration. Ideal for enterprise-grade conversational agents. More information here: Building your own copilot with Copilot Studio. Microsoft 365 Copilot: Embeds agentic capabilities directly into productivity apps like Word, Excel, and Teams—enabling contextual, multi-step assistance across workflows. More information here: What is Microsoft 365 Copilot? Why It Matters: Real-World Impact Traditional Generative AI is like a calculator—you input a question, and it gives you an answer. It’s reactive, single-turn, and lacks context. While useful for quick tasks, it struggles with complexity, personalization, and continuity. Agentic AI, on the other hand, is like a smart teammate. It can: Understand goals Plan multi-step actions Remember past interactions Adapt to changing needs Generative AI vs. Agentic Systems Feature Generative AI Agentic AI Interaction Style One-shot responses Multi-turn, goal-driven Context Awareness Limited Persistent memory Task Execution Static Dynamic and autonomous Adaptability Low High (based on feedback/input) How Agentic AI Works — Agentic AI for Students Example Imagine a student named Alice preparing for her final exams. She uses a Smart Study Assistant powered by Agentic AI. Here's how the agent works behind the scenes: Skills / Functions These are the actions or the callable units of logic the agent can invoke to perform. The assistant has functions like: Summarize lecture notes Generate quiz questions Search academic papers Schedule study sessions Think of these as plug-and-play capabilities the agent can call when needed. Memory The agent remembers Alice’s: Past quiz scores Topics she struggled with Preferred study times This helps the assistant personalize recommendations and avoid repeating content she already knows. Planner Instead of doing everything at once, the agent: Breaks down Alice’s goal (“prepare for exams”) into steps Plans a week-by-week study schedule Decides which skills/functions to use at each stage It’s like having a tutor who builds a custom roadmap. Orchestrator This is the brain that coordinates everything. It decides when to use memory, which function to call, and how to adjust the plan if Alice misses a study session or scores low on a quiz. It ensures the agent behaves intelligently and adapts in real time. Conclusion Agentic AI marks a pivotal shift in how we interact with intelligent systems—from passive assistants to proactive collaborators. As we move beyond prompts, we unlock new possibilities for autonomy, adaptability, and human-AI synergy. Whether you're a developer, educator, or strategist, understanding agentic frameworks is no longer optional - it’s foundational. Here are the high-level steps to get started with Agentic AI using only official Microsoft resources, each with a direct link to the relevant documentation: Get Started with Agentic AI Understand Agentic AI Concepts - Begin by learning the fundamentals of AI agents, their architecture, and use cases. See: Explore the basics in this Microsoft Learn module Set Up Your Azure Environment - Create an Azure account and ensure you have the necessary roles (e.g., Azure AI Account Owner or Contributor). See: Quickstart guide for Azure AI Foundry Agent Service Create Your First Agent in Azure AI Foundry - Use the Foundry portal to create a project and deploy a default agent. Customize it with instructions and test it in the playground. See: Step-by-step agent creation in Azure AI Foundry Build an Agentic Web App with Semantic Kernel or Foundry - Follow a hands-on tutorial to integrate agentic capabilities into a .NET web app using Semantic Kernel or Azure AI Foundry. See: Tutorial: Build an agentic app with Semantic Kernel or Foundry Deploy and Test Your Agent - Use GitHub Codespaces or Azure Developer CLI to deploy your app and connect it to your agent. Validate functionality using OpenAPI tools and the agent playground. See: Deploy and test your agentic app For Further Learning: Develop generative AI apps with Azure OpenAI and Semantic Kernel Agentic app with Semantic Kernel or Azure AI Foundry (.NET) - Azure App Service AI Agent Orchestration Patterns - Azure Architecture Center Configuring Agents with Semantic Kernel Plugins Workflows with AI Agents and Models - Azure Logic Apps About the author: I'm Juliet Rajan, a Lead Technical Trainer and passionate innovator in AI education. I specialize in crafting gamified, visionary learning experiences and building intelligent agents that go beyond traditional prompt-based systems. My recent work explores agentic AI, autonomous copilots, and dynamic human-AI collaboration using platforms like Azure AI Foundry and Semantic Kernel.821Views6likes2CommentsBuilding custom AI Speech models with Phi-3 and Synthetic data
Introduction In today’s landscape, speech recognition technologies play a critical role across various industries—improving customer experiences, streamlining operations, and enabling more intuitive interactions. With Azure AI Speech, developers and organizations can easily harness powerful, fully managed speech functionalities without requiring deep expertise in data science or speech engineering. Core capabilities include: Speech to Text (STT) Text to Speech (TTS) Speech Translation Custom Neural Voice Speaker Recognition Azure AI Speech supports over 100 languages and dialects, making it ideal for global applications. Yet, for certain highly specialized domains—such as industry-specific terminology, specialized technical jargon, or brand-specific nomenclature—off-the-shelf recognition models may fall short. To achieve the best possible performance, you’ll likely need to fine-tune a custom speech recognition model. This fine-tuning process typically requires a considerable amount of high-quality, domain-specific audio data, which can be difficult to acquire. The Data Challenge: When training datasets lack sufficient diversity or volume—especially in niche domains or underrepresented speech patterns—model performance can degrade significantly. This not only impacts transcription accuracy but also hinders the adoption of speech-based applications. For many developers, sourcing enough domain-relevant audio data is one of the most challenging aspects of building high-accuracy, real-world speech solutions. Addressing Data Scarcity with Synthetic Data A powerful solution to data scarcity is the use of synthetic data: audio files generated artificially using TTS models rather than recorded from live speakers. Synthetic data helps you quickly produce large volumes of domain-specific audio for model training and evaluation. By leveraging Microsoft’s Phi-3.5 model and Azure’s pre-trained TTS engines, you can generate target-language, domain-focused synthetic utterances at scale—no professional recording studio or voice actors needed. What is Synthetic Data? Synthetic data is artificial data that replicates patterns found in real-world data without exposing sensitive details. It’s especially beneficial when real data is limited, protected, or expensive to gather. Use cases include: Privacy Compliance: Train models without handling personal or sensitive data. Filling Data Gaps: Quickly create samples for rare scenarios (e.g., specialized medical terms, unusual accents) to improve model accuracy. Balancing Datasets: Add more samples to underrepresented classes, enhancing fairness and performance. Scenario Testing: Simulate rare or costly conditions (e.g., edge cases in autonomous driving) for more robust models. By incorporating synthetic data, you can fine-tune custom STT(Speech to Text) models even when your access to real-world domain recordings is limited. Synthetic data allows models to learn from a broader range of domain-specific utterances, improving accuracy and robustness. Overview of the Process This blog post provides a step-by-step guide—supported by code samples—to quickly generate domain-specific synthetic data with Phi-3.5 and Azure AI Speech TTS, then use that data to fine-tune and evaluate a custom speech-to-text model. We will cover steps 1–4 of the high-level architecture: End-to-End Custom Speech-to-Text Model Fine-Tuning Process Custom Speech with Synthetic data Hands-on Labs: GitHub Repository Step 0: Environment Setup First, configure a .env file based on the provided sample.env template to suit your environment. You’ll need to: Deploy the Phi-3.5 model as a serverless endpoint on Azure AI Foundry. Provision Azure AI Speech and Azure Storage account. Below is a sample configuration focusing on creating a custom Italian model: # this is a sample for keys used in this code repo. # Please rename it to .env before you can use it # Azure Phi3.5 AZURE_PHI3.5_ENDPOINT=https://aoai-services1.services.ai.azure.com/models AZURE_PHI3.5_API_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx AZURE_PHI3.5_DEPLOYMENT_NAME=Phi-3.5-MoE-instruct #Azure AI Speech AZURE_AI_SPEECH_REGION=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx AZURE_AI_SPEECH_API_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx # https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=stt CUSTOM_SPEECH_LANG=Italian CUSTOM_SPEECH_LOCALE=it-IT # https://speech.microsoft.com/portal?projecttype=voicegallery TTS_FOR_TRAIN=it-IT-BenignoNeural,it-IT-CalimeroNeural,it-IT-CataldoNeural,it-IT-FabiolaNeural,it-IT-FiammaNeural TTS_FOR_EVAL=it-IT-IsabellaMultilingualNeural #Azure Account Storage AZURE_STORAGE_ACCOUNT_NAME=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx AZURE_STORAGE_ACCOUNT_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx AZURE_STORAGE_CONTAINER_NAME=stt-container Key Settings Explained: AZURE_PHI3.5_ENDPOINT / AZURE_PHI3.5_API_KEY / AZURE_PHI3.5_DEPLOYMENT_NAME: Access credentials and the deployment name for the Phi-3.5 model. AZURE_AI_SPEECH_REGION: The Azure region hosting your Speech resources. CUSTOM_SPEECH_LANG / CUSTOM_SPEECH_LOCALE: Specify the language and locale for the custom model. TTS_FOR_TRAIN / TTS_FOR_EVAL: Comma-separated Voice Names (from the Voice Gallery) for generating synthetic speech for training and evaluation. AZURE_STORAGE_ACCOUNT_NAME / KEY / CONTAINER_NAME: Configurations for your Azure Storage account, where training/evaluation data will be stored. > Voice Gallery Step 1: Generating Domain-Specific Text Utterances with Phi-3.5 Use the Phi-3.5 model to generate custom textual utterances in your target language and English. These utterances serve as a seed for synthetic speech creation. By adjusting your prompts, you can produce text tailored to your domain (such as call center Q&A for a tech brand). Code snippet (illustrative): topic = f""" Call center QnA related expected spoken utterances for {CUSTOM_SPEECH_LANG} and English languages. """ question = f""" create 10 lines of jsonl of the topic in {CUSTOM_SPEECH_LANG} and english. jsonl format is required. use 'no' as number and '{CUSTOM_SPEECH_LOCALE}', 'en-US' keys for the languages. only include the lines as the result. Do not include ```jsonl, ``` and blank line in the result. """ response = client.complete( messages=[ SystemMessage(content=""" Generate plain text sentences of #topic# related text to improve the recognition of domain-specific words and phrases. Domain-specific words can be uncommon or made-up words, but their pronunciation must be straightforward to be recognized. Use text data that's close to the expected spoken utterances. The nummber of utterances per line should be 1. """), UserMessage(content=f""" #topic#: {topic} Question: {question} """), ], ... ) content = response.choices[0].message.content print(content) # Prints the generated JSONL with no, locale, and content keys Sample Output (Contoso Electronics in Italian): {"no":1,"it-IT":"Come posso risolvere un problema con il mio televisore Contoso?","en-US":"How can I fix an issue with my Contoso TV?"} {"no":2,"it-IT":"Qual è la garanzia per il mio smartphone Contoso?","en-US":"What is the warranty for my Contoso smartphone?"} {"no":3,"it-IT":"Ho bisogno di assistenza per il mio tablet Contoso, chi posso contattare?","en-US":"I need help with my Contoso tablet, who can I contact?"} {"no":4,"it-IT":"Il mio laptop Contoso non si accende, cosa posso fare?","en-US":"My Contoso laptop won't turn on, what can I do?"} {"no":5,"it-IT":"Posso acquistare accessori per il mio smartwatch Contoso?","en-US":"Can I buy accessories for my Contoso smartwatch?"} {"no":6,"it-IT":"Ho perso la password del mio router Contoso, come posso recuperarla?","en-US":"I forgot my Contoso router password, how can I recover it?"} {"no":7,"it-IT":"Il mio telecomando Contoso non funziona, come posso sostituirlo?","en-US":"My Contoso remote control isn't working, how can I replace it?"} {"no":8,"it-IT":"Ho bisogno di assistenza per il mio altoparlante Contoso, chi posso contattare?","en-US":"I need help with my Contoso speaker, who can I contact?"} {"no":9,"it-IT":"Il mio smartphone Contoso si surriscalda, cosa posso fare?","en-US":"My Contoso smartphone is overheating, what can I do?"} {"no":10,"it-IT":"Posso acquistare una copia di backup del mio smartwatch Contoso?","en-US":"Can I buy a backup copy of my Contoso smartwatch?"} These generated lines give you a domain-oriented textual dataset, ready to be converted into synthetic audio. Step 2: Creating the Synthetic Audio Dataset Using the generated utterances from Step 1, you can now produce synthetic speech WAV files using Azure AI Speech’s TTS service. This bypasses the need for real recordings and allows quick generation of numerous training samples. Core Function: def get_audio_file_by_speech_synthesis(text, file_path, lang, default_tts_voice): ssml = f"""<speak version='1.0' xmlns="https://www.w3.org/2001/10/synthesis" xml:lang='{lang}'> <voice name='{default_tts_voice}'> {html.escape(text)} </voice> </speak>""" speech_sythesis_result = speech_synthesizer.speak_ssml_async(ssml).get() stream = speechsdk.AudioDataStream(speech_sythesis_result) stream.save_to_wav_file(file_path) Execution: For each generated text line, the code produces multiple WAV files (one per specified TTS voice). It also creates a manifest.txt for reference and a zip file containing all the training data. Note: If DELETE_OLD_DATA = True, the training_dataset folder resets each run. If you’re mixing synthetic data with real recorded data, set DELETE_OLD_DATA = False to retain previously curated samples. Code snippet (illustrative): import zipfile import shutil DELETE_OLD_DATA = True train_dataset_dir = "train_dataset" if not os.path.exists(train_dataset_dir): os.makedirs(train_dataset_dir) if(DELETE_OLD_DATA): for file in os.listdir(train_dataset_dir): os.remove(os.path.join(train_dataset_dir, file)) timestamp = datetime.datetime.now().strftime("%Y%m%d%H%M%S") zip_filename = f'train_{lang}_{timestamp}.zip' with zipfile.ZipFile(zip_filename, 'w') as zipf: for file in files: zipf.write(os.path.join(output_dir, file), file) print(f"Created zip file: {zip_filename}") shutil.move(zip_filename, os.path.join(train_dataset_dir, zip_filename)) print(f"Moved zip file to: {os.path.join(train_dataset_dir, zip_filename)}") train_dataset_path = {os.path.join(train_dataset_dir, zip_filename)} %store train_dataset_path You’ll also similarly create evaluation data using a different TTS voice than used for training to ensure a meaningful evaluation scenario. Example Snippet to create the synthetic evaluation data: import datetime print(TTS_FOR_EVAL) languages = [CUSTOM_SPEECH_LOCALE] eval_output_dir = "synthetic_eval_data" DELETE_OLD_DATA = True if not os.path.exists(eval_output_dir): os.makedirs(eval_output_dir) if(DELETE_OLD_DATA): for file in os.listdir(eval_output_dir): os.remove(os.path.join(eval_output_dir, file)) eval_tts_voices = TTS_FOR_EVAL.split(',') for tts_voice in eval_tts_voices: with open(synthetic_text_file, 'r', encoding='utf-8') as f: for line in f: try: expression = json.loads(line) no = expression['no'] for lang in languages: text = expression[lang] timestamp = datetime.datetime.now().strftime("%Y%m%d%H%M%S") file_name = f"{no}_{lang}_{timestamp}.wav" get_audio_file_by_speech_synthesis(text, os.path.join(eval_output_dir,file_name), lang, tts_voice) with open(f'{eval_output_dir}/manifest.txt', 'a', encoding='utf-8') as manifest_file: manifest_file.write(f"{file_name}\t{text}\n") except json.JSONDecodeError as e: print(f"Error decoding JSON on line: {line}") print(e) Step 3: Creating and Training a Custom Speech Model To fine-tune and evaluate your custom model, you’ll interact with Azure’s Speech-to-Text APIs: Upload your dataset (the zip file created in Step 2) to your Azure Storage container. Register your dataset as a Custom Speech dataset. Create a Custom Speech model using that dataset. Create evaluations using that custom model with asynchronous calls until it’s completed. You can also use UI-based approaches to customize a speech model with fine-tuning in the Azure AI Foundry portal, but in this hands-on, we'll use the Azure Speech-to-Text REST APIs to iterate entire processes. Key APIs & References: Azure Speech-to-Text REST APIs (v3.2) The provided common.py in the hands-on repo abstracts API calls for convenience. Example Snippet to create training dataset: uploaded_files, url = upload_dataset_to_storage(data_folder, container_name, account_name, account_key) kind="Acoustic" display_name = "acoustic dataset(zip) for training" description = f"[training] Dataset for fine-tuning the {CUSTOM_SPEECH_LANG} base model" zip_dataset_dict = {} for display_name in uploaded_files: zip_dataset_dict[display_name] = create_dataset(base_url, headers, project_id, url[display_name], kind, display_name, description, CUSTOM_SPEECH_LOCALE) You can monitor training progress using monitor_training_status function which polls the model’s status and updates you once training completes Core Function: def monitor_training_status(custom_model_id): with tqdm(total=3, desc="Running Status", unit="step") as pbar: status = get_custom_model_status(base_url, headers, custom_model_id) if status == "NotStarted": pbar.update(1) while status != "Succeeded" and status != "Failed": if status == "Running" and pbar.n < 2: pbar.update(1) print(f"Current Status: {status}") time.sleep(10) status = get_custom_model_status(base_url, headers, custom_model_id) while(pbar.n < 3): pbar.update(1) print("Training Completed") Step 4: Evaluate Trained Custom Speech After training, create an evaluation job using your synthetic evaluation dataset. With the custom model now trained, compare its performance (measured by Word Error Rate, WER) against the base model’s WER. Key Steps: Use create_evaluation function to evaluate the custom model against your test set. Compare evaluation metrics between base and custom models. Check WER to quantify accuracy improvements. After evaluation, you can view the evaluation results of the base model and the fine-tuning model based on the evaluation dataset created in the 1_text_data_generation.ipynb notebook in either Speech Studio or the AI Foundry Fine-Tuning section, depending on the resource location you specified in the configuration file. Example Snippet to create evaluation: description = f"[{CUSTOM_SPEECH_LOCALE}] Evaluation of the {CUSTOM_SPEECH_LANG} base and custom model" evaluation_ids={} for display_name in uploaded_files: evaluation_ids[display_name] = create_evaluation(base_url, headers, project_id, dataset_ids[display_name], base_model_id, custom_model_with_acoustic_id, f'vi_eval_base_vs_custom_{display_name}', description, CUSTOM_SPEECH_LOCALE) Also, you can see a simple Word Error Rate (WER) number in the code below, which you can utilize in 4_evaluate_custom_model.ipynb. Example Snippet to create WER dateframe: # Collect WER results for each dataset wer_results = [] eval_title = "Evaluation Results for base model and custom model: " for display_name in uploaded_files: eval_info = get_evaluation_results(base_url, headers, evaluation_ids[display_name]) eval_title = eval_title + display_name + " " wer_results.append({ 'Dataset': display_name, 'WER_base_model': eval_info['properties']['wordErrorRate1'], 'WER_custom_model': eval_info['properties']['wordErrorRate2'], }) # Create a DataFrame to display the results print(eval_info) wer_df = pd.DataFrame(wer_results) print(eval_title) print(wer_df) About WER: WER is computed as (Insertions + Deletions + Substitutions) / Total Words. A lower WER signifies better accuracy. Synthetic data can help reduce WER by introducing more domain-specific terms during training. You'll also similarly create a WER result markdown file using the md_table_scoring_result method below. Core Function: # Create a markdown file for table scoring results md_table_scoring_result(base_url, headers, evaluation_ids, uploaded_files) Implementation Considerations The provided code and instructions serve as a baseline for automating the creation of synthetic data and fine-tuning Custom Speech models. The WER numbers you get from model evaluation will also vary depending on the actual domain. Real-world scenarios may require adjustments, such as incorporating real data or customizing the training pipeline for specific domain needs. Feel free to extend or modify this baseline to better match your use case and improve model performance. Conclusion By combining Microsoft’s Phi-3.5 model with Azure AI Speech TTS capabilities, you can overcome data scarcity and accelerate the fine-tuning of domain-specific speech-to-text models. Synthetic data generation makes it possible to: Rapidly produce large volumes of specialized training and evaluation data. Substantially reduce the time and cost associated with recording real audio. Improve speech recognition accuracy for niche domains by augmenting your dataset with diverse synthetic samples. As you continue exploring Azure’s AI and speech services, you’ll find more opportunities to leverage generative AI and synthetic data to build powerful, domain-adapted speech solutions—without the overhead of large-scale data collection efforts. 🙂 Reference Azure AI Speech Overview Microsoft Phi-3 Cookbook Text to Speech Overview Speech to Text Overview Custom Speech Overview Customize a speech model with fine-tuning in the Azure AI Foundry Scaling Speech-Text Pre-Training with Synthetic Interleaved Data (arXiv) Training TTS Systems from Synthetic Data: A Practical Approach for Accent Transfer (arXiv) Generating Data with TTS and LLMs for Conversational Speech Recognition (arXiv)1.2KViews3likes8CommentsAutonomous Visual Studio Code Desktop Automation using Computer Use Agent & PyAutoGUI
This blog post introduces an approach to desktop automation that combines Computer Use Agent (CUA) technology with PyAutoGUI to create fully autonomous VS Code workflows. The solution addresses critical gaps in existing automation tools by providing intelligent, screenshot-based decision-making for scenarios where traditional GitHub-hosted solutions fall short.730Views0likes0CommentsFine-Tuning Together: How Our Developer Community Is Shaping the Future of Custom AI
In a world where foundation models are increasingly accessible, the real magic happens when developers make them their own. Fine-tuning is no longer a niche capability, it’s becoming a core skill for developers who want to build AI that’s faster, smarter, and more aligned with their users scaling expert knowledge across their organizations. Over the past few months, we’ve seen something remarkable: a growing community of builders, tinkerers, and innovators coming together to push the boundaries of what fine-tuning can do making a powerful difference for everyday organizations. A Community Making a Big Impact: Customer Stories At Build 2025, we saw firsthand how much the landscape has shifted. Just a year ago, many teams were still relying solely on prompt engineering or retrieval-augmented generation (RAG). Today, nearly half of developers say they’re actively exploring fine-tuning. Why? Because it gives them something no off-the-shelf model can: the ability to embed their domain knowledge, tone, and logic directly into the model itself. In our breakout session led by Product Leaders Alicia and Omkar, they dove into fine-tuning and distillation with Azure AI Foundry. We heard from the Oracle Health team and how they are restoring joy in providing patient care by relieving administration burden. The Oracle Health team used fine-tuned GPT-4o-mini models to power a clinical AI agent that responds in under 800 milliseconds—fast enough to keep up with real-time healthcare workflows. Earlier this year, we saw DraftWise use reinforcement fine-tuning on o-series reasoning models within Azure AI Foundry to tailor model behavior for legal-specific tasks. This approach allowed them to sharpen responses based on proprietary legal data, improving the quality of contract drafting and review. The fine-tuned models contributed to a 30% improvement in search result quality enabling faster, more accurate legal drafting and review at scale. And we watched CoStar Group deliver a low-latency, voice-driven home search experience that scales to over 100 million monthly users, while reducing token usage, improving cost-efficiency, and accelerating time to deployment by combining GPT-4o Realtime API audio models and Mistral 3B on Azure AI Foundry. These aren’t just technical wins - they’re community wins. They show what’s possible when developers have the right tools and support to build AI that truly fits their needs. What’s New in Azure AI Foundry: Tools That Empower, Not Overwhelm The last 3 months have brought a wave of updates designed to make fine-tuning more accessible, more affordable, and more powerful—especially for developers who are just getting started. One of the most exciting additions is Reinforcement Fine-Tuning (RFT), now available in public preview with the o4-mini model. Unlike traditional supervised fine-tuning, RFT lets you teach models how to reason through complex tasks using reward signals. It’s ideal for domains where logic and nuance matter—like law, finance, or healthcare—and it’s already helping teams like DraftWise build smarter, more adaptive systems. Watch this o4-mini demo to learn more. We also introduced Global Training, which lets you fine-tune models from any of 24 Azure OpenAI regions. This means no more guessing which region supports which model, significantly lowering the barrier to entry for model customization. One of the most common questions we hear from developers is: “How do I know if my fine-tuned model is actually better?” The new Evaluation API is our answer. This API lets you programmatically evaluate model outputs using model-based graders, custom rubrics, and structured scoring—all from code And for developers who want to experiment without breaking the bank, we launched the Developer Tier. It’s a new way to deploy fine-tuned models for free (for 24 hours), paying only for tokens at the same rate as base models. It’s ideal for A/B testing, distillation experiments, or just kicking the tires on a new idea. Learning Together: From Distillation to Deployment One of the most powerful trends we’ve seen is the rise of distillation. Developers are using larger “teacher” models like GPT-4o to generate high-quality outputs, then fine-tuning smaller “student” models like GPT-4.1-mini or nano to replicate that performance at a fraction of the cost. This is now supported end-to-end in Azure AI Foundry. You can generate completions, store them automatically, fine-tune your student model, and evaluate it using model-based graders—all in one place. And the results speak for themselves. In one demo at Build, we saw a distilled 4o model go from 35% to 90% accuracy just by learning from the outputs of a larger o3 model. Let’s Keep Building: Join Us for Model Mondays | Mondays at 10:30 AM PT We’re excited about what’s next. More models. More techniques. More impact from developers like you. If you’re already fine-tuning, we’d love to hear what you’re working on. And if you’re just getting started, we’re here to help with Model Mondays. Model Mondays is your weekly dose of AI model magic: an hour of livestreamed demos, interactive developer-friendly deep dives, and just enough chaos to keep Mondays interesting. Watch the latest fine-tuning & distillation episode with Dave Voutila, Microsoft PM Checkout these Resources 🧠 Get Started with fine-tuning with Azure AI Foundry on Microsoft Learn Docs ▶️ Watch On-Demand: Fine-tuning and distillation with Azure AI Foundry 👩💻 Fine-tune GPT-4o-mini model with this tutorial 👋 Continue the conversation on Discord385Views1like0CommentsAutomate a multi-step business process, using turnkey MCP, Logic App Integration in AI Foundry
This article walks you through an application for Procure-to-Pay anomaly detection using Azure AI Foundry Agent Service. It analyzes purchase invoice images to detect procurement anomalies and compliance issues. The key capabilities of Azure AI Foundry it showcases are: Agent Service that performs a multi-step business process with only natural language instructions. The Application has little to no business logic in the code. Invocation of a variety of automated tool actions required during the multi steps business process. It showcases the recently announced turnkey integration with MCP Server, Azure Logic Apps. In addition, it uses a) visual reasoning of images using gpt-4o models, b) turnkey vector search, and c) reasoning to apply business rules and perform p2p anomaly detection. The ability of Agent Service in Azure AI Foundry to: most importantly, all the steps in the business process are performed through a single call from the Application to the Agent Service. The Agent orchestrates the steps autonomously, based on the instructions in Natural language. The Client application itself contains no business logic in code. handle application state between different tool calls. It determines what to extract from the output of one tool call to pass as input to the next. orchestrating the business process from start to end, sequencing them to achieve the business end. Architecture of the Solution Shown below is the architecture of the Solution. The Client Application is a python-based console App that calls the P2P Anomaly Detection Agent running in the Azure AI Foundry Agent Service. Solution Components Sr. No Entity Entity Purpose or Description 1 Foundry Agent The autonomous Agent that implements the P2P anomaly detection process end to end. It orchestrates all the steps in the business process through a single API call, handling application state between different tool calls and sequencing operations to achieve the business objective. 2 Visual Reasoning Tool Call Analyzes the input Purchase Invoice Image using GPT-4o's vision capabilities to extract the Purchase Invoice Header, Invoice Lines, Supplier ID, and Contract ID. This extracted information is then used in subsequent calls to the Logic App for contract validation. 3 Azure Logic App Tool Call Called by the Agent via HTTP Trigger with the Supplier ID and Contract ID extracted from the invoice. Returns matching contract data by executing dynamic SQL queries on an Azure SQL Database, providing contract header and line item details for comparison with the invoice. 4 Vector Search Tool Call Retrieves the business rules that apply to P2P anomaly detection in an Enterprise environment. Uses the configured vector store to search through the uploaded business rules document (p2p-rules.txt) to find relevant compliance and validation criteria. 5 Reasoning Applies the retrieved business rules to evaluate the Purchase Invoice against the Contract data. Determines if there are anomalies or compliance issues and generates a detailed verification report with verdicts, comparisons, and recommendations. 6 MCP Server Tool Call Catalogs and stores the generated reports in Azure Blob Storage. The MCP Server implements protocol-specific implementations to connect to the Blob Storage service, performing lifecycle actions like creating containers, listing containers, and uploading blobs. The Agent uses the MCP Server to upload the markdown verification report to a designated container. Some of the key aspects of the Solution are: 1) There is only one call from the Client Application to the Agent in Azure AI Foundry. The rest of the steps are performed by the Agent Service autonomously. The code snippet below shows the client application passing the user provided Invoice image and user instruction to the Agent in Azure AI Foundry. 2) Observe that there is no business logic in the code whatsoever, and there is no code written to perform any of the tool actions either. These are autonomously done by the Agent itself. def process_anomaly_detection(self, user_prompt: str, image_path: str, verbose: bool = True) -> str: """ Process the anomaly detection request with proper error handling and retry logic. Args: user_prompt: User's prompt for anomaly detection image_path: Path to the invoice image Returns: Response from the AI agent Raises: Exception: For various processing errors """ try: # Create message content self._show_progress("Creating message content...", verbose) content_blocks = self.create_message_content(user_prompt, image_path) # Create a new thread for this conversation self._show_progress("Creating conversation thread...", verbose) thread = self.client.threads.create() logger.info(f"Created thread, ID: {thread.id}") # Create the message self._show_progress("Sending message to agent...", verbose) message = self.client.messages.create( thread_id=thread.id, role="user", content=content_blocks ) # Run the agent with proper error handling self._show_progress("Processing with AI agent (this may take a moment)...", verbose) run = self.client.runs.create_and_process( thread_id=thread.id, agent_id=self.agent.id ) if run.status == "failed": error_msg = f"Agent run failed: {run.last_error}" logger.error(error_msg) raise Exception(error_msg) # Retrieve and process messages self._show_progress("Retrieving analysis results...", verbose) messages = self.client.messages.list( thread_id=thread.id, order=ListSortOrder.ASCENDING ) # Extract the agent's response agent_response = "" for message in messages: if message.role == "assistant" and message.text_messages: agent_response = message.text_messages[-1].text.value break if not agent_response: raise Exception("No response received from the agent") self._show_progress("Analysis complete!", verbose) logger.info("Successfully processed anomaly detection request") return agent_response except Exception as e: logger.error(f"Error processing anomaly detection: {e}") raise Note: The steps to configure the Solution is covered in the GitHub Repo link shared later in this article, and hence not discussed here. Agent Instructions As mentioned already, the Client application does not contain any Business logic. The latter is provided as natural language instructions to the Foundry Agent. Show below are the instructions configured on the P2P Agent. This is a Procure to Pay process. You will be provided with the Purchase Invoice image as input. Note the sequence of execution you must adhere to strictly: - Step 2 must be performed only after Step 1 is performed - Step 3 below must be performed only after Step 1 and Step 2 are completed. - Step 5 must be performed only after Step 1, Step 2, Step 3 and Step 4 have been performed successfully. - Step 6 must be performed only after Step 5 is performed Step 1: As a first step, you will extract the Contract ID and Supplier ID from the Purchase Invoice image along with all the line items from the Invoice in the form of a table. Step 2: You will then use the function tool by passing the Contract ID and Supplier ID to retrieve the contract details. Step 3: You will then use the file search tool to retrieve the business rules applicable to detection of anomalies in the Procure to Pay process. Step 4: Then, apply the retrieved business rules to match the invoice line items with the contract details fetched from the system, and detect anomalies if any. Step 5: Prepare a detailed 'p2p verification report' in markdown with the following content: - Verdict: Whether the Purchase Invoice complies with the Contract? - Purchase Invoice details: Invoice Header and Invoice Lines, in Markdown Table Format - Contract Details: Contract Header and Contract Lines, in Markdown Table Format - P2P Business Rules: The Rules that were retrieved using the File Search Tool - Reasoning behind the verdict: Provide a detailed reasoning why you think the Invoice aligns with the Contract yes or no. Use a Markdown Table format to compare each item in the Invoice with the Contract and indicate the basis for your judgement. Use icons/images to embellish the report and make it easy for comprehension by Business users. They must be able to quickly give this a once over and commit the Invoice into the System Step 6: You will use the MCP tool available with you to upload the 'p2p verification report' created in Step 5. - Choose a container with name 'p2p-anomaly-detection-outcomes' to upload to. If the container with this name does not exist, create one. - The name of the Report must start with the Invoice Number, appended with a hyphen and appended with a guid. E.g. Invoice001-001.md - Secure the name of the Report document uploaded to the Blob Storage account through the MCP Tool Step 7: Wait till Step 6 completes, then return the content of the Markdown document that was uploaded to Blob Storage. Output from the Run (.venv) PS C:\Users\sansri\agentic-ai-service-samples\p2p-anomaly-detection-agent> python p2pagent.py 2025-07-04 09:04:22,149 - __main__ - INFO - Successfully initialized Azure AI Agents client 2025-07-04 09:04:22,149 - INFO - Successfully initialized Azure AI Agents client === P2P Anomaly Detection Agent === This tool analyzes invoice images for procurement anomalies. Enter the path to your invoice image: data_files/Invoice-002.png Enter your analysis prompt (or press Enter for default): can you perform the procure to pay anomaly detection based on the instructions you have been provided with and give me a detailed response if this Purchase Invoice Image attached aligns with the Contract? Processing image: C:\Users\sansri\agentic-ai-service-samples\p2p-anomaly-detection-agent\data_files\Invoice-002.png Analyzing with Azure AI Agent... 🔄 Creating message content... 2025-07-04 09:04:46,774 - __main__ - INFO - Successfully converted image to base64: C:\Users\sansri\agentic-ai-service-samples\p2p-anomaly-detection-agent\data_files\Invoice-002.png 2025-07-04 09:04:46,774 - INFO - Successfully converted image to base64: C:\Users\sansri\agentic-ai-service-samples\p2p-anomaly-detection-agent\data_files\Invoice-002.png 2025-07-04 09:04:46,777 - __main__ - INFO - Successfully created message content blocks 2025-07-04 09:04:46,777 - INFO - Successfully created message content blocks 🔄 Creating conversation thread... 2025-07-04 09:04:47,231 - __main__ - INFO - Created thread, ID: thread_Lz8ViyvcnDMG9i9TGkgE7jq6 2025-07-04 09:04:47,231 - INFO - Created thread, ID: thread_Lz8ViyvcnDMG9i9TGkgE7jq6 🔄 Sending message to agent... 🔄 Processing with AI agent (this may take a moment)... 🔄 Retrieving analysis results... 🔄 Analysis complete! 2025-07-04 09:05:55,543 - __main__ - INFO - Successfully processed anomaly detection request 2025-07-04 09:05:55,543 - INFO - Successfully processed anomaly detection request ================================================== ANOMALY DETECTION RESULTS ================================================== ### P2P Verification Report #### **Verdict:** The Purchase Invoice **does not comply** with the Contract. --- #### **Purchase Invoice Details** | **Attribute** | **Value** | |-----------------------|------------------------------------------------------------------------------| | **Invoice Number** | INV001002 | | **Contract Reference**| CON000002 | | **Supplier ID** | SUP0008 | | **Total Invoice Value** | 113130.16 USD | | **Invoice Date** | 2023-06-15 | --- #### **Invoice Line Items** | **Item ID** | **Quantity** | **Unit Price (USD)** | **Total Price** | **Description** | |-------------|--------------|-----------------------|-----------------|---------------------------------------| | ITEM0040 | 116 | $136.75 | $15863.00 | Description for ITEM0040 | | ITEM0082 | 116 | $554.62 | $64335.92 | Description for ITEM0082 | | ITEM0011 | 36 | $398.09 | $14331.24 | Description for ITEM0011 | | ITEM0031 | 36 | $475.00 | $17100.00 | Description for ITEM0031 | | ITEM9999 | 10 | $150.00 | $1500.00 | Extra item not in contract | --- #### **Contract Details** | **Attribute** | **Value** | |--------------------------------|------------------------------------------------------------------------------| | **Contract ID** | CON000002 | | **Supplier ID** | SUP0008 | | **Contract Date** | 2022-10-19 | | **Expiration Date** | 2023-01-07 | | **Contract Total Amount** | 66543.39 USD | | **Status** | Expired | | **Currency** | USD | | **Item ID** | **Quantity** | **Unit Price (USD)** | **Total Price** | **Delivery Date** | **Description** | |-------------|--------------|-----------------------|-----------------|-------------------|----------------------------| | ITEM0040 | 78 | $136.75 | $10666.50 | 2023-01-01 | Description for ITEM0040 | | ITEM0082 | 57 | $479.87 | $27352.58 | 2022-11-26 | Description for ITEM0082 | | ITEM0011 | 21 | $398.09 | $8359.89 | 2022-11-29 | Description for ITEM0011 | | ITEM0031 | 47 | $429.03 | $20164.41 | 2022-12-09 | Description for ITEM0031 | --- #### **P2P Business Rules** - Invoice date in Purchase Invoice must be within the Contract term【6:0†source】. - The Contract must be valid; ensure it is not expired【6:0†source】. - Invoice Total Value should stay within the Contract value【6:0†source】. - Items in the Invoice must be strictly from the Contract item list, with correct quantities, descriptions, and unit prices【6:0†source】. - Minor rounding differences in amounts should be ignored【6:0†source】. --- |------------------|-------------------------------------------------|------------------------------------------------------------------------------------------------------------| | ITEM0040 | Quantity: 78, Unit Price: $136.75 | Invoice quantity exceeds Contract quantity. | |------------------|-------------------------------------------------|------------------------------------------------------------------------------------------------------------| | ITEM0040 | Quantity: 78, Unit Price: $136.75 | Invoice quantity exceeds Contract quantity. | | ITEM0082 | Quantity: 57, Unit Price: $479.87 | Invoice unit price differs and quantity exceeds Contract quantity. | | ITEM0011 | Quantity: 21, Unit Price: $398.09 | Invoice quantity exceeds Contract quantity. | | ITEM0031 | Quantity: 47, Unit Price: $429.03 | Invoice quantity exceeds Contract quantity and Contract total is lower than Invoice total. | | ITEM9999 | Not listed in Contract | Extra item not defined in Contract—anomalous. | ### Additional Observations: - **Contract is Expired:** The Contract expired on 2023-01-07. Invoice dated 2023-06-15 falls outside the Contract's validity term. - **Invoice exceeds Contract Value:** Contract value capped at $66543.39 USD, while Invoice totals $113130.16 USD. - **Currency Matches:** Invoice and Contract both indicate prices in USD. --- ### Professional Guidance and Suggested Next Steps: - The anomalies detected prevent this Invoice from fully complying with the Contract terms. - **Recommendation:** Rectify discrepancies (quantities, unit price adjustments, or removal of extra items) and adhere to valid Contract before committing Invoice to the System. --- ### Uploading Report Content Proceeding to upload this Markdown report into Blob Storage. ================================================== MCP Actions As an outcome from the P2P evaluation, a Markdown document is created by the P2P Agent which then gets uploaded to the Azure Blob Storage, using the MCP Tool action. See the screenshot below from the Azure Portal. References Here is the GitHub Repo that contains the code for the solution covered in this article. A Linkedin article related to this topic - here1KViews1like1CommentMastering Model Context Protocol (MCP): Building Multi Server MCP with Azure OpenAI
Create complex Multi MCP AI Agentic applications. Deep dive into Multi Server MCP implementation, connecting both local custom and ready MCP Servers in a single client session through a custom chatbot interface.7.9KViews7likes3CommentsIntroduction to OCR Free Vision RAG using Colpali For Complex Documents
Explore the cutting-edge world of document retrieval with "From Pixels to Intelligence: Introduction to OCR Free Vision RAG using ColPali for Complex Documents." This blog post delves into how ColPali revolutionizes the way we interact with documents by leveraging Vision Language Models (VLMs) to enhance Retrieval-Augmented Generation (RAG) processes.10KViews1like1Comment