Artificial Intelligence
310 TopicsSelecting the Right Agentic Solution on Azure - Part 1
Recently, we have seen a surge in requests from customers and Microsoft partners seeking guidance on building and deploying agentic solutions at various scales. With the rise of Generative AI, replacing traditional APIs with agents has become increasingly popular. There are several approaches to building, deploying, running, and orchestrating agents on Azure. In this discussion, I will focus exclusively on Azure-specific tools, services, and methodologies, setting aside Copilot and Copilot Studio for now. This article describes the options available as of today. 1. Azure OpenAI Assistants API: This feature within Azure OpenAI Service enables developers to create conversational agents (“assistants”) based on OpenAI models (such as GPT-3.5 and GPT-4). It supports capabilities like memory, tool/function calls, and retrieval (e.g., document search). However, Microsoft has already deprecated version 1 of the Azure OpenAI Assistants API, and version 2 remains in preview. Microsoft strongly recommends migrating all existing Assistants API-based agents to the Agent Service. Additionally, OpenAI is retiring the Assistants API and advises developers to use the modern “Response” API instead (see migration detail). Given these developments, it is not advisable to use the Assistants API for building agents. Instead, you should use the Azure AI Agent Service, which is part of Azure AI Foundry. 2. Workflows with AI agents and models in Azure Logic Apps (Preview) – As the name suggests, this feature is currently in public preview and is only available with Logic Apps Standard, not with the consumption plan. You can enhance your workflow by integrating agentic capabilities. For example, in a visa processing workflow, decisions can be made based on priority, application type, nationality, and background checks using a knowledge base. The workflow can then route cases to the appropriate queue and prepare messages accordingly. Workflows can be implemented either as chat assistant or APIs. If your project is workflow-dependent and you are ready to implement agents in a declarative way, this is a great option. However, there are currently limited choices for models and regional availability. For CI/CD, there is an Azure Logic Apps Standard template available for VS Code you can use. 3. Azure AI Agent Service – Part of Azure AI Foundry, the Azure AI Agent Service allows you to provision agents declaratively from the UI. You can consume various OpenAI models (with support for non-OpenAI models coming soon) and leverage important tools or knowledge bases such as files, Azure AI Search, SharePoint, and Fabric. You can connect agents together and create hierarchical agent dependencies. SDKs are available for building agents within agent services using Python, C#, or Java. Microsoft manages the infrastructure to host and run these agents in isolated containers. The service offers role-based access control, MS Entra ID integration, and options to bring your own storage for agent states and Azure Key Vault keys. You can also incorporate different actions including invoking a Logic App instance from your agent. There is also option to trigger an agent using Logic Apps (preview). Microsoft recommends using Agent Service/Azure Foundry as the destination for agents, as further enhancements and investments are focused here. 4. Agent Orchestrators – There are several excellent orchestrators available, such as LlamaIndex, LangGraph, LangChain, and two from Microsoft—Semantic Kernel and AutoGen. These options are ideal if you need full control over agent creation, hosting, and orchestration. They are developer-only solutions and do not offer a UI (barring AutoGen Studio having some UI assistance). You can create complex, multi-layered agent connections. You can then host and run these agents in you choice of Azure services like AKS or Apps Service. Additionally, you have the option to create agents using Agent Service and then orchestrate them with one of these orchestrators. Choosing the Right Solution The choice of agentic solution depends on several factors, including whether you prefer code or no-code approaches, control over the hosting platform, customer needs, scalability, maintenance, orchestration complexity, security, and cost. Customer Need: If agents need to be part of a workflow, use AI Agents in Logic Apps; otherwise, consider other options. No-Code: For workflow-based agents, Logic Apps is suitable; for other scenarios, Azure AI Agent Service is recommended. Hosting and Maintenance: If Logic Apps is not an option and you prefer not to maintain your own environment, use Azure AI Agent Service. Otherwise, consider custom agent orchestrators like Semantic Kernel or AutoGen to build the agent and services like AKS or Apps Service to host those. Orchestration Complexity: For simple hierarchical agent connections, Azure AI Agent Service is good choice. For complex orchestration, use an agent orchestrator. Versioning - If you are concerned about versioning to ensure solid CI/CD regime then you may have to chose Agent Orchestrators. Agent Service still miss this feature clarity. We have some work-around but it is not robust implementation. Hopefully we will catch up soon with a better versioning solution. Summary: When selecting the right agentic solution on Azure, consider the latest recommendations and platform developments. For most scenarios, Microsoft advises using the Azure AI Agent Service within Azure Foundry, as it is the focus of ongoing enhancements and support. For workflow-driven projects, Azure Logic Apps with agentic capabilities may be suitable, while advanced users can leverage orchestrators for custom agent architectures. In Selecting the Right Agentic Solution on Azure – Part 2 (Security) blog we will examine the security aspects of each option, one by one.874Views4likes0CommentsSelecting the Right Agentic Solution on Azure – Part 2 (Security)
Let’s pick up from where we left off in the previous post — Selecting the Right Agentic Solution on Azure - Part 1. Earlier, we explored a decision tree to help identify the most suitable Azure service for building your agentic solution. Following that discussion, we received several requests to dive deeper into the security considerations for each of these services. In this post, we’ll examine the security aspects of each option, one by one. But before going ahead and looking at the security perspective I highly recommend looking at list of Azure AI Services Technologies made available by Microsoft. This list is inclusive of all those services which were part of erstwhile cognitive services and latest additions. Workflows with AI agents and models in Azure Logic Apps (Preview) – This approach focuses on running your agents as an action or as part of an “agent loop” with multiple actions within Azure Logic Apps. It’s important not to confuse this with the alternative setup, where Azure Logic Apps integrates with AI Agents in the Foundry Agent Service—either as a tool or as a trigger. (Announcement: Power your Agents in Azure AI Foundry Agent Service with Azure Logic Apps | Microsoft Community Hub). In that scenario, your agents are hosted under the Azure AI Foundry Agent Service, which we’ll discuss separately below. Although, to create an agent workflow, you’ll need to establish a connection—either to Azure OpenAI or to an Azure AI Foundry project for connecting to a model. When connected to a Foundry project, you can view agents and threads directly within that project’s lists. Since agents here run as Logic Apps actions, their security is governed by the Logic Apps security framework. Let’s look at the key aspects: Easy Auth or App Service Auth (Preview) - Agent workflows often integrate with a broader range of systems—models, MCPs, APIs, agents, and even human interactions. You can secure these workflows using Easy Auth, which integrates with Microsoft Entra ID for authentication and authorization. Read more here: Protect Agent Workflows with Easy Auth - Azure Logic Apps | Microsoft Learn. Securing and Encrypting Data at Rest - Azure Logic Apps stores data in Azure Storage, which uses Microsoft-managed keys for encryption by default. You can further enhance security by: Restricting access to Logic App operations via Azure RBAC Limiting access to run history data Securing inputs and outputs Controlling parameter access for webhook-triggered workflows Managing outbound call access to external services More info here: Secure access and data in workflows - Azure Logic Apps | Microsoft Learn. Secure Data at transit – When exposing your Logic App as an HTTP(S) endpoint, consider using: Azure API Management for access policies and documentation Azure Application Gateway or Azure Front Door for WAF (Web Application Firewall) protection. I highly recommend the labs provided by Logic Apps Product Group to learn more about Agentic Workflows: https://azure.github.io/logicapps-labs/docs/intro. Azure AI Foundry Agent Service – As of this writing, the Azure AI Foundry Agent Service abstracts the underlying infrastructure where your agents run. Microsoft manages this secure environment, so you don’t need to handle compute, network, or storage resources—though bring-your-own-storage is an option. Securing and Encrypting Data at Rest - Microsoft guarantees that your prompts and outputs remain private—never shared with other customers or AI providers (such as OpenAI or Meta). Data (from messages, threads, runs, and uploads) is encrypted using AES-256. It remains stored in the same region where the Agent Service is deployed. You can optionally use Customer-Managed Keys (CMK) for encryption. Read more here: Data, privacy, and security for Azure AI Agent Service - Azure AI Services | Microsoft Learn. Network Security – The service allows integration with your private virtual network using a private endpoint. Note: There are known limitations, such as subnet IP restrictions, the need for a dedicated agent subnet, same-region requirements, and limited regional availability. Read more here: How to use a virtual network with the Azure AI Foundry Agent Service - Azure AI Foundry | Microsoft Learn. Secure Data at transit – Upcoming enhancements include API Management support (soon in Public Preview) for AI APIs, including Model APIs, Tool APIs/MCP servers, and Agent APIs. Here is another great article about using APIM to safeguard HTTP APIs exposed by Azure OpenAI that let your applications perform embeddings or completions by using OpenAI's language models. Agent Orchestrators – We’ve introduced the Agent Framework, which succeeds both AutoGen and Semantic Kernel. According to the product group, it combines the best capabilities of both predecessors. Support for Semantic Kernel and related documentation for AutoGen will continue to be available for some time to allow users to transition smoothly to the new framework. When discussing the security aspects of agent orchestrators, it’s important to note that these considerations also extend to the underlying services hosting them—whether on AKS or Container Apps. However, this discussion will not focus on the security features of those hosting environments, as comprehensive resources already exist for them. Instead, we’ll focus on common security concerns applicable across different orchestrators, including AutoGen, Semantic Kernel, and other frameworks such as LlamaIndex, LangGraph, or LangChain. Key areas to consider include (but are not limited to): Secure Secrets / Key Management Avoid hard-coding secrets (e.g., API keys for Foundry, OpenAI, Anthropic, Pinecone, etc.). Use secret management solutions such as Azure Key Vault or environment variables. Encrypt secrets at rest and enforce strict limits on scope and lifetime. Access Control & Least Privilege Grant each agent or tool only the minimum required permissions. Implement Role-Based Access Control (RBAC) and enforce least privilege principles. Use strong authentication (e.g., OAuth2, Azure AD) for administrative or tool-level access. Restrict the scope of external service credentials (e.g., read-only vs. write) and rotate them regularly. Isolation / Sandboxing Isolate plugin execution and use inter-process separation as needed. Prevent user inputs from executing arbitrary code on the host. Apply resource limits for model or function execution to mitigate abuse. Sensitive Data Protection Encrypt data both at rest and in transit. Mask or remove PII before sending data to models. Avoid persisting sensitive context unnecessarily. Ensure logs and memory do not inadvertently expose secrets or user data. Prompt & Query Security Sanitize or escape user input in custom query engines or chat interfaces. Protect against prompt injection by implementing guardrails to monitor and filter prompts. Set context length limits and use safe output filters (e.g., profanity filters, regex validators). Observability, Logging & Auditing Maintain comprehensive logs, including tool invocations, agent decisions, and execution paths. Continuously monitor for anomalies or unexpected behaviour. I hope this overview assists you in evaluating and implementing the appropriate security measures for your chosen agentic solution.64Views0likes0CommentsIntegrate Custom Azure AI Agents with CoPilot Studio and M365 CoPilot
Integrating Custom Agents with Copilot Studio and M365 Copilot In today's fast-paced digital world, integrating custom agents with Copilot Studio and M365 Copilot can significantly enhance your company's digital presence and extend your CoPilot platform to your enterprise applications and data. This blog will guide you through the integration steps of bringing your custom Azure AI Agent Service within an Azure Function App, into a Copilot Studio solution and publishing it to M365 and Teams Applications. When Might This Be Necessary: Integrating custom agents with Copilot Studio and M365 Copilot is necessary when you want to extend customization to automate tasks, streamline processes, and provide better user experience for your end-users. This integration is particularly useful for organizations looking to streamline their AI Platform, extend out-of-the-box functionality, and leverage existing enterprise data and applications to optimize their operations. Custom agents built on Azure allow you to achieve greater customization and flexibility than using Copilot Studio agents alone. What You Will Need: To get started, you will need the following: Azure AI Foundry Azure OpenAI Service Copilot Studio Developer License Microsoft Teams Enterprise License M365 Copilot License Steps to Integrate Custom Agents: Create a Project in Azure AI Foundry: Navigate to Azure AI Foundry and create a project. Select 'Agents' from the 'Build and Customize' menu pane on the left side of the screen and click the blue button to create a new agent. Customize Your Agent: Your agent will automatically be assigned an Agent ID. Give your agent a name and assign the model your agent will use. Customize your agent with instructions: Add your knowledge source: You can connect to Azure AI Search, load files directly to your agent, link to Microsoft Fabric, or connect to third-party sources like Tripadvisor. In our example, we are only testing the CoPilot integration steps of the AI Agent, so we did not build out additional options of providing grounding knowledge or function calling here. Test Your Agent: Once you have created your agent, test it in the playground. If you are happy with it, you are ready to call the agent in an Azure Function. Create and Publish an Azure Function: Use the sample function code from the GitHub repository to call the Azure AI Project and Agent. Publish your Azure Function to make it available for integration. azure-ai-foundry-agent/function_app.py at main · azure-data-ai-hub/azure-ai-foundry-agent Connect your AI Agent to your Function: update the "AIProjectConnString" value to include your Project connection string from the project overview page of in the AI Foundry. Role Based Access Controls: We have to add a role for the function app on OpenAI service. Role-based access control for Azure OpenAI - Azure AI services | Microsoft Learn Enable Managed Identity on the Function App Grant "Cognitive Services OpenAI Contributor" role to the System-assigned managed identity to the Function App in the Azure OpenAI resource Grant "Azure AI Developer" role to the System-assigned managed identity for your Function App in the Azure AI Project resource from the AI Foundry Build a Flow in Power Platform: Before you begin, make sure you are working in the same environment you will use to create your CoPilot Studio agent. To get started, navigate to the Power Platform (https://make.powerapps.com) to build out a flow that connects your Copilot Studio solution to your Azure Function App. When creating a new flow, select 'Build an instant cloud flow' and trigger the flow using 'Run a flow from Copilot'. Add an HTTP action to call the Function using the URL and pass the message prompt from the end user with your URL. The output of your function is plain text, so you can pass the response from your Azure AI Agent directly to your Copilot Studio solution. Create Your Copilot Studio Agent: Navigate to Microsoft Copilot Studio and select 'Agents', then 'New Agent'. Make sure you are in the same environment you used to create your cloud flow. Now select ‘Create’ button at the top of the screen From the top menu, navigate to ‘Topics’ and ‘System’. We will open up the ‘Conversation boosting’ topic. When you first open the Conversation boosting topic, you will see a template of connected nodes. Delete all but the initial ‘Trigger’ node. Now we will rebuild the conversation boosting agent to call the Flow you built in the previous step. Select 'Add an Action' and then select the option for existing Power Automate flow. Pass the response from your Custom Agent to the end user and end the current topic. My existing Cloud Flow: Add action to connect to existing Cloud Flow: When this menu pops up, you should see the option to Run the flow you created. Here, mine does not have a very unique name, but you see my flow 'Run a flow from Copilot' as a Basic action menu item. If you do not see your cloud flow here add the flow to the default solution in the environment. Go to Solutions > select the All pill > Default Solution > then add the Cloud Flow you created to the solution. Then go back to Copilot Studio, refresh and the flow will be listed there. Now complete building out the conversation boosting topic: Make Agent Available in M365 Copilot: Navigate to the 'Channels' menu and select 'Teams + Microsoft 365'. Be sure to select the box to 'Make agent available in M365 Copilot'. Save and re-publish your Copilot Agent. It may take up to 24 hours for the Copilot Agent to appear in M365 Teams agents list. Once it has loaded, select the 'Get Agents' option from the side menu of Copilot and pin your Copilot Studio Agent to your featured agent list Now, you can chat with your custom Azure AI Agent, directly from M365 Copilot! Conclusion: By following these steps, you can successfully integrate custom Azure AI Agents with Copilot Studio and M365 Copilot, enhancing you’re the utility of your existing platform and improving operational efficiency. This integration allows you to automate tasks, streamline processes, and provide better user experience for your end-users. Give it a try! Curious of how to bring custom models from your AI Foundry to your CoPilot Studio solutions? Check out this blog16KViews3likes11CommentsThe Future of AI: An Intern’s Adventure Turning Hours of Video into Minutes of Meaning
This blog post, part of The Future of AI series by Microsoft’s AI Futures team, follows an intern’s journey in developing AutoHighlight—a tool that transforms long-form video into concise, narrative-driven highlight reels. By combining Azure AI Content Understanding with OpenAI reasoning models, AutoHighlight bridges the gap between machine-detected moments and human storytelling. The post explores the challenges of video summarization, the technical architecture of the solution, and the lessons learned along the way.458Views0likes0CommentsAnnouncing the Grok 4 Fast Models from xAI: Now Available in Azure AI Foundry
These models, grok-4-fast-reasoning and grok-4-fast-non-reasoning, empower developers with distinct approaches to suit their application needs. Each model brings advanced capabilities such as structured outputs, long-context processing, and seamless integration with enterprise-grade security and governance. This release marks a significant step toward scalable, agentic AI systems that orchestrate tools, APIs, and domain data with low latency. Leveraging the Grok 4 Fast models within Azure AI Foundry Models accelerates the development of intelligent applications that combine speed, flexibility, and compliance. The unified model experience, paired with Azure’s enterprise controls, positions the Grok 4 Fast models as foundational technologies for next-generation AI-powered workflows. Why use the Grok 4 Fast Models on Azure Modern AI applications are increasingly agentic—capable of orchestrating tools, APIs, and domain data at low latency. The Grok 4 Fast models were designed for these patterns: fast, intelligent, and agent-ready, enabling parallel tool use, JSON-structured outputs, and image input for multimodal understanding. Azure AI Foundry enhances these models with enterprise controls (RBAC, private networking, customer-managed keys), observability and evaluations, and first-party hosting through Foundry Models—helping teams move confidently from prototype to production. Beyond that, using the Grok 4 Fast models on Azure offers the following: Global scalability and reliability – Azure’s worldwide infrastructure supports resilient, high-availability deployments across multiple regions. Integrated security and compliance – Enterprise-grade identity management, network isolation, encryption at rest and in transit, and compliance certifications help safeguard sensitive data and comply with regulatory requirements. Unified management experience – Centralized monitoring, governance, and cost controls through Azure Portal and Azure Resource Manager simplify operations and oversight. Native integration across Azure services – Easily connect to data sources, analytics, and other services like Azure Synapse, Cosmos DB, and Logic Apps for end-to-end solutions. Enterprise support and SLAs – Azure delivers 24/7 support, service-level agreements, and best-in-class reliability for mission-critical workloads. By building withDeploying Grok 4 Fast models throughon Azure, enables organizations tocan build robust, secure, and scalable AI applications with confidence and agility. Key capabilities The Grok 4 Fast models introduce a suite of advanced features designed to enhance agentic workflows and multimodal integration. With flexible model choices and powerful context handling, the Grok 4 Fast models are engineered for efficiency, scalability, and seamless deployment. Choose reasoning level by selecting which Grok 4 Fast model to use: grok-4-fast-reasoning: Optimized for fast reasoning in agentic workflows. grok-4-fast-non-reasoning: Uses the same underlying weights but is constrained by a non-reasoning system prompt, offering a streamlined approach for specific tasks. Multimodal: Provides image understanding when deployed with Grok image tokenizer. Tool use & structured outputs: Enables parallel function calling and supports JSON schemas for predictable integration. Long context: Supports approximately 131K tokens for deep, comprehensive understanding. Efficient H100 performance: Designed to run efficiently on H100 GPUs for agentic search and real-time orchestration. Collectively, these features make the Grok 4 Fast models a robust and versatile solution for developers and enterprises looking to push the boundaries of AI-powered workflows. What you can do with the Grok 4 Fast models Building on the advanced capabilities of the Grok 4 Fast models, developers can unlock innovative solutions across a wide variety of applications. The following use cases highlight how these models streamline complex workflows, maximize efficiency, and accelerate intelligent automation with robust, scalable AI. Real-time agentic task orchestration : Automate and coordinate multi-step processes across systems with fast, flexible reasoning for dynamic business operations. Multimodal document analysis : Extract insights and process information from both text and images for comprehensive, context-aware understanding. Enterprise search and knowledge retrieval : Leverage long-context support for enhanced semantic search, surfacing relevant information from massive data repositories. Parallel tool integration : Invoke multiple APIs and functions simultaneously, enabling sophisticated workflows with structured, predictable outputs. Scalable conversational AI : Deploy high-capacity virtual agents capable of handling extended dialogues and nuanced queries with low latency. Customizable decision support- : Empower users with AI-driven recommendations and scenario analysis tailored to organizational needs and governance requirements. With the Grok 4 Fast models, developers are equipped to build and iterate on next-generation AI solutions, leveraging powerful tools and streamlined deployment workflows. Start shaping the future of intelligent applications by harnessing the speed, scalability, and multimodal capabilities of the Grok 4 Fast models today. The Grok 4 Fast models offer developers the speed, scalability, and multimodal capabilities needed to advance intelligent applications, supporting complex workflows and innovative solutions across a range of use cases. Pricing for Grok 4 Fast Models on Azure AI Foundry Model Deployment Price $/1m tokens grok-4-fast-reasoning Global Standard (PayGo) Input - $0.43 Output - $1.73 grok-4-fast-non-reasoning Get started in minutes With the Grok 4 Fast models, developers gain access to cutting-edge AI with a massive context window, efficient GPU performance, and enterprise-grade governance. Start building the future of AI today,visit the Model Catalog in Azure AI Foundry and deploy grok-4-fast-reasoning and grok-4-fast-non-reasoning to accelerate your innovation.1.4KViews0likes1CommentAccelerating Enterprise AI Adoption with Azure AI Landing Zone
Introduction As organizations across industries race to integrate Artificial Intelligence (AI) into their business processes and realize tangible value, one question consistently arises — where should we begin? Customers often wonder: What should the first steps in AI adoption look like? Should we build a unified, enterprise-grade platform for all AI initiatives? Who should guide us through this journey — Microsoft, our partners, or both? This blog aims to demystify these questions by providing a foundational understanding of the Azure AI Landing Zone (AI ALZ) — a unified, scalable, and secure framework for enterprise AI adoption. It explains how AI ALZ builds on two key architectural foundations — the Cloud Adoption Framework (CAF) and the Well-Architected Framework (WAF) — and outlines an approach to setting up an AI Landing Zone in your Azure environment. Foundational Frameworks Behind the AI Landing Zone 1.1 Cloud Adoption Framework (CAF) The Azure Cloud Adoption Framework is Microsoft’s proven methodology for guiding customers through their cloud transformation journey. It encompasses the complete lifecycle of cloud enablement across stages such as Strategy, Plan, Ready, Adopt, Govern, Secure, and Manage. The Landing Zone concept sits within the Ready stage — providing a secure, scalable, and compliant foundation for workload deployment. CAF also defines multiple adoption scenarios, one of which focuses specifically on AI adoption, ensuring that AI workloads align with enterprise cloud governance and best practices. 1.2 Well-Architected Framework (WAF) The Azure Well-Architected Framework complements CAF by providing detailed design guidance across five key pillars: Reliability Security Cost Optimization Operational Excellence Performance Efficiency AI Landing Zones integrate these design principles to ensure that AI workloads are not only functional but also resilient, cost-effective, and secure at enterprise scale. Understanding Azure Landing Zones To understand an AI Landing Zone, it’s important to first understand Azure Landing Zones in general. An Azure Landing Zone acts as a blueprint or foundation for deploying workloads in a cloud environment — much like a strong foundation is essential for constructing a building or bridge. Each workload type (SAP, Oracle, CRM, AI, etc.) may require a different foundation, but all share the same goal: to provide a consistent, secure, and repeatable environment built on best practices. Azure Landing Zones provide: A governed, scalable foundation aligned with enterprise standards Repeatable, automated deployment patterns using Infrastructure as Code (IaC) Integrated security and management controls baked into the architecture To have more insightful understanding of Azure Landing zone architecture pls visit the official link here and refer diagram below: The Role of Azure AI Foundry in AI Landing Zones Azure AI Foundry is emerging as Microsoft’s unified environment for enterprise AI development and deployment. It acts as a one-stop platform for building, deploying, and managing AI solutions at scale. Key components include: Foundry Model Catalog: A collection of foundation and fine-tuned models Agent Service: Enables model selection, tool and knowledge integration, and control over data and security Search and Machine Learning Services: Integrated capabilities for knowledge retrieval and ML lifecycle management Content Safety and Observability: Ensures responsible AI use and operational visibility Compute Options: Customers can choose from various Azure compute services based on control and scalability needs: Azure Kubernetes Service (AKS) — full control App Service and Azure Container Apps — simplified management Azure Functions — fully serverless option What Is Azure AI Landing Zone (AI ALZ)? The Azure AI Landing Zone is a workload-specific landing zone designed to help enterprises deploy AI workloads securely and efficiently in production environments. Key Objectives of AI ALZ Accelerate deployment of production-grade AI solutions Embed security, compliance, and resilience from the start Enable cost and operational optimization through standardized architecture Support repeatable patterns for multiple AI use cases using Azure AI Foundry Empower customer-centric enablement with extensibility and modularity By adopting the AI ALZ, organizations can move faster from proof-of-concept (POC) to production, addressing common challenges such as inconsistent architectures, lack of governance, and operational inefficiencies. Core Components of AI Landing Zone The AI ALZ is structured around three major components: Design Framework – Based on the Cloud Adoption Framework (CAF) and Well-Architected Framework (WAF). Reference Architectures – Blueprint architectures for common AI workloads. Extensible Implementations – Deployable through Terraform, Bicep, or (soon) Azure Portal templates using Azure Verified Modules (AVM). Together, these elements allow customers to quickly deploy a secure, standardized, and production-ready AI environment. Customer Readiness and Discovery A common question during early customer engagements is: “Can our existing enterprise-scale landing zone support AI workloads, or do we need a new setup?” To answer this, organizations should start with a discovery and readiness assessment, reviewing their existing enterprise-scale landing zone across key areas such as: Identity and Access Management Networking and Connectivity Data Security and Compliance Governance and Policy Controls Compute and Deployment Readiness Based on this assessment, customers can either: Extend their existing enterprise-scale foundation, or Deploy a dedicated AI workload spoke designed specifically for Azure AI Foundry and enterprise-wide AI enablement. Attached excel contains the discovery question to enquire about customer current setup and propose a adoption plan to reflect architecture changes if any. The Journey Toward AI Adoption The AI Landing Zone represents the first critical step in an organization’s AI adoption journey. It establishes the foundation for: Consistent governance and policy enforcement Security and networking standardization Rapid experimentation and deployment of AI workloads Scalable, production-grade AI environments By aligning with CAF and WAF, customers can be confident that their AI adoption strategy is architecturally sound, secure, and sustainable. Conclusion The Azure AI Landing Zone provides enterprises with a structured, secure, and scalable foundation for AI adoption at scale. It bridges the gap between innovation and governance, enabling organizations to deploy AI workloads faster while maintaining compliance, performance, and operational excellence. By leveraging Microsoft’s proven frameworks — CAF and WAF — and adopting Azure AI Foundry as the unified development platform, enterprises can confidently build the next generation of responsible, production-grade AI solutions on Azure. Get Started Ready to start your AI Landing Zone journey? Microsoft can help assess your readiness and accelerate deployment through validated reference implementations and expert-led guidance. To help organizations accelerate deployment, Microsoft has published open-source Azure AI Landing Zone templates and automation scripts in Terraform and Bicep that can be directly used to implement the architecture described in this blog. 👉 Explore and deploy the Azure AI Landing Zone(Preview) on GitHub: https://github.com/Azure/AI-Landing-Zones1.8KViews4likes9CommentsReal-Time Speech Intelligence for Global Scale: gpt-4o-transcribe-diarize in Azure AI Foundry
Voice is a natural interface for communication. Now, with the general availability of gpt-4o-transcribe-diarize, the new automatic speech recognition (ASR) model in Azure AI Foundry, transforming speech into actionable text is faster, smarter, and more accurate than ever. This launch marks a significant milestone in our mission to empower organizations with AI that delivers speed, accuracy, and enterprise-grade reliability. With gpt-4o-transcribe-diarize seamlessly integrated, businesses can unlock critical insights from conversations, instantly converting audio into text with ultra-low latency and outstanding accuracy across 100+ languages. Whether you're enhancing live event accessibility, analyzing customer interactions, or enabling intelligent voice-driven applications, gpt-4o-transcribe-diarize helps capture spoken word and leverages it for real-time decision-making. Experience how Azure AI’s innovation in speech technology is helping to redefine productivity and global reach, setting a new standard for audio intelligence in the enterprise landscape. Why gpt-4o-transcribe-diarize Matters Businesses today operate in a world where conversations drive decisions. From customer support calls to virtual meetings, audio data holds critical insights. Gpt-4o-transcribe-diarize unlocks these insights, converting speech to text with ultra-low latency and high accuracy across 100+ languages. Whether you’re captioning live events, analyzing call center interactions, or building voice-driven applications, gpt-4o-transcribe-diarize offers the opportunity to help your workflows be powered by real-time intelligence. Key Features Lightning-Fast Transcription: Convert 10 minutes of audio in ~15 seconds with our new Fast Transcription API. Global Language Coverage: Support for 100+ languages and dialects for inclusive, global experiences. Seamless Integration: Available in Azure AI Foundry with managed endpoints for easy deployment and scale. Real-World Impact Imagine a reporter summarizing interviews in real time, a financial institution transcribing calls instantly, or a global retailer powering multilingual voice assistants; all with the speed and security of Azure AI Foundry. gpt-4o-transcribe-diarize can make these scenarios possible today. Pricing and regional availability for gpt-4o-transcribe-diarize Model Deployment Regions Price $/1m tokens gpt-4o-transcribe-diarize Global Standard (Paygo) East US 2, Sweden Central Text input: $2.50 Audio input: $6.00 Output: $10.00 gpt-4o-transcribe-diarize in audio AI innovation context gpt-4o-transcribe-diarize is part of a broader wave of audio AI innovation on Azure, joining new models like OpenAI gpt-realtime and gpt-audio that are purpose-built for expressive, low-latency voice experiences. While gpt-4o-transcribe-diarize delivers ultra-fast transcription with enterprise-grade accuracy, gpt-realtime enables natural, emotionally rich voice interactions with millisecond responsiveness—ideal for live conversations, voice agents, and multimodal applications. Meanwhile, audio models like gpt-4o-transcribe mini, and mini-tts extend the platform’s capabilities with customizable speech synthesis and real-time captioning, making Azure AI a comprehensive solution for building intelligent, production-ready voice systems. gpt-realtime Features OpenAI claims the gpt-realtime model introduces a new standard for voice-first applications, combining expressive audio generation with ultra-low latency and natural conversational flow. It’s designed to power real-time interactions that feel like natural, responsive speech. Key Features: Millisecond Latency: Enables live responsiveness suitable for real-time conversations, kiosks, and voice agents. Emotionally Expressive Voices: Supports nuanced speech delivery with voices like Marin and Cedar, capable of conveying tone, emotion, and intent. Natural Turn-Taking: Built-in mechanisms for detecting pauses and transitions, allowing fluid back-and-forth dialogue. Function Calling Support: Seamlessly integrates with backend systems to trigger actions based on voice input. Multimodal Readiness: Designed to work with text, audio, and visual inputs for rich, interactive experiences. Stable APIs for Production: Enterprise-grade reliability with consistent behavior across sessions and deployments. These features make gpt-realtime a foundational model for building intelligent voice interfaces that go beyond transcription—delivering conversational intelligence in real time. gpt-realtime Use Cases With its expressive audio capabilities and real-time responsiveness, gpt-realtime unlocks new possibilities across industries. Whether enhancing customer engagement or streamlining operations, it brings voice AI into the heart of enterprise workflows. Examples include: Customer Service Agents: Power virtual agents that respond instantly with natural, tones for rich expressiveness, improving customer satisfaction and reducing wait times. Retail Kiosks & Smart Devices: Enable voice-driven product discovery, troubleshooting, and checkout experiences with real-time feedback. Multilingual Voice Assistants: Deliver localized, expressive voice experiences across global markets with support for multiple languages and dialects. Live Captioning & Accessibility: Combine gpt-4o-transcribe-diarize gpt-realtime to provide real-time captions and voice synthesis for inclusive experiences. These use cases demonstrate how gpt-realtime transforms voice into a strategic interface—bridging human communication and intelligent systems with speed and accuracy. Ready to transform voice into value? Learn more and start building with gpt-4o-transcribe-diarize1.2KViews0likes0CommentsHow Azure NetApp Files Object REST API powers Azure and ISV Data and AI services – on YOUR data
This article introduces the Azure NetApp Files Object REST API, a transformative solution for enterprises seeking seamless, real-time integration between their data and Azure's advanced analytics and AI services. By enabling direct, secure access to enterprise data—without costly transfers or duplication—the Object REST API accelerates innovation, streamlines workflows, and enhances operational efficiency. With S3-compatible object storage support, it empowers organizations to make faster, data-driven decisions while maintaining compliance and data security. Discover how this new capability unlocks business potential and drives a new era of productivity in the cloud.305Views0likes0CommentsValidating Scalable EDA Storage Performance: Azure NetApp Files and SPECstorage Solution 2020
Electronic Design Automation (EDA) workloads drive innovation across the semiconductor industry, demanding robust, scalable, and high-performance cloud solutions to accelerate time-to-market and maximize business outcomes. Azure NetApp Files empowers engineering teams to run complex simulations, manage vast datasets, and optimize workflows by delivering industry-leading performance, flexibility, and simplified deployment—eliminating the need for costly infrastructure overprovisioning or disruptive workflow changes. This leads to faster product development cycles, reduced risk of project delays, and the ability to capitalize on new opportunities in a highly competitive market. In a historic milestone, Microsoft has been independently validated Azure NetApp Files for EDA workloads through the publication of the SPECstorage® Solution 2020 EDA_BLENDED benchmark, providing objective proof of its readiness to meet the most demanding enterprise requirements, now and in the future.225Views0likes0CommentsHow Microsoft Evaluates LLMs in Azure AI Foundry: A Practical, End-to-End Playbook
Deploying large language models (LLMs) without rigorous evaluation is risky: quality regressions, safety issues, and expensive rework often surface in production—when it’s hardest to fix. This guide translates Microsoft’s approach in Azure AI Foundry into a practical playbook: define metrics that matter (quality, safety, and business impact), choose the right evaluation mode (offline, online, human-in-the-loop, automated), and operationalize continuous evaluation with the Azure AI Evaluation SDK and monitoring. Quick-Start Checklist Identify your use case: Match model type (SLM, LLM, task-specific) to business needs. Benchmark models: Use Azure AI Foundry leaderboards for quality, safety, and performance, plus private datasets. Evaluate with key metrics: Focus on relevance, coherence, factuality, completeness, safety, and business impact. Combine offline & online evaluation: Test with curated datasets and monitor real-world performance. Leverage manual & automated methods: Use human-in-the-loop for nuance, automated tools for scale. Use private benchmarks: Evaluate with organization-specific data for best results. Implement continuous monitoring: Set up alerts for drift, safety, and performance issues. Terminology Quick Reference SLM: Small Language Model—compact, efficient models for latency/cost-sensitive tasks. LLM: Large Language Model—broad capabilities, higher resource requirements. MMLU: Multitask Language Understanding—academic benchmark for general knowledge. HumanEval: Benchmark for code generation correctness. BBH: BIG-Bench Hard—reasoning-heavy subset of BIG-Bench. LLM-as-a-Judge: Using a language model to grade outputs using a rubric. The Generative AI Model Selection Challenge Deploying an advanced AI solution without thorough evaluation can lead to costly errors, loss of trust, and regulatory risks. LLMs now power critical business functions, but their unpredictable behavior makes robust evaluation essential. The Issue: Traditional evaluation methods fall short for LLMs, which are sensitive to prompt changes and can exhibit unexpected behaviors. Without a strong evaluation strategy, organizations risk unreliable or unsafe AI deployments. The Solution: Microsoft Azure AI Foundry provides a systematic approach to LLM evaluation, helping organizations reduce risk and realize business value. This guide shares proven techniques and best practices so you can confidently deploy AI and turn evaluation into a competitive advantage. LLMs and Use-Case Alignment When choosing an AI model, it’s important to match it to the specific job you need done. For example, some models are better at solving problems that require logical thinking or math—these are great for tasks that need careful analysis. Others are designed to write computer code, making them ideal for building software tools or helping programmers. There are also models that excel at having natural conversations, which is especially useful for customer service or support roles. Microsoft Azure AI Foundry helps with this by showing how different models perform in various categories, making it easier to pick the right one for your needs. Key Metrics: Quality, Safety, and Business Impact When evaluating an AI model, it’s important to look beyond just how well it performs. To truly understand if a model is ready for real-world use, we need to measure its quality, ensure it’s safe, and see how it impacts the business. Quality metrics show if the model gives accurate and useful answers. Safety metrics help us catch any harmful or biased content before it reaches users. Business impact metrics connect the model’s performance to what matters most—customer satisfaction, efficiency, and meeting important rules or standards. By tracking these key areas, organizations can build AI systems that are reliable, responsible, and valuable. Dimension What it Measures Typical Evaluators Quality Relevance, coherence, factuality, completeness LLM-as-a-judge, groundedness, code eval Safety Harmful content, bias, jailbreak resistance, privacy Content safety checks, bias probes Business Impact User experience, value delivery, compliance Task completion rate, CSAT, cost/latency Organizations that align model selection with use-case-specific benchmarks deploy faster and achieve higher user satisfaction than teams relying only on generic metrics. The key is matching evaluation criteria to business objectives from the earliest stages of model selection. Now that we know which metrics and parameters to evaluate LLMs on, when and how do we run these evaluations? Let’s get right into it. Evaluation Modalities Offline vs. Online Evaluation Offline Evaluation: Pre-deployment assessment using curated datasets and controlled environments. Enables reproducible testing, comprehensive coverage, and rapid iteration. However, it may miss real-world complexity. Online Evaluation: Assesses model performance on live production data. Enables real-world monitoring, drift detection, and user feedback integration. Best practice: use offline evaluation for development and gating, then online evaluation for continuous monitoring. Manual vs. Automated Evaluation Manual Evaluation: Human insight is irreplaceable for subjective qualities like creativity and cultural sensitivity. Azure AI Foundry supports human-in-the-loop evaluation via annotation queues and feedback systems. However, manual evaluation faces scalability and consistency challenges. Automated Evaluation: Azure AI Foundry’s built-in evaluators provide scalable, rigorous assessment of relevance, coherence, safety, and performance. Best practice: The most effective approach combines automated evaluation for broad coverage with targeted manual evaluation for nuanced assessment. Leading organizations implement a "human-in-the-loop" methodology where automated systems flag potential issues for human review. Public vs. Private Benchmarks Public Benchmarks (MMLU, HumanEval, BBH): Useful for standardized comparison but may not reflect your domain or business objectives. Risk of contamination and over-optimization. Private Benchmarks: Organization-specific data and metrics provide evaluation that directly reflects deployment scenarios. Best practice: Use public benchmarks to narrow candidates, then rely on private benchmarks for final decisions. LLM-as-a-Judge and Custom Evaluators LLM-as-a-Judge uses language models themselves to assess the quality of generated content. Azure AI Foundry’s implementation enables scalable, nuanced, and explainable evaluation—but requires careful validation. Common challenges and mitigations: Position bias: Scores can skew toward the first-listed answer. Mitigate by randomizing order, evaluating both (A,B) and (B,A), and using majority voting across permutations. Verbosity bias: Longer answers may be over-scored. Mitigate by enforcing concise-answer rubrics and normalizing by token count. Inconsistency: Repeated runs can vary. Mitigate by aggregating over multiple runs and reporting confidence intervals. Custom Evaluators allow organizations to implement domain-specific logic and business rules, either as Python functions or prompt-based rubrics. This ensures evaluation aligns with your unique business outcomes. Evaluation SDK: Comprehensive Assessment Tools The Azure AI Evaluation SDK (azure-ai-evaluation) provides the technical foundation for systematic LLM assessment. The SDK's architecture enables both local development testing and cloud-scale evaluation: Cloud Evaluation for Scale: The SDK seamlessly transitions from local development to cloud-based evaluation for large-scale assessment. Cloud evaluation enables processing of massive datasets while integrating results into the Azure AI Foundry monitoring dashboard. Built-in Evaluator Library: The platform provides extensive pre-built evaluators covering quality metrics (coherence, fluency, relevance), safety metrics (toxicity, bias, fairness), and task-specific metrics (groundedness for RAG, code correctness for programming). Each evaluator has been validated against human judgment and continuously improved based on real-world usage. Real-World Workflow: From Model Selection to Continuous Monitoring Azure AI Foundry's integrated workflow guides organizations through the complete evaluation lifecycle: Stage 1: Model Selection and Benchmarking Compare models using integrated leaderboards across quality, safety, cost, and performance dimensions Evaluate top candidates using private datasets that reflect actual use cases Generate comprehensive model cards documenting capabilities, limitations, and recommended use cases Stage 2: Pre-Deployment Evaluation Systematic testing using Azure AI Evaluation SDK with built-in and custom evaluators Safety assessment using AI Red Teaming Agent to identify vulnerabilities Human-in-the-loop validation for business-critical applications Stage 3: Production Monitoring and Continuous Evaluation Real-time monitoring through Azure Monitor Application Insights integration Continuous evaluation at configurable sampling rates (e.g., 10 evaluations per hour) Automated alerting for performance degradation, safety issues, or drift detection This workflow ensures that evaluation is not a one-time gate but an ongoing practice that maintains AI system quality and safety throughout the deployment lifecycle. Next Steps and Further Reading Explore the Azure AI Foundry documentation for hands-on guides. Find the Best Model - https://aka.ms/BestModelGenAISolution Azure AI Foundry Evaluation SDK Summary Robust evaluation of large language models (LLMs) using systematic benchmarking and Azure AI Foundry tools is essential for building trustworthy, efficient, and business-aligned AI solutions Tags: #LLMEvaluation #AzureAIFoundry #AIModelSelection #Benchmarking #Skilled by MTT #MicrosoftLearn #MTTBloggingGroup254Views0likes0Comments