openai
133 TopicsProductize, observe, version, and automate MCP servers in Azure API Management
Introduction As organizations move from AI-assisted applications to agentic workflows, MCP servers are becoming a critical integration layer between agents, tools, APIs, data sources, and enterprise systems. Azure API Management already helps teams bring MCP servers under enterprise governance. But as MCP adoption scales, platform teams need more than basic exposure. They need a way to package MCP servers for the right consumers, understand tool usage in detail, manage changes safely, and automate configuration across environments. These are familiar API management challenges — and the same patterns that organizations already use for APIs can now be applied more deeply to MCP servers. We are excited to announce new generally available capabilities for MCP server management in Azure API Management: Add MCP servers to products to package and govern MCP capabilities for specific consumers MCP tool observability to trace tool usage, logs, errors, and payload context MCP server versioning to run multiple versions side by side and manage change safely Management API and Bicep support to automate MCP server configuration as part of CI/CD workflows Together, these capabilities extend MCP server management in Azure API Management and help make MCP servers first-class managed resources — productized, observable, versionable, and automatable. Why MCP server management matters MCP gives agents a standard way to connect with tools and external capabilities. That standardization is powerful, but it also introduces a new operational surface for enterprises. Without a management layer, teams can quickly run into questions such as: Which MCP servers are approved for use? Who can access each server? How do we expose MCP servers to different developer or agent audiences? How do we monitor tool calls, latency, errors, and cost? How do we run preview and production versions side by side? How do we automate MCP server configuration across environments? These are not just developer experience questions. They are enterprise governance questions. With Azure API Management, MCP servers can now be managed using the same core patterns organizations already use for APIs: products, subscriptions, policies, observability, versioning, and automation. What’s new 1. Add MCP servers to products Azure API Management products are a proven way to package APIs for consumption. With this release, you can now add one or more MCP servers to APIM products as well. This makes it easier to expose MCP capabilities to specific consumers, teams, applications, or agent experiences using familiar product-based governance. For example, a platform team can create a product for internal agents that includes approved MCP servers such as: Customer profile lookup Order status retrieval Knowledge base search Ticket creation Workflow automation tools By adding MCP servers to products, teams can use familiar controls such as subscriptions, quotas, approval workflows, and access management to govern how MCP capabilities are consumed. Why it matters: MCP servers are no longer isolated endpoints. They can be bundled, governed, and delivered as secure, consumable products. 2. MCP tool observability As agents use MCP servers to discover and invoke tools, teams need more than basic traffic visibility. They need end-to-end trace context for each agent-to-tool interaction. With MCP observability in Azure API Management, teams can inspect key MCP-specific details, including: Operation context: whether the request was a tools/list or tools/call operation Session context: the MCP session ID through gen_ai.conversation.id Client context: MCP client name and version Protocol context: MCP protocol name and version Server context: MCP server name and version Access context: authentication type and API type Tool context: tool name and tool type for tool invocation traces Error context: error type and error message when a call fails Payload context: tool invocation arguments and results when payload logging is enabled This is especially important for agentic workflows, where a single user request may trigger multiple tool calls across different systems. With APIM, MCP traffic can be traced, inspected, and monitored using the same operational practices teams already use across their API estate. Why it matters: MCP servers are not just accessible through APIM — they are observable. Platform teams can trace tool calls, inspect errors, and understand MCP usage with the same operational discipline they expect from managed APIs. 3. Expose multiple MCP versions Enterprise teams need safe ways to evolve MCP servers over time. With MCP server versioning in Azure API Management, you can expose multiple versions of the same MCP server side by side. This allows teams to run a stable GA version while introducing a preview or next version for early adopters. For example: v1 can serve the majority of production traffic. v2 can be exposed to a subset of consumers for testing. Teams can monitor adoption, errors, latency, and behavior. Once the new version is validated, v2 can be promoted with confidence. This pattern is especially useful when MCP tools evolve, schemas change, new capabilities are added, or teams want to validate agent behavior before rolling changes out broadly. Why it matters: MCP servers can now follow a safer lifecycle model: preview, validate, route, promote, and retire. 4. Management API and Infrastructure as Code MCP server management also needs to work at enterprise scale. With Management API and Infrastructure as Code support, teams can provision and configure MCP servers programmatically through Azure API Management APIs and automation pipelines. This allows platform teams to define MCP server resources as part of repeatable deployment workflows using tools such as Bicep, Terraform, ARM, REST APIs, and CI/CD pipelines. Teams can automate configuration for: MCP server endpoints Runtime and transport settings Authentication configuration Metadata and ownership Versioning Product association Policies Environment promotion This is critical for organizations that need consistent MCP governance across development, test, staging, and production environments. Why it matters: MCP server management can now be automated, reviewed, deployed, and governed like the rest of your API platform. How these capabilities work together Individually, each capability solves an important operational need. Together, they create a complete management model for MCP servers in Azure API Management. A platform team can: Register or expose MCP servers through Azure API Management. Package them into products for specific consumers. Apply access controls, subscriptions, quotas, and policies. Observe tool-level usage, latency, errors, traces, and cost. Run multiple versions side by side. Promote changes safely. Automate deployment through APIs and Infrastructure as Code. This brings the full API management playbook to MCP. Instead of treating MCP servers as unmanaged agent extensions, organizations can operate them as governed enterprise resources. Example scenario Imagine a company building internal copilots for customer support, sales, and operations. Each copilot needs access to different tools: Customer lookup Order history Case management Knowledge search Refund workflows Escalation workflows With MCP and Azure API Management, the platform team can expose these capabilities as MCP servers and organize them into products. The customer support copilot can subscribe to the support product. The sales copilot can subscribe to the sales product. Early adopters can be routed to a preview version of a tool. Operations teams can monitor usage, errors, latency, traces, and cost. Platform teams can automate the entire setup across environments. The result is a more governed and scalable way to bring MCP-based tools into enterprise agent workflows. Getting started To get started with MCP server management in Azure API Management: Create or identify an MCP server you want to expose through Azure API Management. Add the MCP server as a managed resource in APIM. Add the MCP server to an APIM product. Configure access, subscriptions, quotas, and approval workflows. Enable observability to monitor tool-level usage and traces. Use versioning to manage preview and production versions. Use the Management API or Infrastructure as Code to automate configuration. Conclusion MCP is quickly becoming an important standard for connecting agents to tools and enterprise capabilities. But for MCP to succeed in production, organizations need more than connectivity. They need governance, lifecycle management, observability, and automation. With these new MCP server management capabilities in Azure API Management, platform teams can manage MCP servers using the same trusted patterns they already use for APIs. MCP servers are now first-class APIM resources — productized, observable, versionable, and automatable. We are excited to see how customers use these capabilities to build the next generation of governed, enterprise-ready agentic applications.318Views1like0CommentsNew AI gateway capabilities in Azure API Management
Multi-model, multi-protocol AI applications are quickly becoming the norm. Teams are mixing OpenAI, Anthropic, and Vertex AI models, exposing tools through MCP, and wiring agents together with A2A. As that surface grows, so does the work of keeping it secure, observable, and consistent. Our ongoing strategy for the AI gateway capabilities in Azure API Management centers on that problem: providing one place to manage models, MCP tools, and agents, no matter which provider or protocol is behind them. The updates below are the latest steps in that direction. Unified Model API (preview) The headline change in this release: the Unified Model API lets clients speak one API format — OpenAI Chat Completions — while API Management transforms requests to the backend provider, whether that's a model using OpenAI Chat Completions or Anthropic Messages API. By centralizing model access behind a single API layer, you can: Standardize on a single API format for clients, independently from the formats used by backend models. Unify observability, security, and governance with policies that apply across model providers. Configure failover across model providers. Decouple client-facing model names from backend model names using aliases. Learn more about the unified model API. Model aliases Model aliases give clients a stable, provider-neutral name to use when calling a model. By assigning an alias like gpt or claude-sonnet, you decouple the client-facing model name from the actual backend deployment. That makes a few common operations a lot easier: Upgrading a model. Update the alias target to point at a new version — no client code changes required. A/B tests. Shift traffic between backends behind the same alias using API Management's load balancing capabilities. Vendor swaps. Replace one provider with another without touching application code. Model discovery Developers can discover available models by calling the /models endpoint of the Unified Model API. API Management returns the list of model aliases, so apps and tools can adapt to what the platform team has published — without out-of-band documentation. Anthropic and Vertex AI models (GA) AI gateway policies and observability now work with Anthropic and Google Vertex AI models, alongside the providers we already support. You can: Apply runtime policies such as content safety, token limits, and semantic caching to Anthropic and Vertex AI traffic. Collect logs, traces, and metrics for these models in the same place as the rest of your AI traffic. If you're running a multi-provider setup, you no longer need a separate governance story for each vendor. Learn more about AI gateway capabilities in API Management. Anthropic API operations in Microsoft Foundry import When you import a Microsoft Foundry resource as an API in Azure API Management, the import now creates operations for Anthropic APIs alongside the existing model APIs. In a few clicks, you can stand up an API that mediates traffic to Foundry models using either the OpenAI or Anthropic API format — no manual operation definitions needed — and then apply the same policies, security, and observability you use for the rest of your AI traffic. Learn more about Microsoft Foundry import. Token metrics for additional token types (preview) Token tracking used to stop at prompt, completion, and total tokens. Modern models add cached, reasoning, and thinking tokens, which can make up a significant share of token consumption, cost, and latency. API Management now logs metrics for these additional token types into Application Insights, across API formats (OpenAI Chat Completions, OpenAI Responses, and Anthropic Messages API) and providers (Microsoft Foundry, OpenAI, Amazon Bedrock, Google Vertex AI, and others). With richer signals, your cost dashboards, budget alerts, and capacity planning can actually reflect how today's models behave. Learn more about token metrics. Content safety for MCP and A2A (GA) The llm-content-safety policy now covers MCP and A2A traffic in addition to LLM traffic. That includes MCP tool-call arguments, MCP response text, and A2A payloads. A couple of related improvements: llm-content-safety can now be configured directly as an outbound policy. Two new attributes — window-size and window-overlap-size — let you tune how messages exceeding the Azure Content Safety limit of 10,000 characters are chunked and forwarded for validation, balancing detection sensitivity with Azure Content Safety call volume. The result is one consistent safety policy across LLM, MCP, and A2A flows instead of stitching together custom filters per protocol. Learn more about the content safety policy. A2A APIs (GA) Support for Agent-to-Agent (A2A) APIs in API Management is now generally available. Agent APIs can now be governed with the same policies, identity, and observability you use for the rest of your APIs. What you can do with A2A APIs in API Management: Mediate JSON-RPC runtime operations to your agent backend with full policy support — including the content safety improvements above. Expose and manage agent cards, automatically transformed by API Management to represent the managed agent API. Log traces to Application Insights using OpenTelemetry GenAI semantic conventions for deep correlation between API and agent execution traces. What's new in GA, on top of the preview: Available in classic tiers, in addition to v2 tiers — bring A2A governance to existing API Management resources without migrating tiers. Richer diagnostic logging for A2A APIs, giving more actionable telemetry for monitoring and troubleshooting agent traffic. Learn more about A2A support in API Management. Related: Bring Your Own Model in Foundry Agent Service (GA) Last month, Bring Your Own Model (BYOM) in Foundry Agent Service went GA. BYOM lets enterprise teams route Foundry agent model calls through their own infrastructure — typically for compliance, governance, or to reuse an existing model gateway. This pairs naturally with the AI gateway capabilities in Azure API Management. Put API Management in front of your models, apply the policies and observability described above, and have Foundry agents call through it — getting consistent governance for both your direct AI traffic and your agent workloads. Get started Together, these updates make Azure API Management a more complete AI gateway: consistent governance, security, and observability across models from various providers, MCP tools, and agent interactions. Some of these features are still rolling out. They will first become available in v2 tiers of API Management and in the AI release channel for classic tiers, then continue rolling out to the rest of classic tier resources over the following weeks. Get started with the unified model API or explore the AI gateway capabilities in API Management.823Views0likes0CommentsMCP Test Console and Git Repository synch in Azure API Center
Why This Matters As organizations race to build AI-powered applications, the Model Context Protocol (MCP) has emerged as the standard way to connect AI agents with external tools and data sources. Managing these MCP servers at enterprise scale, however, has been a growing challenge — until now. AI agents are only as useful as the tools they can access. MCP servers expose those tools — from databases and internal APIs to third-party services — in a standardized way that any AI agent or model can consume. As your MCP ecosystem grows, so does the challenge of keeping track of what's available, what's working, and what your teams are actually using. Azure API Center already serves as a centralized registry for APIs across your organization. Now it extends that same governance model to MCP servers, complete with developer-friendly discovery, live testing, and automated synchronization from your source repositories. New Feature: MCP Test Console in the API Center Portal Developers can now test MCP server tools interactively without leaving the Azure portal. Once an MCP server is registered in your API Center inventory, the API Center portal — your organization's customizable developer portal — surfaces a dedicated test console on the server's Documentation tab. Developers simply select a tool, click Run tool, and immediately see the response. This means your teams can: Validate tools before connecting them to agents — no more building a test harness from scratch. Explore tool schemas interactively — the portal surfaces endpoint details and input/output schemas alongside the live console. Onboard faster — developers browsing your internal MCP registry can go from discovery to verified integration in minutes. The MCP server tiles in the portal provide a clear, browsable view of all registered servers. Each tile surfaces the server's endpoint URL, available tools, and installation instructions for Visual Studio Code — giving developers everything they need to get started in one place. Getting started: Set up your API Center portal, then navigate to any registered MCP server. On the Documentation tab, select a tool and click Run tool to open the test console. New Feature: Synch MCP Servers from a Git Repository Managing API assets shouldn't require manual registration every time something changes. With Git repository integration, Azure API Center can automatically sync assets — including MCP server definitions — directly from your source repository. How It Works When you connect a Git repository to your API Center: An environment is created in your API Center representing the repository as an asset source. API Center regularly synchronizes MCP servers from the repository into your inventory — no manual intervention required. Assets appear in your inventory on the Inventory > Assets page with a visual link indicator, making it easy to identify which assets are source-controlled. This is especially valuable for teams that maintain MCP server definitions, skill files, or OpenAPI specs in version control. As your repository evolves, your API Center inventory stays current automatically. Setting It Up Step 1: Secure your access credentials (for private repos) If your repository is private, store a personal access token (PAT) as a secret in Azure Key Vault. Your API Center instance uses a managed identity to retrieve this secret securely — you can configure the managed identity manually or let API Center handle it automatically during the integration setup. Step 2: Connect the repository In the Azure portal, go to your API Center and navigate to Platforms > Integrations > + New integration > From Git repository. You'll configure: Repository URL — including an optional branch and subfolder path (e.g., https://github.com/<org>/<repo>/tree/main/skills). Git provider — such as GitHub. Asset type configuration — API Center defaults to a skill asset type with the file pattern **/skill.md, but you can add additional asset types to match your repository structure. PAT reference — select the Key Vault secret containing your PAT, if applicable. Environment details — give the repository environment a friendly name, resource ID, type (e.g., Production), and lifecycle stage for synced assets. Step 3: Let the sync run Once created, the integration runs automatically. Your assets will appear in the Inventory > Assets view, linked to their source in the repository. Access Control for Private Repositories The integration uses Azure's managed identity framework to authenticate to Key Vault. Assign your API Center's managed identity the Key Vault Secrets User role on your Key Vault to grant the necessary read access. If you prefer, API Center can configure this automatically — just enable the Automatically configure managed identity and assign permissions option during integration setup. Bringing It Together: A Complete MCP Governance Story Together, these two features complete an end-to-end workflow for enterprise MCP governance: Register → Connect your Git repository and let API Center automatically synch your MCP servers and skills as they evolve. Discover → Developers and AI engineers browse the API Center portal to find the right MCP server for their agent, with full schema visibility and endpoint details. Test → The built-in test console lets developers validate tools interactively before committing to an integration. Govern → Use API Center's access management capabilities to control who can view and consume specific MCP servers across your organization. And if you're building MCP servers on Azure services, the registry integrates directly with Azure API Management, Azure Logic Apps, and Azure Functions — so your MCP ecosystem and your API ecosystem share a single source of truth. Get Started Register and discover MCP servers in Azure API Center Synchronize API assets from a Git repository Set up the API Center portal Explore MCP Center — Azure API Center's public MCP registryMore Control, Less Overhead: Custom Domain Upgrades in Azure API Management v2
Multiple custom domains in Premium v2 Large organizations rarely expose APIs under a single domain. A global enterprise might need api.contoso.com for external partners, apis.hrportal.contoso.com for internal teams, and dev.europe.contoso.com for a regional developer portal — all at once. Until now, achieving this required spinning up separate API Management instances, adding cost and operational complexity. Azure API Management Premium v2 now supports multiple custom domains within a single instance — across gateway, developer portal, and management endpoints. This allows organizations to: Configure distinct hostnames for different endpoints and target audiences Align API experiences with business units, products, or regional brands Simplify domain-scoped networking and security policies Reduce the need for separate APIM instances created solely for domain separation For enterprises managing large, distributed API estates, this provides greater flexibility in how APIs and developer experiences are exposed — while maintaining centralized governance. Wildcard custom hostnames in Premium v2 and Standard v2 As API estates grow, managing individual certificates for every subdomain becomes a scaling problem fast. Each new surface — payments.api.contoso.com, inventory.api.contoso.com, orders.api.contoso.com — previously required its own hostname registration and certificate. Ten new API surfaces meant ten separate management tasks. Azure API Management Premium v2 and Standard v2 now support wildcard entries in custom hostnames. A single *.api.contoso.com entry paired with a single wildcard certificate covers all subdomains automatically — no per-subdomain configuration required. This helps teams: Simplify certificate and domain management at scale Accelerate onboarding of new API surfaces without repeated hostname setup Maintain consistent branded endpoints across dynamic subdomains Reduce operational overhead for rapidly growing API environments By extending this capability to both Premium v2 and Standard v2, Azure API Management makes flexible, scalable domain management accessible to more organizations without requiring higher-tier deployments. Both updates are generally available now. Learn more about Azure API Management v2 tiers and how they help organizations build scalable, enterprise-grade API platforms. Further reading: Configure a custom domain name for Azure API ManagementAzure API Center Introduces a Data Plane MCP Server for Enterprise-Wide API and AI Asset Discovery
As organizations scale their adoption of MCP-based tooling and AI agents, one challenge keeps surfacing: developers spend too much time figuring out what APIs, tools, and AI assets exist — and then manually wiring up connections to each one. Today, we're excited to announce general availability of a new capability that changes that. What's new Azure API Center now provides a data plane MCP server — a unified enterprise discovery endpoint that gives agents and developer tools a single connection point to your organization's full catalog of registered MCP servers, tools, APIs, and AI assets. Instead of hunting across systems or hand-configuring integrations one by one, developers and agents can now connect once and immediately access everything that's been registered in your API Center. Why this matters The MCP ecosystem is growing fast. So is the number of enterprise APIs and AI assets that teams need to manage and consume. Without a central discovery mechanism, that growth creates friction — more manual configuration, more drift between what's available and what's actually reachable, and more integration complexity for every new agentic application. The Azure API Center data plane MCP server addresses this directly. With it, teams can: Give agents centralized access to enterprise APIs and AI assets without custom routing logic Eliminate manual configuration of connections to individual MCP servers Automatically surface newly registered MCP servers and tools without reconfiguration Simplify discovery and consumption across a rapidly growing enterprise catalog Built for how organizations actually operate Agentic applications don't just need APIs — they need to find the right APIs, trust that the catalog is current, and connect reliably at scale. By acting as a unified discovery endpoint, Azure API Center helps teams operationalize AI ecosystems with stronger discoverability, governance, and developer productivity, while meaningfully reducing integration complexity. This is especially valuable as enterprises move from experimenting with AI agents to deploying them in production workflows, where manual integration approaches don't scale. How to enable the data plane MCP server Turning on the MCP server takes just a few clicks in the Azure portal. Navigate to your API Center instance and open Data API settings under the Consumption section in the left-hand menu. From there, under MCP endpoint, toggle Enable API Center MCP endpoint to on. Once enabled, your MCP endpoint URL (in the form https://<your-instance>.data.<region>.azure-apicenter.ms/mcp) will appear and can be copied directly for use in agent configurations or developer tools. Note: When enabled, the MCP endpoint is also surfaced on the developer portal homepage, so developers can connect via CLI without needing to look up the URL separately. You can also enable the Plugin marketplace endpoint from the same settings page to let developers browse and install approved plugins and skills from your organization's marketplace. The Visibility section lets you control which APIs are exposed through the data plane — use Add condition to filter the catalog based on your governance requirements. Get started Learn more about Azure API Center and how organizations are building unified catalogs for APIs, MCP tools, agents, and AI assets.Find what you need, faster: Azure API Center now supports custom metadata filtering
Enterprise API and AI catalogs have expanded dramatically. Where teams once managed dozens of APIs, they now govern hundreds — spanning business units, environments, compliance domains, and an ever-growing roster of AI assets. The catalog itself has become a discovery challenge. What's new Developers can now filter catalog assets using organization-defined metadata attributes. These aren't generic tags — they're the classifications your organization already uses: environments, business units, domains, compliance tiers, ownership groups, and more. Custom metadata filtering works across all major asset types in Azure API Center: APIs Skills Agents MCP tools Why it matters Discovery friction is a hidden tax on developer productivity. When a developer needs to find the right API for a project, every minute spent navigating inconsistent lists or applying manual filters is a minute not spent building. At scale, this compounds quickly. Custom metadata filtering addresses this directly by aligning the catalog's search experience with how your organization already thinks about its assets: Surface the right assets faster — filter by internal classifications and governance models instead of browsing overwhelming lists Improve discoverability at scale — no need to retag or reorganize existing assets to make them findable Align with your organizational taxonomy — filter by domain, environment, business unit, compliance requirement, or any custom attribute your teams already use Built for governed, AI-ready teams This update reinforces Azure API Center's role as the foundation for scalable, AI-ready discovery experiences — where governance and developer velocity move together, not against each other. By making enterprise catalogs easier to navigate, Azure API Center helps developers spend less time searching and more time building with governed APIs and AI assets. Get started Learn more about Azure API Center custom metadata filtering and how organizations are building scalable, AI-ready discovery experiences.GA: Azure API Center Now Supports Plugin Registration
As organizations scale their AI and integration ecosystems, one challenge keeps surfacing: developers don't have a reliable, governed way to discover and reuse the plugins their teams have already built. Plugins end up siloed in individual repos, shared over Slack, or duplicated across teams — slowing down development and creating governance blind spots. What's new With this update, developers can register plugins directly into Azure API Center's enterprise catalog. Once registered, plugins are discoverable, governable, and consumable alongside the rest of an organization's API and AI portfolio — no more hunting across repositories or relying on word-of-mouth to find what's already been built. Why it matters Plugins are increasingly central to how AI-powered applications are built. Agents depend on them. Integrations are built on top of them. But without a governed home, even well-built plugins go undiscovered and get rebuilt from scratch. Plugin registration in Azure API Center helps organizations: Centralize plugin discovery within a governed catalog, so developers always know where to look Surface vetted plugins that teams can confidently find and reuse — reducing duplication and accelerating development Align plugins with source-controlled workflows, keeping development practices consistent across the catalog Reduce friction between building a plugin and enabling it for real-world integration One catalog for your entire AI ecosystem This update reflects a broader vision for Azure API Center: a single, unified catalog for everything an organization builds and consumes — APIs, plugins, agents, and AI assets. By bringing plugins into this experience, teams can operationalize reusable integration and AI capabilities at scale, with the governance and discoverability that enterprise development requires. Whether your teams are building copilot extensions, orchestration layers, or custom integrations, Azure API Center gives them a governed foundation to build on — and a shared place to discover what's already there. Learn more about Azure API Center and how organizations are building centralized catalogs for APIs, plugins, agents, and AI assets.Azure API Center now supports agent registration, agent assessment, and Git-based synchronization
What's new Three capabilities are now generally available: Capability What it does Agent registration Register agents directly into the enterprise catalog for cross-team discovery and reuse. Agent assessment LLM-as-a-Judge framework scores agents across 6 criteria before catalog registration. Git synchronization Connect a repo and keep agent definitions automatically in sync with source control. Agent assessment — six weighted criteria, automatically enforced When assessment is enabled, every agent is scored on creation or update. Up to 8 criteria are supported; weights are fully configurable by platform teams. Criterion Weight What it evaluates capability-transparency 0.15 Documents tools, delegated capabilities, external systems, access levels, and capability boundaries. composition-resource-discipline 0.15 Named skill/sub-agent references, invoke guidance, no inline duplication, resource/token constraints. operational-protocol-quality 0.25 Structured workflow with named steps, decision points, failure modes, recovery paths, and pre-flight checks. output-contract 0.10 Specifies output format, mandatory sections, evidence requirements, confidence semantics, and dead-end handling. purpose-scope-clarity 0.15 Clear role identity, type (specialist/orchestrator), activation triggers, anti-triggers, and refusal behavior. safety-consent-architecture 0.20 Classifies risk, distinguishes idempotent vs approval-required actions, enumerates NEVER/ALWAYS rules, documents failure-mode safety. Git-based synchronization — connect once, stay in sync automatically Integrating a Git repository creates an environment in API Center representing the repo as an asset source. API Center polls for changes and reflects them in Inventory > Assets — linked assets display a provenance icon. Agent definitions live in source control and evolve continuously — yet keeping a catalog manually in sync with a codebase is slow, error-prone work that no platform team should have to do. Git-based synchronization connects Azure API Center directly to your repository, polling for changes and reflecting them in the inventory automatically so the catalog always represents the current state of your agents. Linked assets carry a provenance icon that traces them back to their source repository, giving developers immediate visibility into where an agent comes from and whether it is actively maintained. Because assessment gates run before any version is promoted from the repo into the catalog, only agents that meet your organization's quality criteria ever reach developers — enforcing governance at the point of commit, not as an afterthought. A single integration supports multiple asset types through configurable file patterns, so teams can sync agents, skills, and APIs from one repository with per-type control and no duplication. A2A agent sync from Azure API Management — publish once, discover everywhere Azure API Management → continuous sync → Azure API Center One-way · updates within minutes · includes API definitions, environments & deployments As teams publish more A2A agents through Azure API Management, manually registering each one into a discovery catalog creates friction that slows developer productivity and risks catalog staleness. Azure API Center now automatically synchronizes A2A agents — alongside APIs and MCP servers — published in an API Management instance, so every agent that reaches runtime is immediately visible in the centralized catalog without any additional registration step. The sync is continuous and one-way: when agents are created, updated, versioned, or deleted in API Management, those changes propagate to API Center within minutes, keeping the inventory accurate at all times. Each synchronized agent gets an associated environment and deployment record, giving developers the runtime context they need to discover and integrate the right agent with confidence. This closes the loop between runtime publishing and centralized governance, helping organizations operationalize agent ecosystems at scale without burdening platform teams with manual catalog maintenance. Why it matters By bringing agents into Azure API Center alongside APIs, plugins, skills, and MCP tools, organizations gain a single pane of glass for everything their AI applications depend on. Teams reduce duplication, improve reuse, and accelerate development — while maintaining the governance standards enterprise deployments require. Get started Azure API Center overview Set up Git-based synchronization Sync A2A agents from API ManagementFine-Tuning and Deploying Phi-3.5 Model with Azure and AI Toolkit
What is Phi-3.5? Phi-3.5 as a state-of-the-art language model with strong multilingual capabilities. Emphasize that it is designed to handle multiple languages with high proficiency, making it a versatile tool for Natural Language Processing (NLP) tasks across different linguistic backgrounds. Key Features of Phi-3.5 Highlight the core features of the Phi-3.5 model: Multilingual Capabilities: Explain that the model supports a wide variety of languages, including major world languages such as English, Spanish, Chinese, French, and others. You can provide an example of its ability to handle a sentence or document translation from one language to another without losing context or meaning. Fine-Tuning Ability: Discuss how the model can be fine-tuned for specific use cases. For instance, in a customer support setting, the Phi-3.5 model can be fine-tuned to understand the nuances of different languages used by customers across the globe, improving response accuracy. High Performance in NLP Tasks: Phi-3.5 is optimized for tasks like text classification, machine translation, summarization, and more. It has superior performance in handling large-scale datasets and producing coherent, contextually correct language outputs. Applications in Real-World Scenarios To make this section more engaging, provide a few real-world applications where the Phi-3.5 model can be utilized: Customer Support Chatbots: For companies with global customer bases, the model’s multilingual support can enhance chatbot capabilities, allowing for real-time responses in a customer’s native language, no matter where they are located. Content Creation for Global Markets: Discuss how businesses can use Phi-3.5 to automatically generate or translate content for different regions. For example, marketing copy can be adapted to fit cultural and linguistic nuances in multiple languages. Document Summarization Across Languages: Highlight how the model can be used to summarize long documents or articles written in one language and then translate the summary into another language, improving access to information for non-native speakers. Why Choose Phi-3.5 for Your Project? End this section by emphasizing why someone should use Phi-3.5: Versatility: It’s not limited to just one or two languages but performs well across many. Customization: The ability to fine-tune it for particular use cases or industries makes it highly adaptable. Ease of Deployment: With tools like Azure ML and Ollama, deploying Phi-3.5 in the cloud or locally is accessible even for smaller teams. Objective Of Blog Specialized Language Models (SLMs) are at the forefront of advancements in Natural Language Processing, offering fine-tuned, high-performance solutions for specific tasks and languages. Among these, the Phi-3.5 model has emerged as a powerful tool, excelling in its multilingual capabilities. Whether you're working with English, Spanish, Mandarin, or any other major world language, Phi-3.5 offers robust, reliable language processing that adapts to various real-world applications. This makes it an ideal choice for businesses looking to deploy multilingual chatbots, automate content generation, or translate customer interactions in real time. Moreover, its fine-tuning ability allows for customization, making Phi-3.5 versatile across industries and tasks. Customization and Fine-Tuning for Different Applications The Phi-3.5 model is not just limited to general language understanding tasks. It can be fine-tuned for specific applications, industries, and language models, allowing users to tailor its performance to meet their needs. Customizable for Industry-Specific Use Cases: With fine-tuning, the model can be trained further on domain-specific data to handle particular use cases like legal document translation, medical records analysis, or technical support. Example: A healthcare company can fine-tune Phi-3.5 to understand medical terminology in multiple languages, enabling it to assist in processing patient records or generating multilingual health reports. Adapting for Specialized Tasks: You can train Phi-3.5 to perform specialized tasks like sentiment analysis, text summarization, or named entity recognition in specific languages. Fine-tuning helps enhance the model's ability to handle unique text formats or requirements. Example: A marketing team can fine-tune the model to analyse customer feedback in different languages to identify trends or sentiment across various regions. The model can quickly classify feedback as positive, negative, or neutral, even in less widely spoken languages like Arabic or Korean. Applications in Real-World Scenarios To illustrate the versatility of Phi-3.5, here are some real-world applications where this model excels, demonstrating its multilingual capabilities and customization potential: Case Study 1: Multilingual Customer Support Chatbots Many global companies rely on chatbots to handle customer queries in real-time. With Phi-3.5’s multilingual abilities, businesses can deploy a single model that understands and responds in multiple languages, cutting down on the need to create language-specific chatbots. Example: A global airline can use Phi-3.5 to power its customer service bot. Passengers from different countries can inquire about their flight status or baggage policies in their native languages—whether it's Japanese, Hindi, or Portuguese—and the model responds accurately in the appropriate language. Case Study 2: Multilingual Content Generation Phi-3.5 is also useful for businesses that need to generate content in different languages. For example, marketing campaigns often require creating region-specific ads or blog posts in multiple languages. Phi-3.5 can help automate this process by generating localized content that is not just translated but adapted to fit the cultural context of the target audience. Example: An international cosmetics brand can use Phi-3.5 to automatically generate product descriptions for different regions. Instead of merely translating a product description from English to Spanish, the model can tailor the description to fit cultural expectations, using language that resonates with Spanish-speaking audiences. Case Study 3: Document Translation and Summarization Phi-3.5 can be used to translate or summarize complex documents across languages. Its ability to preserve meaning and context across languages makes it ideal for industries where accuracy is crucial, such as legal or academic fields. Example: A legal firm working on cross-border cases can use Phi-3.5 to translate contracts or legal briefs from German to English, ensuring the context and legal terminology are accurately preserved. It can also summarize lengthy documents in multiple languages, saving time for legal teams. Fine-Tuning Phi-3.5 Model Fine-tuning a language model like Phi-3.5 is a crucial step in adapting it to perform specific tasks or cater to specific domains. This section will walk through what fine-tuning is, its importance in NLP, and how to fine-tune the Phi-3.5 model using Azure Model Catalog for different languages and tasks. We'll also explore a code example and best practices for evaluating and validating the fine-tuned model. What is Fine-Tuning? Fine-tuning refers to the process of taking a pre-trained model and adapting it to a specific task or dataset by training it further on domain-specific data. In the context of NLP, fine-tuning is often required to ensure that the language model understands the nuances of a particular language, industry-specific terminology, or a specific use case. Why Fine-Tuning is Necessary Pre-trained Large Language Models (LLMs) are trained on diverse datasets and can handle various tasks like text summarization, generation, and question answering. However, they may not perform optimally in specialized domains without fine-tuning. The goal of fine-tuning is to enhance the model's performance on specific tasks by leveraging its prior knowledge while adapting it to new contexts. Challenges of Fine-Tuning Resource Intensiveness: Fine-tuning large models can be computationally expensive, requiring significant hardware resources. Storage Costs: Each fine-tuned model can be large, leading to increased storage needs when deploying multiple models for different tasks. LoRA and QLoRA To address these challenges, techniques like LoRA (Low-rank Adaptation) and QLoRA (Quantized Low-rank Adaptation) have emerged. Both methods aim to make the fine-tuning process more efficient: LoRA: This technique reduces the number of trainable parameters by introducing low-rank matrices into the model while keeping the original model weights frozen. This approach minimizes memory usage and speeds up the fine-tuning process. QLoRA: An enhancement of LoRA, QLoRA incorporates quantization techniques to further reduce memory requirements and increase the efficiency of the fine-tuning process. It allows for the deployment of large models on consumer hardware without the extensive resource demands typically associated with full fine-tuning. from transformers import AutoModelForSequenceClassification, Trainer, TrainingArguments from peft import get_peft_model, LoraConfig # Load a pre-trained model model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased") # Configure LoRA lora_config = LoraConfig( r=16, # Rank lora_alpha=32, lora_dropout=0.1, ) # Wrap the model with LoRA model = get_peft_model(model, lora_config) # Define training arguments training_args = TrainingArguments( output_dir="./results", evaluation_strategy="epoch", learning_rate=2e-5, per_device_train_batch_size=16, per_device_eval_batch_size=16, num_train_epochs=3, ) # Create a Trainer trainer = Trainer( model=model, args=training_args, train_dataset=train_dataset, eval_dataset=eval_dataset, ) # Start fine-tuning trainer.train() This code outlines how to set up a model for fine-tuning using LoRA, which can significantly reduce the resource requirements while still adapting the model effectively to specific tasks. In summary, fine-tuning with methods like LoRA and QLoRA is essential for optimizing pre-trained models for specific applications in NLP, making it feasible to deploy these powerful models in various domains efficiently. Why is Fine-Tuning Important in NLP? Task-Specific Performance: Fine-tuning helps improve performance for tasks like text classification, machine translation, or sentiment analysis in specific domains (e.g., legal, healthcare). Language-Specific Adaptation: Since models like Phi-3.5 are trained on general datasets, fine-tuning helps them handle industry-specific jargon or linguistic quirks. Efficient Resource Utilization: Instead of training a model from scratch, fine-tuning leverages pre-trained knowledge, saving computational resources and time. Steps to Fine-Tune Phi-3.5 in Azure AI Foundry Fine-tuning the Phi-3.5 model in Azure AI Foundry involves several key steps. Azure provides a user-friendly interface to streamline model customization, allowing you to quickly configure, train, and deploy models. Step 1: Setting Up the Environment in Azure AI Foundry Access Azure AI Foundry: Log in to Azure AI Foundry. If you don’t have an account, you can create one and set up a workspace. Create a New Experiment: Once in the Azure AI Foundry, create a new training experiment. Choose the Phi-3.5 model from the pre-trained models provided in the Azure Model Zoo. Set Up the Data for Fine-Tuning: Upload your custom dataset for fine-tuning. Ensure the dataset is in a compatible format (e.g., CSV, JSON). For instance, if you are fine-tuning the model for a customer service chatbot, you could upload customer queries in different languages. Step 2: Configure Fine-Tuning Settings Select the Training Dataset: Select the dataset you uploaded and link it to the Phi-3.5 model. 2) Configure the Hyperparameters: Set up training hyperparameters like the number of epochs, learning rate, and batch size. You may need to experiment with these settings to achieve optimal performance. 3) Choose the Task Type: Specify the task you are fine-tuning for, such as text classification, translation, or summarization. This helps Azure AI Foundry understand how to optimize the model during fine-tuning. 4) Fine-Tuning for Specific Languages: If you are fine-tuning for a specific language or multilingual tasks, ensure that the dataset is labeled appropriately and contains enough examples in the target language(s). This will allow Phi-3.5 to learn language-specific features effectively. Step 3: Train the Model Launch the Training Process: Once the configuration is complete, launch the training process in Azure AI Foundry. Depending on the size of your dataset and the complexity of the model, this could take some time. Monitor Training Progress: Use Azure AI Foundry’s built-in monitoring tools to track performance metrics such as loss, accuracy, and F1 score. You can view the model’s progress during training to ensure that it is learning effectively. Code Example: Fine-Tuning Phi-3.5 for a Specific Use Case Here's a code snippet for fine-tuning the Phi-3.5 model using Python and Azure AI Foundry SDK. In this example, we are fine-tuning the model for a customer support chatbot in multiple languages. from azure.ai import Foundry from azure.ai.model import Model # Initialize Azure AI Foundry foundry = Foundry() # Load the Phi-3.5 model model = Model.load("phi-3.5") # Set up the training dataset training_data = foundry.load_dataset("customer_queries_dataset") # Fine-tune the model model.fine_tune(training_data, epochs=5, learning_rate=0.001) # Save the fine-tuned model model.save("fine_tuned_phi_3.5") Best Practices for Evaluating and Validating Fine-Tuned Models Once the model is fine-tuned, it's essential to evaluate and validate its performance before deploying it in production. Split Data for Validation: Always split your dataset into training and validation sets. This ensures that the model is evaluated on unseen data to prevent overfitting. Evaluate Key Metrics: Measure performance using key metrics such as: Accuracy: The proportion of correct predictions. F1 Score: A measure of precision and recall. Confusion Matrix: Helps visualize true vs. false predictions for classification tasks. Cross-Language Validation: If the model is fine-tuned for multiple languages, test its performance across all supported languages to ensure consistency and accuracy. Test in Production-Like Environments: Before full deployment, test the fine-tuned model in a production-like environment to catch any potential issues. Continuous Monitoring and Re-Fine-Tuning: Once deployed, continuously monitor the model’s performance and re-fine-tune it periodically as new data becomes available. Deploying Phi-3.5 Model After fine-tuning the Phi-3.5 model, the next crucial step is deploying it to make it accessible for real-world applications. This section will cover two key deployment strategies: deploying in Azure for cloud-based scaling and reliability, and deploying locally with AI Toolkit for simpler offline usage. Each deployment strategy offers its own advantages depending on the use case. Deploying in Azure Azure provides a powerful environment for deploying machine learning models at scale, enabling organizations to deploy models like Phi-3.5 with high availability, scalability, and robust security features. Azure AI Foundry simplifies the entire deployment pipeline. Set Up Azure AI Foundry Workspace: Log in to Azure AI Foundry and navigate to the workspace where the Phi-3.5 model was fine-tuned. Go to the Deployments section and create a new deployment environment for the model. Choose Compute Resources: Compute Target: Select a compute target suitable for your deployment. For large-scale usage, it’s advisable to choose a GPU-based compute instance. Example: Choose an Azure Kubernetes Service (AKS) cluster for handling large-scale requests efficiently. Configure Scaling Options: Azure allows you to set up auto-scaling based on traffic. This ensures that the model can handle surges in demand without affecting performance. Model Deployment Configuration: Create an Inference Pipeline: In Azure AI Foundry, set up an inference pipeline for your model. Specify the Model: Link the fine-tuned Phi-3.5 model to the deployment pipeline. Deploy the Model: Select the option to deploy the model to the chosen compute resource. Test the Deployment: Once the model is deployed, test the endpoint by sending sample requests to verify the predictions. Configuration Steps (Compute, Resources, Scaling) During deployment, Azure AI Foundry allows you to configure essential aspects like compute type, resource allocation, and scaling options. Compute Type: Choose between CPU or GPU clusters depending on the computational intensity of the model. Resource Allocation: Define the minimum and maximum resources to be allocated for the deployment. For real-time applications, use Azure Kubernetes Service (AKS) for high availability. For batch inference, Azure Container Instances (ACI) is suitable. Auto-Scaling: Set up automatic scaling of the compute instances based on the number of requests. For example, configure the deployment to start with 1 node and scale to 10 nodes during peak usage. Cost Comparison: Phi-3.5 vs. Larger Language Models When comparing the costs of using Phi-3.5 with larger language models (LLMs), several factors come into play, including computational resources, pricing structures, and performance efficiency. Here’s a breakdown: Cost Efficiency Phi-3.5: Designed as a Small Language Model (SLM), Phi-3.5 is optimized for lower computational costs. It offers competitive performance at a fraction of the cost of larger models, making it suitable for budget-conscious projects. The smaller size (3.8 billion parameters) allows for reduced resource consumption during both training and inference. Larger Language Models (e.g., GPT-3.5): Typically require more computational resources, leading to higher operational costs. Larger models may incur additional costs for storage and processing power, especially in cloud environments. Performance vs. Cost Performance Parity: Phi-3.5 has been shown to achieve performance parity with larger models on various benchmarks, including language comprehension and reasoning tasks. This means that for many applications, Phi-3.5 can deliver similar results to larger models without the associated costs. Use Case Suitability: For simpler tasks or applications that do not require extensive factual knowledge, Phi-3.5 is often the more cost-effective choice. Larger models may still be preferred for complex tasks requiring deep contextual understanding or extensive factual recall. Pricing Structure Azure Pricing: Phi-3.5 is available through Azure with a pay-as-you-go billing model, allowing users to scale costs based on usage. Pricing details for Phi-3.5 can be found on the Azure pricing page, where users can customize options based on their needs. Code Example: API Setup and Endpoints for Live Interaction Below is a Python code snippet demonstrating how to interact with a deployed Phi-3.5 model via an API in Azure: import requests # Define the API endpoint and your API key api_url = "https://<your-azure-endpoint>/predict" api_key = "YOUR_API_KEY" # Prepare the input data input_data = { "text": "What are the benefits of renewable energy?" } # Make the API request response = requests.post(api_url, json=input_data, headers={"Authorization": f"Bearer {api_key}"}) # Print the model's response if response.status_code == 200: print("Model Response:", response.json()) else: print("Error:", response.status_code, response.text) Deploying Locally with AI Toolkit For developers who prefer to run models on their local machines, the AI Toolkit provides a convenient solution. The AI Toolkit is a lightweight platform that simplifies local deployment of AI models, allowing for offline usage, experimentation, and rapid prototyping. Deploying the Phi-3.5 model locally using the AI Toolkit is straightforward and can be used for personal projects, testing, or scenarios where cloud access is limited. Introduction to AI Toolkit The AI Toolkit is an easy-to-use platform for deploying language models locally without relying on cloud infrastructure. It supports a range of AI models and enables developers to work in a low-latency environment. Advantages of deploying locally with AI Toolkit: Offline Capability: No need for continuous internet access. Quick Experimentation: Rapid prototyping and testing without the delays of cloud deployments. Setup Guide: Installing and Running Phi-3.5 Locally Using AI Toolkit Install AI Toolkit: Go to the AI Toolkit website and download the platform for your operating system (Linux, macOS, or Windows). Install AI Toolkit by running the appropriate installation command in your terminal. Download the Phi-3.5 Model: Once AI Toolkit is installed, you can download the Phi-3.5 model locally by running: 3. Run the Model Locally: After downloading the model, start a local session by running: This will launch a local server on your machine where the model will be available for interaction. Code Example: Using Phi-3.5 Locally in a Project Below is a Python code example demonstrating how to send a query to the locally deployed Phi-3.5 model running on the AI Toolkit. import requests # Define the local endpoint local_url = "http://localhost:8000/predict" # Prepare the input data input_data = { "text": "What are the benefits of renewable energy?" } # Make the API request response = requests.post(local_url, json=input_data) # Print the model's response if response.status_code == 200: print("Model Response:", response.json()) else: print("Error:", response.status_code, response.text) Comparing Language Capabilities Test Results: How Phi-3.5 Handles Different Languages The Phi-3.5 model demonstrates robust multilingual capabilities, effectively processing and generating text in various languages. Below are comparative examples showcasing its performance in English, Spanish, and Mandarin: English Example: Input: "What are the benefits of renewable energy?" Output: "Renewable energy sources, such as solar and wind, reduce greenhouse gas emissions and promote sustainability." Spanish Example: Input: "¿Cuáles son los beneficios de la energía renovable?" Output: "Las fuentes de energía renovable, como la solar y la eólica, reducen las emisiones de gases de efecto invernadero y promueven la sostenibilidad." Mandarin Example: Input: "可再生能源的好处是什么?" Output: "可再生能源,如太阳能和风能,减少温室气体排放,促进可持续发展。" Performance Benchmarking and Evaluation Across Different Languages Benchmarking Phi-3.5 across different languages involves evaluating its accuracy, fluency, and contextual understanding. For instance, using BLEU scores and human evaluations, the model can be assessed on its translation quality and coherence in various languages. Real-World Use Case: Multilingual Customer Service Chatbot A practical application of Phi-3.5's multilingual capabilities is in developing a customer service chatbot that can interact with users in their preferred language. For instance, the chatbot could provide support in English, Spanish, and Mandarin, ensuring a wider reach and better user experience. Optimizing and Validating Phi-3.5 Model Model Performance Metrics To validate the model's performance in different scenarios, consider the following metrics: Accuracy: Measure how often the model's outputs are correct or align with expected results. Fluency: Assess the naturalness and readability of the generated text. Contextual Understanding: Evaluate how well the model understands and responds to context-specific queries. Tools to Use in Azure and Ollama for Evaluation Azure Cognitive Services: Utilize tools like Text Analytics and Translator to evaluate performance. Ollama: Use local testing environments to quickly iterate and validate model outputs. Conclusion In summary, Phi-3.5 exhibits impressive multilingual capabilities, effective deployment options, and robust performance metrics. Its ability to handle various languages makes it a versatile tool for natural language processing applications. Phi-3.5 stands out for its adaptability and performance in multilingual contexts, making it an excellent choice for future NLP projects, especially those requiring diverse language support. We encourage readers to experiment with the Phi-3.5 model using Azure AI Foundry or the AI Toolkit, explore fine-tuning techniques for their specific use cases, and share their findings with the community. For more information on optimized fine-tuning techniques, check out the Ignite Fine-Tuning Workshop. References Customize the Phi-3.5 family of models with LoRA fine-tuning in Azure Fine-tune Phi-3.5 models in Azure Fine Tuning with Azure AI Foundry and Microsoft Olive Hands on Labs and Workshop Customize a model with fine-tuning https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning?tabs=azure-openai%2Cturbo%2Cpython-new&pivots=programming-language-studio Microsoft AI Toolkit - AI Toolkit for VSCode1.8KViews1like2CommentsUnleashing the Power of Model Context Protocol (MCP): A Game-Changer in AI Integration
Artificial Intelligence is evolving rapidly, and one of the most pressing challenges is enabling AI models to interact effectively with external tools, data sources, and APIs. The Model Context Protocol (MCP) solves this problem by acting as a bridge between AI models and external services, creating a standardized communication framework that enhances tool integration, accessibility, and AI reasoning capabilities. What is Model Context Protocol (MCP)? MCP is a protocol designed to enable AI models, such as Azure OpenAI models, to interact seamlessly with external tools and services. Think of MCP as a universal USB-C connector for AI, allowing language models to fetch information, interact with APIs, and execute tasks beyond their built-in knowledge. Key Features of MCP Standardized Communication – MCP provides a structured way for AI models to interact with various tools. Tool Access & Expansion – AI assistants can now utilize external tools for real-time insights. Secure & Scalable – Enables safe and scalable integration with enterprise applications. Multi-Modal Integration – Supports STDIO, SSE (Server-Sent Events), and WebSocket communication methods. MCP Architecture & How It Works MCP follows a client-server architecture that allows AI models to interact with external tools efficiently. Here’s how it works: Components of MCP MCP Host – The AI model (e.g., Azure OpenAI GPT) requesting data or actions. MCP Client – An intermediary service that forwards the AI model's requests to MCP servers. MCP Server – Lightweight applications that expose specific capabilities (APIs, databases, files, etc.). Data Sources – Various backend systems, including local storage, cloud databases, and external APIs. Data Flow in MCP The AI model sends a request (e.g., "fetch user profile data"). The MCP client forwards the request to the appropriate MCP server. The MCP server retrieves the required data from a database or API. The response is sent back to the AI model via the MCP client. Integrating MCP with Azure OpenAI Services Microsoft has integrated MCP with Azure OpenAI Services, allowing GPT models to interact with external services and fetch live data. This means AI models are no longer limited to static knowledge but can access real-time information. Benefits of Azure OpenAI Services + MCP Integration ✔ Real-time Data Fetching – AI assistants can retrieve fresh information from APIs, databases, and internal systems. ✔ Contextual AI Responses – Enhances AI responses by providing accurate, up-to-date information. ✔ Enterprise-Ready – Secure and scalable for business applications, including finance, healthcare, and retail. Hands-On Tools for MCP Implementation To implement MCP effectively, Microsoft provides two powerful tools: Semantic Workbench and AI Gateway. Microsoft Semantic Workbench A development environment for prototyping AI-powered assistants and integrating MCP-based functionalities. Features: Build and test multi-agent AI assistants. Configure settings and interactions between AI models and external tools. Supports GitHub Codespaces for cloud-based development. Explore Semantic Workbench Workbench interface examples Microsoft AI Gateway A plug-and-play interface that allows developers to experiment with MCP using Azure API Management. Features: Credential Manager – Securely handle API credentials. Live Experimentation – Test AI model interactions with external tools. Pre-built Labs – Hands-on learning for developers. Explore AI Gateway Setting Up MCP with Azure OpenAI Services Step 1: Create a Virtual Environment First, create a virtual environment using Python: python -m venv .venv Activate the environment: # Windows venv\Scripts\activate # MacOS/Linux source .venv/bin/activate Step 2: Install Required Libraries Create a requirements.txt file and add the following dependencies: langchain-mcp-adapters langgraph langchain-openai Then, install the required libraries: pip install -r requirements.txt Step 3: Set Up OpenAI API Key Ensure you have your OpenAI API key set up: # Windows setx OPENAI_API_KEY "<your_api_key> # MacOS/Linux export OPENAI_API_KEY=<your_api_key> Building an MCP Server This server performs basic mathematical operations like addition and multiplication. Create the Server File First, create a new Python file: touch math_server.py Then, implement the server: from mcp.server.fastmcp import FastMCP # Initialize the server mcp = FastMCP("Math") MCP.tool() def add(a: int, b: int) -> int: return a + b MCP.tool() def multiply(a: int, b: int) -> int: return a * b if __name__ == "__main__": mcp.run(transport="stdio") Your MCP server is now ready to run. Building an MCP Client This client connects to the MCP server and interacts with it. Create the Client File First, create a new file: touch client.py Then, implement the client: import asyncio from mcp import ClientSession, StdioServerParameters from langchain_openai import ChatOpenAI from mcp.client.stdio import stdio_client # Define server parameters server_params = StdioServerParameters( command="python", args=["math_server.py"], ) # Define the model model = ChatOpenAI(model="gpt-4o") async def run_agent(): async with stdio_client(server_params) as (read, write): async with ClientSession(read, write) as session: await session.initialize() tools = await load_mcp_tools(session) agent = create_react_agent(model, tools) agent_response = await agent.ainvoke({"messages": "what's (4 + 6) x 14?"}) return agent_response["messages"][3].content if __name__ == "__main__": result = asyncio.run(run_agent()) print(result) Your client is now set up and ready to interact with the MCP server. Running the MCP Server and Client Step 1: Start the MCP Server Open a terminal and run: python math_server.py This starts the MCP server, making it available for client connections. Step 2: Run the MCP Client In another terminal, run: python client.py Expected Output 140 This means the AI agent correctly computed (4 + 6) x 14 using both the MCP server and GPT-4o. Conclusion Integrating MCP with Azure OpenAI Services enables AI applications to securely interact with external tools, enhancing functionality beyond text-based responses. With standardized communication and improved AI capabilities, developers can build smarter and more interactive AI-powered solutions. By following this guide, you can set up an MCP server and client, unlocking the full potential of AI with structured external interactions. Next Steps: Explore more MCP tools and integrations. Extend your MCP setup to work with additional APIs. Deploy your solution in a cloud environment for broader accessibility. For further details, visit the GitHub repository for MCP integration examples and best practices. MCP GitHub Repository MCP Documentation Semantic Workbench AI Gateway MCP Video Walkthrough MCP Blog MCP Github End to End Demo62KViews11likes6Comments