api management
174 TopicsAzure API Management - Unified AI Gateway Design Pattern
Scaling AI adoption requires a unified control plane As organizations scale generative AI adoption, they face growing complexity managing multiple AI providers, models, API formats, and rapid release cycles. Without a unified control plane, enterprises risk fragmented governance, inconsistent developer experiences, and uncontrolled AI consumption costs. As an AI Gateway, Azure API Management enables organizations to implement centralized AI mediation, governance, and developer access control across AI services. This blog post introduces the Unified AI Gateway design pattern, a customer developed architecture pattern designed by Uniper, that builds on API Management’s policy extensibility to create a flexible and maintainable solution for managing AI services across providers, models, and environments. Uniper runs this pattern in production today to optimize AI governance and operational efficiency, enhance the developer experience, and manage AI costs. Note: The Unified AI Gateway described in this post is a customer-implemented design pattern built using Azure API Management policy extensibility. Customer spotlight: Uniper Uniper is a leading European energy company with a global footprint, generating, trading, and delivering electricity and natural gas through a diverse portfolio spanning hydro, wind, solar, nuclear, and flexible thermal assets. With a strategy centered on accelerating the energy transition, Uniper provides reliable and innovative energy solutions that power industries, strengthen grids, and support communities across its core markets. Committed to becoming one of Europe’s first AI-driven utilities, Uniper views artificial intelligence as a strategic cornerstone for future competitiveness, efficiency, and operational transformation. Building on a strong foundation of AI and machine-learning solutions—from plant optimization and predictive maintenance to advanced energy trading—Uniper is now scaling the adoption of generative AI (GenAI) across all business functions. At Uniper, AI is not just a technology enhancer—it is a business imperative. The momentum for AI-driven transformation starts within Uniper’s business areas, with the technology organization enabling and empowering this evolution through responsible, value-focused AI deployment. Enterprise challenges when scaling AI services As Uniper expanded AI adoption, they encountered challenges common across enterprises implementing multi-model and multi-provider AI architectures: API growth and management overhead – Using a conventional REST/SOAP API definition approach, each combination of AI provider, model, API type, and version typically results in a separate API schema definition in API Management. As AI services evolve, the number of API definitions can grow significantly, increasing management overhead. Limited routing flexibility – Each API schema definition is typically linked to a static backend, which prevents dynamic routing decisions based on factors like model cost, capacity, or performance (e.g., routing to gpt-4.1-mini instead of gpt-4.1). Because AI services evolve rapidly, this approach creates exponential growth in API definitions and ongoing management overhead: Separate APIs are typically needed for each of the following: o AI service provider (e.g. Microsoft Foundry, Google Gemini) o API type (e.g., OpenAI, Inference, Responses) o Model (e.g., gpt-4.1, gpt-4.1-mini, phi-4) Each AI service also supports multiple versions. For instance, OpenAI might include: o 2025-01-01-preview (latest features) o 2024-10-21 (stable release) o 2024-02-01 (legacy support) Different request patterns may be required. For example, Microsoft Foundry's OpenAI supports chat completion using both: o OpenAI v1 format (/v1/chat/completions) o Azure OpenAI format (/openai/deployments/{model}/chat/completions) Each API definition may be replicated across environments. For example, Development, Test, and Production API Management environments. The Unified AI Gateway design pattern To address these challenges, Uniper implemented a policy-driven enterprise AI mediation layer using Azure API Management. At a high level, the pattern creates a single enterprise AI access layer that: Normalizes requests across providers and models Enforces consistent authentication and governance Dynamically routes traffic across AI services Provides centralized observability and cost controls The design emphasizes modular policy components that provide centralized, auditable control over security, routing, quotas, and monitoring. Core architecture components The following components are involved in the Unified AI Gateway pattern: Single wildcard API definition with wildcard operations (/*) that minimizes API management overhead. No API definition changes are required when introducing new AI providers, models, or APIs. Unified authentication that enforces consistent authentication for every request, supporting both API key and JWT validation for inbound requests, with managed identity used for backend authentication to AI services. Optimized path construction that automatically transforms requests to simplify consuming AI services, such as automatic API version selection (for example, transforming /deployments/gpt-4.1-mini/chat/completions to /openai/deployments/gpt-4.1-mini/chat/completions?api-version=2025-01-01-preview). Model and API aware backend selection that dynamically routes requests to backend AI services and load balancing pools based on capacity, cost, performance, and other operational factors. Circuit breaker and load balancing that leverages API Management’s built-in circuit breaker functionality with load balancing pools to provide resiliency across backend AI services deployed in different regions. When endpoints reach failure thresholds, traffic automatically rebalances to healthy regional instances. Tiered token limiting that enforces token consumption using API Management’s llm-token-limit policy with quota thresholds. Comprehensive trace logging and monitoring using Application Insights to provide robust usage tracking and operational insights, including token tracking through API Management’s llm‑emit‑token‑metric policy. "The collaboration between the Uniper and Microsoft’s AI and API Management teams on delivering the unified AI gateway has been exceptional. Together, we've built a robust solution that provides the flexibility to rapidly adapt to fast-paced advancements in the AI sphere, while maintaining the highest standards of security, resilience, and governance. This partnership has enabled us to deliver enterprise-grade AI solutions that our customers can trust and scale with confidence." ~ Ian Beeson – Uniper, API Centre of Excellence Lead Uniper’s results: Business and operational impact For Uniper, shifting to use the Unified AI Gateway pattern has proven to be a strategic enabler for scaling their AI adoption with API Management. Uniper reports significant improvements across governance, efficiency, developer experience, and cost management: Centralized AI security and governance o Real-time content filtering – Uniper can detect, log, and alert on content filter violations. o Centralized audit and traceability – All AI requests and responses are centrally logged, enabling unified auditing and tracing. Operational efficiency o Reduction in API definitions – Uniper estimates an 85% API definition reduction, moving from managing seven API definitions per environment (Development, Test, and Production) to a single universal wildcard API definition per environment. o Feature deployment speed – Uniper delivers AI capabilities 60–180 days faster, enabled by immediate feature availability and the elimination of reliance on API schema updates and migrations. o AI service availability – Uniper achieves 99.99% availability for AI services, enabled through circuit breakers and multi‑regional backend routing. o Centralized ownership and maintenance – API management responsibilities are now consolidated under a single team. Improved developer experience o Immediate feature availability – New AI capabilities are available immediately without requiring API definition updates, eliminating the previous 2–6-month delay before new features could be shared with Uniper’s developers. o Automatic API schema compatibility – Both Microsoft and third-party provider API updates no longer require migrations to new or updated API definitions. Previously, Uniper’s developers had to migrate for each update. o Consistent API interface with equivalent SDK support – A unified API surface across all AI services simplifies development and integration for Uniper’s developers. o Equivalent request performance – Uniper validated that request performance through the Unified AI Gateway pattern is equivalent to the conventional API definition approach, based on comparing the time a request is received by the gateway to the time it is sent to the backend. AI cost management o Token consumption visibility – Uniper uses detailed usage and token level metrics to enable a charge‑back model. o Automated cost controls – Uniper enforces costs through configurable quotas and limits at both the AI gateway and backend AI service levels. o Optimized model routing – Uniper dynamically routes requests to the most cost-effective models based on their policy. “The Unified AI Gateway pattern has fundamentally changed how we scale and govern AI across the enterprise. By consolidating AI access behind a single, policy-driven Azure API Management layer, we’ve reduced operational complexity while improving security, resilience, and developer experience. Most importantly, this approach allows us to adopt new models and capabilities at the pace the AI ecosystem demands—without compromising performance, availability, or governance.” ~ Hinesh Pankhania – Uniper, Head of Cloud Engineering & CCoE When to use this pattern The Unified AI Gateway pattern is most beneficial when organizations experience growing AI service complexity. Consider using the Unified AI Gateway pattern when: Multiple AI service providers: Your organization integrates with various AI service providers (Microsoft Foundry, Google Gemini, etc.) Frequent model/API changes: New models/APIs need to be regularly added or existing ones updated Dynamic routing needs: Your organization requires dynamic backend selection based on capacity, cost, or performance When not to use this pattern: If you expect a limited number of models/API definitions with minimal ongoing changes, following the conventional approach may be simpler to implement and maintain. The additional implementation and maintenance effort required by the Unified AI Gateway pattern should be weighed against the management overhead it is intended to reduce. Refer to the next section for details on implementing the Unified AI Gateway pattern, including how the request and response pipeline is built using API Management policy fragments. Get started Get started by exploring a simplified sample that demonstrates the Unified AI Gateway pattern: Azure-Samples/APIM-Unified-AI-Gateway-Sample. The sample shows how to route requests to multiple AI models through a single API Management endpoint, including Phi‑4, GPT‑4.1, and GPT‑4.1‑mini from Microsoft Foundry, as well as Google Gemini 2.5 Flash‑Lite. It uses a universal wildcard API definition (/*) across GET, POST, PUT, and DELETE operations, routing all requests through a unified, policy-driven pipeline built with policy fragments to ensure consistent security, dynamic routing, load balancing, rate limiting, and comprehensive logging and monitoring. The Unified AI Gateway pattern is designed to be extensible, allowing organizations to add support for additional API types, models, versions, etc. to meet their unique requirements through minimal updates to policy fragments. Each policy fragment is designed as a modular component with a single, well-defined responsibility. This modular design enables targeted customization, such as adding customized token tracking, without impacting the rest of the pipeline. Acknowledgments We would like to recognize the following Uniper contributors for their design of the Unified AI Gateway pattern and their contributions to this blog post: ~ Hinesh Pankhania, Uniper – Head of Cloud Engineering and CCoE ~ Ian Beeson, Uniper - API Centre of Excellence Lead ~ Steve Atkinson – Freelance AI Architect and AI Engineering Lead (Contract)Introducing native Service Bus message publishing from Azure API Management (Preview)
We’re excited to announce a preview capability in Azure API Management (APIM) — you can now send messages directly to Azure Service Bus from your APIs using a built-in policy. This enhancement, currently in public preview, simplifies how you connect your API layer with event-driven and asynchronous systems, helping you build more scalable, resilient, and loosely coupled architectures across your enterprise. Why this matters? Modern applications increasingly rely on asynchronous communication and event-driven designs. With this new integration: Any API hosted in API Management can publish to Service Bus — no SDKs, custom code, or middleware required. Partners, clients, and IoT devices can send data through standard HTTP calls, even if they don’t support AMQP natively. You stay in full control with authentication, throttling, and logging managed centrally in API Management. Your systems scale more smoothly by decoupling front-end requests from backend processing. How it works The new send-service-bus-message policy allows API Management to forward payloads from API calls directly into Service Bus queues or topics. High-level flow A client sends a standard HTTP request to your API endpoint in API Management. The policy executes and sends the payload as a message to Service Bus. Downstream consumers such as Logic Apps, Azure Functions, or microservices process those messages asynchronously. All configurations happen in API Management — no code changes or new infrastructure are required. Getting started You can try it out in minutes: Set up a Service Bus namespace and create a queue or topic. Enable a managed identity (system-assigned or user-assigned) on your API Management instance. Grant the identity the “Service Bus data sender” role in Azure RBAC, scoped to your queue/ topic. Add the policy to your API operation: <send-service-bus-message queue-name="orders"> <payload>@(context.Request.Body.As<string>())</payload> </send-service-bus-message> Once saved, each API call publishes its payload to the Service Bus queue or topic. 📖 Learn more. Common use cases This capability makes it easy to integrate your APIs into event-driven workflows: Order processing – Queue incoming orders for fulfillment or billing. Event notifications – Trigger internal workflows across multiple applications. Telemetry ingestion – Forward IoT or mobile app data to Service Bus for analytics. Partner integrations – Offer REST-based endpoints for external systems while maintaining policy-based control. Each of these scenarios benefits from simplified integration, centralized governance, and improved reliability. Secure and governed by design The integration uses managed identities for secure communication between API Management and Service Bus — no secrets required. You can further apply enterprise-grade controls: Enforce rate limits, quotas, and authorization through APIM policies. Gain API-level logging and tracing for each message sent. Use Service Bus metrics to monitor downstream processing. Together, these tools help you maintain a consistent security posture across your APIs and messaging layer. Build modern, event-driven architectures With this feature, API Management can serve as a bridge to your event-driven backbone. Start small by queuing a single API’s workload, or extend to enterprise-wide event distribution using topics and subscriptions. You’ll reduce architectural complexity while enabling more flexible, scalable, and decoupled application patterns. Learn more: Get the full walkthrough and examples in the documentation 👉 here4KViews2likes6CommentsAnnouncing the General Availability (GA) of the Premium v2 tier of Azure API Management
Superior capacity, highest entity limits, unlimited included calls, and the most comprehensive set of features set the Premium v2 tier apart from other API Management tiers. Customers rely on the Premium v2 tier for running enterprise-wide API programs at scale, with high availability, and performance. The Premium v2 tier has a new architecture that eliminates management traffic from the customer VNet, making private networking much more secure and easier to setup. During the creation of a Premium v2 instance, you can choose between VNet injection or VNet integration (introduced in the Standard v2 tier) options. In addition, today we are also adding three new features to Premium v2: Inbound Private Link: You can now enable private endpoint connectivity to restrict inbound access to your Premium v2 instance. It can be enabled along with VNet injection or VNet integration or without a VNet. Availability zone support: Premium v2 now supports availability zones (zone redundancy) to enhance the reliability and resilience of your API gateway. Custom CA certificates: Azure API management v2 gateway can now validate TLS connections with the backend service using custom CA certificates. New and improved VNet injection Using VNet injection in Premium v2 no longer requires configuring routes or service endpoints. Customers can secure their API workloads without impacting API Management dependencies, while Microsoft can secure the infrastructure without interfering with customer API workloads. In short, the new VNet injection implementation enables both parties to manage network security and configuration settings independently and without affecting each other. You can now configure your APIs with complete networking flexibility: force tunnel all outbound traffic to on-premises, send all outbound traffic through an NVA, or add a WAF device to monitor all inbound traffic to your API Management Premium v2—all without constraints. Inbound Private Link Customers can now configure an inbound private endpoint for their API Management Premium v2 instance to allow your API consumers securely access the API Management gateway over Azure Private Link. The private endpoint uses an IP address from an Azure virtual network in which it's hosted. Network traffic between a client on your private network and API Management traverses over the virtual network and a Private Link on the Microsoft backbone network, eliminating exposure from the public internet. Further, you can configure custom DNS settings or an Azure DNS private zone to map the API Management hostname to the endpoint's private IP address. With a private endpoint and Private Link, you can: Create multiple Private Link connections to an API Management instance. Use the private endpoint to send inbound traffic on a secure connection. Apply different API Management policies based on whether traffic comes from the private endpoint. Limit incoming traffic only to private endpoints, preventing data exfiltration. Combine with inbound virtual network injection or outbound virtual network integration to provide end-to-end network isolation of your API Management clients and backend services. More details can be found here Today, only the API Management instance’s Gateway endpoint supports inbound private link connections. Each API management instance can support at most 100 Private Link connections. Availability zones Azure API Management Premium v2 now supports Availability Zones (AZ) redundancy to enhance the reliability and resilience of your API gateway. When deploying an API Management instance in an AZ-enabled region, users can choose to enable zone redundancy. This distributes the service's units, including Gateway, management plane, and developer portal, across multiple, physically separate AZs within that region. Learn how to enable AZs here. CA certificates If the API Management Gateway needs to connect to the backends secured with TLS certificates issued by private certificate authorities (CA), you need to configure custom CA certificates in the API Management instance. Custom CA certificates can be added and managed as Authorization Credentials in the Backend entities. The Backend entity has been extended with new properties allowing customers to specify a list of certificate thumbprints or subject name + issuer thumbprint pairs that Gateway should trust when establishing TLS connection with associated backend endpoint. More details can be found here. Region availability The Premium v2 tier is now generally available in six public regions (Australia East, East US2, Germany West Central, Korea Central, Norway East and UK South) with additional regions coming soon. For pricing information and regional availability, please visit the API Management pricing page. Learn more API Management v2 tiers FAQ API Management v2 tiers documentation API Management overview documentationAzure API Management Your Auth Gateway For MCP Servers
The Model Context Protocol (MCP) is quickly becoming the standard for integrating Tools 🛠️ with Agents 🤖 and Azure API Management is at the fore-front, ready to support this open-source protocol 🚀. You may have already encountered discussions about MCP, so let's clarify some key concepts: Model Context Protocol (MCP) is a standardized way, (a protocol), for AI models to interact with external tools, (and either read data or perform actions) and to enrich context for ANY language models. AI Agents/Assistants are autonomous LLM-powered applications with the ability to use tools to connect to external services required to accomplish tasks on behalf of users. Tools are components made available to Agents allowing them to interact with external systems, perform computation, and take actions to achieve specific goals. Azure API Management: As a platform-as-a-service, API Management supports the complete API lifecycle, enabling organizations to create, publish, secure, and analyze APIs with built-in governance, security, analytics, and scalability. New Cool Kid in Town - MCP AI Agents are becoming widely adopted due to enhanced Large Language Model (LLM) capabilities. However, even the most advanced models face limitations due to their isolation from external data. Each new data source requires custom implementations to extract, prepare, and make data accessible for any model(s). - A lot of heavy lifting. Anthropic developed an open-source standard - the Model Context Protocol (MCP), to connect your agents to external data sources such as local data sources (databases or computer files) or remote services (systems available over the internet through e.g. APIs). MCP Hosts: LLM applications such as chat apps or AI assistant in your IDEs (like GitHub Copilot in VS Code) that need to access external capabilities MCP Clients: Protocol clients that maintain 1:1 connections with servers, inside the host application MCP Servers: Lightweight programs that each expose specific capabilities and provide context, tools, and prompts to clients MCP Protocol: Transport layer in the middle At its core, MCP follows a client-server architecture where a host application can connect to multiple servers. Whenever your MCP host or client needs a tool, it is going to connect to the MCP server. The MCP server will then connect to for example a database or an API. MCP hosts and servers will connect with each other through the MCP protocol. You can create your own custom MCP Servers that connect to your or organizational data sources. For a quick start, please visit our GitHub repository to learn how to build a remote MCP server using Azure Functions without authentication: https://aka.ms/mcp-remote Remote vs. Local MCP Servers The MCP standard supports two modes of operation: Remote MCP servers: MCP clients connect to MCP servers over the Internet, establishing a connection using HTTP and Server-Sent Events (SSE), and authorizing the MCP client access to resources on the user's account using OAuth. Local MCP servers: MCP clients connect to MCP servers on the same machine, using stdio as a local transport method. Azure API Management as the AI Auth Gateway Now that we have learned that MCP servers can connect to remote services through an API. The question now rises, how can we expose our remote MCP servers in a secure and scalable way? This is where Azure API Management comes in. A way that we can securely and safely expose tools as MCP servers. Azure API Management provides: Security: AI agents often need to access sensitive data. API Management as a remote MCP proxy safeguards organizational data through authentication and authorization. Scalability: As the number of LLM interactions and external tool integrations grows, API Management ensures the system can handle the load. Security remains to be a critical piece of building MCP servers, as agents will need to securely connect to protected endpoints (tools) to perform certain actions or read protected data. When building remote MCP servers, you need a way to allow users to login (Authenticate) and allow them to grant the MCP client access to resources on their account (Authorization). MCP - Current Authorization Challenges State: 4/10/2025 Recent changes in MCP authorization have sparked significant debate within the community. 🔍 𝗞𝗲𝘆 𝗖𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗲𝘀 with the Authorization Changes: The MCP server is now treated as both a resource server AND an authorization server. This dual role has fundamental implications for MCP server developers and runtime operations. 💡 𝗢𝘂𝗿 𝗦𝗼𝗹𝘂𝘁𝗶𝗼𝗻: To address these challenges, we recommend using 𝗔𝘇𝘂𝗿𝗲 𝗔𝗣𝗜 𝗠𝗮𝗻𝗮𝗴𝗲𝗺𝗲𝗻𝘁 as your authorization gateway for remote MCP servers. 🔗For an enterprise-ready solution, please check out our azd up sample repo to learn how to build a remote MCP server using Azure API Management as your authentication gateway: https://aka.ms/mcp-remote-apim-auth The Authorization Flow The workflow involves three core components: the MCP client, the APIM Gateway, and the MCP server, with Microsoft Entra managing authentication (AuthN) and authorization (AuthZ). Using the OAuth protocol, the client starts by calling the APIM Gateway, which redirects the user to Entra for login and consent. Once authenticated, Entra provides an access token to the Gateway, which then exchanges a code with the client to generate an MCP server token. This token allows the client to communicate securely with the server via the Gateway, ensuring user validation and scope verification. Finally, the MCP server establishes a session key for ongoing communication through a dedicated message endpoint. Diagram source: https://aka.ms/mcp-remote-apim-auth-diagram Conclusion Azure API Management (APIM) is an essential tool for enterprise customers looking to integrate AI models with external tools using the Model Context Protocol (MCP). In this blog, we've emphasized the simplicity of connecting AI agents to various data sources through MCP, streamlining previously complex implementations. Given the critical role of secure access to platforms and services for AI agents, APIM offers robust solutions for managing OAuth tokens and ensuring secure access to protected endpoints, making it an invaluable asset for enterprises, despite the challenges of authentication. API Management: An Enterprise Solution for Securing MCP Servers Azure API Management is an essential tool for enterprise customers looking to integrate AI models with external tools using the Model Context Protocol (MCP). It is designed to help you to securely expose your remote MCP servers. MCP servers are still very new, and as the technology evolves, API Management provides an enterprise-ready solution that will evolve with the latest technology. Stay tuned for further feature announcements soon! Acknowledgments This post and work was made possible thanks to the hard work and dedication of our incredible team. Special thanks to Pranami Jhawar, Julia Kasper, Julia Muiruri, Annaji Sharma Ganti Jack Pa, Chaoyi Yuan and Alex Vieira for their invaluable contributions. Additional Resources MCP Client Server integration with APIM as AI gateway Blog Post: https://aka.ms/remote-mcp-apim-auth-blog Sequence Diagram: https://aka.ms/mcp-remote-apim-auth-diagram APIM lab: https://aka.ms/ai-gateway-lab-mcp-client-auth Python: https://aka.ms/mcp-remote-apim-auth .NET: https://aka.ms/mcp-remote-apim-auth-dotnet On-Behalf-Of Authorization: https://aka.ms/mcp-obo-sample 3rd Party APIs – Backend Auth via Credential Manager: Blog Post: https://aka.ms/remote-mcp-apim-lab-blog APIM lab: https://aka.ms/ai-gateway-lab-mcp YouTube Video: https://aka.ms/ai-gateway-lab-demo22KViews12likes4CommentsApplying DevOps Principles on Lean Infrastructure. Lessons From Scaling to 102K Users.
Hi Azure Community, I'm a Microsoft Certified DevOps Engineer, and I want to share an unusual journey. I have been applying DevOps principles on traditional VPS infrastructure to scale to 102,000 users with 99.2% uptime. Why am I posting this in an Azure community? Because I'm planning migration to Azure in 2026, and I want to understand: What mistakes am I already making that will bite me during migration? THE CURRENT SETUP Platform: Social commerce (West Africa) Users: 102,000 active Monthly events: 2 million Uptime: 99.2% Infrastructure: Single VPS Stack: PHP/Laravel, MySQL, Redis Yes - one VPS. No cloud. No Kubernetes. No microservices. WHY I HAVEN'T USED AZURE YET Honest answer: Budget constraints in emerging market startup ecosystem. At our current scale, fully managed Azure services would significantly increase monthly burn before product-market expansion. The funding we raised needs to last through growth milestones. The trade: I manually optimize what Azure would auto-scale. I debug what Application Insights would catch. I do by hand what Azure Functions would automate. DEVOPS PRACTICES THAT KEPT US RUNNING Even on single-server infrastructure, core DevOps principles still apply: CI/CD Pipeline (GitHub Actions) • 3-5 deployments weekly • Zero-downtime deploys • Automated rollback on health check failures • Feature flags for gradual rollouts Monitoring & Observability • Custom monitoring (would love Application Insights) • Real-time alerting • Performance tracking and slow query detection • Resource usage monitoring Automation • Automated backups • Automated database optimization • Automated image compression • Automated security updates Infrastructure as Code • Configs in Git • Deployment scripts • Environment variables • Documented procedures Testing & Quality • Automated test suite • Pre-deployment health checks • Staging environment • Post-deployment verification KEY OPTIMIZATIONS Async Job Processing • Upload endpoint: 8 seconds → 340ms • 4x capacity increase Database Optimization • Feed loading: 6.4 seconds → 280ms • Strategic caching • Batch processing Image Compression • 3-8MB → 180KB (94% reduction) • Critical for mobile users Caching Strategy • Redis for hot data • Query result caching • Smart invalidation Progressive Enhancement • Server-rendered pages • 2-3 second loads on 4G WHAT I'M WORRIED ABOUT FOR AZURE MIGRATION This is where I need your help: Architecture Decisions • App Service vs Functions + managed services? • MySQL vs Azure SQL? • When does cost/benefit flip for managed services? Cost Management • How do startups manage Azure costs during growth? • Reserved instances vs pay-as-you-go? • Which Azure services are worth the premium? Migration Strategy • Lift-and-shift first, or re-architect immediately? • Zero-downtime migration with 102K active users? • Validation approach before full cutover? Monitoring & DevOps • Application Insights - worth it from day one? • Azure DevOps vs GitHub Actions for Azure deployments? • Operational burden reduction with managed services? Development Workflow • Local development against Azure services? • Cost-effective staging environments? • Testing Azure features without constant bills? MY PLANNED MIGRATION PATH Phase 1: Hybrid (Q1 2026) • Azure CDN for static assets • Azure Blob Storage for images • Application Insights trial • Keep compute on VPS Phase 2: Compute Migration (Q2 2026) • App Service for API • Azure Database for MySQL • Azure Cache for Redis • VPS for background jobs Phase 3: Full Azure (Q3 2026) • Azure Functions for processing • Full managed services • Retire VPS QUESTIONS FOR THIS COMMUNITY Question 1: Am I making migration harder by waiting? Should I have started with Azure at higher cost to avoid technical debt? Question 2: What will break when I migrate? What works on VPS but fails in cloud? What assumptions won't hold? Question 3: How do I validate before cutting over? Parallel infrastructure? Gradual traffic shift? Safe patterns? Question 4: Cost optimization from day one? What to optimize immediately vs later? Common cost mistakes? Question 5: DevOps practices that transfer? What stays the same? What needs rethinking for cloud-native? THE BIGGER QUESTION Have you migrated from self-hosted to Azure? What surprised you? I know my setup isn't best practice by Azure standards. But it's working, and I've learned optimization, monitoring, and DevOps fundamentals in practice. Will those lessons transfer? Or am I building habits that cloud will expose as problematic? Looking forward to insights from folks who've made similar migrations. --- About the Author: Microsoft Certified DevOps Engineer and Azure Developer. CTO at social commerce platform scaling in West Africa. Preparing for phased Azure migration in 2026. P.S. I got the Azure certifications to prepare for this migration. Now I need real-world wisdom from people who've actually done it!90Views0likes0CommentsPreview: Govern, Secure, and Observe A2A APIs with Azure API Management
Today, we’re announcing the preview support for A2A (Agent2Agent) APIs in Azure API Management. With this capability, organizations can now manage and govern agent APIs alongside AI model APIs, Model Context Protocol (MCP) tools, and traditional APIs such as REST, SOAP, GraphQL, WebSocket, and gRPC — all within a single, consistent API management plane. Extending API Governance into the Agentic Ecosystem As organizations adopt agentic systems, the need for consistent governance, security, and observability grows. With A2A API support, Azure API Management enables you to extend established API practices into the agentic world — ensuring secure access, consistent policy enforcement, and complete visibility for AI agents. A2A APIs in Azure API Management: Mediate JSON-RPC runtime operations with policy support Expose and manage agent cards for users, clients, or other agents Support OpenTelemetry GenAI semantic conventions when logging traces to Application Insights — including "gen_ai.agent.id" and "gen_ai.agent.name" attributes How It Works When you import an A2A API, API Management mediates runtime calls to your agent backend (JSON-RPC only) and exposes the agent card as an operation within the same API. The agent card is transformed automatically to represent the A2A API managed by API Management — with the hostname replaced by API Management’s gateway address, security schemes converted to authentication configured in API Management, and unsupported interfaces removed. When integrated with Application Insights, API Management enriches traces with GenAI-compliant telemetry attributes — allowing easy identification of the agent and deep correlation between API and agent execution traces for monitoring and debugging. Try It Out To import an A2A API: Navigate to the APIs page in the Azure portal and select the A2A Agent tile. Enter your agent card URL. If accessible, the portal will automatically populate relevant settings. Configure the remaining properties, such as API path in API Management. This functionality is currently available only in v2 tiers of API Management and it will continue to roll out to all tiers in the coming months. Start Managing Your Agent APIs With A2A support in Azure API Management, you can now bring agent APIs under the same governance and security umbrella as your existing APIs — strengthening control, security, and observability across your AI and API ecosystems. Learn more about A2A API support in Azure API Management.AI Gateway in Azure API Management Is Now Available in Microsoft Foundry (Preview)
For more than a decade, Azure API Management has been the trusted control plane for API governance, security, and observability on a global scale supporting more than 38,000 customers, almost 3 million APIs, and 3 trillion API requests every month. AI Gateway builds on this foundation, extending API Management’s proven governance, security, and observability model to AI workloads, including models, tools, and agents. Today, more than 1,200 enterprise customers use AI Gateway to safely operationalize AI at scale. As customers accelerate AI adoption, the need for consistent, centralized governance becomes even more critical. AI systems increasingly rely on a mix of models, tools, and agents, each introducing new access patterns and governance requirements. Enterprises need a unified way to ensure all this AI traffic remains secure, compliant, and cost-efficient without slowing down developer productivity. Today, we’re making that significantly easier. AI Gateway is now integrated directly into Microsoft Foundry. This gives Foundry users a simple way to govern, observe, and secure their AI workloads with the same reliability and trust as Azure API Management. This integration brings enterprise-grade AI governance directly into Microsoft Foundry right where teams design, build, and operate their AI applications and agents. It provides a streamlined experience that helps organizations adopt strong governance from day one while keeping full API Management capabilities available for advanced configuration. Governance for models With this integration, customers can create a new AI Gateway instance (powered by API Management Basic v2) or associate an existing API Management resource into their Foundry resource. Once configured, all model deployments in the Foundry resource can be accessed through the AI Gateway hostname, ensuring that calls to models, whether to Azure OpenAI or other models, flow through consistent governance and usage controls. Long-term token quotas and short-term token limits can be managed directly within the Foundry interface, enabling teams to set and adjust usage boundaries without leaving the environment where they build and deploy AI applications and agents. Learn more here. Governance for agents The integration also introduces a unified way to govern agents. Organizations can register agents running anywhere — in Azure, other clouds, or on-premises — into the Foundry Control Plane. These agents appear alongside Foundry-native agents for centralized inventory, monitoring, and governance. Teams can view telemetry collected by AI Gateway directly in Foundry or in Application Insights without any reconfiguration of agents at source. Administrators can block agents posing security, compliance, or cost risks within Foundry or apply advanced governance policies, like throttling or content safety, in Azure API Management. Learn more here. Governance for tools Tools benefit from the same consistent governance model. Foundry users can register Model Context Protocol (MCP) tools hosted across any environment and have them automatically governed through the integrated AI Gateway. These tools appear in the Foundry inventory, making them discoverable to developers and ready for consumption by agents. This reduces the operational overhead of securing and mediating tools, simplifying the path to building agentic applications that safely interact with enterprise systems. Learn more here. Unified governance across Foundry and API Management Together, these capabilities bring the power of AI Gateway directly into Microsoft Foundry removing barriers to adoption while strengthening governance. The experience is streamlined with simple setup, intuitive controls, and immediate value. At the same time, customers retain full access to the breadth and depth of API Management capabilities. When advanced policies, enterprise networking, federated gateways, or fine-grained controls are required, teams can seamlessly shift into the API Management experience without losing continuity. With AI Gateway now part of Microsoft Foundry, teams can build and scale AI applications with confidence knowing that consistent governance, security, and observability are built in from the start. AI Gateway in Microsoft Foundry gives every organization a consistent way to govern AI - models, tools, and agents - with the reliability of API Management and the velocity of Foundry. Getting started To set up and use AI Gateway in Foundry, follow the steps in this article. A new AI Gateway deploys an API Management Basic v2 instance for free for the first 100,000 calls. Explore these new capabilities in depth at Microsoft Ignite. Join the Azure API Management and Microsoft Foundry sessions. If attending the conference in person try the hands-on labs to experience how AI Gateway and Foundry help deliver secure and scalable AI applications and stop by our booths to meet the product teams behind these innovations. Session Speaker(s) Link BRK1706: Innovation Session: Build & Manage AI Apps with Your Agent Factory Yina Arenas, Sarah Bird, Amanda Silver, Marco Casalaina https://ignite.microsoft.com/en-US/sessions/BRK1706?source=sessions BRK113: Upskill AI agents with the Azure app platform Mike Hulme, Balan Subramanian, Shawn Henry https://ignite.microsoft.com/en-US/sessions/BRK113?source=sessions BRK119: Don’t let your AI agents go rogue, secure with Azure API management Anish Tallapureddy, Mike Budzynski https://ignite.microsoft.com/en-US/sessions/BRK119?source=sessions LAB519: Governing AI Apps & Agents with AI Gateway in Azure API Management Annaji sharma Ganti, Galin Iliev https://ignite.microsoft.com/en-US/sessions/LAB519?source=sessions2.7KViews3likes0CommentsBuild. Secure. Launch Your Private MCP Registry with Azure API Center.
We are thrilled to embrace a new era in the world of MCP registries. As organizations increasingly build and consume MCP servers, the need for a secure, governed, robust and easily discoverable tools catalog has become critical. Today, we are excited to show you how to do just that with MCP Center, a live example demonstrating how Azure API Center (APIC) can serve as a private and enterprise-ready MCP registry. The registry puts your MCPs just one click away for developers, ensuring no setup fuss and a direct path to coding brilliance. Why a private registry? 🤔 Public OSS registries have been instrumental in driving growth and innovation across the MCP ecosystem. But as adoption scales, so does the need for tighter security, governance, and control, this is where private MCP registries step in. This is where Azure API Center steps in. Azure API Center offers a powerful and centralized approach to MCP discovery and governance across diverse teams and services within an organization. Let's delve into the key benefits of leveraging a private MCP registry with Azure API Center. Security and Trust: The Foundation of AI Adoption Review and Verification: Public registries, by their open nature, accept submissions from a wide range of developers. This can introduce risks from tools with limited security practices or even malicious intent. A private registry empowers your organization to thoroughly review and verify every MCP server before it becomes accessible to internal developers or AI agents (like Copilot Studio and AI Foundry). This eliminates the risk of introducing random, potentially vulnerable first or third-party tools into your ecosystem. Reduced Attack Surface: By controlling which MCP servers are accessible, organizations significantly shrink their potential attack surface. When your AI agents interact solely with known and secure internal tools, the likelihood of external attackers exploiting vulnerabilities in unvetted solutions is drastically reduced. Enterprise-Grade Authentication and Authorization: Private registries enable the enforcement of your existing robust enterprise authentication and authorization mechanisms (e.g., OAuth 2) across all MCP servers. Public registries, in contrast, may have varying or less stringent authentication requirements. Enforced AI Gateway Control (Azure API Management): Beyond vetting, a private registry enables organizations to route all MCP server traffic through an AI gateway such as Azure API Management. This ensures that every interaction, whether internal or external, adheres to strict security policies, including centralized authentication, authorization, rate limiting, and threat protection, creating a secure front for your AI services. Governance and Control: Navigating the AI Landscape with Confidence Centralized Oversight and "Single Source of Truth": A private registry provides a centralized "single source of truth" for all AI-related tools and data connections within your organization. This empowers comprehensive oversight of AI initiatives, clearly identifying ownership and accountability for each MCP server. Preventing "Shadow AI": Without a formal registry, individual teams might independently develop or integrate AI tools, leading to "shadow AI" – unmanaged and unmonitored AI deployments that can pose significant risks. A private registry encourages a standardized approach, bringing all AI tools under central governance and visibility. Tailored Tool Development: Organizations can develop and host MCP servers specifically tailored to their unique needs and requirements. This means optimized efficiency and utility, providing specialized tools you won't typically find in broader public registries. Simplified Integration and Accelerated Development: A well-managed private registry simplifies the discovery and integration of internal tools for your AI developers. This significantly accelerates the development and deployment of AI-powered applications, fostering innovation. Good news! Azure API Center can be created for free in any Azure subscription. You can find a detailed guide to help you get started: Inventory and Discover MCP Servers in Your API Center - Azure API Center Get involved 💡 Your remote MCP server can be discoverable on API Center’s MCP Discovery page today! Bring your MCP server and reach Azure customers! These Microsoft partners are shaping the future of the MCP ecosystem by making their remote MCP Servers discoverable via API Center’s MCP Discovery page. Early Partners: Atlassian – Connect to Jira and Confluence for issue tracking and documentation Box – Use Box to securely store, manage and share your photos, videos, and documents in the cloud Neon – Manage and query Neon Postgres databases with natural language Pipedream – Add 1000s of APIs with built-in authentication and 10,000+ tools to your AI assistant or agent - coming soon - Stripe – Payment processing and financial infrastructure tools If partners would like their remote MCP servers to be featured in our Discover Panel, reach out to us here: GitHub/mcp-center and comment under the following GitHub issue: MCP Server Onboarding Request Ready to Get Started? 🚀 Modernize your AI strategy and empower your teams with enhanced discovery, security, and governance of agentic tools. Now's the time to explore creating your own private enterprise MCP registry. Check out MCP Center, a public showcase demonstrating how you can build your own enterprise MCP registry - MCP Center - Build Your Own Enterprise MCP Registry - or go ahead and create your Azure API Center today!9.2KViews7likes4CommentsNot able to setup azure private endpoint url as webservice/backend for Azure API Management service
Hi all, I have integrated Private endpoint connected to private link service. Private link service is created by azure standard load balancer created by kubernetes load balancer service using below annotations . annotations: service.beta.kubernetes.io/azure-load-balancer-internal: "true" service.beta.kubernetes.io/azure-pls-create: "true" service.beta.kubernetes.io/azure-pls-name: myPLS service.beta.kubernetes.io/azure-pls-ip-configuration-subnet: YOUR SUBNET service.beta.kubernetes.io/azure-pls-ip-configuration-ip-address-count: "1" service.beta.kubernetes.io/azure-pls-ip-configuration-ip-address: SUBNET_IP service.beta.kubernetes.io/azure-pls-proxy-protocol: "false" service.beta.kubernetes.io/azure-pls-visibility: "*" # does not apply here because we will use Front Door later service.beta.kubernetes.io/azure-pls-auto-approval: "YOUR SUBSCRIPTION ID" i am getting expected response i.e response from kubernetes service from Private endpoint ip which confirms that private link and private endpoint integration is working fine. we now want to integrate above private endpoint service with azure api management service so we tried adding private endpoint url as web service url for api management service but api management service is returning 500 error { "statusCode": 500, "message": "Internal server error", "activityId": "76261291-7121-4814-b0e4-66b52284d76c" } I also tried api management service Troubleshoot & analysis page for exact error its showing below error: BackendConnectionFailure An attempt was made to access a socket in a way forbidden by its access permissions <private_endpoint_url>:80 Please help me what i am doing wrong in this implementation Our requirement is to have kubernetes private load balancer and integrate it with azure api management service. so user can access api only through api management service and only api management service should be able to access load balancer service. Thanks in advance761Views0likes1Comment