openai
23 TopicsIntroducing GenAI Gateway Capabilities in Azure API Management
We are thrilled to announce GenAI Gateway capabilities in Azure API Management – a set of features designed specifically for GenAI use cases. Azure OpenAI service offers a diverse set of tools, providing access to advanced models like GPT3.5-Turbo to GPT-4 and GPT-4 Vision, enabling developers to build intelligent applications that can understand, interpret, and generate human-like text and images. One of the main resources you have in Azure OpenAI is tokens. Azure OpenAI assigns quota for your model deployments expressed in tokens-per-minute (TPMs) which is then distributed across your model consumers that can be represented by different applications, developer teams, departments within the company, etc. Starting with a single application integration, Azure makes it easy to connect your app to Azure OpenAI. Your intelligent application connects to Azure OpenAI directly using API Key with a TPM limit configured directly on the model deployment level. However, when you start growing your application portfolio, you are presented with multiple apps calling single or even multiple Azure OpenAI endpoints deployed as Pay-as-you-go or Provisioned Throughput Units (PTUs) instances. That comes with certain challenges: How can we track token usage across multiple applications? How can we do cross charges for multiple applications/teams that use Azure OpenAI models? How can we make sure that a single app does not consume the whole TPM quota, leaving other apps with no option to use Azure OpenAI models? How can we make sure that the API key is securely distributed across multiple applications? How can we distribute load across multiple Azure OpenAI endpoints? How can we make sure that PTUs are used first before falling back to Pay-as-you-go instances? To tackle these operational and scalability challenges, Azure API Management has built a set of GenAI Gateway capabilities: Azure OpenAI Token Limit Policy Azure OpenAI Emit Token Metric Policy Load Balancer and Circuit Breaker Import Azure OpenAI as an API Azure OpenAI Semantic Caching Policy (in public preview) Azure OpenAI Token Limit Policy Azure OpenAI Token Limit policy allows you to manage and enforce limits per API consumer based on the usage of Azure OpenAI tokens. With this policy you can set limits, expressed in tokens-per-minute (TPM). This policy provides flexibility to assign token-based limits on any counter key, such as Subscription Key, IP Address or any other arbitrary key defined through policy expression. Azure OpenAI Token Limit policy also enables pre-calculation of prompt tokens on the Azure API Management side, minimizing unnecessary request to the Azure OpenAI backend if the prompt already exceeds the limit. Learn more about this policy here. Azure OpenAI Emit Token Metric Policy Azure OpenAI enables you to configure token usage metrics to be sent to Azure Applications Insights, providing overview of the utilization of Azure OpenAI models across multiple applications or API consumers. This policy captures prompt, completions, and total token usage metrics and sends them to Application Insights namespace of your choice. Moreover, you can configure or select from pre-defined dimensions to split token usage metrics, enabling granular analysis by Subscription ID, IP Address, or any custom dimension of your choice. Learn more about this policy here. Load Balancer and Circuit Breaker Load Balancer and Circuit Breaker features allow you to spread the load across multiple Azure OpenAI endpoints. With support for round-robin, weighted (new), and priority-based (new) load balancing, you can now define your own load distribution strategy according to your specific requirements. Define priorities within the load balancer configuration to ensure optimal utilization of specific Azure OpenAI endpoints, particularly those purchased as PTUs. In the event of any disruption, a circuit breaker mechanism kicks in, seamlessly transitioning to lower-priority instances based on predefined rules. Our updated circuit breaker now features dynamic trip duration, leveraging values from the retry-after header provided by the backend. This ensures precise and timely recovery of the backends, maximizing the utilization of your priority backends to their fullest. Learn more about load balancer and circuit breaker here. Import Azure OpenAI as an API New Import Azure OpenAI as an API in Azure API management provides an easy single click experience to import your existing Azure OpenAI endpoints as APIs. We streamline the onboarding process by automatically importing the OpenAPI schema for Azure OpenAI and setting up authentication to the Azure OpenAI endpoint using managed identity, removing the need for manual configuration. Additionally, within the same user-friendly experience, you can pre-configure Azure OpenAI policies, such as token limit and emit token metric, enabling swift and convenient setup. Learn more about Import Azure OpenAI as an API here. Azure OpenAI Semantic Caching policy Azure OpenAI Semantic Caching policy empowers you to optimize token usage by leveraging semantic caching, which stores completions for prompts with similar meaning. Our semantic caching mechanism leverages Azure Redis Enterprise or any other external cache compatible with RediSearch and onboarded to Azure API Management. By leveraging the Azure OpenAI Embeddings model, this policy identifies semantically similar prompts and stores their respective completions in the cache. This approach ensures completions reuse, resulting in reduced token consumption and improved response performance. Learn more about semantic caching policy here. Get Started with GenAI Gateway Capabilities in Azure API Management We’re excited to introduce these GenAI Gateway capabilities in Azure API Management, designed to empower developers to efficiently manage and scale their applications leveraging Azure OpenAI services. Get started today and bring your intelligent application development to the next level with Azure API Management.36KViews10likes14CommentsExpose REST APIs as MCP servers with Azure API Management and API Center (now in preview)
As AI-powered agents and large language models (LLMs) become central to modern application experiences, developers and enterprises need seamless, secure ways to connect these models to real-world data and capabilities. Today, we’re excited to introduce two powerful preview capabilities in the Azure API Management Platform: Expose REST APIs in Azure API Management as remote Model Context Protocol (MCP) servers Discover and manage MCP servers using API Center as a centralized enterprise registry Together, these updates help customers securely operationalize APIs for AI workloads and improve how APIs are managed and shared across organizations. Unlocking the value of AI through secure API integration While LLMs are incredibly capable, they are stateless and isolated unless connected to external tools and systems. Model Context Protocol (MCP) is an open standard designed to bridge this gap by allowing agents to invoke tools—such as APIs—via a standardized, JSON-RPC-based interface. With this release, Azure empowers you to operationalize your APIs for AI integration—securely, observably, and at scale. 1. Expose REST APIs as MCP servers with Azure API Management An MCP server exposes selected API operations to AI clients over JSON-RPC via HTTP or Server-Sent Events (SSE). These operations, referred to as “tools,” can be invoked by AI agents through natural language prompts. With this new capability, you can expose your existing REST APIs in Azure API Management as MCP servers—without rebuilding or rehosting them. Addressing common challenges Before this capability, customers faced several challenges when implementing MCP support: Duplicating development efforts: Building MCP servers from scratch often led to unnecessary work when existing REST APIs already provided much of the needed functionality. Security concerns: Server trust: Malicious servers could impersonate trusted ones. Credential management: Self-hosted MCP implementations often had to manage sensitive credentials like OAuth tokens. Registry and discovery: Without a centralized registry, discovering and managing MCP tools was manual and fragmented, making it hard to scale securely across teams. API Management now addresses these concerns by serving as a managed, policy-enforced hosting surface for MCP tools—offering centralized control, observability, and security. Benefits of using Azure API Management with MCP By exposing MCP servers through Azure API Management, customers gain: Centralized governance for API access, authentication, and usage policies Secure connectivity using OAuth 2.0 and subscription keys Granular control over which API operations are exposed to AI agents as tools Built-in observability through APIM’s monitoring and diagnostics features How it works MCP servers: In your API Management instance navigate to MCP servers Choose an API: + Create a new MCP Server and select the REST API you wish to expose. Configure the MCP Server: Select the API operations you want to expose as tools. These can be all or a subset of your API’s methods. Test and Integrate: Use tools like MCP Inspector or Visual Studio Code (in agent mode) to connect, test, and invoke the tools from your AI host. Getting started and availability This feature is now in public preview and being gradually rolled out to early access customers. To use the MCP server capability in Azure API Management: Prerequisites Your APIM instance must be on a SKUv1 tier: Premium, Standard, or Basic Your service must be enrolled in the AI Gateway early update group (activation may take up to 2 hours) Use the Azure Portal with feature flag: ➤ Append ?Microsoft_Azure_ApiManagement=mcp to your portal URL to access the MCP server configuration experience Note: Support for SKUv2 and broader availability will follow in upcoming updates. Full setup instructions and test guidance can be found via aka.ms/apimdocs/exportmcp. 2. Centralized MCP registry and discovery with Azure API Center As enterprises adopt MCP servers at scale, the need for a centralized, governed registry becomes critical. Azure API Center now provides this capability—serving as a single, enterprise-grade system of record for managing MCP endpoints. With API Center, teams can: Maintain a comprehensive inventory of MCP servers. Track version history, ownership, and metadata. Enforce governance policies across environments. Simplify compliance and reduce operational overhead. API Center also addresses enterprise-grade security by allowing administrators to define who can discover, access, and consume specific MCP servers—ensuring only authorized users can interact with sensitive tools. To support developer adoption, API Center includes: Semantic search and a modern discovery UI. Easy filtering based on capabilities, metadata, and usage context. Tight integration with Copilot Studio and GitHub Copilot, enabling developers to use MCP tools directly within their coding workflows. These capabilities reduce duplication, streamline workflows, and help teams securely scale MCP usage across the organization. Getting started This feature is now in preview and accessible to customers: https://aka.ms/apicenter/docs/mcp AI Gateway Lab | MCP Registry 3. What’s next These new previews are just the beginning. We're already working on: Azure API Management (APIM) Passthrough MCP server support We’re enabling APIM to act as a transparent proxy between your APIs and AI agents—no custom server logic needed. This will simplify onboarding and reduce operational overhead. Azure API Center (APIC) Deeper integration with Copilot Studio and VS Code Today, developers must perform manual steps to surface API Center data in Copilot workflows. We’re working to make this experience more visual and seamless, allowing developers to discover and consume MCP servers directly from familiar tools like VS Code and Copilot Studio. For questions or feedback, reach out to your Microsoft account team or visit: Azure API Management documentation Azure API Center documentation — The Azure API Management & API Center Teams7.7KViews5likes7Comments📢Announcing agent loop: Build AI Agents in Azure Logic Apps 🤖
This post is written in collaboration with Kent Weare and Rohitha Hewawasam The era of intelligent business processes has arrived! Today, we are excited to announce agent loop, a groundbreaking new capability in Azure Logic Apps to build AI agents into your enterprise workflows. With agent loop, you can embed advanced AI decision-making directly into your processes – enabling your apps and automation to not just follow predefined steps, but to reason, adapt, and act autonomously towards goals. Agent loop becomes central to AI Agent development — it’s a new action type that brings together your AI model of choice, domain-specific tools, and enterprise knowledge sources. Whether you’re building an autonomous agent to process loan approvals, a conversational agent to support customers, or a multi-agent system that coordinates tasks such as Sales Report generation across agents, Agent Loop enables your workflows to go beyond static steps — making decisions, adapting to context, and delivering outcomes. Agent loop is implemented using kernel object in the Semantic Kernel. The kernel object, along with an LLM, creates the plan for what needs to be done, while Logic Apps runtime handles execution of that plan. Agent Loop is highly configurable, enabling you to build agents with diverse capabilities: Conversational or Autonomous Agents With Logic Apps' extensive gallery of connectors, you can build fully autonomous agents that respond to real-time events — like new records in a database, files added to a share, or messages in a queue. Agent Loop also supports conversational agents via Channels, allowing agents to interact with users through the Azure portal or custom chat clients. Bring your own Model Associate your AI agent with any Azure OpenAI model of your choice. As new models become available, you can easily switch or upgrade without re-architecting the solution. Define Agent Goals and Guardrails Specify your agent’s objective and behavioral boundaries through system prompts and user instructions. Using connectors like Outlook or Teams, you can easily introduce human-in-the-loop interactions for approvals or overrides — enabling safe, controlled autonomy. Tools and Knowledge, Built In Leverage hundreds of out-of-the-box connectors to equip agents with access to enterprise systems, APIs, and business data. Enrich their reasoning with knowledge from vector stores, structured databases, or unstructured files, and empower them to take meaningful actions across your environment. AI Agents in Action Here are some examples of AI Agents in Action that highlight the value and efficiencies of these agents across different domains and solution areas. A product return agent verifies order details, return eligibility, and refund rules, then processes the return or requests additional info from the customer. A loan approval agent evaluates credit score, income, and risk profile, applies business rules, and auto-approves or routes applications for review. A recruiting agent screens resumes, summarizes qualifications, and drafts personalized outreach to top candidates, streamlining early hiring stages. A sales report generation workflow uses a writer agent to draft content, a reviewer agent to verify accuracy, and a publisher agent to format and distribute the report. An IT operations agent triages alerts, checks recent changes, and either resolves common issues or escalates to on-call engineers when needed. A multi-agent retail supply chain solution combines inventory and logistics agents to ensure timely restocks and optimize fulfillment routes. Why agent loop matters Modern businesses thrive on agility and intelligence. Traditional workflows remain essential for deterministic tasks—especially those involving structured data or high-risk decisions. But when processes involve unstructured data, changing context, or require adaptive decision-making, AI agents excel. They can reason, act in real time, and dynamically sequence steps to meet goals. Agent Loop exactly serves this purpose. What makes Agent Loop especially powerful is its deep integration with the Logic Apps ecosystem. Logic Apps comes with over 1,400+ connectors for Microsoft and third-party services – from databases and ERP systems to SaaS applications and custom APIs. They can also invoke custom code and scripts, making it easy to tap into homegrown capabilities. The agent isn’t limited to information in its prompt; it can actively retrieve knowledge, perform transactions, and effect change in the real world via these connectors. Logic Apps is uniquely positioned to enable customers to leverage their API and connector ecosystem cohesively across their workflows and AI Agents to build agentic applications. Equally important, Agent Loop is designed for flexibility. You can orchestrate single-agent workflows or coordinate multiple agents working in tandem towards a common goal. Agent Loop can even involve humans in the loop when needed – for instance, pausing to get a manager’s approval or to ask for clarification – leveraging Logic Apps’ human workflow capabilities. All of this is handled within the familiar, visual Logic Apps designer, so you get a high-level view of the entire orchestration. How agent loop works At a high level, Agent Loop works by pairing the reasoning capabilities of large-scale AI models with the robust action framework of Logic Apps. Built on top of Semantic Kernel, the Agent loop operates in iterative cycles, allowing the agent to think, act, and learn from each step: Reasoning (Think): The agent (powered by an LLM like Azure OpenAI Service under the hood) and on Semantic Kernel, examines its goal and the current context. It decides what needs to be done next – whether that’s gathering more information, calling a specific connector, or formulating an answer. This step is essentially the AI “planning” its next action based on the goal you’ve provided and the data it has so far. Action (Act): The agent then carries out the decided action by invoking a tool or connector through Logic Apps. This could be anything from querying a database, calling a REST API, sending an email, to running a calculation. Thanks to Logic Apps’ extensive connector library, the agent has a rich toolbox at its disposal. Each action is executed as a Logic Apps step, meaning it’s secure, managed, and logged like any other workflow action. Reflection (Learn): After the action, the agent receives the results (e.g. data retrieved, outcome of the API call, user input, etc.). It then evaluates: Did this bring it closer to the goal? Does the plan need adjusting? The agent updates its understanding based on new information. This reflection is what lets the agent handle complex, open-ended tasks – it can correct course if needed, try alternative approaches, or conclude if the goal has been satisfied. These steps repeat in a loop. The Agent Loop action manages this cycle automatically – calling the AI model to reason, executing the chosen connector operations, feeding results back, and iterating. Why Build AI Agents in Logic Apps? Building AI agents is an emerging frontier in automation but doing it from the ground up can be daunting especially when organizations build them in large numbers. Agent Loop in Logic Apps makes this dramatically easier and more scalable for several reasons: Declarative Orchestration: Logic Apps provides a visual workflow canvas and a serverless runtime. The Agent Loop action plugs into this and the platform handles the sequence of steps and iterations, so you can focus on defining the goal and selecting the connectors (tools) the agent can use. Code extensibility: Logic Apps supports both declarative and code-first approaches to building agents. You can combine the two — using visual designer for orchestration and injecting code where needed through extensibility points. Write custom logic in C#, PowerShell, JavaScript, or use inline scripts for lightweight processing. Python support is coming soon, enabling even more flexibility. 1400+ Integrated Tools: With the rich connector ecosystem at its disposal, your agent can seamlessly tap into your enterprise systems and SaaS applications. Your entire ecosystem of connectors, APIs, custom code and agents can be used by deterministic workflows and agents to solve business problems Observability: Logic Apps offers full traceability into each agent’s decisions and actions. Every run is logged in the workflow history, with data stored within the customer’s own network and storage boundaries. The Agent Chat view provides insights into the agent’s reasoning, tool invocations, and goal progress. Developers can easily revisit these logs for debugging, auditing, or analysis. Enterprise-Grade Governance: Because it runs on Azure Logic Apps, agent loop inherits all the robust monitoring, logging, security and compliance capabilities of the platform You can secure connections with managed identities and leverage built-in rate limiting, retries, and exception handling. Your AI agents run with the same enterprise-ready guardrails as any mission-critical workflow. Human-in-the-Loop & Multi-Agent Coordination: Logic Apps makes it straightforward to involve people at key decision points or to coordinate multiple agents. You can chain Agent Loop actions or have agents invoke other workflows, enabling collaborative problem-solving that would be difficult to implement from scratch. The result is a system where AI and humans can smoothly interact and complement each other. Faster Time to Value: By eliminating the boilerplate work of building an agent architecture (managing memory, planning logic, connecting to services, etc.), Agent Loop lets developers and architects concentrate on high-value logic and business goals, accelerating how you bring AI-driven improvements to your business processes. In short, agent loop combines the brains of generative AI with the brawn of Azure’s integration platform. It offers a turnkey way to build sophisticated AI-driven automation without reinventing the wheel. Companies no longer have to choose between the flexibility of custom AI solutions and the convenience of a managed workflow service – with Logic Apps and Agent Loop, you get both. Getting Started Agent Loop is available in Logic Apps Standard starting today! Here are some resources to help you begin: Documentation: Explore the agent loop concepts and detailed guide with step-by-step instructions on how to configure and use Agent Loop. Samples & Demos: Watch pre-recorded demos showcasing both conversational and autonomous agent scenarios built with Agent Loop. You'll also get a preview of exciting features coming soon. Looking Ahead Agent Loop opens up a new realm of possibilities for what you can achieve with Azure Logic Apps. It blurs the line between application integration and AI, allowing workflows to evolve from static sequences into adaptive, self-directed processes. We can’t wait to see what you will build with Agent Loop! This is just the beginning. We’re actively investing in new capabilities that are planned for release soon Multi-agent Hand-off Support – A multi-agent application with hand-off capabilities enables different agent-loops to collaborate by transferring tasks between one another based on expertise or context, which is crucial for building agentic applications that can dynamically adapt to complex, evolving goals and user needs. A2A (Agent-to-Agent) protocol support – A2A is a communication standard that defines how autonomous agents exchange messages, share context, and coordinate actions in a secure and structured way. It’s especially important in building agentic applications because it ensures interoperability, enables seamless hand-offs between agents, and maintains context integrity across different agents working toward a shared goal. This will allow Logic Apps agents to seamlessly integrate with other agentic platforms. OBO Auth for Logic Apps Agents: On Behalf Of Auth support for logic Apps agents would allow Logic Apps agents to use logged-in users identity for authentication when invoking Logic Apps connectors as part of agent-loop execution. This will enable building conversational applications to dynamically perform OAuth flows for fetching consent from log-in users to invoke Logic Apps connectors on logged-in user’s behalf. Contact Us Have feedback or questions about Agent Loop? We’d love to hear from you. Reply directly to this blog post or reach out to us through this form. Your input helps shape the future of Logic Apps and agentic automation.8.8KViews4likes2CommentsAzure API Management Turns 10: Celebrating a Decade of Customer-Driven Innovation and Success
This September marks a truly special occasion: Azure API Management turns 10! Since our launch in 2014, we've been on an incredible journey, transforming how businesses connect, scale and secure their digital ecosystems. As the first cloud provider to integrate API management into its platform, Azure has led the way in helping organizations seamlessly navigate the evolving digital landscape.3.8KViews4likes3CommentsIntegrate GPT4o (Azure Open AI) in Teams channel via Logic App with image supportability
As per the previous blog (Integrate Azure Open AI in Teams Channel via Logic App - Microsoft Community Hub) which only supports for plain text conversation. For now, Azure Open AI released GPT-4o model which integrates text and images in a single model, enabling it to handle multiple data types simultaneously. This post is to introduce how to upgrade to GPT-4o with capability for image processing.3.5KViews4likes0CommentsAnnouncing AI building blocks in Logic Apps (Consumption)
We’re thrilled to announce that the Azure OpenAI and AI Search connectors, along with the Parse Document and Chunk Text actions, are now available in the Logic Apps Consumption SKU! These capabilities, already available in the Logic Apps Standard SKU, can now be leveraged in serverless, pay-as-you-go workflows to build powerful AI-driven applications providing cost-efficiency and flexibility. What’s new in Consumption SKU? This release brings almost all the advanced AI capabilities from Logic Apps Standard to Consumption SKU, enabling lightweight, event-driven workflows that automatically scale with your needs. Here’s a summary of the operations now available: Azure OpenAI connector operations Get Completions: Generate text with Azure OpenAI’s GPT models for tasks such as summarization, content creation, and more. Get Embeddings: Generate vector embeddings from text for advanced scenarios like semantic search and knowledge mining. AI Search connector operations Index Document: Add or update a single document in an AI Search index. Index Multiple Documents: Add or update multiple documents in an AI Search index in one operation. *Note: The Vector Search operation for enabling retrieval pattern will be highlighted in an upcoming release in December.* Parse Document and Chunk Text Actions Under the Data operations connector: Parse Document: Extract structured data from uploaded files like PDFs or images. Chunk Text: Split large text blocks into smaller chunks for downstream processing, such as generating embeddings or summaries. Demo workflow: Automating document ingestion with AI To showcase these capabilities, here’s an example workflow that automates document ingestion, processing, and indexing: Trigger: Start the workflow with an HTTP request or an event like a file upload to Azure Blob Storage. Get Blob Content: Retrieve the document to be processed. Parse Document: Extract structured information, such as key data points from a service agreement. Chunk Text: Split the document content into smaller, manageable text chunks. Generate Embeddings: Use the Azure OpenAI connector to create vector embeddings for the text chunks. Select array: To compose the inputs being passed to Index documents operation Index Data: Store the embeddings and metadata for downstream applications, like search or analytics Why choose Consumption SKU? With this release, Logic Apps Consumption SKU allows you to: - Build smarter, scalable workflows: Leverage advanced AI capabilities without upfront infrastructure costs. - Pay only for what you use: Ideal for event-driven workloads where cost-efficiency is key. - Integrate seamlessly: Combine AI capabilities with hundreds of existing Logic Apps connectors. What’s next? In December, we’ll be announcing the Vector Search operation for the AI Search connector, enabling retrieval capability in Logic Apps Consumption SKU to bring feature parity with Standard SKU. This will allow you to perform advanced search scenarios by matching queries with contextually similar content. Stay tuned for updates!825Views3likes0Comments