azure api management
59 TopicsCustom Tracing in API Management
When issues such as run time errors, or unexpected behavior occur, request tracing can help isolate the problem by showing policy code or any other component is responsible. The trace policy in APIM can add a custom trace into the request tracing output532Views1like0CommentsAI Resilience: Strategies to Keep Your Intelligent App Running at Peak Performance
Stay Online Reliability. It's one of the 5 pillars of Azure Well-Architect Framework. When starting to implement and go-to-market any new product witch has any integration with Open AI Service you can face spikes of usage in your workload and, even having everything scaling correctly in your side, if you have an Azure Open AI Services deployed using PTU you can reach the PTU threshold and them start to experience some 429 response code. You also will receive some important information about the when you can retry the request in the header of the response and with this information you can implement in your business logic a solution. Here in this article I will show how to use the API Management Service policy to handle this and also explore the native cache to save some tokens! Architecture Reference The Azure Function in the left of the diagram just represent and App request and can be any kind of resource (even in an On-Premisse environment). Our goal in this article is to show one in n possibilities to handle the 429 responses. We are going to use API Management Policy to automatically redirect the backend to another Open AI Services instance in other region in the Standard mode, witch means that the charge is going to be only what you use. First we need to create an API in our API Management to forward the requests to your main Open AI Services (region 1 in the diagram). Now we are going to create this policy in the API call request: <policies> <inbound> <base /> <set-backend-service base-url="<your_open_ai_region1_endpoint>" /> </inbound> <backend> <base /> </backend> <outbound> <base /> </outbound> <on-error> <retry condition="@(context.Response.StatusCode == 429)" count="1" interval="5" /> <set-backend-service base-url="<your_open_ai_region2_endpoint>" /> </on-error> </policies> The first part of our job is done! Now we have an automatically redirect to our OpenAI Services deployed at region 2 when our PTU threshold is reached. Cost consideration So now you can ask me: and about my cost increment for using API Management? Even if you don't want to use any other feature on API Management you can leverage of the API Management native cache and, once again using policy and AI, put some questions/answers in the built-in Redis* cache using semantic cache for Open AI services. Let's change our policy to consider this: <policies> <inbound> <base /> <azure-openai-semantic-cache-lookup score-threshold="0.05" embeddings-backend-id ="azure-openai-backend" embeddings-backend-auth ="system-assigned" > <vary-by>@(context.Subscription.Id)</vary-by> </azure-openai-semantic-cache-lookup> <set-backend-service base-url="<your_open_ai_region1_endpoint>" /> </inbound> <backend> <base /> </backend> <outbound> <base /> <azure-openai-semantic-cache-store duration="60" /> </outbound> <on-error> <retry condition="@(context.Response.StatusCode == 429)" count="1" interval="5" /> <set-backend-service base-url="<your_open_ai_region2_endpoint>" /> </on-error> </policies> Now, API Management will handle the tokens inputted and use semantic equivalence and decide if its fit with cached information or redirect the request to your OpenAI endpoint. And, sometime, this can help you to avoid reach the PTU threshold as well! * Check the tier / cache capabilities to validate your business solution needs with the API Management cache feature: Compare API Management features across tiers and cache size across tiers. Conclusion API Management offers key capabilities for AI that we are exploring in this article and also others that you can leverage for your intelligent applications. Check it out on this awesome AI Gateway HUB repository At least but not less important, dive in API Management features with experts in the field inside the API Management HUB. Thanks for reading and Happy Coding!469Views4likes1CommentImport Logic Apps (Standard) into Azure API Management
API Management (APIM) is a way to create consistent and modern API gateways for existing back-end services. API Management helps organizations publish APIs to external, partner, and internal developers to unlock the potential of their data and services. Azure Logic Apps is a cloud-based platform for creating and running automated logic app workflows that integrate your apps, data, services, and systems. With this platform, you can quickly develop highly scalable integration solutions for your enterprise and business-to-business (B2B) scenarios. To create a logic app, you use either the Logic App (Consumption) resource type or the Logic App (Standard) resource type. The Consumption resource type runs in the multi-tenant Azure Logic Apps or integration service environment, while the Standard resource type runs in single-tenant Azure Logic Apps environment. This blog walks you through step by step on how to import Logic App (Standard) into Azure API Management. For how to import a Logic App (Consumption) into APIM, please refer to our public doc Prerequisites— Create an Azure API Management instance. Create a Logic App The functionality to directly import from “Create from Azure Resource” is not available for workflows in Logic App (Standard) yet. We will demonstrate how to overcome this limitation in the followings. Steps to import Logic App (Standard) into Azure API Management: ======================================================== As an alternative, to import the Logic App (Standard) we can manually register the Request trigger URL from workflows as a blank API in APIM service. We will need to divide the Request URL(i.e., Logic Apps Workflow URL) into two parts to put into the backend and frontend. For example – this request URL can be broken into 2 segments— https://stdla1.azurewebsites.net:443/api/TESTWF1/triggers/manual/invoke?api-version=2020-05-01-pre...> Part 1 https://stdla1.azurewebsites.net:443/api/ Part 2 /test2/triggers/manual/invoke?api-version=2020-05-01-preview&sp=%2Ftriggers%2Fmanual%2Frun&sv=1.0&sig=<123abc> We need to place the part 1 URL into either Webservice URL Or Backend HTTP(s) endpoint by clicking on highlighted part as portrayed below Or Place the part 1 URL of Workflow URL into the backend HTTP(s) endpoint by clicking on highlighted part as portrayed below Then select target as HTTP(s) endpoint and click on override and provide the first part of your request URL as shown in the screenshot. Next add part 2 URL into the frontend section as depicted in the screenshot below. On testing, it gives 200 OK response. Similarly, you can add various workflows as different operations in the same API. This method is useful for Logic App (Standard) Workflows. Happy Learning!! 🙂15KViews2likes3CommentsConfigure rate limits for different API operations in Azure API Management
Azure API Management (APIM) is one of the PaaS products offered by Azure which allows you to publish, manage, secure and monitor APIs. One of the features of APIM is the ability to control the traffic to your APIs using policies such as rate limits and quotas. Rate limits are policies that prevent API usage spikes on a per subscription or per key basis by limiting the call rate to a specified number per a specified time period. Quotas are policies that enforce a hard limit on the number of calls that can be made to an API within a billing period. In this blog post, we will focus on how to configure rate limits for different operations in APIM using the `rate-limit-by-key` policy. This policy allows you to define expressions to identify the keys that are used to track traffic usage. You can use any arbitrary string value as a key, such as IP address, subscription ID, etc. Scenario Let's say you have two operations in your API: Operation A and Operation B. You want to apply different rate limits for each operation based on your business requirements. For example: - Operation A has a rate limit of 5 calls per minute - Operation B has a rate limit of 5 calls per 30 seconds You also want to make sure that the rate limits are independent by operation, meaning that calling one operation does not affect the counter for another operation. Solution As per our official document, operations in APIM (regardless API) use a single counter for all scopes at which the policy is configured. Say, if you make 2 calls to one operation, these calls will be counted towards the single counter used by all of the operations. To achieve this scenario, you need to use the `rate-limit-by-key` policy with an expression that produces unique values for different operations. One way to ensure that is to add `context.Operation.Id` to the expression. The `context.Operation.Id` property returns a unique identifier for each operation in your API. By concatenating it with another value such as IP address or subscription ID, you can create a key that is specific for each operation and each caller. Here is an example of how you can apply this policy at the inbound section of your API: <policies> <inbound> <!-- Extracts caller's IP address --> <set-variable name="caller-ip" value="@(context.Request.IpAddress)" /> <!-- Applies rate limit by key using IP address and operation ID --> <rate-limit-by-key calls="5" renewal-period="60" counter-key="@(context.Request.IpAddress + context.Operation.Id)" /> <base /> </inbound> ... </policies> ``` To test this solution, you can use any tool that can send HTTP requests such as Postman or curl. You can also use Azure Portal's Test console feature. Note: Due to the distributed nature of throttling architecture, rate limiting is never completely accurate. The difference between the configured and the actual number of allowed requests varies based on request volume and rate, backend latency, and other factors. You can also customize this policy by adding optional attributes such as increment-condition or quota-exceeded-response-code. For more details on how this policy works and what options are available, see: https://learn.microsoft.com/en-us/azure/api-management/rate-limit-by-key-policy In this blog post, we have seen how to configure rate limits for different operations of an API in Azure API Management using the `rate-limit-by-key` policy. This policy allows us to define expressions to identify the keys that are used to track traffic usage. We have also seen how to use `context.Operation.Id` as part of the key expression to ensure that the rate limits are independent by operation. We hope this blog helps you!10KViews1like1CommentHow to deploy APIM self-hosted gateway in Windows Server 2019
APIM self-hosted gateway is packaged as a x86-64 Linux-based Docker container. In Windows 10, if we have Docker Desktop installed, we can easily switch the Docker to Linux Mode so we can spin up Linux containers. However, it takes a few more steps to run Linux containers in windows server OS.6.1KViews2likes1CommentAPI Management - Networking FAQs (Demystifying Series I)
This is a Demystifying series on the various questions related to the integration of API Management with its Networking components. This article will answer the most common questions revolving around APIM and networking with API Management as its primary point of focus. If you wish to cover more details/FAQs on the article, mention them in the comments.29KViews6likes9CommentsCompute Platform Versions for Azure API management service
For Azure API Management (APIM) service users, you may notice that we've been upgrading the API Management compute platform version - the Azure compute resources that host the service - for instances in several service tiers, in order to enhance service capabilities. This article gives you context about the upgrade from platform version stv1 to stv2, and the major differences between these compute platforms.25KViews7likes3Comments