Monitor your LLM API endpoints

Microsoft

Apr 03, 2025

This article is part of a series of articles on API Management and Generative AI. We firmly believe that adding Azure API Management to your AI projects can help you scale your AI models, make them more secure, and easier to manage.

In this article, we will talk about how to monitor requests made to an Azure Open AI endpoint. This involves the usage of Azure API Management service, Azure App Insights and deciding on what dimensions to monitor.

The problem: do you know what's going on with your API?

So you have created an Azure Open AI endpoint. Everything is good, your users are enjoying your app, now with AI capabilities. Or is it good, how do you know? The way to know what's going on is through monitoring. Monitoring is the process of collecting and analyzing data to determine the performance and availability of your system.

A way to monitor is to use Azure Monitor and Azure API Insights. However, if you've added your Azure Open AI endpoints, it's a good idea to put something like Azure API Management in front of it and if you do that, you can then enable monitoring based on dimensions of your liking. This way you can track the metrics you're interested in.

What to monitor?

When monitoring an API, you want to know quite a few things. Below are some of the things you might want to monitor:

How many requests: The sheer volume of requests can give you an idea of how popular your API is.
How long they take:Tthe time it takes for a request to be processed can give you an idea of how responsive your API is. If requests take too long to process, it can be a sign that your API is underperforming and your users are getting frustrated.
How many are failing: Number of failed requests can give you an idea of how reliable your API is. If a large number of requests are failing, it can be a sign that your API is not working as expected and your users are not getting the results they expect.
From where: You also want to know where the requests are coming from, this allows you to identify patterns and trends in your API usage. For example, you might notice that most of your requests are coming from a particular region or device, which can help you optimize your API for that region or device.
What the requests are: Knowing what the requests are allows you to identify shortcomings in your LLM, for example you might need to add more training data, a tool or similar to ensure that the requests are being processed correctly.
Who is making them. For example, you might notice that most of your requests are coming from a particular user or group of users, which can help you optimize your API for that user or group of users.

This information can help you identify problems, optimize performance, and make informed decisions about your API.

Azure Open AI Emit Token Metric

This is the policy that emits a metric to Azure Monitor. The policy is added to the APIs inbound setting. The idea is to provide it with a number of dimensions that can be used to filter the metric in Azure Monitor.

<azure-openai-emit-token-metric namespace="metric namespace" > <dimension name="dimension name" value="dimension value" /> ...additional dimensions... </azure-openai-emit-token-metric>

Dimensions to monitor

Here are some dimensions you can track by, that can also be used without specifying a value:

API ID
Operation ID
Product ID
User ID
Subscription ID
Location
Gateway ID

Adding monitoring

For this policy to work, you need the following:

Azure API Management instance
Azure OpenAI Service APIs added to your API Management instance
Azure App Insights resource

-1- Adding Azure OpenAI Service APIs to your API Management instance

One or more Azure OpenAI Service APIs must be added to your API Management instance. For more information, see Add an Azure OpenAI Service API to Azure API Management.

After you've set that up, you should have your Open AI look like so:

-2- Integrate Azure API Management with Azure Application Insights

Your API Management instance must be integrated with Application insights. For more information, see How to integrate Azure API Management with Azure Application Insights.

Add App Insights to Azure API Management: Add the Azure App Insights resource to your API Management instance, see below image:

Enable Application Insights logging for your Azure OpenAI APIs.
1. Navigate to your Azure API Management service instance in the Azure portal.
2. Select APIs > APIs from the menu on the left.
3. Select your API. If configured, select a version.
4. Go to the Settings tab from the top bar.
5. Scroll down to the Diagnostics Logs section. Check the Enable box.
6. Select your attached logger in the Destination dropdown.
7. Input 100 as Sampling (%) and select the Always log errors checkbox.
8. Leave the rest of the settings as is. For details about the settings, see Diagnostic logs settings reference.
9. Select Save.

-3- Enable custom metrics in Application Insights

Enable custom metrics with dimensions in Application Insights. For more information, see Emit custom metrics.

Enable Custom metrics (Preview) with custom dimensions in your Application Insights instance.
Navigate to your Application Insights instance in the portal.
In the left menu, select Usage and estimated costs.
Select Custom metrics (Preview) > With dimensions.
Select OK.

Enable metrics in diagnostics settings

Add the "metrics": true property to the applicationInsights diagnostic entity that's configured in API Management. Currently you must add this property using the API Management Diagnostic - Create or Update REST API.
PUT https://management.azure.com/subscriptions/{SubscriptionId}/resourceGroups/{ResourceGroupName}/providers/Microsoft.ApiManagement/service/{APIManagementServiceName}/diagnostics/applicationinsights
- {SubscriptionId} - The subscription ID of the Azure subscription that contains the API Management service. You can find it here, https://portal.azure.com/#view/Microsoft_Azure_Billing/SubscriptionsBladeV2
- {ResourceGroupName} - The name of the resource group that contains the API Management service.
- {APIManagementServiceName} - The name of the API Management service.
- a token to make the request. Use az cli with the following commands:
  - az login
  - az account get-access-token --resource https://management.azure.com Here's an example using curl, with example values:
    - SubscriptionId: 1234-abcd
    - ResourceGroupName: my-resource-group
    - APIManagementServiceName: my-apim
    - accessToken: {accessToken}
    - App Insights logger ID: my-app-insights-logger
    PUT https://management.azure.com/subscriptions/1234-abcd/resourceGroups/my-resource-group/providers/Microsoft.ApiManagement/service/my-apim/diagnostics/applicationinsights?api-version=2021-01-01 Authorization: Bearer {accessToken} Content-Type: application/json
  - ```
  { "properties": { "loggerId": "/subscriptions/1234-abcd/resourceGroups/my-resource-group/providers/Microsoft.ApiManagement/service/my-apim/loggers/my-app-insights-logger", "metrics": true } }
```

Summary

You've learned how to monitor your Azure Open AI APIs. Monitoring is an important aspect to stay on top of API usage to help you with things like troubleshooting, optimitize for certain regions or user types and much more. Stay in the know - monitor!