Microsoft Foundry Blog

6 MIN READ

Monitor OpenAI Agents SDK with Application Insights

Microsoft

Mar 17, 2025

As AI agents become more prevalent in applications, monitoring their behavior and performance becomes crucial. In this blog post, we'll explore how to monitor the OpenAI Agents SDK using Azure Application Insights through OpenTelemetry integration.

Enhancing OpenAI Agents with OpenTelemetry

The OpenAI Agents SDK provides powerful capabilities for building agent-based applications. By default, the SDK doesn't emit OpenTelemetry data, as noted in GitHub issue #18. This presents an opportunity to extend the SDK's functionality with robust observability features.

Adding OpenTelemetry integration enables you to:

Track agent interactions across distributed systems
Monitor performance metrics in production
Gain insights into agent behaviour
Seamlessly integrate with existing observability platforms

Fortunately, the Pydantic Logfire SDK has implemented an OpenTelemetry instrumentation wrapper for OpenAI Agents. This wrapper allows us to capture telemetry data and propagate it to an OpenTelemetry Collector endpoint.

How It Works

The integration works by wrapping the OpenAI Agents tracing provider with a Logfire-compatible wrapper that generates OpenTelemetry spans for various agent activities:

Agent runs
Function calls
Chat completions
Handoffs between agents
Guardrail evaluations

Each of these activities is captured as a span with relevant attributes that provide context about the operation.

Implementation Example

Here's how to set up the Logfire instrumentation in your application:

import logfire
from openai import AsyncAzureOpenAI
from agents import set_default_openai_client, set_tracing_disabled

# Configure your OpenAI client
azure_openai_client = AsyncAzureOpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    azure_deployment=os.getenv("AZURE_OPENAI_DEPLOYMENT")
)

# Set as default client and enable tracing
set_default_openai_client(azure_openai_client)
set_tracing_disabled(False)

# Configure OpenTelemetry endpoint
os.environ["OTEL_EXPORTER_OTLP_TRACES_ENDPOINT"] = "http://0.0.0.0:4318/v1/traces"

# Configure Logfire
logfire.configure(
    service_name='my-agent-service',
    send_to_logfire=False,
    distributed_tracing=True
)

# Instrument OpenAI Agents
logfire.instrument_openai_agents()

Note: The send_to_logfire=False parameter ensures that data is only sent to your OpenTelemetry collector, not to Logfire's cloud service.

Environment Variables: The OTEL_EXPORTER_OTLP_TRACES_ENDPOINT environment variable tells the Logfire SDK where to send the OpenTelemetry traces.

If you're using Azure Container Apps with the built-in OpenTelemetry collector, this variable will be automatically set for you.

Similarly, when using AKS with auto-instrumentation enabled via the OpenTelemetry Operator, this environment variable is automatically injected into your pods. For other environments, you'll need to set it manually as shown in the example above.

Setting Up the OpenTelemetry Collector

To collect and forward the telemetry data to Application Insights, we need to set up an OpenTelemetry Collector. There are two ways to do this:

Option 1: Run the Collector Locally

Find the right OpenTelemetry Contrib Releases for your processor architecture at: https://github.com/open-telemetry/opentelemetry-collector-releases/releases/tag/v0.121.0

Only Contrib releases will support Azure Monitor exporter.

./otelcol-contrib --config=otel-collector-config.yaml

Option 2: Run the Collector in Docker

docker run --rm \
   -v $(pwd)/otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml \
   -p 4318:4318 \
   -p 55679:55679 \
   otel/opentelemetry-collector-contrib:latest

Collector Configuration

Here's a basic configuration for the OpenTelemetry Collector that forwards data to Azure Application Insights:

receivers:
  otlp:
    protocols:
      http:
        endpoint: "0.0.0.0:4318"

exporters:
  azuremonitor:
    connection_string: "InstrumentationKey=your-instrumentation-key;IngestionEndpoint=https://your-region.in.applicationinsights.azure.com/"
    maxbatchsize: 100
    maxbatchinterval: 10s
  debug:
    verbosity: basic

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [azuremonitor, debug]

Important: Replace connection_string with your actual Application Insights connection string.

What You Can Monitor

With this setup, you can monitor various aspects of your OpenAI Agents in Application Insights:

Agent Performance: Track how long each agent takes to process requests
Model Usage: Monitor which AI models are being used and their response times
Function Calls: See which tools/functions are being called by agents
Handoffs: Track when agents hand off tasks to other specialized agents
Errors: Identify and diagnose failures in agent processing
End-to-End Traces: Follow user requests through your entire system

Example Trace Visualisation

In Application Insights, you can visualise the traces as a hierarchical timeline, showing the flow of operations:

Known Issue: Span Name Display in Application Insights

When using LogFire SDK 3.8.1 with Application Insights, you might notice that span names appear as message templates (with regular expressions) instead of showing the actual agent or model names. This makes it harder to identify specific spans in the Application Insights UI.

Issue: In the current implementation of LogFire SDK's OpenAI Agents integration (source code), the message template is used as the span's name, resulting in spans being displayed with placeholders like {name!r} or {gen_ai.request.model!r} instead of actual values.

Temporary Fix

Until LogFire SDK introduces a fix, you can modify the /logfire/_internal/integrations/openai_agents.py file to properly format the span names. This is after pip install logfire, the file will usually be at venv/lib/python3.11/site-packages/logfire/_internal/integrations/openai_agents.py

Replace the span creation code around line 100:

Original code

logfire_span = self.logfire_instance.span(
    msg_template,
    **attributes_from_span_data(span_data, msg_template),
    **extra_attributes,
    _tags=['LLM'] * isinstance(span_data, GenerationSpanData),
)

`Modified code with setting Span name as message`

attributes = attributes_from_span_data(span_data, msg_template)
message = logfire_format(msg_template, dict(attributes or {}), NOOP_SCRUBBER)
logfire_span = self.logfire_instance.span(
    msg_template,
    _span_name=message,
    **attributes,
    **extra_attributes,
    _tags=['LLM'] * isinstance(span_data, GenerationSpanData),
)

This change formats the message template with actual values and sets it as the span name, making it much easier to identify spans in the Application Insights UI.

After applying this fix, your spans will display meaningful names like "Chat completion with 'gpt-4o'" instead of "Chat completion with {gen_ai.request.model!r}".

Limitation: Even after applying this fix, HandOff spans will still not show the correct to_agent field in the span name. This occurs because the to_agent field is not set during initial span creation but later in the on_ending method of the LogfireSpanWrapper class:

@dataclass
class LogfireSpanWrapper(LogfireWrapperBase[Span[TSpanData]], Span[TSpanData]):
    # ...
    def on_ending(self):
        # This is where to_agent gets updated, but too late for the span name
        # ...

Until LogFire SDK optimizes this behavior, you can still see the correct HandOff values by clicking on the span and looking at the logfire.msg property. For example, you'll see "Handoff: Customer Service Agent -> Investment Specialist" in the message property even if the span name doesn't show it correctly.

Auto-Instrumentation for AKS

Azure Kubernetes Service (AKS) offers a codeless way to enable OpenTelemetry instrumentation for your applications. This approach simplifies the setup process and ensures that your OpenAI Agents can send telemetry data without requiring manual instrumentation.

How to Enable Auto-Instrumentation

To enable auto-instrumentation for Python applications in AKS, you can add an annotation to your pod specification:

annotations:
  instrumentation.opentelemetry.io/inject-python: 'true'

This annotation tells the OpenTelemetry Operator to inject the necessary instrumentation into your Python application.

For more details, refer to the following resources:

Built-in Managed OpenTelemetry Collector in Azure Container Apps

Azure Container Apps provides a built-in Managed OpenTelemetry Collector that simplifies the process of collecting and forwarding telemetry data to Application Insights. This eliminates the need to deploy and manage your own collector instance.

Setting Up the Managed Collector

When you enable the built-in collector, Azure Container Apps automatically sets the OTEL_EXPORTER_OTLP_ENDPOINT environment variable for your applications. This allows the Logfire SDK to send traces to the collector without any additional configuration.

Here's an example of enabling the collector in an ARM template:

{
  "type": "Microsoft.App/containerApps",
  "properties": {
    "configuration": {
      "dapr": {},
      "ingress": {},
      "observability": {
        "applicationInsightsConnection": {
          "connectionString": "InstrumentationKey=your-instrumentation-key"
        }
      }
    }
  }
}

For more information, check out these resources:

Conclusion

Monitoring OpenAI Agents with Application Insights provides valuable insights into your AI systems' performance and behavior. By leveraging the Pydantic Logfire SDK's OpenTelemetry instrumentation and the OpenTelemetry Collector, you can gain visibility into your agents' operations and ensure they're functioning as expected.

This approach allows you to integrate AI agent monitoring into your existing observability stack, making it easier to maintain and troubleshoot complex AI systems in production environments.