Blog Post

Azure AI Foundry Blog
5 MIN READ

Observability for Multi-Agent Systems with Microsoft Agent Framework and Azure AI Foundry

Shah_Viral's avatar
Shah_Viral
Icon for Microsoft rankMicrosoft
Nov 11, 2025

Agentic applications are revolutionizing enterprise automation, but their dynamic toolchains and latent reasoning make them notoriously hard to operate. In this post, you'll learn how to instrument a Microsoft Agent Framework–based service with OpenTelemetry, ship traces to Azure AI Foundry observability, and adopt a practical workflow to debug, evaluate, and improve multi-agent behavior in production. 

We'll show how to wire spans around reasoning steps and tool calls (OpenAPI / MCP), enabling deep visibility into your agentic workflows.

Who Should Read This? 
  • Developers building agents with Microsoft Agent Framework (MAF) in .NET or Python
  • Architects/SREs seeking enterprise-grade visibility, governance, and reliability for deployments on Azure AI Foundry 
Why Observability Is Non-Negotiable for Agents 

Traditional logs fall short for agentic systems: 

  • Reasoning and routing (which tool? which doc?) are opaque without explicit spans/events 
  • Failures often occur between components (e.g., retrieval mismatch, tool schema drift) 
  • Without traces across agents  tools  data stores, you can't reproduce or evaluate behavior 

Microsoft has introduced multi-agent observability patterns and OpenTelemetry (OTel) conventions that unify traces across Agent Framework, Foundry, and popular stacks—so you can see one coherent timeline for each task.

Reference Architecture
Key Capabilities
  • Agent orchestration & deployment via Microsoft Agent Framework 
  • Model access using Foundry’s OpenAI-compatible endpoint 
  • OpenTelemetry for traces/spans + attributes (agent, tool, retrieval, latency, tokens)
Step-by-Step Implementation

Assumption: This article uses Azure Monitor (via Application Insights) as the OpenTelemetry exporter, but you can configure other supported exporters in the same way.

Prerequisites
  • .NET 8 SDK or later 
  • Azure OpenAI service (endpoint, API key, deployed model) 
  • Application Insights and Grafana
  1. Create an Agent with OpenTelemetry (ASP.NET Core or Console App)

Install required packages: 

dotnet add package Azure.AI.OpenAI 
dotnet add package Azure.Monitor.OpenTelemetry.Exporter 
dotnet add package Microsoft.Agents.AI.OpenAI 
dotnet add package Microsoft.Extensions.Logging 
dotnet add package OpenTelemetry 
dotnet add package OpenTelemetry.Trace 
dotnet add package OpenTelemetry.Metrics 
dotnet add package OpenTelemetry.Extensions.Hosting 
dotnet add package OpenTelemetry.Instrumentation.Http

Setup environment variables: 

AZURE_OPENAI_ENDPOINT: https://<your_service_name>.openai.azure.com/ 
AZURE_OPENAI_API_KEY: <your_azure_openai_apikey> 
APPLICATIONINSIGHTS_CONNECTION_STRING: <your_application_insights_connectionstring_for_azuremonitor_exporter>

Configure tracing once at startup:

var applicationInsightsConnectionString = Environment.GetEnvironmentVariable("APPLICATIONINSIGHTS_CONNECTION_STRING");

// Create a resource describing the service
var resource = ResourceBuilder.CreateDefault()
    .AddService(serviceName: ServiceName)
    .AddAttributes(new Dictionary<string, object>
    {
        ["deployment.environment"] = "development",
        ["service.instance.id"] = Environment.MachineName
    })
    .Build();

// Setup OpenTelemetry TracerProvider
var traceProvider = Sdk.CreateTracerProviderBuilder()
    .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(ServiceName))
    .AddSource(SourceName)
    .AddSource("Microsoft.Agents.AI")
    .AddHttpClientInstrumentation()
    .AddAzureMonitorTraceExporter(options =>
    {
        options.ConnectionString = applicationInsightsConnectionString;
    })
    .Build();

// Setup OpenTelemetry MeterProvider
var meterProvider = Sdk.CreateMeterProviderBuilder()
    .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(ServiceName))
    .AddMeter(SourceName)
    .AddAzureMonitorMetricExporter(options =>
    {
        options.ConnectionString = applicationInsightsConnectionString;
    })
    .Build();

// Configure DI and OpenTelemetry
var serviceCollection = new ServiceCollection();

// Setup Logging with OpenTelemetry and Application Insights
serviceCollection.AddLogging(loggingBuilder =>
    {
        loggingBuilder.SetMinimumLevel(LogLevel.Debug);
        loggingBuilder.AddOpenTelemetry(options =>
        {
            options.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(ServiceName));
            options.IncludeScopes = true;
            options.IncludeFormattedMessage = true;
            options.AddAzureMonitorLogExporter(exporterOptions =>
            {
                exporterOptions.ConnectionString = applicationInsightsConnectionString;
            });
        });
        loggingBuilder.AddApplicationInsights(
            configureTelemetryConfiguration: (config) =>
            {
                config.ConnectionString = Environment.GetEnvironmentVariable("APPLICATIONINSIGHTS_CONNECTION_STRING");
            },
            configureApplicationInsightsLoggerOptions: options =>
            {
                options.TrackExceptionsAsExceptionTelemetry = true;
                options.IncludeScopes = true;
            });
    });

Configure custom metrics and activity source for tracing:

using var activitySource = new ActivitySource(SourceName);
using var meter = new Meter(SourceName);

// Create custom metrics
var interactionCounter = meter.CreateCounter<long>("chat_interactions_total", description: "Total number of chat interactions");
var responseTimeHistogram = meter.CreateHistogram<double>("chat_response_time_ms", description: "Chat response time in milliseconds");
2. Wire-up the AI Agent:
// Create OpenAI client
var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT");
var apiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_API_KEY");
var deploymentName = "gpt-4o-mini";

using var client = new AzureOpenAIClient(new Uri(endpoint), new AzureKeyCredential(apiKey))
    .GetChatClient(deploymentName)
    .AsIChatClient()
    .AsBuilder()
    .UseOpenTelemetry(sourceName: SourceName, configure: (cfg) => cfg.EnableSensitiveData = true)
    .Build();

logger.LogInformation("Creating Agent with OpenTelemetry instrumentation");

// Create AI Agent
var agent = new ChatClientAgent(
    client,
    name: "AgentObservabilityDemo",
    instructions: "You are a helpful assistant that provides concise and informative responses.")
    .AsBuilder()
    .UseOpenTelemetry(SourceName, configure: (cfg) => cfg.EnableSensitiveData = true)
    .Build();

var thread = agent.GetNewThread();

logger.LogInformation("Agent created successfully with ID: {AgentId}", agent.Id);
3. Instrument Agent logic with semantic attributes and call OpenAI-compatible API:
// Create a parent span for the entire agent session
using var sessionActivity = activitySource.StartActivity("Agent Session");
Console.WriteLine($"Trace ID: {sessionActivity?.TraceId} ");

var sessionId = Guid.NewGuid().ToString("N");
sessionActivity?
    .SetTag("agent.name", "AgentObservabilityDemo")
    .SetTag("session.id", sessionId)
    .SetTag("session.start_time", DateTimeOffset.UtcNow.ToString("O"));

logger.LogInformation("Starting agent session with ID: {SessionId}", sessionId);
using (logger.BeginScope(new Dictionary<string, object> { ["SessionId"] = sessionId, ["AgentName"] = "AgentObservabilityDemo" }))
{
    var interactionCount = 0;
    while (true)
    {
        Console.Write("You (or 'exit' to quit): ");
        var input = Console.ReadLine();

        if (string.IsNullOrWhiteSpace(input) || input.Equals("exit", StringComparison.OrdinalIgnoreCase))
        {
            logger.LogInformation("User requested to exit the session");
            break;
        }

        interactionCount++;
        logger.LogInformation("Processing interaction #{InteractionCount}", interactionCount);

        // Create a child span for each individual interaction
        using var activity = activitySource.StartActivity("Agent Interaction");
        activity?
            .SetTag("user.input", input)
            .SetTag("agent.name", "AgentObservabilityDemo")
            .SetTag("interaction.number", interactionCount);

        var stopwatch = Stopwatch.StartNew();

        try
        {
            logger.LogInformation("Starting agent execution for interaction #{InteractionCount}", interactionCount);

            var response = await agent.RunAsync(input);
            Console.WriteLine($"Agent: {response}");
            Console.WriteLine();

            stopwatch.Stop();
            var responseTimeMs = stopwatch.Elapsed.TotalMilliseconds;

            // Record metrics
            interactionCounter.Add(1, new KeyValuePair<string, object?>("status", "success"));
            responseTimeHistogram.Record(responseTimeMs, new KeyValuePair<string, object?>("status", "success"));

            activity?.SetTag("interaction.status", "success");
            logger.LogInformation("Agent interaction #{InteractionNumber} completed successfully in {ResponseTime:F2} seconds", interactionCount, responseTimeMs);
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Error: {ex.Message}");
            Console.WriteLine();

            stopwatch.Stop();
            var responseTimeMs = stopwatch.Elapsed.TotalSeconds;

            // Record error metrics
            interactionCounter.Add(1, new KeyValuePair<string, object?>("status", "error"));
            responseTimeHistogram.Record(responseTimeMs,
                new KeyValuePair<string, object?>("status", "error"));

            activity?
                .SetTag("response.success", false)
                .SetTag("error.message", ex.Message)
                .SetStatus(ActivityStatusCode.Error, ex.Message);
            logger.LogError(ex, "Agent interaction #{InteractionNumber} failed after {ResponseTime:F2} seconds: {ErrorMessage}",
                interactionCount, responseTimeMs, ex.Message);
        }
    }

    // Add session summary to the parent span
    sessionActivity?
        .SetTag("session.total_interactions", interactionCount)
        .SetTag("session.end_time", DateTimeOffset.UtcNow.ToString("O"));

    logger.LogInformation("Agent session completed. Total interactions: {TotalInteractions}", interactionCount);
Azure Monitor dashboard

Once you run the agent and generate some traffic, your dashboard in Azure Monitor will be populated as shown below:

You can drill down to specific service / activity source / spans by applying relevant filters:

Key Features Demonstrated
  • OpenTelemetry instrumentation with Microsoft Agent framework 
  • Custom metrics for user interactions 
  • End-to-end Telemetry correlation 
  • Real time telemetry visualization along with metrics and logging interactions 
Further reading
Contributors:

pranav_pratik​ 

MariyamAshai​ 

Updated Nov 12, 2025
Version 6.0
No CommentsBe the first to comment