Agentic applications are revolutionizing enterprise automation, but their dynamic toolchains and latent reasoning make them notoriously hard to operate. In this post, you'll learn how to instrument a Microsoft Agent Framework–based service with OpenTelemetry, ship traces to Azure AI Foundry observability, and adopt a practical workflow to debug, evaluate, and improve multi-agent behavior in production.
We'll show how to wire spans around reasoning steps and tool calls (OpenAPI / MCP), enabling deep visibility into your agentic workflows.
Who Should Read This?
- Developers building agents with Microsoft Agent Framework (MAF) in .NET or Python
- Architects/SREs seeking enterprise-grade visibility, governance, and reliability for deployments on Azure AI Foundry
Why Observability Is Non-Negotiable for Agents
Traditional logs fall short for agentic systems:
- Reasoning and routing (which tool? which doc?) are opaque without explicit spans/events
- Failures often occur between components (e.g., retrieval mismatch, tool schema drift)
- Without traces across agents ⇄ tools ⇄ data stores, you can't reproduce or evaluate behavior
Microsoft has introduced multi-agent observability patterns and OpenTelemetry (OTel) conventions that unify traces across Agent Framework, Foundry, and popular stacks—so you can see one coherent timeline for each task.
Reference Architecture
Key Capabilities
- Agent orchestration & deployment via Microsoft Agent Framework
- Model access using Foundry’s OpenAI-compatible endpoint
- OpenTelemetry for traces/spans + attributes (agent, tool, retrieval, latency, tokens)
Step-by-Step Implementation
Assumption: This article uses Azure Monitor (via Application Insights) as the OpenTelemetry exporter, but you can configure other supported exporters in the same way.
Prerequisites
- .NET 8 SDK or later
- Azure OpenAI service (endpoint, API key, deployed model)
- Application Insights and Grafana
-
Create an Agent with OpenTelemetry (ASP.NET Core or Console App)
Install required packages:
dotnet add package Azure.AI.OpenAI
dotnet add package Azure.Monitor.OpenTelemetry.Exporter
dotnet add package Microsoft.Agents.AI.OpenAI
dotnet add package Microsoft.Extensions.Logging
dotnet add package OpenTelemetry
dotnet add package OpenTelemetry.Trace
dotnet add package OpenTelemetry.Metrics
dotnet add package OpenTelemetry.Extensions.Hosting
dotnet add package OpenTelemetry.Instrumentation.Http
Setup environment variables:
AZURE_OPENAI_ENDPOINT: https://<your_service_name>.openai.azure.com/
AZURE_OPENAI_API_KEY: <your_azure_openai_apikey>
APPLICATIONINSIGHTS_CONNECTION_STRING: <your_application_insights_connectionstring_for_azuremonitor_exporter>
Configure tracing once at startup:
var applicationInsightsConnectionString = Environment.GetEnvironmentVariable("APPLICATIONINSIGHTS_CONNECTION_STRING");
// Create a resource describing the service
var resource = ResourceBuilder.CreateDefault()
.AddService(serviceName: ServiceName)
.AddAttributes(new Dictionary<string, object>
{
["deployment.environment"] = "development",
["service.instance.id"] = Environment.MachineName
})
.Build();
// Setup OpenTelemetry TracerProvider
var traceProvider = Sdk.CreateTracerProviderBuilder()
.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(ServiceName))
.AddSource(SourceName)
.AddSource("Microsoft.Agents.AI")
.AddHttpClientInstrumentation()
.AddAzureMonitorTraceExporter(options =>
{
options.ConnectionString = applicationInsightsConnectionString;
})
.Build();
// Setup OpenTelemetry MeterProvider
var meterProvider = Sdk.CreateMeterProviderBuilder()
.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(ServiceName))
.AddMeter(SourceName)
.AddAzureMonitorMetricExporter(options =>
{
options.ConnectionString = applicationInsightsConnectionString;
})
.Build();
// Configure DI and OpenTelemetry
var serviceCollection = new ServiceCollection();
// Setup Logging with OpenTelemetry and Application Insights
serviceCollection.AddLogging(loggingBuilder =>
{
loggingBuilder.SetMinimumLevel(LogLevel.Debug);
loggingBuilder.AddOpenTelemetry(options =>
{
options.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(ServiceName));
options.IncludeScopes = true;
options.IncludeFormattedMessage = true;
options.AddAzureMonitorLogExporter(exporterOptions =>
{
exporterOptions.ConnectionString = applicationInsightsConnectionString;
});
});
loggingBuilder.AddApplicationInsights(
configureTelemetryConfiguration: (config) =>
{
config.ConnectionString = Environment.GetEnvironmentVariable("APPLICATIONINSIGHTS_CONNECTION_STRING");
},
configureApplicationInsightsLoggerOptions: options =>
{
options.TrackExceptionsAsExceptionTelemetry = true;
options.IncludeScopes = true;
});
});
Configure custom metrics and activity source for tracing:
using var activitySource = new ActivitySource(SourceName);
using var meter = new Meter(SourceName);
// Create custom metrics
var interactionCounter = meter.CreateCounter<long>("chat_interactions_total", description: "Total number of chat interactions");
var responseTimeHistogram = meter.CreateHistogram<double>("chat_response_time_ms", description: "Chat response time in milliseconds");
2. Wire-up the AI Agent:
// Create OpenAI client
var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT");
var apiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_API_KEY");
var deploymentName = "gpt-4o-mini";
using var client = new AzureOpenAIClient(new Uri(endpoint), new AzureKeyCredential(apiKey))
.GetChatClient(deploymentName)
.AsIChatClient()
.AsBuilder()
.UseOpenTelemetry(sourceName: SourceName, configure: (cfg) => cfg.EnableSensitiveData = true)
.Build();
logger.LogInformation("Creating Agent with OpenTelemetry instrumentation");
// Create AI Agent
var agent = new ChatClientAgent(
client,
name: "AgentObservabilityDemo",
instructions: "You are a helpful assistant that provides concise and informative responses.")
.AsBuilder()
.UseOpenTelemetry(SourceName, configure: (cfg) => cfg.EnableSensitiveData = true)
.Build();
var thread = agent.GetNewThread();
logger.LogInformation("Agent created successfully with ID: {AgentId}", agent.Id);
3. Instrument Agent logic with semantic attributes and call OpenAI-compatible API:
// Create a parent span for the entire agent session
using var sessionActivity = activitySource.StartActivity("Agent Session");
Console.WriteLine($"Trace ID: {sessionActivity?.TraceId} ");
var sessionId = Guid.NewGuid().ToString("N");
sessionActivity?
.SetTag("agent.name", "AgentObservabilityDemo")
.SetTag("session.id", sessionId)
.SetTag("session.start_time", DateTimeOffset.UtcNow.ToString("O"));
logger.LogInformation("Starting agent session with ID: {SessionId}", sessionId);
using (logger.BeginScope(new Dictionary<string, object> { ["SessionId"] = sessionId, ["AgentName"] = "AgentObservabilityDemo" }))
{
var interactionCount = 0;
while (true)
{
Console.Write("You (or 'exit' to quit): ");
var input = Console.ReadLine();
if (string.IsNullOrWhiteSpace(input) || input.Equals("exit", StringComparison.OrdinalIgnoreCase))
{
logger.LogInformation("User requested to exit the session");
break;
}
interactionCount++;
logger.LogInformation("Processing interaction #{InteractionCount}", interactionCount);
// Create a child span for each individual interaction
using var activity = activitySource.StartActivity("Agent Interaction");
activity?
.SetTag("user.input", input)
.SetTag("agent.name", "AgentObservabilityDemo")
.SetTag("interaction.number", interactionCount);
var stopwatch = Stopwatch.StartNew();
try
{
logger.LogInformation("Starting agent execution for interaction #{InteractionCount}", interactionCount);
var response = await agent.RunAsync(input);
Console.WriteLine($"Agent: {response}");
Console.WriteLine();
stopwatch.Stop();
var responseTimeMs = stopwatch.Elapsed.TotalMilliseconds;
// Record metrics
interactionCounter.Add(1, new KeyValuePair<string, object?>("status", "success"));
responseTimeHistogram.Record(responseTimeMs, new KeyValuePair<string, object?>("status", "success"));
activity?.SetTag("interaction.status", "success");
logger.LogInformation("Agent interaction #{InteractionNumber} completed successfully in {ResponseTime:F2} seconds", interactionCount, responseTimeMs);
}
catch (Exception ex)
{
Console.WriteLine($"Error: {ex.Message}");
Console.WriteLine();
stopwatch.Stop();
var responseTimeMs = stopwatch.Elapsed.TotalSeconds;
// Record error metrics
interactionCounter.Add(1, new KeyValuePair<string, object?>("status", "error"));
responseTimeHistogram.Record(responseTimeMs,
new KeyValuePair<string, object?>("status", "error"));
activity?
.SetTag("response.success", false)
.SetTag("error.message", ex.Message)
.SetStatus(ActivityStatusCode.Error, ex.Message);
logger.LogError(ex, "Agent interaction #{InteractionNumber} failed after {ResponseTime:F2} seconds: {ErrorMessage}",
interactionCount, responseTimeMs, ex.Message);
}
}
// Add session summary to the parent span
sessionActivity?
.SetTag("session.total_interactions", interactionCount)
.SetTag("session.end_time", DateTimeOffset.UtcNow.ToString("O"));
logger.LogInformation("Agent session completed. Total interactions: {TotalInteractions}", interactionCount);
Azure Monitor dashboard
Once you run the agent and generate some traffic, your dashboard in Azure Monitor will be populated as shown below:
You can drill down to specific service / activity source / spans by applying relevant filters:
Key Features Demonstrated
- OpenTelemetry instrumentation with Microsoft Agent framework
- Custom metrics for user interactions
- End-to-end Telemetry correlation
- Real time telemetry visualization along with metrics and logging interactions
Further reading
- Introducing Microsoft Agent Framework
- Azure AI Foundry docs
- OpenTelemetry Aspire Demo with Azure OpenAI