observability
33 TopicseBPF-Powered Observability Beyond Azure: A Multi-Cloud Perspective with Retina
Kubernetes simplifies container orchestration but introduces observability challenges due to dynamic pod lifecycles and complex inter-service communication. eBPF technology addresses these issues by providing deep system insights and efficient monitoring. The open-source Retina project leverages eBPF for comprehensive, cloud-agnostic network observability across AKS, GKE, and EKS, enhancing troubleshooting and optimization through real-world demo scenarios.943Views9likes0CommentsDiscover the Future of Data Engineering with Microsoft Fabric for Technical Students & Entrepreneurs
Microsoft Fabric is an all-in-one analytics solution for enterprises that covers everything from data movement to data science, Real-Time Analytics, and business intelligence. It offers a comprehensive suite of services, including data lake, data engineering, and data integration, all in one place. This makes it an ideal platform for technical students and entrepreneurial developers looking to streamline their data engineering and analytics workflows.6.2KViews4likes1CommentFoundry Agent Service at Ignite 2025: Simple to Build. Powerful to Deploy. Trusted to Operate.
The upgraded Foundry Agent Service delivers a unified, simplified platform with managed hosting, built-in memory, tool catalogs, and seamless integration with Microsoft Agent Framework. Developers can now deploy agents faster and more securely, leveraging one-click publishing to Microsoft 365 and advanced governance features for streamlined enterprise AI operations.6.3KViews3likes1CommentObservability for Multi-Agent Systems with Microsoft Agent Framework and Azure AI Foundry
Agentic applications are revolutionizing enterprise automation, but their dynamic toolchains and latent reasoning make them notoriously hard to operate. In this post, you'll learn how to instrument a Microsoft Agent Framework–based service with OpenTelemetry, ship traces to Azure AI Foundry observability, and adopt a practical workflow to debug, evaluate, and improve multi-agent behavior in production. We'll show how to wire spans around reasoning steps and tool calls (OpenAPI / MCP), enabling deep visibility into your agentic workflows. Who Should Read This? Developers building agents with Microsoft Agent Framework (MAF) in .NET or Python Architects/SREs seeking enterprise-grade visibility, governance, and reliability for deployments on Azure AI Foundry Why Observability Is Non-Negotiable for Agents Traditional logs fall short for agentic systems: Reasoning and routing (which tool? which doc?) are opaque without explicit spans/events Failures often occur between components (e.g., retrieval mismatch, tool schema drift) Without traces across agents ⇄ tools ⇄ data stores, you can't reproduce or evaluate behavior Microsoft has introduced multi-agent observability patterns and OpenTelemetry (OTel) conventions that unify traces across Agent Framework, Foundry, and popular stacks—so you can see one coherent timeline for each task. Reference Architecture Key Capabilities Agent orchestration & deployment via Microsoft Agent Framework Model access using Foundry’s OpenAI-compatible endpoint OpenTelemetry for traces/spans + attributes (agent, tool, retrieval, latency, tokens) Step-by-Step Implementation Assumption: This article uses Azure Monitor (via Application Insights) as the OpenTelemetry exporter, but you can configure other supported exporters in the same way. Prerequisites .NET 8 SDK or later Azure OpenAI service (endpoint, API key, deployed model) Application Insights and Grafana Create an Agent with OpenTelemetry (ASP.NET Core or Console App) Install required packages: dotnet add package Azure.AI.OpenAI dotnet add package Azure.Monitor.OpenTelemetry.Exporter dotnet add package Microsoft.Agents.AI.OpenAI dotnet add package Microsoft.Extensions.Logging dotnet add package OpenTelemetry dotnet add package OpenTelemetry.Trace dotnet add package OpenTelemetry.Metrics dotnet add package OpenTelemetry.Extensions.Hosting dotnet add package OpenTelemetry.Instrumentation.Http Setup environment variables: AZURE_OPENAI_ENDPOINT: https://<your_service_name>.openai.azure.com/ AZURE_OPENAI_API_KEY: <your_azure_openai_apikey> APPLICATIONINSIGHTS_CONNECTION_STRING: <your_application_insights_connectionstring_for_azuremonitor_exporter> Configure tracing once at startup: var applicationInsightsConnectionString = Environment.GetEnvironmentVariable("APPLICATIONINSIGHTS_CONNECTION_STRING"); // Create a resource describing the service var resource = ResourceBuilder.CreateDefault() .AddService(serviceName: ServiceName) .AddAttributes(new Dictionary<string, object> { ["deployment.environment"] = "development", ["service.instance.id"] = Environment.MachineName }) .Build(); // Setup OpenTelemetry TracerProvider var traceProvider = Sdk.CreateTracerProviderBuilder() .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(ServiceName)) .AddSource(SourceName) .AddSource("Microsoft.Agents.AI") .AddHttpClientInstrumentation() .AddAzureMonitorTraceExporter(options => { options.ConnectionString = applicationInsightsConnectionString; }) .Build(); // Setup OpenTelemetry MeterProvider var meterProvider = Sdk.CreateMeterProviderBuilder() .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(ServiceName)) .AddMeter(SourceName) .AddAzureMonitorMetricExporter(options => { options.ConnectionString = applicationInsightsConnectionString; }) .Build(); // Configure DI and OpenTelemetry var serviceCollection = new ServiceCollection(); // Setup Logging with OpenTelemetry and Application Insights serviceCollection.AddLogging(loggingBuilder => { loggingBuilder.SetMinimumLevel(LogLevel.Debug); loggingBuilder.AddOpenTelemetry(options => { options.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(ServiceName)); options.IncludeScopes = true; options.IncludeFormattedMessage = true; options.AddAzureMonitorLogExporter(exporterOptions => { exporterOptions.ConnectionString = applicationInsightsConnectionString; }); }); loggingBuilder.AddApplicationInsights( configureTelemetryConfiguration: (config) => { config.ConnectionString = Environment.GetEnvironmentVariable("APPLICATIONINSIGHTS_CONNECTION_STRING"); }, configureApplicationInsightsLoggerOptions: options => { options.TrackExceptionsAsExceptionTelemetry = true; options.IncludeScopes = true; }); }); Configure custom metrics and activity source for tracing: using var activitySource = new ActivitySource(SourceName); using var meter = new Meter(SourceName); // Create custom metrics var interactionCounter = meter.CreateCounter<long>("chat_interactions_total", description: "Total number of chat interactions"); var responseTimeHistogram = meter.CreateHistogram<double>("chat_response_time_ms", description: "Chat response time in milliseconds"); 2. Wire-up the AI Agent: // Create OpenAI client var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT"); var apiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_API_KEY"); var deploymentName = "gpt-4o-mini"; using var client = new AzureOpenAIClient(new Uri(endpoint), new AzureKeyCredential(apiKey)) .GetChatClient(deploymentName) .AsIChatClient() .AsBuilder() .UseOpenTelemetry(sourceName: SourceName, configure: (cfg) => cfg.EnableSensitiveData = true) .Build(); logger.LogInformation("Creating Agent with OpenTelemetry instrumentation"); // Create AI Agent var agent = new ChatClientAgent( client, name: "AgentObservabilityDemo", instructions: "You are a helpful assistant that provides concise and informative responses.") .AsBuilder() .UseOpenTelemetry(SourceName, configure: (cfg) => cfg.EnableSensitiveData = true) .Build(); var thread = agent.GetNewThread(); logger.LogInformation("Agent created successfully with ID: {AgentId}", agent.Id); 3. Instrument Agent logic with semantic attributes and call OpenAI-compatible API: // Create a parent span for the entire agent session using var sessionActivity = activitySource.StartActivity("Agent Session"); Console.WriteLine($"Trace ID: {sessionActivity?.TraceId} "); var sessionId = Guid.NewGuid().ToString("N"); sessionActivity? .SetTag("agent.name", "AgentObservabilityDemo") .SetTag("session.id", sessionId) .SetTag("session.start_time", DateTimeOffset.UtcNow.ToString("O")); logger.LogInformation("Starting agent session with ID: {SessionId}", sessionId); using (logger.BeginScope(new Dictionary<string, object> { ["SessionId"] = sessionId, ["AgentName"] = "AgentObservabilityDemo" })) { var interactionCount = 0; while (true) { Console.Write("You (or 'exit' to quit): "); var input = Console.ReadLine(); if (string.IsNullOrWhiteSpace(input) || input.Equals("exit", StringComparison.OrdinalIgnoreCase)) { logger.LogInformation("User requested to exit the session"); break; } interactionCount++; logger.LogInformation("Processing interaction #{InteractionCount}", interactionCount); // Create a child span for each individual interaction using var activity = activitySource.StartActivity("Agent Interaction"); activity? .SetTag("user.input", input) .SetTag("agent.name", "AgentObservabilityDemo") .SetTag("interaction.number", interactionCount); var stopwatch = Stopwatch.StartNew(); try { logger.LogInformation("Starting agent execution for interaction #{InteractionCount}", interactionCount); var response = await agent.RunAsync(input); Console.WriteLine($"Agent: {response}"); Console.WriteLine(); stopwatch.Stop(); var responseTimeMs = stopwatch.Elapsed.TotalMilliseconds; // Record metrics interactionCounter.Add(1, new KeyValuePair<string, object?>("status", "success")); responseTimeHistogram.Record(responseTimeMs, new KeyValuePair<string, object?>("status", "success")); activity?.SetTag("interaction.status", "success"); logger.LogInformation("Agent interaction #{InteractionNumber} completed successfully in {ResponseTime:F2} seconds", interactionCount, responseTimeMs); } catch (Exception ex) { Console.WriteLine($"Error: {ex.Message}"); Console.WriteLine(); stopwatch.Stop(); var responseTimeMs = stopwatch.Elapsed.TotalSeconds; // Record error metrics interactionCounter.Add(1, new KeyValuePair<string, object?>("status", "error")); responseTimeHistogram.Record(responseTimeMs, new KeyValuePair<string, object?>("status", "error")); activity? .SetTag("response.success", false) .SetTag("error.message", ex.Message) .SetStatus(ActivityStatusCode.Error, ex.Message); logger.LogError(ex, "Agent interaction #{InteractionNumber} failed after {ResponseTime:F2} seconds: {ErrorMessage}", interactionCount, responseTimeMs, ex.Message); } } // Add session summary to the parent span sessionActivity? .SetTag("session.total_interactions", interactionCount) .SetTag("session.end_time", DateTimeOffset.UtcNow.ToString("O")); logger.LogInformation("Agent session completed. Total interactions: {TotalInteractions}", interactionCount); Azure Monitor dashboard Once you run the agent and generate some traffic, your dashboard in Azure Monitor will be populated as shown below: You can drill down to specific service / activity source / spans by applying relevant filters: Key Features Demonstrated OpenTelemetry instrumentation with Microsoft Agent framework Custom metrics for user interactions End-to-end Telemetry correlation Real time telemetry visualization along with metrics and logging interactions Further reading Introducing Microsoft Agent Framework Azure AI Foundry docs OpenTelemetry Aspire Demo with Azure OpenAI1.1KViews3likes0CommentsAnnouncing new security and observability features in Azure Database for PostgreSQL
Today we’re excited to announce new enterprise grade security features available in Azure Database for PostgreSQL – Flexible Server to give developers more control and peace of mind when developing secure applications in Azure4.6KViews3likes0Comments