.net
429 TopicsUsing ClientConnectionId to Correlate .NET Connection Attempts in Azure SQL
Getting Better Diagnostics with ClientConnectionId in .NET A few days ago, I was working on a customer case involving intermittent connectivity failures to Azure SQL Database from a .NET application. On the surface, nothing looked unusual. Retries were happening. In this post, I want to share a simple yet effective pattern for producing JDBC-style trace logs in .NET — specifically focusing on the ClientConnectionId property exposed by SqlConnection. This gives you a powerful correlation key that aligns with backend diagnostics and significantly speeds up root cause analysis for connection problems. Why ClientConnectionId Matters Azure SQL Database assigns a unique identifier to every connection attempt from the client. In .NET, this identifier is available through the ClientConnectionId property of SqlConnection. According to the official documentation: The ClientConnectionId property gets the connection ID of the most recent connection attempt, regardless of whether the attempt succeeded or failed. Source: https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlconnection.clientconnectionid?view=netframework-4.8.1 This GUID is the single most useful piece of telemetry for correlating client connection attempts with server logs and support traces. What .NET Logging Doesn’t Give You by Default Unlike the JDBC driver, the .NET SQL Client does not produce rich internal logs of every connection handshake or retry. There’s no built-in switch to emit gateway and redirect details, attempt counts, or port information. What you do have is: Timestamps Connection attempt boundaries ClientConnectionId values Outcome (success or failure) If you capture and format these consistently, you end up with logs that are as actionable as the JDBC trace output — and importantly, easy to correlate with backend diagnostics and Azure support tooling. Below is a small console application in C# that produces structured logs in the same timestamped, [FINE] format you might see from a JDBC trace — but for .NET applications: using System; using Microsoft.Data.SqlClient; class Program { static int Main() { // SAMPLE connection string (SQL Authentication) // Replace this with your own connection string. // This is provided only for demonstration purposes. string connectionString = "Server=tcp:<servername>.database.windows.net,1433;" + "Database=<database_name>;" + "User ID=<sql_username>;" + "Password=<sql_password>;" + "Encrypt=True;" + "TrustServerCertificate=False;" + "Connection Timeout=30;"; int connectionId = 1; // Log connection creation Log($"ConnectionID:{connectionId} created by (SqlConnection)"); using SqlConnection connection = new SqlConnection(connectionString); try { // Log connection attempt Log($"ConnectionID:{connectionId} This attempt No: 0"); // Open the connection connection.Open(); // Log ClientConnectionId after the connection attempt Log($"ConnectionID:{connectionId} ClientConnectionId: {connection.ClientConnectionId}"); // Execute a simple test query using SqlCommand cmd = new SqlCommand("SELECT 1", connection) { Log($"SqlCommand:1 created by (ConnectionID:{connectionId})"); Log("SqlCommand:1 Executing (not server cursor) SELECT 1"); cmd.ExecuteScalar(); Log("SqlDataReader:1 created by (SqlCommand:1)"); } } catch (SqlException ex) { // ClientConnectionId is available even on failure Log($"ConnectionID:{connectionId} ClientConnectionId: {connection.ClientConnectionId} (failure)"); Log($"SqlException Number: {ex.Number}"); Log($"Message: {ex.Message}"); return 1; } return 0; } // Simple logger to match JDBC-style output format static void Log(string message) { Console.WriteLine( $"[{DateTime.Now:yyyy-MM-dd HH:mm:ss}] [FINE] {message}" ); } } Run the above application and you’ll get output like: [2025-12-31 03:38:10] [FINE] ConnectionID:1 This attempt server name: aabeaXXX.trXXXX.northeurope1-a.worker.database.windows.net port: 11002 InstanceName: null useParallel: false [2025-12-31 03:38:10] [FINE] ConnectionID:1 This attempt endtime: 1767152309272 [2025-12-31 03:38:10] [FINE] ConnectionID:1 This attempt No: 1 [2025-12-31 03:38:10] [FINE] ConnectionID:1 Connecting with server: aabeaXXX.trXXXX.northeurope1-a.worker.database.windows.net port: 11002 Timeout Full: 20 [2025-12-31 03:38:10] [FINE] ConnectionID:1 ClientConnectionID: 6387718b-150d-482a-9731-02d06383d38f Server returned major version: 12 [2025-12-31 03:38:10] [FINE] SqlCommand:1 created by (ConnectionID:1 ClientConnectionID: 6387718b-150d-482a-9731-02d06383d38f) [2025-12-31 03:38:10] [FINE] SqlCommand:1 Executing (not server cursor) select 1 [2025-12-31 03:38:10] [FINE] SqlDataReader:1 created by (SqlCommand:1) [2025-12-31 03:38:10] [FINE] ConnectionID:2 created by (SqlConnection) [2025-12-31 03:38:11] [FINE] ConnectionID:2 ClientConnectionID: 5fdd311e-a219-45bc-a4f6-7ee1cc2f96bf Server returned major version: 12 [2025-12-31 03:38:11] [FINE] sp_executesql SQL: SELECT 1 AS ID, calling sp_executesql [2025-12-31 03:38:12] [FINE] SqlDataReader:3 created by (sp_executesql SQL: SELECT 1 AS ID) Notice how each line is tagged with: A consistent local timestamp (yyyy-MM-dd HH:mm:ss) A [FINE] log level A structured identifier that mirrors what you’d see in JDBC logging If a connection fails, you’ll still get the ClientConnectionId logged, which is exactly what Azure SQL support teams will ask for when troubleshooting connectivity issues.Modernizing Applications by Migrating Code to Use Managed Identity with Copilot App Modernization
Migrating application code to use Managed Identity removes hard‑coded secrets, reduces operational risk, and aligns with modern cloud security practices. Applications authenticate directly with Azure services without storing credentials. GitHub Copilot app modernization streamlines this transition by identifying credential usage patterns, updating code, and aligning dependencies for Managed Identity flows. Supported Migration Steps GitHub Copilot app modernization helps accelerate: Replacing credential‑based authentication with Managed Identity authentication. Updating SDK usage to token‑based flows. Refactoring helper classes that build credential objects. Surfacing libraries or APIs that require alternative authentication approaches. Preparing build configuration changes needed for managed identity integration. Migration Analysis Open the project in Visual Studio Code or IntelliJ IDEA. GitHub Copilot app modernization analyzes: Locations where secrets, usernames, passwords, or connection strings are referenced. Service clients using credential constructors or static credential factories. Environment‑variable‑based authentication workarounds. Dependencies and SDK versions required for Managed Identity authentication. The analysis outlines upgrade blockers and the required changes for cloud‑native authentication. Migration Plan Generation GitHub Copilot app modernization produces a migration plan containing: Replacement of hard‑coded credentials with Managed Identity authentication patterns. Version updates for Azure libraries aligning with Managed Identity support. Adjustments to application configuration to remove unnecessary secrets. Developers can review and adjust before applying. Automated Transformations GitHub Copilot app modernization applies changes: Rewrites code that initializes clients using username/password or connection strings. Introduces Managed Identity‑friendly constructors and token credential patterns. Updates imports, method signatures, and helper utilities. Cleans up configuration files referencing outdated credential flows. Build & Fix Iteration The tool rebuilds the project, identifies issues, and applies targeted fixes: Compilation errors from removed credential classes. Incorrect parameter types or constructors. Dependencies requiring updates for Managed Identity compatibility. Security & Behavior Checks GitHub Copilot app modernization validates: CVEs introduced through dependency updates. Behavior changes caused by new authentication flows. Optional fixes for dependency vulnerabilities. Expected Output A migrated codebase using Managed Identity: Updated authentication logic. Removed credential references. Updated SDKs and dependencies. A summary file listing code edits, dependency changes, and items requiring manual review. Developer Responsibilities Developers should: Validate identity access on Azure resources. Reconfigure role assignments for system‑assigned or user‑assigned managed identities. Test functional behavior across environments. Review integration points dependent on identity scopes and permissions. Learn full upgrade workflows in the Microsoft Learn guide for upgrading Java projects with GitHub Copilot app modernization. Learn more Predefined tasks for GitHub Copilot app modernization Apply a predefined task Install GitHub Copilot app modernization for VS Code and IntelliJ IDEA86Views0likes0CommentsMigrating Application Credentials to Azure Key Vault with GitHub Copilot App Modernization
Storing secrets directly in applications or configuration files increases operational risk. Migrating to Azure Key Vault centralizes secret management, supports rotation, and removes embedded credentials from application code. GitHub Copilot app modernization accelerates this process by identifying credential usage areas and generating changes for Key Vault integration. What This Migration Covers GitHub Copilot app modernization helps with: Detecting secrets hard‑coded in source files, config files, or environment variables. Recommending retrieval patterns using Azure Key Vault SDKs. Updating application code to load secrets from Key Vault. Preparing configuration updates to remove stored credentials. Surfacing dependency, version, and API adjustments required for Key Vault usage. Project Analysis Once the project is opened in Visual Studio Code or IntelliJ IDEA, GitHub Copilot app modernization analyzes: Hard‑coded credentials: passwords, tokens, client secrets, API keys. Legacy configuration patterns using .properties, .yaml, or environment variables. Azure SDK usage and required upgrades for Key Vault integration. Areas requiring secure retrieval or replacement with a managed identity. Migration Plan Generation The tool creates a step‑by‑step migration plan including: Introducing Key Vault client libraries. Mapping existing credential variables to Key Vault secrets. Updating configuration loading logic to retrieve secrets at runtime. Integrating Managed Identity authentication if applicable. Removing unused credential fields from code and configuration. Automated Transformations GitHub Copilot app modernization applies targeted changes: Rewrites code retrieving credentials from files or constants. Generates Key Vault retrieval patterns using SecretClient. Updates build dependencies to current Azure SDK versions. Removes unused configuration entries and environment variables. Build & Fix Iteration The project is rebuilt and validated: Fixes constructor changes related to updated clients. Resolves missing dependency versions. Corrects updated method signatures for Key Vault API calls. Rebuilds until no actionable errors remain. Security & Behavior Checks The tool surfaces: CVEs introduced by new or updated libraries. Behavior changes tied to lazy loading of secrets at runtime. Optional fixes or alternative patterns if Key Vault integration affects existing workflows. Expected Output After modernization: Credentials removed from source and config files. Application retrieves secrets from Azure Key Vault. Updated Azure SDK versions aligned with Key Vault. A summary file detailing code changes, dependency updates, and review items. Developer Responsibilities Developers should: Provision Key Vault resources and assign required access policies. Validate permissions through Managed Identity or service principals. Test application startup, error handling, and rotation scenarios. Review semantic impacts on components relying on early secret loading. Refer to the Microsoft Learn guide on upgrading Java projects with GitHub Copilot app modernization for foundational workflow details. Learn more Predefined tasks for GitHub Copilot app modernization Apply a predefined task Install GitHub Copilot app modernization for VS Code and IntelliJ IDEA110Views0likes0CommentsHow to Resolve .NET Runtime Error 1026 after Windows 11 Update
I hope you're doing well. I am a teacher from Mexico and I'm facing a critical issue after a recent Windows 11 update to version 23H2. Ever since the update, I've been encountering a frustrating problem – a .NET Runtime Error 1026. This issue is causing applications to crash shortly after I launch them, making it impossible for me to access Mi Portal. Given that I heavily rely on Windows applications for my teaching tasks – from document editing to publications and printing, including crucial tasks like payslips handling – this situation has left me unable to perform my duties effectively. I understand that many educators are in a similar situation, where Windows OS and Office applications are our lifelines for seamless work. I urgently need a simple and effective solution to overcome this error and restore the functionality of my Windows 11 system. Your expertise and support in troubleshooting this .NET Runtime Error 1026 would be immensely appreciated. If anyone has encountered a similar issue or has successfully resolved it, I would be grateful for your insights. I need guidance or steps you can provide to help me swiftly resolve this issue and continue my teaching job without further disruptions. Warm regards and appreciation13KViews1like2CommentsMSB4184: The expression "[System.IO.Path]::GetDirectoryName('')" cannot be evaluate
Hi got this error now after I updated my VS IDE .... The path is not of a legal form. WinFormsSQLExamplePro C:\Program Files\dotnet\sdk\10.0.102\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.Windows.targets <_WinRTRuntimeDllDirectory>$([System.IO.Path]::GetDirectoryName('%(_WinRTRuntimeDllReferencePath.FullPath)'))</_WinRTRuntimeDllDirectory> <_CsWinRTOrTargetingPackLibDirectory>$([System.IO.Path]::GetDirectoryName('$(_WinRTRuntimeDllDirectory)'))</_CsWinRTOrTargetingPackLibDirectory> Note the top most path entry contains a '%' character , whereas the 2nd path contains a '$' character... Not sure what is going on ... (happened after VS update)27Views0likes0CommentsEdm Model Generation Fails
Using an Azure ASP.NET MVC API from a Maui app, I am encountering the following error in the app when attempting to access the API. A few days ago, the initalization was working properly. The cloud app has been under further development, but the base api code has not been modified, only the response from the endpoints. The latest release NuGets are in use. What elements are being sought? System.TypeInitializationException: The type initializer for 'GeneratedEdmModel' threw an exception. ---> System.InvalidOperationException: Sequence contains no elements at System.Linq.ThrowHelper.ThrowNoElementsException() at System.Linq.Enumerable.First[TSource](IEnumerable`1 source) at SentryCloud.Models.Container.GeneratedEdmModel.CreateXmlReader()20Views0likes0CommentsImplementing A2A protocol in NET: A Practical Guide
As AI systems mature into multi‑agent ecosystems, the need for agents to communicate reliably and securely has become fundamental. Traditionally, agents built on different frameworks like Semantic Kernel, LangChain, custom orchestrators, or enterprise APIs do not share a common communication model. This creates brittle integrations, duplicate logic, and siloed intelligence. The Agent‑to‑Agent Standard (A2AS) addresses this gap by defining a universal, vendor‑neutral protocol for structured agent interoperability. A2A establishes a common language for agents, built on familiar web primitives: JSON‑RPC 2.0 for messaging and HTTPS for transport. Each agent exposes a machine‑readable Agent Card describing its capabilities, supported input/output modes, and authentication requirements. Interactions are modeled as Tasks, which support synchronous, streaming, and long‑running workflows. Messages exchanged within a task contain Parts; text, structured data, files, or streams, that allow agents to collaborate without exposing internal implementation details. By standardizing discovery, communication, authentication, and task orchestration, A2A enables organizations to build composable AI architectures. Specialized agents can coordinate deep reasoning, planning, data retrieval, or business automation regardless of their underlying frameworks or hosting environments. This modularity, combined with industry adoption and Linux Foundation governance, positions A2A as a foundational protocol for interoperable AI systems. A2AS in .NET — Implementation Guide Prerequisites • .NET 8 SDK • Visual Studio 2022 (17.8+) • A2A and A2A.AspNetCore packages • Curl/Postman (optional, for direct endpoint testing) The open‑source A2A project provides a full‑featured .NET SDK, enabling developers to build and host A2A agents using ASP.NET Core or integrate with other agents as a client. Two A2A and A2A.AspNetCore packages power the experience. The SDK offers: A2AClient - to call remote agents TaskManager - to manage incoming tasks & message routing AgentCard / Message / Task models - strongly typed protocol objects MapA2A() - ASP.NET Core router integration that auto‑generates protocol endpoints This allows you to expose an A2A‑compliant agent with minimal boilerplate. Project Setup Create two separate projects: CurrencyAgentService → ASP.NET Core web project that hosts the agent A2AClient → Console app that discovers the agent card and sends a message Install the packages from the pre-requisites in the above projects. Building a Simple A2A Agent (Currency Agent Example) Below is a minimal Currency Agent implemented in ASP.NET Core. It responds by converting amounts between currencies. Step 1: In CurrencyAgentService project, create the CurrencyAgentImplementation class to implement the A2A agent. The class contains the logic for the following: a) Describing itself (agent “card” metadata). b) Processing the incoming text messages like “100 USD to EUR”. c) Returning a single text response with the conversion. The AttachTo(ITaskManager taskManager) method hooks two delegates on the provided taskManager - a) OnAgentCardQuery → GetAgentCardAsync: returns agent metadata. b) OnMessageReceived → ProcessMessageAsync: handles incoming messages and produces a response. Step 2: In the Program.cs of the Currency Agent Solution, create a TaskManager , and attach the agent to it, and expose the A2A endpoint. Typical flow: GET /agent → A2A host asks OnAgentCardQuery → returns the card POST /agent with a text message → A2A host calls OnMessageReceived → returns the conversion text. All fully A2A‑compliant. Calling an A2A Agent from .NET To interact with any A2A‑compliant agent from .NET, the client follows a predictable sequence: identify where the agent lives, discover its capabilities through the Agent Card, initialize a correctly configured A2AClient, construct a well‑formed message, send it asynchronously, and finally interpret the structured response. This ensures your client is fully aligned with the agent’s advertised contract and remains resilient as capabilities evolve. Below are the steps implemented to call the A2A agent from the A2A client: Identify the agent endpoint: Why: You need a stable base URL to resolve the agent’s metadata and send messages. What: Construct a Uri pointing to the agent service, e.g., https://localhost:7009/agent. Discover agent capabilities via an Agent Card. Why: Agent Cards provide a contract: name, description, final URL to call, and features (like streaming). This de-couples your client from hard-coded assumptions and enables dynamic capability checks. What: Use A2ACardResolver with the endpoint Uri, then call GetAgentCardAsync() to obtain an AgentCard. Initialize the A2AClient with the resolved URL. Why: The client encapsulates transport details and ensures messages are sent to the correct agent endpoint, which may differ from the discovery URL. What: Create A2AClient using new Uri (currencyCard.Url) from the Agent Card for correctness. Construct a well-formed agent request message. Why: Agents typically require structured messages for roles, traceability, and multi-part inputs. A unique message ID supports deduplication and logging. What: Build an AgentMessage: • Role = MessageRole.User clarifies intent. • MessageId = Guid.NewGuid().ToString() ensures uniqueness. • Parts contains content; for simple queries, a single TextPart with the prompt (e.g., “100 USD to EUR”). Package and send the message. Why: MessageSendParams can carry the message plus any optional settings (e.g., streaming flags or context). Using a dedicated params object keeps the API extensible. What: Wrap the AgentMessage in MessageSendParams and call SendMessageAsync(...) on the A2AClient. Outcome: Await the asynchronous response to avoid blocking and to stay scalable. Interpret the agent response. Why: Agents can return multiple Parts (text, data, attachments). Extracting the appropriate part avoids assumptions and keeps your client robust. What: Cast to AgentMessage, then read the first TextPart’s Text for the conversion result in this scenario. Best Practices 1. Keep Agents Focused and Single‑Purpose Design each agent around a clear, narrow capability (e.g., currency conversion, scheduling, document summarization). Single‑responsibility agents are easier to reason about, scale, and test, especially when they become part of larger multi‑agent workflows. 2. Maintain Accurate and Helpful Agent Cards The Agent Card is the first interaction point for any client. Ensure it accurately reflects: Supported input/output formats Streaming capabilities Authentication requirements (if any) Version information A clean and honest card helps clients integrate reliably without guesswork. 3. Prefer Structured Inputs and Outputs Although A2A supports plain text, using structured payloads through DataPart objects significantly improves consistency. JSON inputs and outputs reduce ambiguity, eliminate prompt‑engineering edge cases, and make agent behavior more deterministic especially when interacting with other automated agents. 4. Use Meaningful Task States Treat A2A Tasks as proper state machines. Transition through states intentionally (Submitted → Working → Completed, or Working → InputRequired → Completed). This gives clients clarity on progress, makes long‑running operations manageable, and enables more sophisticated control flows. 5. Provide Helpful Error Messages Make use of A2A and JSON‑RPC error codes such as -32602 (invalid input) or -32603 (internal error), and include additional context in the error payload. Avoid opaque messages, error details should guide the client toward recovery or correction. 6. Keep Agents Stateless Where Possible Stateless agents are easier to scale and less prone to hidden failures. When state is necessary, ensure it is stored externally or passed through messages or task contexts. For local POCs, in‑memory state is acceptable, but design with future statelessness in mind. 7. Validate Input Strictly Do not assume incoming messages are well‑formed. Validate fields, formats, and required parameters before processing. For example, a currency conversion agent should confirm both currencies exist and the value is numeric before attempting a conversion. 8. Design for Streaming Even if Disabled Streaming is optional, but it’s a powerful pattern for agents that perform progressive reasoning or long computations. Structuring your logic so it can later emit partial TextPart updates makes it easy to upgrade from synchronous to streaming workflows. 9. Include Traceability Metadata Embed and log identifiers such as TaskId, MessageId, and timestamps. These become crucial for debugging multi‑agent scenarios, improving observability, and correlating distributed workflows—especially once multiple agents collaborate. 10. Offer Clear Guidance When Input Is Missing Instead of returning a generic failure, consider shifting the task to InputRequired and explaining what the client should provide. This improves usability and makes your agent self‑documenting for new consumers.From Vibe Coding to Working App: How SRE Agent Completes the Developer Loop
The Most Common Challenge in Modern Cloud Apps There's a category of bugs that drive engineers crazy: multi-layer infrastructure issues. Your app deploys successfully. Every Azure resource shows "Succeeded." But the app fails at runtime with a vague error like Login failed for user ''. Where do you even start? You're checking the Web App, the SQL Server, the VNet, the private endpoint, the DNS zone, the identity configuration... and each one looks fine in isolation. The problem is how they connect and that's invisible in the portal. Networking issues are especially brutal. The error says "Login failed" but the actual causes could be DNS, firewall, identity, or all three. The symptom and the root causes are in completely different resources. Without deep Azure networking knowledge, you're just clicking around hoping something jumps out. Now imagine you vibe coded the infrastructure. You used AI to generate the Bicep, deployed it, and moved on. When it breaks, you're debugging code you didn't write, configuring resources you don't fully understand. This is where I wanted AI to help not just to build, but to debug. Enter SRE Agent + Coding Agent Here's what I used: Layer Tool Purpose Build VS Code Copilot Agent Mode + Claude Opus Generate code, Bicep, deploy Debug Azure SRE Agent Diagnose infrastructure issues and create developer issue with suggested fixes in source code (app code and IaC) Fix GitHub Coding Agent Create PRs with code and IaC fix from Github issue created by SRE Agent Copilot builds. SRE Agent debugs. Coding Agent fixes. What I Built I used VS Code Copilot in Agent Mode with Claude Opus to create a .NET 8 Web App connected to Azure SQL via private endpoint: Private networking (no public exposure) Entra-only authentication Managed identity (no secrets) Deployed with azd up. All green. Then I tested the health endpoint: $ curl https://app-tsdvdfdwo77hc.azurewebsites.net/health/sql {"status":"unhealthy","error":"Login failed for user ''.","errorType":"SqlException"} Deployment succeeded. App failed. One error. How I Fixed It: Step by Step Step 1: Create SRE Agent with Azure Access I created an SRE Agent with read access to my Azure subscription. You can scope it to specific resource groups. The agent builds a knowledge graph of your resources and their dependencies visible in the Resource Mapping view below. Step 2: Connect GitHub to SRE Agent using GitHub MCP server I connected the GitHub MCP server so the agent could read my repository and create issues. Step 3: Create Sub Agent to analyze source code I created a sub-agent for analyzing source code using GitHub mcp tools. this lets SRE Agent understand not just Azure resources, but also the Bicep and source code files that created them. "you are expert in analyzing source code (bicep and app code) from github repos" Step 4: Invoke Sub-Agent to Analyze the Error In the SRE Agent chat, I invoked the sub-agent to diagnose the error I received from my app end point. It correlated the runtime error with the infrastructure configuration Step 5: Watch the SRE Agent Think and Reason SRE Agent analyzed the error by tracing code in Program.cs, Bicep configurations, and Azure resource relationships Web App, SQL Server, VNet, private endpoint, DNS zone, and managed identity. Its reasoning process worked through each layer, eliminating possibilities one by one until it identified the root causes. Step 6: Agent Creates GitHub Issue Based on its analysis, SRE Agent summarized the root causes and suggested fixes in a GitHub issue: Root Causes: Private DNS Zone missing VNet link Managed identity not created as SQL user Suggested Fixes: Add virtualNetworkLinks resource to Bicep Add SQL setup script to create user with db_datareader and db_datawriter roles Step 7: Merge the PR from Coding Agent Assign the Github issue to Coding Agent which then creates a PR with the fixes. I just reviewed the fix. It made sense and I merged it. Redeployed with azd up, ran the SQL script: curl -s https://app-tsdvdfdwo77hc.azurewebsites.net/health/sql | jq . { "status": "healthy", "database": "tododb", "server": "tcp:sql-tsdvdfdwo77hc.database.windows.net,1433", "message": "Successfully connected to SQL Server" } 🎉 From error to fix in minutes without manually debugging a single Azure resource. Why This Matters If you're a developer building and deploying apps to Azure, SRE Agent changes how you work: You don't need to be a networking expert. SRE Agent understands the relationships between Azure resources private endpoints, DNS zones, VNet links, managed identities. It connects dots you didn't know existed. You don't need to guess. Instead of clicking through the portal hoping something looks wrong, the agent systematically eliminates possibilities like a senior engineer would. You don't break your workflow. SRE Agent suggests fixes in your Bicep and source code not portal changes. Everything stays version controlled. Deployed through pipelines. No hot fixes at 2 AM. You close the loop. AI helps you build fast. Now AI helps you debug fast too. Try It Yourself Do you vibe code your app, your infrastructure, or both? How do you debug when things break? Here's a challenge: Vibe code a todo app with a Web App, VNet, private endpoint, and SQL database. "Forget" to link the DNS zone to the VNet. Deploy it. Watch it fail. Then point SRE Agent at it and see how it identifies the root cause, creates a GitHub issue with the fix, and hands it off to Coding Agent for a PR. Share your experience. I'd love to hear how it goes. Learn More Azure SRE Agent documentation Azure SRE Agent blogs Azure SRE Agent community Azure SRE Agent home page Azure SRE Agent pricing605Views3likes0CommentsFix It Before They Feel It: Higher Reliability with Proactive Mitigation
What if your infrastructure could detect performance issues and fix them automatically—before your users even notice? This blog brings that vision to life using Azure SRE Agent, an AI-powered autonomous agent that monitors, detects, and remediates production issues in real-time. 💡 The magic: Zero human intervention required. The agent handles detection, diagnosis, remediation, and reporting—all autonomously. 📺 Watch the Demo This content was presented at .NET Day 2025. Watch the full session to see Azure SRE Agent in action: 🎬 Fix it before they feel it - .NET Day 2025 🎯 What You'll See in This Demo Watch as we intentionally deploy "bad" code to production and observe how the SRE Agent: Detects the degradation — Compares live response times against learned baselines Takes autonomous action — Executes a slot swap to roll back to healthy code Communicates the incident — Posts to Teams and creates a GitHub issue Generates reports — Summarizes deployment metrics for stakeholders 🚀 Key Capabilities Capability What It Shows Proactive Baseline Learning Agent learns normal response times and stores them in a knowledge base Real-time Anomaly Detection Instant comparison of current vs. baseline metrics Autonomous Remediation Agent executes Azure CLI commands to swap slots without human approval Cross-platform Communication Automatic Teams posts and GitHub issue creation Incident Reporting End-of-day email summaries with deployment health metrics Architecture Overview The solution uses Azure SRE Agent with three specialized sub-agents working together: Components Application Layer: .NET 9 Web API running on Azure App Service Application Insights for telemetry collection Azure Monitor Alerts for incident triggers Azure SRE Agent: AvgResponseTime Sub-Agent: Captures baseline metrics every 15 minutes, stores in Knowledge Store DeploymentHealthCheck Sub-Agent: Triggered by deployment alerts, compares metrics to baseline, auto-remediates DeploymentReporter Sub-Agent: Generates daily summary emails from Teams activity External Integrations: GitHub (issue creation, semantic code search, Copilot assignment) Microsoft Teams (deployment summaries) Outlook (summary reports) What the SRE Agent Does When a deployment occurs, the SRE Agent autonomously performs the following actions: 1. Health Check (DeploymentHealthCheck Sub-Agent) When a slot swap alert fires, the agent: Queries App Insights for current response times Retrieves the baseline from the Knowledge Store Compares current performance against baseline If degradation > 20%: Executes rollback and creates GitHub issue If healthy: Posts confirmation to Teams Healthy Deployment - No Action Needed (Teams Post): The agent confirms the deployment is healthy — response time (22ms) is 80% faster than baseline (116ms). Degraded Deployment - Automatic Rollback (Teams Post): The agent detects +332% latency regression (212ms vs 116ms baseline), executes a slot swap to rollback, and creates GitHub Issue. 2. Daily Summary (DeploymentReporter Sub-Agent) Every 24 hours, the reporter agent: Reads all Teams deployment posts from the last 24 hours Aggregates deployment metrics Sends an executive summary email Daily Summary (Outlook Email): The daily report shows 9 deployments, 6 healthy, 3 rollbacks, and 3 GitHub issues created — complete with response time details and issue links. Demo Flow View Step-by-Step Instructions → Step Action Step 1 Deploy Infrastructure + Applications Step 2 Create Sub-Agents, Triggers & Schedules Step 3 Swap bad code, watch agent remediate Setting Up the Demo Prerequisites Azure subscription with Contributor access Azure CLI installed and logged in (az login) .NET 9.0 SDK PowerShell 7.0+ Step 1: Deploy Infrastructure cd scripts .\1-setup-demo.ps1 -ResourceGroupName "sre-demo-rg" -AppServiceName "sre-demo-app-12345" This script will: Prompt for Azure subscription selection Deploy Azure infrastructure (App Service, App Insights, Alerts) Build and deploy healthy code to production Build and deploy problematic code to staging Full Setup Instructions → Step 2: Configure Azure SRE Agent Navigate to Azure SRE Agents Portal Sub Agent builder tab and create three sub-agents: Sub-Agent Purpose Tools Used AvgResponseTime Captures baseline response time metrics QueryAppInsightsByAppId, UploadKnowledgeDocument DeploymentHealthCheck Detects degradation and executes remediation SearchMemory, QueryAppInsights, PostTeamsMessage, CreateGithubIssue, Az CLI commands DeploymentReporter Generates deployment summary reports GetTeamsMessages, SendOutlookEmail Creating Each Sub-Agent In the Github links below you can find gif images that capture the creation flow AvgResponseTime + Baseline Task: Detailed Instructions → DeploymentHealthCheck + Swap Alert: Detailed Instructions → DeploymentReporter + Reporter Task: Detailed Instructions → Step 3: Run the Demo .\2-run-demo.ps1 This triggers the following flow: Slot Swap Occurs (demo script) ▼ Activity Log Alert Fires ▼ Incident Trigger Activated ▼ DeploymentHealthCheck Agent Runs ─ Queries current response time from App Insights ─ Retrieves baseline from knowledge store ─ Compares (if >20% degradation) ─ Executes: az webapp deployment slot swap ─ Creates GitHub issue (if degraded) ─ Posts to Teams channel Full Demo Instructions → Demo Timeline Time Event 0:00 Run 2-run-demo.ps1 0:30 Swap staging → production (bad code deployed) 1:00 Production now slow (~1500ms vs ~50ms baseline) ~5:00 Slot Swap Alert fires ~5:04 Agent executes slot swap (rollback) ~5:30 Production restored to healthy state ~6:00 Agent posts to Teams, creates GitHub issue How the Performance Toggle Works The app has a compile-time toggle in ProductsController.cs: private const bool EnableSlowEndpoints = false; // false = fast, true = slow The setup script creates two versions: Production: EnableSlowEndpoints = false → ~50ms responses Staging: EnableSlowEndpoints = true → ~1500ms responses (artificial delay) Get Started 🔗 Full source code and instructions: github.com/microsoft/sre-agent/samples/proactive-reliability 🔗 Azure SRE Agent documentation: https://learn.microsoft.com/en-us/azure/sre-agent/ Technology Stack Framework: ASP.NET Core 9.0 Infrastructure: Azure Bicep Monitoring: Application Insights + Log Analytics Automation: Azure SRE Agent Scripts: PowerShell 7.0+ Tags: Azure, SRE Agent, DevOps, Reliability, .NET, App Service, Application Insights, Autonomous Remediation364Views0likes1CommentBuild Long-Running AI Agents on Azure App Service with Microsoft Agent Framework
UPDATE 10/22/2025: An alternative implementation of this sample app has been added to this blog post. The alternate version uses a WebJob for background processing instead of an in-process hosted service. WebJobs are a great alternative for background processing in App Service, providing better separation of concerns, independent restarts, and dedicated logging. To learn more about WebJobs on App Service, see the Azure App Service WebJobs documentation. The AI landscape is evolving rapidly, and with the introduction of Microsoft Agent Framework, developers now have a powerful platform for building sophisticated AI agents that go far beyond simple chat completions. These agents can execute complex, multi-step workflows with persistent state, conversation threads, and structured execution—capabilities that are essential for production AI applications. Today, we're excited to share how Azure App Service provides an excellent platform for running Agent Framework workloads, especially those involving long-running operations. Let's explore why App Service is a great choice and walk through a practical example. 🔗 Quick link to sample app GitHub repo: https://github.com/Azure-Samples/app-service-agent-framework-travel-agent-dotnet 🔗 Quick link to WebJob sample app GitHub repo: https://github.com/Azure-Samples/app-service-agent-framework-travel-agent-dotnet-webjob The Challenge: Long-Running Agent Framework Flows Agent Framework enables AI agents to perform complex tasks that can take significant time to complete: Multi-turn reasoning: Iterative calls to large language models (LLMs) where each response informs the next prompt Tool integration: Function calling and external API interactions for real-time data Complex processing: Budget calculations, content optimization, multi-phase generation Persistent context: Maintaining conversation state across multiple interactions These workflows often take 30 seconds to several minutes to complete—far too long for synchronous HTTP request handling. Traditional web applications run into several constraints: ⏱️ Timeout Limitations: HTTP requests have timeout constraints (typically 30-230 seconds) ⚠️ Connection Issues: Clients may disconnect due to network interruptions or browser navigation 📈 Scalability Concerns: Long-running requests block worker threads and don't survive app restarts 🎯 Poor User Experience: Users see endless loading spinners with no progress feedback The Solution: Async Pattern with App Service Azure App Service provides a robust solution through the asynchronous request-reply pattern combined with background processing: API immediately returns (202 Accepted) with a task ID Background worker processes the Agent Framework workflow Client polls for status with real-time progress updates Durable state storage (Cosmos DB) maintains task status and results This pattern ensures: ✅ No HTTP timeouts—API responds in milliseconds ✅ Resilient to restarts—state survives deployments and scale events ✅ Progress tracking—users see real-time updates (10%, 45%, 100%) ✅ Better scalability—background workers process independently NOTE! This pattern can be implemented with either an in-process BackgroundService or as a separate WebJob process. Deployment Patterns: BackgroundService vs WebJob The following compares the two deployment options you have for this implementation. BackgroundService Pattern: ✅ Simpler deployment (single project) ✅ Shared process and memory ✅ Good for moderate workloads ⚠️ API and worker restart together WebJob Pattern (alternative): ✅ Separate processes (API + WebJob) ✅ Independent restart without API downtime ✅ Dedicated WebJob monitoring in portal ✅ Better for production operations ⚠️ Slightly more complex deployment (manual WebJob upload) Either of these options are a great way to help you get started with implementing long-running processes on App Service. To learn more about WebJobs on App Service, see the Azure App Service WebJobs documentation. Rapid Innovation Support The AI landscape is changing at an unprecedented pace. New models, frameworks, and capabilities are released constantly. Azure App Service's managed platform ensures your applications can adapt quickly without infrastructure rewrites: Framework Updates: Deploy new Agent Framework SDK versions like any application update Model Upgrades: Switch between GPT-4, GPT-4o, or future models with configuration changes Scaling Patterns: Start with combined API+worker, split into separate apps as needs grow New Capabilities: Integrate emerging AI services without changing hosting infrastructure App Service handles the platform complexity so you can focus on building great AI experiences. Sample Application: AI Travel Planner To demonstrate this pattern, we've built a Travel Planner application that uses Agent Framework to generate detailed, multi-day travel itineraries. The agent performs complex reasoning including: Researching destination attractions and activities Optimizing daily schedules based on location proximity Calculating detailed budget breakdowns Generating personalized travel tips and recommendations The entire application runs on a single P0v4 App Service with both the API and background worker combined—showcasing App Service's flexibility for hosting diverse workload patterns in one deployment. Key Architecture Components Azure App Service (P0v4 Premium) Hosts both REST API and background worker in a single app "Always On" feature keeps background worker running continuously Managed identity for secure, credential-less authentication Azure Service Bus Decouples API from long-running Agent Framework processing Reliable message delivery with automatic retries Dead letter queue for error handling Azure Cosmos DB Stores task status with real-time progress updates Automatic 24-hour TTL for cleanup Rich query capabilities for complex itinerary data Azure AI Foundry Hosts persistent agents with conversation threads Structured execution with Agent Framework runtime GPT-4o model for intelligent travel planning One of the powerful features of using Azure AI Foundry with Agent Framework is the ability to inspect agents and conversation threads directly in the Azure portal. This provides valuable visibility into what's happening during execution. Viewing Agents and Threads in Azure AI Foundry When you submit a travel plan request, the application creates an agent in Azure AI Foundry. You can navigate to your AI Foundry project in the Azure portal to see: Agents The application creates an agent for each request Important: Agents are **automatically deleted** after the itinerary is generated to keep your project clean Tip: You'll need to be quick! Navigate to Azure AI Foundry right after submitting a request to see the agent in action Once processing completes, the agent is removed as part of the cleanup process Conversation Threads Unlike agents, threads persist even after the agent completes You can view the complete conversation history at any time See the exact prompts sent to the model and the responses generated Useful for debugging, understanding agent behavior, and improving prompts The ephemeral nature of agents (created per request, deleted after completion) keeps your Azure AI Foundry project clean while the persistent threads give you full traceability of every interaction. Alternative Architecture: WebJob Pattern The alternate version of this app uses a WebJob for background processing instead of an in-process hosted service. However, just a single App Service is still required. WebJobs are a great alternative for background processing in App Service, providing better separation of concerns, independent restarts, and dedicated logging. To learn more about WebJobs on App Service, see the Azure App Service WebJobs documentation. Get Started Today The complete Travel Planner application is available as a reference implementation so you can quickly get started building your own apps with Agent Framework on App Service. Try one or both of these today! 🔗 GitHub Repository for background process version: https://github.com/Azure-Samples/app-service-agent-framework-travel-agent-dotnet 🔗 GitHub Repository for WebJob version: https://github.com/Azure-Samples/app-service-agent-framework-travel-agent-dotnet-webjob The repo includes: Complete .NET 9 source code with Agent Framework integration Infrastructure as Code (Bicep) for automated deployment Web UI with real-time progress tracking Comprehensive README with deployment instructions Deploy in minutes: git clone https://github.com/Azure-Samples/app-service-agent-framework-travel-agent-dotnet.git cd app-service-agent-framework-travel-agent-dotnet azd auth login azd up IMPORTANT! For the WebJob version, you will also need to manually deploy the WebJob. See the instructions in the README to learn how to do this. Key Takeaways ✅ Agent Framework enables sophisticated AI agents beyond simple chat completions ✅ Long-running workflows (30s-minutes) require async patterns to avoid timeouts ✅ App Service provides a simple, cost-effective platform for these workloads ✅ Async request-reply pattern with Service Bus + Cosmos DB ensures reliability ✅ Rapid innovation in AI is supported by App Service's adaptable platform Whether you're building travel planners, document processors, research assistants, or other AI-powered applications, Azure App Service gives you the flexibility and reliability you need—without the complexity of container orchestration or function programming models. What's Next? Build on This Foundation This Travel Planner is just the starting point—a foundation to help you understand the patterns and architecture. Agent Framework is designed to grow with your needs, making it easy to add sophisticated capabilities with minimal effort: 🛠️ Add Tool Calling Connect your agent to real-time APIs for weather, flight prices, hotel availability, and actual booking systems. Agent Framework's built-in tool calling makes this straightforward. 🤝 Implement Multi-Agent Systems Create specialized agents (flight expert, hotel specialist, activity planner) that collaborate to build comprehensive travel plans. Agent Framework handles the orchestration. 🧠 Enhance with RAG Add retrieval-augmented generation to give your agent deep knowledge of destinations, local customs, and insider tips from your own content library. 📊 Expand Functionality Real-time pricing and availability Interactive refinement based on user feedback Personalized recommendations from past trips Multi-language support for global users The beauty of Agent Framework is that these advanced features integrate seamlessly into the pattern we've built. Start with this sample, explore the Agent Framework documentation, and unlock powerful AI capabilities for your applications! Learn More Microsoft Agent Framework Documentation Azure App Service Documentation Async Request-Reply Pattern Azure App Service WebJobs documentation Have you built AI agents on App Service? We'd love to hear about your experience! Share your thoughts in the comments below. Questions about Agent Framework on App Service? Drop a comment and our team will help you get started.2.1KViews1like3Comments