.net
462 TopicsAgents League: The Esports-Inspired Hackathon Where AI Agents Battle for Glory
Ready to put your AI skills to the ultimate test? Agents League is here, a dynamic, esports-inspired developer challenge that brings the thrill of live competition to the world of agentic AI. Whether you're a seasoned AI developer or just getting started, this is your chance to build, compete, and win. What is Agents League? Agents League is a week-long hackathon running as part of AI Skills Fest (June 4–14, 2026). Unlike traditional hackathons, Agents League combines live AI coding battles, asynchronous project submissions, and a thriving Discord community all competing for a total prize pool of $55,000 USD. This isn't just about building it's about showcasing what's possible with agentic AI in a format that's fast, competitive, and globally accessible. Three Challenge Tracks Pick One or Compete in All 1. Creative Apps Build innovative applications using GitHub Copilot for AI-assisted development. Show off your creativity and demonstrate how AI can accelerate app creation from concept to code. 2. Reasoning Agents Create intelligent agents using Microsoft Foundry that solve complex problems through multi-step reasoning. This track is all about building agents that can think, plan, and execute. 3. Enterprise Agents Build business-ready knowledge agents integrated with Microsoft 365 Copilot, authored in Copilot Studio. Perfect for developers focused on real-world enterprise solutions. Live Microsoft Reactor Events—Don't Miss the Battles! The heart of Agents League beats through live Microsoft Reactor events. Watch experts go head-to-head in live coding battles, learn cutting-edge techniques, and get inspired for your own submissions: Event What You'll Learn Creative Apps Battle See GitHub Copilot in action as experts build innovative apps live Reasoning Agents Battle Watch multi-step reasoning agents come to life with Microsoft Foundry Enterprise Agents Battle Learn to build M365-integrated agents with Copilot Studio 👉 View the full event series Key Dates Registration Deadline: June 12, 2026, 12:00 PM PT Hacking Period: June 4–14, 2026 Submission Deadline: June 14, 2026, 11:59 PM PT What You Get Live coding battles with expert demonstrations Curated technical experiences and on-demand content Learning resources on Microsoft Learn and AI Skills Navigator Community support through Discord GitHub-based submissions for transparent, collaborative judging Why Participate? Agents League isn't just another hackathon. It's designed as a streamlined, competitive format that: ✅ Fits into your schedule with focused, time-boxed challenges ✅ Provides real-world product innovation experience ✅ Offers global accessibility—participate from anywhere ✅ Demonstrates the latest capabilities of agentic AI, including new IQ tools ✅ Connects you with a passionate developer community Ready to Enter the Arena? Register Now for Agents League Before you register: Review the Hackathon Rules and Regulations for prize categories and judging criteria Join the Microsoft Reactor event series for live battles and learning Check out the Microsoft Event Code of Conduct Join the Conversation Have questions? Want to connect with fellow competitors? Join the Agents League community on Discord and start strategizing with developers from around the world. Whether you're building creative apps, reasoning agents, or enterprise solutions—the arena awaits. May the best agent win! 🏆 Agents League hackathon is open to the public and offered at no cost. Government employees should check with their employers to ensure participation is permitted in accordance with applicable policies. Related Links: Agents League Hackathon Registration Microsoft Reactor Series AI Skills FestPerformance Tuning Cold Starts, Scaling Delays, and Startup Latency in Azure Container Apps
Introduction There is a particular kind of frustration that comes not when your application fails to start, but when it starts too slowly. The container is running, the health probes pass, your monitoring shows green — and yet every few minutes a user somewhere in the world hits a request that takes 15 seconds to respond. Your support team starts getting tickets. Your SLA dashboard turns amber. This is the cold start problem, and it is one of the most widely discussed pain points with any serverless container platform. Azure Container Apps is no different. But what most engineers do not realize is that the cold start is only one part of the story. Scaling delays, inefficient image layers, wrong resource allocations, and misconfigured KEDA rules all compound to create latency spikes that feel indistinguishable from cold starts but have completely different root causes and fixes. In this part of the series, we break down each cause systematically and show you exactly how to address it. Understanding What "Cold Start" Actually Means in Container Apps Before we fix anything, it helps to understand what is actually happening during a cold start. When a new replica is created, Azure Container Apps needs to do several things in sequence before your application can serve a single request: The platform schedules the new replica on available infrastructure. The container runtime pulls the image layers that are not already cached on that node. The container starts and the process inside it begins executing. Your application framework initializes (the .NET DI container, Django's ORM layer, loaded ML models, etc.). The readiness probe passes, signaling that the replica can accept traffic. Every one of these steps takes time. The total duration is your cold start latency. When you have `minReplicas: 0`, this full cycle happens for every "first request after idle" scenario. With `minReplicas: 1`, steps 1 and 2 are already done, but steps 3–5 still happen whenever a new replica is created due to scaling out. Scenario 1: Requests Spike to 10+ Seconds After a Period of Inactivity What You See Everything looks fine during load testing, but the next morning after a quiet night, the first user to hit the app gets a timeout or a very slow response. You check your Application Insights or Log Analytics and you see exactly one request with a dramatically higher duration than all the others. Why This Happens You have `minReplicas` set to `0` (or it defaults to 0). When there are no replicas running and a new request arrives, the entire cold start sequence kicks off — and the request waits in the ingress queue the entire time. Depending on your image size and application initialization time, this can easily reach 15–30 seconds for a .NET application with a large DI graph, or even longer for a Python application that imports heavy libraries. The Fix Option A (Recommended for most workloads): Set `minReplicas` to 1. This ensures at least one replica is always warm and ready to handle requests. You will pay for that one replica's compute even during idle periods, but you eliminate the cold start for your users: az containerapp update --name my-dotnet-api --resource-group my-rg --min-replicas 1 --max-replicas 10 Or in your Container App YAML: scale: minReplicas: 1 maxReplicas: 10 rules: - name: http-scaling-rule http: metadata: concurrentRequests: "10" Option B: Reduce image size to speed up the pull. Every megabyte in your container image adds time to cold starts. A production .NET API should not be a 2 GB image. Use multi-stage builds to strip away the SDK, test tools, and development dependencies: # Stage 1: Build FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build WORKDIR /src COPY ["MyApp.csproj", "."] RUN dotnet restore COPY . . RUN dotnet publish -c Release -o /app/publish --no-restore # Stage 2: Runtime only - much smaller FROM mcr.microsoft.com/dotnet/aspnet:8.0 AS final WORKDIR /app COPY --from=build /app/publish . # Run as non-root for security USER app EXPOSE 8080 ENTRYPOINT ["dotnet", "MyApp.dll"] For Django, the equivalent pattern is: FROM python:3.11-slim AS base # Install only production dependencies WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt && find /usr/local -name "*.pyc" -delete && find /usr/local -name "__pycache__" -type d -exec rm -rf {} + 2>/dev/null || true COPY . . RUN SECRET_KEY=placeholder python manage.py collectstatic --noinput USER nobody EXPOSE 8000 CMD ["gunicorn", "myproject.wsgi:application", "--bind", "0.0.0.0:8000", "--workers", "2"] Option C: Use a startup probe to manage the readiness window. If your app genuinely needs 20–30 seconds to initialize (loading configuration, warming caches, establishing connection pools), configure a startup probe separately from your liveness probe. This gives the container time to start without the liveness probe killing it prematurely: probes: - type: Startup httpGet: path: /health port: 8080 initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 12 # 12 * 5s = 60 seconds total window - type: Liveness httpGet: path: /health port: 8080 periodSeconds: 10 failureThreshold: 3 - type: Readiness httpGet: path: /health/ready port: 8080 periodSeconds: 5 failureThreshold: 3 Scenario 2: New Replicas Lag Behind Traffic Spikes What You See Your application handles steady traffic just fine. Then a sudden burst arrives — a marketing email goes out, a scheduled batch job triggers API calls, or a downstream system fires webhooks — and for 30–60 seconds your error rate jumps and your latency spikes. After that window, everything recovers. The scaling logs show new replicas were created, but they came online too late. Why This Happens KEDA (the scaling engine behind Container Apps) works reactively. By default, HTTP-based scaling triggers new replicas when concurrent requests exceed the configured threshold. But there is an inherent delay between the moment traffic spikes, the moment KEDA detects it, and the moment a new replica is warm and serving traffic. This window is where your users experience the pain. Additionally, if your image pull takes a long time (large image, first pull on a new node), the new replica arrives even later. KEDA cannot compensate for slow image pulls. The Fix Step 1 — Tune your KEDA scaling rules to trigger earlier. Rather than waiting until you are already at capacity, configure scaling to trigger with a lower concurrency threshold. If your app can handle 20 concurrent requests comfortably, set the threshold to 10 so new replicas spin up before you are overwhelmed: scale: minReplicas: 1 maxReplicas: 20 rules: - name: http-rule http: metadata: concurrentRequests: "10" # Scale earlier, not at capacity For Azure Service Bus or Event Hubs-triggered scaling (common in job-style workloads), use a queue length threshold that gives you a buffer: scale: minReplicas: 0 maxReplicas: 30 rules: - name: servicebus-rule custom: type: azure-servicebus metadata: queueName: my-processing-queue namespace: my-servicebus-namespace messageCount: "5" # Scale when queue depth reaches 5, not 100 auth: - secretRef: servicebus-connection triggerParameter: connection Step 2 — Pre-warm your connection pools in .NET. One of the biggest contributors to new replica slowness is the time spent establishing database connections and other external connections. The first request that hits a new replica bears the cost of opening the connection pool. Configure your connection pool to warm up eagerly at startup: // In Program.cs, after building the app if (app.Environment.IsProduction()) { // Warm up the database connection pool before accepting traffic using var scope = app.Services.CreateScope(); var dbContext = scope.ServiceProvider.GetRequiredService<AppDbContext>(); await dbContext.Database.ExecuteSqlRawAsync("SELECT 1"); } await app.RunAsync(); Step 3 — Enable HTTP/2 keep-alive and connection reuse. In .NET applications running behind the Container Apps ingress, configure your HTTP client to use connection pooling properly: builder.Services.AddHttpClient("downstream-api", client => { client.BaseAddress = new Uri("https://my-downstream-service"); client.DefaultRequestVersion = new Version(2, 0); }) .ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler { PooledConnectionLifetime = TimeSpan.FromMinutes(5), PooledConnectionIdleTimeout = TimeSpan.FromMinutes(2), MaxConnectionsPerServer = 20 }); Scenario 3: Django Startup Is Slow Due to Import Time What You See Your Django application takes 8–12 seconds to start even on a warm node. You check the Gunicorn startup logs and see it spending most of that time in Python module imports before it ever processes a request. Why This Happens Python's import system is synchronous and single-threaded. When you import `django`, `rest_framework`, `pandas`, `numpy`, or any large library, Python reads and executes every module file in the dependency chain. A Django project with Django REST Framework, Celery, and a few third-party packages can easily spend 5–8 seconds just on imports. Multiply that by the number of Gunicorn workers (each is a separate process that imports everything independently) and startup time balloons. The Fix Step 1 — Profile import time to find the worst offenders. Add this to your Dockerfile's entrypoint or run it manually: # Run this in a container shell to see which imports take the longest python -X importtime -c "import django; django.setup()" 2>&1 | sort -k2 -rn | head -20 Step 2 — Use lazy imports for heavy dependencies that are not needed at startup. Instead of importing everything at the module level, defer imports to the functions that actually need them: # Instead of this at the top of your file: import pandas as pd import numpy as np # Do this — import only when the function is actually called: def process_data(data): import pandas as pd import numpy as np df = pd.DataFrame(data) return df.describe().to_dict() Step 3 — Reduce Gunicorn worker count for memory-constrained environments. Having too many workers means too many independent Python processes all importing everything at the same time. For Container Apps with 0.5–1 vCPU, 2 workers is usually the right starting point: CMD ["gunicorn", "myproject.wsgi:application", "--bind", "0.0.0.0:8000", "--workers", "2", "--worker-class", "gthread", "--threads", "4", "--timeout", "120", "--keep-alive", "5", "--log-level", "info"] Step 4 — Consider switching from Gunicorn to Uvicorn for async Django. If you are on Django 4.x with ASGI support, Uvicorn with async workers can handle significantly more concurrent requests per worker than synchronous Gunicorn workers: CMD ["uvicorn", "myproject.asgi:application", "--host", "0.0.0.0", "--port", "8000", "--workers", "2", "--log-level", "info"] Scenario 4: Resource Limits Are Causing Throttling and Slow Responses What You See Your application starts fine and handles light traffic well, but under moderate to heavy load — even well below your max replicas — individual requests become slow and CPU metrics show your replicas running near 100% utilization. You may also see the .NET GC (garbage collector) running very frequently, or Django showing slow database queries that are actually fast queries being delayed because the process has no CPU to run. Why This Happens Container Apps defaults to 0.25 vCPU and 0.5 Gi memory if you do not specify resource limits. For a production .NET API or a Django application serving real traffic, this is almost always too little. When a container hits its CPU limit, the container runtime throttles it — the process continues to run but gets less CPU time, making everything slower without any obvious error signal. The Fix Step 1 — Measure actual resource usage before guessing. Query Log Analytics for actual CPU and memory usage to establish a baseline: ContainerAppSystemLogs_CL | where ContainerAppName_s == "my-dotnet-api" | where TimeGenerated > ago(7d) | summarize AvgCpuUsage = avg(todouble(CpuUsageNanoCores_d)) / 1000000, MaxCpuUsage = max(todouble(CpuUsageNanoCores_d)) / 1000000, AvgMemoryMB = avg(todouble(MemoryWorkingSetBytes_d)) / 1048576, MaxMemoryMB = max(todouble(MemoryWorkingSetBytes_d)) / 1048576 by bin(TimeGenerated, 1h) | order by TimeGenerated desc Step 2 — Update resource allocations based on what you observed. az containerapp update --name my-dotnet-api --resource-group my-rg --cpu 0.5 --memory 1.0Gi Container Apps has specific valid CPU/memory combinations. The valid pairs are: `0.25 vCPU / 0.5 Gi`, `0.5 vCPU / 1.0 Gi`, `0.75 vCPU / 1.5 Gi`, `1.0 vCPU / 2.0 Gi`, and up to `4.0 vCPU / 8.0 Gi`. You cannot mix arbitrary values. Step 3 — Configure .NET GC for server workloads. By default, .NET uses the workstation GC mode which is tuned for interactive applications. For server containers, use server GC mode and configure the heap size appropriately: // In runtimeconfig.template.json or via environment variables { "configProperties": { "System.GC.Server": true, "System.GC.HeapHardLimit": 805306368, "System.GC.HighMemoryPercent": 75 } } Or as environment variables in your Container App: az containerapp update --name my-dotnet-api --resource-group my-rg --set-env-vars "DOTNET_GCConserveMemory=5" "DOTNET_GCHeapHardLimit=805306368" Measuring the Impact of Your Changes After making changes, use this Log Analytics query to track your startup times over the past 24 hours and confirm the improvements: ContainerAppConsoleLogs_CL | where ContainerAppName_s == "my-dotnet-api" | where Log_s contains "Application started" or Log_s contains "Now listening on" | project TimeGenerated, Log_s, ContainerName_s | order by TimeGenerated desc And check request duration percentiles in Application Insights: requests | where cloud_RoleName == "my-dotnet-api" | where timestamp > ago(24h) | summarize p50 = percentile(duration, 50), p90 = percentile(duration, 90), p99 = percentile(duration, 99), count = count() by bin(timestamp, 1h) | order by timestamp desc Summary: Your Performance Tuning Quick Reference Here is a quick decision guide based on what you are seeing: Symptom Most Likely Cause First Fix to Try First request after idle is very slow `minReplicas: 0` Set `minReplicas: 1` Spike period has errors, then recovers KEDA scaling too slow Lower concurrentRequests threshold New replicas start slowly Large image size Multi-stage Docker build High CPU at moderate traffic Under-allocated resources Increase CPU/memory allocation Django startup is slow Heavy Python imports Profile and defer imports .NET app slow under load Workstation GC mode Enable server GC References and Sample Resources Use these links to tune startup performance, scaling behavior, and runtime efficiency. Azure Container Apps docs (core) Scale applications in Azure Container Apps Workload profiles overview Health probes in Azure Container Apps Revisions in Azure Container Apps Monitoring and logging in Azure Container Apps Runtime and framework performance references Docker multi-stage builds .NET performance best practices for ASP.NET Core .NET runtime GC configuration Django performance optimization Uvicorn deployment guide Scaling engine references and samples KEDA concepts and documentation KEDA scaler samples Azure Samples: .NET on Azure Container Apps Azure Samples: Python on Azure Container Apps What's Next In Part 3, we go deeper into the most specialized and complex scenario in this series: troubleshooting AI workloads in Azure Container Apps. Loading large ML models, managing GPU and CPU resource constraints, and dealing with memory pressure from inference workloads all require techniques that go beyond standard web application troubleshooting. Part of the series: Troubleshooting Azure Container Apps in Production Next: Part 3 — Troubleshooting ML Model Loading, GPU Issues, and Memory Pressure in Azure Container AppsTroubleshooting Azure Container Apps and Jobs for .NET and Django Workloads
Introduction Deploying to Azure Container Apps feels like a huge step forward — you get serverless containers, automatic scaling, built-in ingress, and managed environments without managing Kubernetes directly. But when something goes wrong and your container refuses to start, or your Container App Job silently fails, it can feel like debugging inside a black box. This first part of our four-part series walks through the most common deployment and startup failures you will hit when running .NET and Django applications on Azure Container Apps and Container App Jobs. We cover what the real error looks like, why it is happening under the hood, and what you need to do to fix it — step by step. The Real-World Problem: "My Container App is stuck in a restart loop and I have no idea why" This is probably the most common thing engineers report when they first move workloads to Azure Container Apps. The deployment finishes successfully, the revision shows as active, but the app never becomes healthy. In the Azure portal it cycles between `Running` and `Degraded`, and in the logs you see cryptic exit codes or — even worse — nothing at all. The root causes almost always fall into one of these buckets: The container exits immediately because the process crashes on startup (misconfiguration, missing secrets, unhandled exceptions). The health probe fails because the app takes too long to start or is listening on the wrong port. A Container App Job never completes because it times out or the job process exits with a non-zero code. Let us walk through each of these in detail. Scenario 1: Your .NET Application Crashes at Startup What You See Your Container App revision goes into a restart loop. You check the Log Analytics workspace and see something like this: ContainerAppConsoleLogs_CL | where ContainerAppName_s == "my-dotnet-api" | where TimeGenerated > ago(30m) | project TimeGenerated, Log_s | order by TimeGenerated desc The output shows: Unhandled exception. System.InvalidOperationException: Unable to resolve service for type 'MyApp.Data.AppDbContext' at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService ... Application is shutting down. Or even more commonly with Entity Framework Core: fail: Microsoft.EntityFrameworkCore.Database.Connection An error occurred using the connection to database 'mydb' on server 'myserver.database.windows.net'. System.Net.Sockets.SocketException: Connection refused Why This Happens When .NET 6+ applications start up, they run the entire `WebApplication.Build()` pipeline before accepting traffic. If any registered service — like a database context — cannot be constructed or if the connection string is missing or wrong, the application throws an unhandled exception and the process exits with a non-zero code. Container Apps detects this exit and restarts the container. This cycle repeats indefinitely. The most frequent trigger is missing or incorrectly named environment variables and secrets. In local development you rely on `appsettings.Development.json` or `user secrets`, but in Container Apps those files are not present unless you explicitly copy them into the image (which you should never do for secrets). Step-by-Step Fix Step 1 — Verify your secrets and environment variables are configured correctly. In the Azure portal, navigate to your Container App → Configuration → Secrets and Environment variables. Make sure every value your app reads from IConfiguration is defined here. From the CLI you can inspect and update them like this: # Add or update a secret reference az containerapp secret set --name my-dotnet-api --resource-group my-rg --secrets "connectionstring=Server=myserver.database.windows.net;..." # Reference that secret as an environment variable az containerapp update --name my-dotnet-api --resource-group my-rg --set-env-vars "ConnectionStrings__DefaultConnection=secretref:connectionstring" Step 2 — Make sure your .NET app reads configuration correctly. The naming convention that trips up almost everyone: Azure Container Apps uses double underscores (`__`) to represent the colon (`:`) separator in .NET configuration keys. So `ConnectionStrings:DefaultConnection` becomes `ConnectionStrings__DefaultConnection` as the environment variable name. // This reads from "ConnectionStrings__DefaultConnection" env var automatically builder.Services.AddDbContext<AppDbContext>(options => options.UseSqlServer(builder.Configuration.GetConnectionString("DefaultConnection"))); Step 3 — Add a startup health check that gives meaningful feedback. Configure a liveness probe with a generous initial delay to avoid a container being killed before it has had time to start: # In your Container App YAML configuration probes: - type: Liveness httpGet: path: /health port: 8080 initialDelaySeconds: 30 periodSeconds: 10 failureThreshold: 5 - type: Readiness httpGet: path: /health/ready port: 8080 initialDelaySeconds: 15 periodSeconds: 5 failureThreshold: 3 Add the corresponding health endpoint in your .NET app: // Program.cs builder.Services.AddHealthChecks() .AddSqlServer( builder.Configuration.GetConnectionString("DefaultConnection")!, name: "database", tags: new[] { "ready" }); app.MapHealthChecks("/health"); app.MapHealthChecks("/health/ready", new HealthCheckOptions { Predicate = check => check.Tags.Contains("ready") }); Step 4 — Pull the raw container logs using the CLI to see exactly what happened before the container exited: az containerapp logs show --name my-dotnet-api --resource-group my-rg --type console --tail 50 Scenario 2: Your Django Application Fails to Start What You See Your Django app deploys, the container starts, but within seconds it exits. In the logs you see one of these common errors: django.core.exceptions.ImproperlyConfigured: Set the SECRET_KEY environment variable Or: django.db.utils.OperationalError: could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5432? Or the static files problem that catches almost everyone: [Errno 2] No such file or directory: '/app/staticfiles' Why This Happens Django validates its configuration eagerly when the WSGI/ASGI server starts. If `SECRET_KEY` is not set, if `ALLOWED_HOSTS` does not include the container's hostname or the ingress FQDN, or if `DEBUG=True` is set in a configuration branch that requires a proper database, Django refuses to serve any requests. The static files error comes up because many teams forget to run `python manage.py collectstatic` as part of the container image build process. The `STATIC_ROOT` directory simply does not exist at runtime. Step-by-Step Fix Step 1 — Set required Django environment variables in your Container App. az containerapp secret set --name my-django-app --resource-group my-rg --secrets "django-secret-key=your-very-secret-key-here" "db-password=your-db-password" az containerapp update --name my-django-app --resource-group my-rg --set-env-vars "DJANGO_SECRET_KEY=secretref:django-secret-key" "DEBUG=False" "ALLOWED_HOSTS=my-django-app.happyfield-abc123.eastus.azurecontainerapps.io" "DATABASE_URL=secretref:db-password" Step 2 — Run `collectstatic` during Docker image build, not at runtime.* This is a very common mistake. Static files should be baked into the image, not generated when the container starts. Update your `Dockerfile`: FROM python:3.11-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . # Collect static files at build time with a dummy SECRET_KEY RUN SECRET_KEY=build-time-placeholder python manage.py collectstatic --noinput EXPOSE 8000 CMD ["gunicorn", "myproject.wsgi:application", "--bind", "0.0.0.0:8000", "--workers", "2", "--timeout", "120"] Step 3 — Make sure Gunicorn is configured correctly for Container Apps. The most important thing to verify is that Gunicorn is binding to `0.0.0.0` and not `127.0.0.1`. Container Apps expects the application to listen on all interfaces so that the ingress layer can reach it. Also make sure the port matches what you defined in your Container App's ingress target port: # Set ingress to match Gunicorn's bind port az containerapp ingress update --name my-django-app --resource-group my-rg --target-port 8000 --type external Step 4 — Handle database migrations safely. Never run `python manage.py migrate` as part of your container startup command. If you have multiple replicas, all of them will try to run migrations simultaneously, which can corrupt your schema. Instead, use a Container App Job to run migrations as a pre-deployment step: # Create a one-time Container App Job to run migrations az containerapp job create --name django-migrate-job --resource-group my-rg --environment my-aca-env --trigger-type Manual --replica-timeout 300 --image myregistry.azurecr.io/my-django-app:latest --command "python" --args "manage.py" "migrate" --env-vars "DJANGO_SECRET_KEY=secretref:django-secret-key" "DATABASE_URL=secretref:db-password" # Execute the migration job before deploying the new revision az containerapp job start --name django-migrate-job --resource-group my-rg Scenario 3: Your Container App Job Fails Silently or Times Out What You See You trigger a Container App Job — maybe it is a nightly data processing job, a scheduled report generator, or a cleanup task — and in the Azure portal the execution shows as Failed with no helpful error message. Or it shows as Running for an unusually long time and then transitions to Failed with a timeout error. Why This Happens Container App Jobs have a `replicaTimeout` property. If your job process does not complete within that window, Azure Container Apps kills it and marks the execution as failed. This is different from Container Apps (services) where the container keeps running. Jobs are expected to run to completion and exit with code `0`. The silent failure happens when your job process exits with a non-zero exit code but does not write anything to `stdout` or `stderr`. Container Apps records the exit code but has no log content to show you. Step-by-Step Fix Step 1 — Make your job emit logs to stdout explicitly. Every print statement, every log line should go to `stdout` or `stderr`. In Python: import sys import logging logging.basicConfig( level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s", handlers=[logging.StreamHandler(sys.stdout)] ) logger = logging.getLogger(__name__) def main(): logger.info("Job starting") try: # your job logic here process_data() logger.info("Job completed successfully") sys.exit(0) except Exception as e: logger.error(f"Job failed with error: {e}", exc_info=True) sys.exit(1) In .NET: // Use ILogger which writes to stdout by default in containers public class MyJob { private readonly ILogger<MyJob> _logger; public MyJob(ILogger<MyJob> logger) { _logger = logger; } public async Task RunAsync(CancellationToken cancellationToken) { _logger.LogInformation("Job starting at {Time}", DateTimeOffset.UtcNow); try { await ProcessDataAsync(cancellationToken); _logger.LogInformation("Job completed successfully"); } catch (Exception ex) { _logger.LogError(ex, "Job failed"); throw; // Let the process exit with non-zero code } } } Step 2 — Set an appropriate replica timeout and retry count. Be realistic about how long your job takes in production, then add a buffer: az containerapp job update --name my-processing-job --resource-group my-rg --replica-timeout 1800 # 30 minutes --replica-retry-limit 2 # Retry twice before marking as failed Step 3 — Check job execution history and logs. # List recent job executions and their status az containerapp job execution list --name my-processing-job --resource-group my-rg --output table # Get logs for a specific execution az containerapp job execution show --name my-processing-job --resource-group my-rg --job-execution-name my-processing-job-abc123 From Log Analytics: ContainerAppConsoleLogs_CL | where ContainerAppName_s == "my-processing-job" | where TimeGenerated > ago(24h) | project TimeGenerated, Log_s, ContainerName_s | order by TimeGenerated desc Summary: Your Startup Troubleshooting Checklist Before you dig into complex diagnostics, run through this checklist whenever a Container App or Job fails to start: Are all required environment variables and secrets defined and correctly referenced? Is the application listening on `0.0.0.0` and on the port that matches the ingress target port? Does the Dockerfile copy everything needed for the app to run (migrations, static files, etc.)? Are health probes configured with enough initial delay for the app to start? For jobs: is the replica timeout long enough, and does the process exit with code 0 on success? Is the container registry accessible from the Container Apps environment (managed identity or registry credentials configured)? Are the resource allocations (CPU and memory) sufficient for the application to start without OOM-killing? References and Sample Resources Use these resources for deeper implementation details and production-ready patterns. Azure Container Apps docs (core) Azure Container Apps overview Manage secrets in Azure Container Apps Manage environment variables in Azure Container Apps Health probes in Azure Container Apps Ingress in Azure Container Apps View logs in Azure Container Apps Azure Container Apps Jobs overview Azure Container Apps revisions .NET and Django references ASP.NET Core configuration fundamentals ASP.NET Core health checks Django deployment checklist Gunicorn settings Sample repositories Azure Samples: .NET on Azure Container Apps Azure Samples: Python on Azure Container Apps Azure Samples: TypeScript MCP container sample What's Next In Part 2 of this series, we move past startup failures and look at what happens after your app is running — the frustrating world of cold starts, scaling delays, and startup latency spikes that make your application feel slow under real production traffic. Part of the series: Troubleshooting Azure Container Apps in Production Next: Part 2 — From Slow to Snappy: Performance Tuning Cold Starts and Scaling Delays in Azure Container AppsWhat's new in Azure App Service at #MSBuild 2026
At Microsoft Build 2026, Azure App Service introduced a powerful set of updates designed to help organizations accelerate their journey into AI, without increasing complexity or cost. These innovations focus on one clear business outcome: enabling teams to build, deploy, and scale AI-powered applications and agents faster, more securely, and with greater operational efficiency. A key highlight is the new Easy AI experience, which allows existing web apps to become AI-ready with no rearchitecting required. With capabilities like built-in Model Context Protocol (MCP), developers can instantly expose app functionality as agent-ready endpoints, enabling AI agents to interact with business logic securely and seamlessly. This dramatically reduces development time, allowing teams to move from idea to intelligent application in a fraction of the usual effort. Security and compliance are also strengthened with the general availability of Isolated v4 for Azure App Service Environments, delivering improved performance for customers that need single-tenant isolation and strong data residency guarantees. For enterprises operating in regulated industries, this ensures AI applications meet strict governance requirements without sacrificing scalability or speed. For modernization scenarios, Managed Instance on Azure App Service simplifies the migration of legacy applications, including those with OS-level dependencies. Faster restarts, enhanced diagnostics, and AI-assisted migration workflows help organizations modernize existing systems cost-effectively—avoiding expensive rewrites while unlocking AI capabilities. Recent updates include an AI-assisted approach to migrating legacy IIS applications using a multi-agent workflow powered by MCP. Managed Instance is supported on both Premium v4 and Isolated v4, laying the foundation for a modern compute infrastructure across the board. Operational efficiency is further enhanced through platform and CLI improvements designed for the “agent era.” From structured deployment diagnostics to optimized Python pipelines delivering faster deployments, these updates reduce friction and infrastructure overhead, lowering total cost of ownership. Together, these innovations position Azure App Service as a future-ready platform where businesses can rapidly build intelligent, agent-driven applications securely, efficiently, and at scale. 👉 Learn more in the full announcement: Deep dive into Azure App Service Build 2026 updates1.3KViews0likes0CommentsAzure Functions MCP Extension: What's New at Build 2026
The Azure Functions MCP extension has had a breakout year! Since its initial preview, the extension has grown from a single trigger type into a full-featured platform for building remote MCP servers: with tool, resource, and prompt triggers across multiple languages, MCP Apps for interactive UIs, built-in MCP authentication, and feature enhancements. Here's what's new and what it means for developers building MCP servers on Azure Functions. The full MCP primitive set: Tools, resources, and prompts When the MCP extension first shipped, it supported tool triggers. Declare a function as an MCP tool, and any MCP client can discover and call it. That was the starting point. Since then, we've shipped the remaining MCP primitives: Resource triggers: expose a function as an MCP resource. Prompt triggers: expose a function as an MCP prompt, letting clients request structured prompt templates from your server. Like tool triggers, resource and prompt triggers are supported in multiple languages including .NET, Java, Python, TypeScript, and JavaScript. MCP Apps: interactive UI from your MCP server MCP Apps let your tools return interactive user interfaces instead of plain text. Combine tool triggers with resource triggers, and your MCP server can serve rich, rendered experiences to MCP-aware clients. The Azure Functions MCP extension supports MCP Apps natively, meaning the same function app that exposes tools and resources can also serve UI components. The launch blog post on the Azure Apps Blog walked through the pattern in detail. For .NET developers, the new fluent builder API (available in the latest NuGet release) makes it easier to compose MCP Apps by chaining tool and resource definitions in a declarative style. MCP authentication The extension supports built-in MCP authentication, implementing the requirements of the MCP auth spec. All samples in the aka.ms/remote-mcp repo enable built-in MCP auth by default with Microsoft Entra ID as the identity provider. Samples have also been updated to demonstrate how to exchange tokens in the On-Behalf-Of (OBO) flow, so your MCP tools can access downstream APIs using the invoking user's identity. Auth configuration in the Azure portal: Preview at Build is a one-click experience in the Azure portal for configuring built-in MCP auth. No more manual app registration creating, configuration and wiring to the server. Just open your server app on the portal and click to enable MCP auth. Try it out! Feature enhancements Beyond the headline primitives and auth, the extension has shipped a steady stream of capabilities the past few months. The following are the notable additions. Structured content Structured content lets you return machine-readable JSON metadata alongside your tool's response via the `structuredContent` field. Clients that support it can programmatically consume the data (e.g. parse fields, render tables, drive downstream logic) rather than just displaying text. Clients that don't support it still get the regular content blocks as a fallback. Rich content types Tools aren't limited to returning plain text. The extension supports the full set of MCP content block types, e.g. `TextContent`, `ImageContent`, `AudioContent`, `ResourceLink`, and `EmbeddedResource`, so your tools can return images, audio clips, references to resources, and inline file content alongside text. Input and output schemas `WithInputSchema` and `WithOutputSchema` give you explicit control over the JSON schemas advertised for your tools. This is especially useful when the auto-generated schema from function parameters doesn't capture the full contract. For example, when your tool accepts a complex nested object or returns a specific shape that clients depend on. Input and output schemas are currently supported in .NET, with support for other languages coming soon. builder.ConfigureMcpTool("SearchDocs") .WithOutputSchema(""" { "type": "object", "properties": { "results": { "type": "array", "items": { "type": "string" } }, "query": { "type": "string" } }, "required": ["results", "query"] } """); Fluent configuration APIs in .NET A set of fluent builder APIs that let you configure MCP primitives declaratively in `Program.cs`: ConfigureMcpTool: add properties, metadata, input/output schemas, or promote a tool to an MCP App ConfigureMcpResource: attach metadata to resources ConfigureMcpPrompt: define prompt arguments and metadata builder.ConfigureMcpTool("sayhello") .WithProperty("name", McpToolPropertyType.String, "Name of the user", required: true) .WithMetadata("ui", new { resourceUri = "ui://index.html" }); What's next Usage of the MCP extension has grown steadily since its preview launch. Tool execution volume has increased 15x over the past several months as more customers move from experimentation to production. As adoption grows, so do the expectations. Developers building production MCP servers are hitting real friction around auth complexity, client configuration, and observability. We're continuing to invest in the extension to address these gaps and help customers be more successful building and hosting MCP servers on Azure Functions. Here's where we're focusing next. Continued auth simplification Auth remains the biggest barrier to getting an MCP server into production. We'll work on: Smoother client setup: making it easier to connect any MCP client to an authenticated Azure Functions MCP server, not just VS Code. Simplified OBO flow: streamlining the experience of On-Behalf-Of authentication so developers can delegate user identity to downstream services with less configuration. Our goal: the secure path should be the easy path. Deeper integration with Microsoft Foundry We'll build tighter integration between Azure Functions MCP servers and Microsoft Foundry. This includes surfacing MCP servers in Foundry Toolbox, a new feature introduced to help Foundry agents discover and consume tools from a single endpoint. Developers will be able to publish an MCP server from Functions and have it available to Foundry agents through Toolbox without manual endpoint configuration. Continued feature enhancement We prioritize based on feedback from the community raised in our GitHub repo. For example, support for streaming output and pagination are top items in our backlog today based on user demand. We also track the MCP spec's evolution closely and will continue shipping support for strategic features as they land. Examples of proposals we're following: MCP Tasks: the Tasks extension (SEP-2663) defines a standard pattern for async, long-running tool calls with durable task handles. This replaces hand-rolled polling patterns and aligns well with Functions' execute-and-return model. Stateless MCP: SEP-2575 proposes removing the mandatory initialization handshake, which is a natural fit for serverless platforms like Azure Functions where fresh instances can handle any request. Have something you'd like us to prioritize? Let us know by filing a request on GitHub. Get started Samples: Samples showcasing most up-to-date features: aka.ms/remote-mcp Documentation: Model Context Protocol for Azure Functions MCP Extension GitHub repo: Azure Functions MCP Extension416Views1like0CommentsClosing the AI-readiness gap with agentic modernization
Legacy debt is widening the AI-readiness gap Legacy systems and mounting tech debt aren't just slowing your AI agenda — they're quietly stealing its potential. Aging architectures, and complex, decades-old applications, databases, and infrastructure weren't designed for high performance, complex, dynamically scaling agentic workloads. The longer legacy lasts, the wider the gap between AI ambition and AI-readiness. This year at Microsoft Build 2026, we're taking our biggest step yet toward closing it. In a recent Forrester study, 94% of IT leaders ranked modernization as a top priority for their AI strategy, yet only 43% of their portfolios have been modernized on average, and only 32% are AI-ready.¹ Ambition for AI adoption is at a high, yet most are held back by the underlying legacy code, technical debt, and modernization backlog. Forrester’s data underscores it: on average, 35% of modernization projects stall due to legacy constraints, 65% cite security and compliance as the top challenge, 58% are held back by the complexity of monolithic applications, and 59% struggle with finding skilled talent to execute.¹ The result: AI initiatives that stall before they ever reach production. Modernization is a key step to move towards AI production; and it’s typically easier said than done. The growing problem is how to execute at the pace and scale that AI now demands. That’s why IT operators, architects, application owners, and developers are turning to agents to eliminate legacy toil, connect workstreams across teams, and scale their modernization efforts while customizing how they modernize. The first agentic end-to-end modernization solution that unifies IT and developer workflows Azure Copilot migration agent and GitHub Copilot modernization agent create the first agentic, end-to-end modernization solution that unifies IT and developer workflows— helping organizations close the AI-readiness gap by connecting discovery, assessment, planning, code transformation, governance, deployment, and observability into one continuous system. Built into the tools IT teams and developers already use, the solution combines estate-scale planning with GitHub-native execution, application-aware migration, broad workload coverage across apps, infrastructure, and data, and enterprise-grade privacy, security, and flexibility— so modernization becomes a governed, scalable, portfolio-level capability across teams, rather than a series of one-off projects. Building the estate-wide modernization plan Azure Copilot migration agent (public preview) brings AI to every step of estate modernization planning - from discovery and assessment to dependency mapping, ROI analysis, and wave planning - reducing months of manual analysis to minutes. For organizations with a clear picture of their estate, the migration agent accelerates the path from inventory to wave plan. For organizations that don’t, the migration agent helps them build that picture from scratch: what is running, what depends on what, what to move, what to modernize, what to retire, and in what order. By creating business-goal oriented estate plans, generating ROI analysis in minutes, and aligning IT and development teams through connected workflows, it helps enterprises move mission critical applications, databases, and infrastructure onto Azure faster and more confidently, with a continuous, data-driven view of what to modernizing next. Freeing teams from the legacy tax GitHub Copilot modernization agent, now generally available, empowers application owners, architects, and developers to scale modernization across their entire application portfolio. Operated from the CLI, the modernization agent acts as an orchestrator that simultaneously: Assesses readiness across multiple applications at once Plans application-specific modernization journeys and executes the identified migration tasks Surfaces deep code and dependency-level insights and recommendations Automates upgrades for Java and .NET applications Recommends Azure services aligned to organizational standards With its native design into GitHub Copilot, the modernization agent creates issues, pull requests, and shareable assessment reports for each application as it works. Architects and application owners retain visibility and governance from a single view, while developers receive clear, prioritized work they can execute from the agent or finish directly in their preferred editor. Behind the scenes, the modernization agent coordinates with GitHub Copilot's coding agent to complete tasks asynchronously across repositories, with a full monitoring and audit trail in GitHub's Agent HQ. The result is a connected planning-to-execution flow that finally makes modernization at scale possible, without sacrificing oversight or control. In just a few months, the modernization agent has already accelerated modernization up to 4x faster across hundreds of thousands of legacy .NET and Java applications at hundreds of customers. Enterprise level customization Every application is built and operated as uniquely as the business it serves. The path to modernization must be equally unique: tailored to each application's architecture, dependencies, and intent. At Build, we're excited to announce the general availability of custom skills for the modernization agent. Custom skills (GA) let developers teach the modernization agent how their organization works by encoding proprietary patterns, libraries, Azure best practices, and migration approaches once, then reusing them across every run. Each skill is authored as a skill.md file with build instructions, sample usage, reference APIs, and more, and is built on open-standard agent skills so teams aren't locked into a proprietary format. With custom skills, developers can equip the modernization agent with: Business-specific context, knowledge, intent, and migration approaches for application-aware guidance Centralized skills library to reuse and repeat tasks across the portfolio Full traceability for every skill used in generating the modernization plan The result is portfolio-scale execution with application-level specificity, in the same agentic workflow. And because skills live in a shared library, teams can reuse and repeat for faster, more consistent modernization outcomes aligned with the application's goals. Innovating while closing the AI-readiness gap GitHub Copilot is already dramatically reducing technical debt in real world environments, helping to close the AI readiness gap and, more importantly, innovate faster with AI. Organizations that adopt agentic modernization can not only close their AI-readiness gap, they can also make modernization a continuous process, allowing them to more readily integrate AI into existing business applications and services Ready to reimagine your applications? Join us at Microsoft Build this year, online or in person, to see our product teams reimagine applications live with GitHub Copilot modernization, share customer success, and empower you to modernize with confidence in days, not months. Join online or in person for Build session BRK220 on Wednesday 9AM PST Learn more about GitHub Copilot modernization: aka.ms/ghcp-modernization Dive deeper at the virtual .NET Agentic Modernization Day on June 16 th : aka.ms/dotnetday/rsvp ¹ Forrester’s Q1 2026 Cloud and AI Application Modernization Survey [E-66670]957Views1like0CommentsAzure Functions MCP extension now supports MCP Prompts
We are thrilled to announce that the MCP prompt trigger is now available in public preview in the Azure Functions MCP extension! With this release, the extension now supports all three core MCP server primitives - tools, resources, and prompts, giving you a complete platform for building rich MCP servers on Azure Functions. In case you missed it, the MCP resource trigger is generally available for serving resources and building interactive UIs in MCP Apps. What are MCP Prompts In the Model Context Protocol (MCP), prompts are reusable templates that allow server authors to provide parameterized prompts for a domain, or showcase how to best use the MCP server. Prompts are user-controlled in that they require explicit invocation rather than automatic triggering, and can be context-aware, referencing available resources and tools to create comprehensive workflows. Unlike tools (which are model-controlled) and resources (which are application-controlled), prompts are exposed from servers to clients so users can explicitly select them. Applications typically expose prompts through slash commands, command palettes, dedicated UI buttons, or context menus. How It Works In Python, defining a prompt is as simple as decorating a function. Here's a prompt that returns a code review checklist: app.mcp_prompt_trigger( arg_name="context", prompt_name="code_review_checklist", description="Returns a structured code review checklist prompt for evaluating code changes." ) def code_review_checklist(context: func.PromptInvocationContext) -> str: logging.info("Code review checklist prompt invoked.") return """You are a senior software engineer performing a code review. Use the following checklist to evaluate the code: 1. **Correctness** — Does the code do what it's supposed to? 2. **Error Handling** — Are edge cases and failures handled? 3. **Security** — Are there any vulnerabilities (injection, auth, secrets)? 4. **Performance** — Are there obvious inefficiencies? 5. **Readability** — Is the code clear and well-named? 6. **Tests** — Are there adequate tests for the changes? Provide your feedback in a structured format with a severity level (critical, warning, suggestion) for each finding.""" Prompts can accept arguments, allowing clients to customize the generated message. Here's a prompt that generates documentation with configurable parameters: app.mcp_prompt_trigger( arg_name="context", prompt_name="generate_documentation", prompt_arguments=[ func.PromptArgument("function_name", "The name of the function to document.", required=False), func.PromptArgument("style", "Documentation style: 'concise', 'detailed', or 'tutorial'.", required=False) ], description="Generates API documentation for a function. Arguments are configured in Program.cs." ) def generate_documentation(context: func.PromptInvocationContext) -> str: function_name = context.arguments.get("function_name", "(unknown)") style = context.arguments.get("style", "concise") logging.info(f"Generate docs prompt invoked for function: {function_name}") return f"""Generate API documentation for the function named **{function_name}**. Documentation style: **{style}** Include the following sections: - **Description** — What the function does. - **Parameters** — List each parameter with its type and purpose. - **Return Value** — What the function returns. - **Example Usage** — A short code example showing how to call it.""" Checkout the Get Started section for the complete sample and samples in different languages. Why Azure Functions Azure Functions is the ideal platform for hosting remote MCP servers because of its built-in MCP authentication, event-driven scaling from 0 to N, and serverless billing. This ensures your agentic tools are secure, cost-effective, and ready to handle any load. With the MCP extension, you focus on implementing the primitives you want to expose, tools, resources, and prompts, instead of worrying about MCP protocol details and server logistics. Get Started You can start building today using our quickstarts and samples: Python TypeScript .NET Java Documentation Azure Functions MCP extension overview Prompt trigger We'd Love to Hear from You! Let us know your thoughts about the new prompt trigger. What kinds of prompts are you building for your MCP servers? What would you like us to prioritize next? Share your feedback in our GitHub repo.456Views0likes0CommentsMulti-Tenant Architecture: Real Challenges and an Azure Design Walkthrough
Azure Multi-Tenant Architecture (B2C Scenario) Let’s start with a reference design commonly used in Azure-based systems. A pretty standard setup looks something like this: Microsoft Entra External ID (Azure AD B2C) for authentication Azure API Management as the entry layer App Service or Functions for the compute layer Cosmos DB or SQL for storage Redis for caching Service Bus for async processing Application Insights for monitoring If you’ve worked on Azure systems, nothing here is surprising. On paper, this architecture is clean, scalable, and “multi-tenant ready”. But once traffic starts flowing and tenants behave differently, things start breaking in subtle ways. 1. Tenant Context Propagation Across Services A request doesn’t stay in one place. It moves across: API layer queues/topics background workers What I’ve seen happen multiple times: tenant ID is present in the API, but missing in async flows background jobs process data without knowing which tenant it belongs to logs become useless because you can’t tie actions back to a tenant The fix is simple in theory, but often missed in implementation: Every message should carry tenant context. No exceptions. If you rely on “it will be available somewhere”, it won’t be, especially in distributed systems. Ensure tenant context is explicitly carried everywhere: public class TenantMessage { public string TenantId { get; set; } public string Payload { get; set; } } Every message, event, and async operation should include tenant scope. 2. Data Isolation in Shared Databases Most teams start with a shared database model with tenant-based partitioning. It works well initially. Problems start creeping in later: someone forgets to add a tenant filter in a query a query suddenly scans across partitions one large tenant starts slowing down others A simple query like this becomes critical: var query = container.GetItemQueryIterator<Order>( new QueryDefinition("SELECT * FROM c WHERE c.tenantId = @tenantId") .WithParameter("@tenantId", tenantId) ); The tricky part is not writing it once, it’s making sure it’s applied everywhere, every time. 3. Authorization Beyond Tenant Boundaries At the beginning, access control is simple: “Users can access data from their own tenant.” But then requirements grow: admin access cross-tenant visibility reporting across firms or regions And this is where things usually get messy. Different services start implementing their own logic, and over time you end up with inconsistent behavior. A simple check: public bool CanAccess(string userTenant, string resourceTenant, bool isGlobalAdmin) { if (isGlobalAdmin) return true; return userTenant == resourceTenant; } becomes much harder to manage when duplicated across multiple services. One thing that helps a lot here is centralizing authorization logic early. 4. Caching as a Hidden Risk Caching is usually added later for performance. And that’s exactly why it becomes risky. I’ve seen scenarios where: cached data from one tenant is returned to another because the cache key didn’t include tenant information Fixing it is straightforward: public string BuildCacheKey(string tenantId, string key) { return $"{tenantId}:{key}"; } Cache keys must always include tenant boundaries 5. Resource Contention (Noisy Neighbor Problem) All tenants share resources: compute database throughput messaging What happens in practice: one high-load tenant impacts others latency becomes unpredictable system behavior differs per tenant You start adding controls like: if (RequestsPerTenant[tenantId] > 100) { return StatusCode(429); } And gradually move towards: throttling workload isolation prioritization This is less of a design problem and more of an operational reality. 6. Observability in Multi-Tenant Systems Logging works great, until you scale. Then suddenly: logs from all tenants are mixed debugging becomes slow it’s hard to answer basic questions like “which tenant failed?” A small change makes a huge difference: _logger.LogInformation( "Tenant={TenantId} Action=ProcessOrder OrderId={OrderId}", tenantId, orderId ); It sounds obvious, but it’s often inconsistent across services. 7. Backup and Restore Considerations Taking backups is easy. Restoring a single tenant isn’t. In most shared database setups: restore is done at database level which affects all tenants So if one tenant has a problem, recovery is not straightforward. This is one of those areas where decisions made early in design matter a lot later. Final Thoughts Designing a multi-tenant system is not just about choosing Azure services. The real challenges come from: how tenant context flows how isolation is enforced how systems behave under uneven load Most issues don’t show up on day one. They appear gradually as tenants grow, scale, and behave differently. References and Further Reading If you want to explore these concepts in more depth, here are some useful official resources: Microsoft Entra External ID (Azure AD B2C) Azure API Management Azure App Service Azure Cosmos DB and multitenant design Azure Service BusFrom "Maybe Next Quarter" to "Running Before Lunch" on Container Apps - Modernizing Legacy .NET App
In early 2025, we wanted to modernize Jon Galloway's MVC Music Store - a classic ASP.NET MVC 5 app running on .NET Framework 4.8 with Entity Framework 6. The goal was straightforward: address vulnerabilities, enable managed identity, and deploy to Azure Container Apps and Azure SQL. No more plaintext connection strings. No more passwords in config files. We hit a wall immediately. Entity Framework on .NET Framework did not support Azure.Identity or DefaultAzureCredential. We just could not add a NuGet package and call it done - we’d need EF Core, which means modern .NET - and rewriting the data layer, the identity system, the startup pipeline, the views. The engineering team estimated one week of dedicated developer work. As a product manager without extensive .NET modernization experience, I wasn't able to complete it quickly on my own, so the project was placed in the backlog. This was before the GitHub Copilot "Agent" mode, the GitHub Copilot app modernization (a specialized agent with skills for modernization) existed but only offered assessment - it could tell you what needed to change, but couldn't make the end to end changes for you. Fast-forward one year. The full modernization agent is available. I sat down with the same app and the same goal. A few hours later, it was running on .NET 10 on Azure Container Apps with managed identity, Key Vault integration, and zero plaintext credentials. Thank you GitHub Copilot app modernization! And while we were on it – GitHub Copilot helped to modernize the experience as well, built more tests and generated more synthetic data for testing. Why Azure Container Apps? Azure Container Apps is an ideal deployment target for this modernized MVC Music Store application because it provides a serverless, fully managed container hosting environment. It abstracts away infrastructure management while natively supporting the key security and operational features this project required. It pairs naturally with infrastructure-as-code deployments, and its per-second billing on a consumption plan keeps costs minimal for a lightweight web app like this, eliminating the overhead of managing Kubernetes clusters while still giving you the container portability that modern .NET apps benefit from. That is why I asked Copilot to modernize to Azure Container Apps - here's how it went - Phase 1: Assessment GitHub Copilot App Modernization started by analyzing the codebase and producing a detailed assessment: Framework gap analysis - .NET Framework 4.0 → .NET 10, identifying every breaking change Dependency inventory - Entity Framework 6 (not EF Core), MVC 5 references, System.Web dependencies Security findings - plaintext SQL connection strings in Web.config, no managed identity support API surface changes - Global.asax → Program.cs minimal hosting, System.Web.Mvc → Microsoft.AspNetCore.Mvc The assessment is not a generic checklist. It reads your code - your controllers, your DbContext, your views - and maps a concrete modernization path. For this app, the key finding was clear: EF 6 on .NET Framework cannot support DefaultAzureCredential. The entire data layer needs to move to EF Core on modern .NET to unlock passwordless authentication. Phase 2: Code & Dependency Modernization This is where last year's experience ended and this year's began. The agent performed the actual modernization: Project structure: .csproj converted from legacy XML format to SDK-style targeting net10.0 Global.asax replaced with Program.cs using minimal hosting packages.config → NuGet PackageReference entries Data layer (the hard part): Entity Framework 6 → EF Core with Microsoft.EntityFrameworkCore.SqlServer DbContext rewritten with OnModelCreating fluent configuration System.Data.Entity → Microsoft.EntityFrameworkCore namespace throughout EF Core modernization generated from scratch Database seeding moved to a proper DbSeeder pattern with MigrateAsync() Identity: ASP.NET Membership → ASP.NET Core Identity with ApplicationUser, ApplicationDbContext Cookie authentication configured through ConfigureApplicationCookie Security (the whole trigger for this modernization): Azure.Identity + DefaultAzureCredential integrated in Program.cs Azure Key Vault configuration provider added via Azure.Extensions.AspNetCore.Configuration.Secrets Connection strings use Authentication=Active Directory Default — no passwords anywhere Application Insights wired through OpenTelemetry Views: Razor views updated from MVC 5 helpers to ASP.NET Core Tag Helpers and conventions _Layout.cshtml and all partials migrated The code changes touched every layer of the application. This is not a find-and-replace - it's a structural rewrite that maintains functional equivalence. Phase 3: Local Testing After modernization, the app builds, runs locally, and connects to a local SQL Server (or SQL in a container). EF Core modernizations apply cleanly, the seed data loads, and you can browse albums, add to cart, and check out. The identity system works. The Key Vault integration gracefully skips when KeyVaultName isn't configured - meaning local dev and Azure use the same Program.cs with zero code branches. Phase 4: AZD UP and Deployment to Azure The agent also generates the deployment infrastructure: azure.yaml - AZD service definition pointing to the Dockerfile, targeting Azure Container Apps Dockerfile - Multi-stage build using mcr.microsoft.com/dotnet/sdk:10.0 and aspnet:10.0 infra/main.bicep - Full IaaC including: Azure Container Apps with system + user-assigned managed identity Azure SQL Server with Azure AD-only authentication (no SQL auth) Azure Key Vault with RBAC, Secrets Officer role for the managed identity Container Registry with ACR Pull role assignment Application Insights + Log Analytics All connection strings injected as Container App secrets — using Active Directory Default, not passwords One command: AZD UP Provisions everything, builds the container, pushes to ACR, deploys to Container Apps. The app starts, runs MigrateAsync() on first boot, seeds the database, and serves traffic. Managed identity handles all auth to SQL and Key Vault. No credentials stored anywhere. What Changed in a Year Early 2025 Now Assessment Available Available Automated code modernization Semi-manual ✅ Full modernization agent Infrastructure generation Semi-manual ✅ Bicep + AZD generated Time to complete Weeks ✅ Hours The technology didn't just improve incrementally. The gap between "assessment" and "done" collapsed. A year ago, knowing what to do and being able to do it were very different things. Now they're the same step. Who This Is For If you have a .NET Framework app sitting on a backlog because "the modernization is too expensive" - revisit that assumption. The process changed. GitHub Copilot app modernization helps you rewrite your data layer, generates your infrastructure, and gets you to azd up. It can help you generate tests to increase your code coverage. If you have some feature requests or if you want to further optimize the code for scale - bring your requirements or logs or profile traces, you can take care of all of that during the modernization process. MVC Music Store went from .NET Framework 4.0 with Entity Framework 6 and plaintext SQL credentials to .NET 10 on Azure Container Apps with managed identity, Key Vault, and zero secrets in code. In an afternoon. That backlog item might be a lunch break now 😊. Really. Find your legacy apps and try it yourself. Next steps Modernize your .Net or Java apps with GitHub Copilot app modernization – https://aka.ms/ghcp-appmod Open your legacy application in Visual Studio or Visual Studio Code to start the process Deploy to Azure Container Apps https://aka.ms/aca/start491Views0likes1Comment