application modernization
111 TopicsReference Architecture for a High Scale Moodle Environment on Azure
Introduction Moodle is an open-source learning platform that was developed in 1999 by Martin Dougiamas, a computer scientist and educator from Australia. Moodle stands for Modular Object-Oriented Dynamic Learning Environment, and it is written in PHP, a popular web programming language. Moodle aims to provide educators and learners with a flexible and customizable online environment for teaching and learning, where they can create and access courses, activities, resources, and assessments. Moodle also supports collaboration, communication, and feedback among users, as well as various plugins and integrations with other systems and tools. Moodle is widely used around the world by schools, universities, businesses, and other organizations, with over 100 million registered users and 250,000 registered sites as of 2020. Moodle is also supported by a large and active community of developers, educators, and users, who contribute to its development, documentation, translation, and support. [URL] is the official website of the Moodle project, where anyone can download the software, join the forums, access the documentation, participate in events, and find out more about Moodle. Goal The goal for this architecture is to have a Moodle environment that can handle 400k concurrent users and scale in and out its application resources according to usage. Using Azure managed services to minimize operational burden was a design premise because standard Moodle reference architectures are based on Virtual Machines that comes with a heavy operational cost. Challenges Being a monolith application, scaling Moodle in a modern cloud native environment is challenging. We choose to use Kubernetes as its computing provider due to the fact that it allow us to build a Moodle artifact in an immutable way that allows it to scale out and in when needed in a fast and automatic way and also recover from potential failures by simply recreating its Deployments without the need to maintain Virtual Machine resources, introducing the concept of pets vs cattle[1] to a scenario that at first glance wouldn't be feasible. Since Moodle is written in PHP it has no concept of database polling, creating a scenario where its underlying database is heavily impacted by new client requests, making it necessary to use an external database pooling solution that had to be custom tailored in order to handle the amount of connections for a heavy-traffic setup like this instead of using Azure Database for PostgreSQL's built-in pgbouncer. The same effect is also observed in its Redis implementation, where a custom Redis cluster had to be created, whereas using Azure Cache for Redis would incur prohibitive costs due to the way it is set up for a more general usage. 1 - https://learn.microsoft.com/en-us/dotnet/architecture/cloud-native/definition#the-cloud Architecture This architecture uses Azure managed (PaaS) components to minimize operational burden by using Azure Kubernetes Service to run Moodle, Azure Storage Account to host course content, Azure Database for PostgreSQL Flexible Server as its database and Azure Front Door to expose the application to the public as well as caching commonly used assets. The solution also leverages Azure Availability Zones to distribute its component across different zones in the region to optimize its availability. Provisioning the solution The provisioning has two parts: setting up the infrastructure and the application. The first part uses Terraform to deploy easily. The second part involves creating Moodle's database and configuring the application for optimal performance based on the templates, number of users, etc. and installing templates, courses, plugins etc. The following steps walk you through all tasks needed to have this job done. Clone the repository $ git clone https://github.com/Azure-Samples/moodle-high-scale Provision the infrastructure $ cd infra/ $ az login $ az group create --name moodle-high-scale --location <region> $ terraform init $ terraform plan -var moodle-environment=production $ terraform apply -var moodle-environment=production $ az aks get-credentials --name moodle-high-scale --resource-group moodle-high-scale Provision the Redis Cluster $ cd ../manifests/redis-cluster $ kubectl apply -f redis-configmap.yaml $ kubectl apply -f redis-cluster.yaml $ kubectl apply -f redis-service.yaml Wait for all the replicas to be running $ ./init.sh Type 'yes' when prompted. Deploy Moodle and its services Change image in moodle-service.yaml and also adjust the moodle data storage account name in the nfs-pv.yaml (see commented lines in the files) $ cd ../../images/moodle $ az acr build --registry moodlehighscale<suffix> -t moodle:v0.1 --file Dockerfile . $ cd ../../manifests $ kubectl apply -f pgbouncer-deployment.yaml $ kubectl apply -f nfs-pv.yaml $ kubectl apply -f nfs-pvc.yaml $ kubectl apply -f moodle-service.yaml $ kubectl -n moodle get svc –watch Provision the frontend configuration that will be used to expose Moodle and its assets publicly $ cd ../frontend $ terraform init $ terraform plan $ terraform apply Approve the private endpoint connection request from Frontdoor in moodle-svc-pls resource. Private Link Services > moodle-svc-pls > Private Endpoint Connections > Select the request from Front Door and click on Approve. Install database $ kubectl -n moodle exec -it deployment/moodle-deployment -- /bin/bash $ php /var/www/html/admin/cli/install_database.php --adminuser=admin_user --adminpass=admin_pass --agree-license Deploy Moodle Cron Change image in moodle-cron.yaml $ cd ../manifests $ kubectl apply -f moodle-cron.yaml Your Moodle installation is now ready to use! Conclusion You can create a Moodle environment that is scalable and reliable in minutes with a very simple approach, without having to deal with the hassle of operating its parts that normally comes with standard Moodle installations.1.9KViews8likes1CommentWhat's new in Azure App Service at #MSBuild 2026
At Microsoft Build 2026, Azure App Service introduced a powerful set of updates designed to help organizations accelerate their journey into AI, without increasing complexity or cost. These innovations focus on one clear business outcome: enabling teams to build, deploy, and scale AI-powered applications and agents faster, more securely, and with greater operational efficiency. A key highlight is the new Easy AI experience, which allows existing web apps to become AI-ready with no rearchitecting required. With capabilities like built-in Model Context Protocol (MCP), developers can instantly expose app functionality as agent-ready endpoints, enabling AI agents to interact with business logic securely and seamlessly. This dramatically reduces development time, allowing teams to move from idea to intelligent application in a fraction of the usual effort. Security and compliance are also strengthened with the general availability of Isolated v4 for Azure App Service Environments, delivering improved performance for customers that need single-tenant isolation and strong data residency guarantees. For enterprises operating in regulated industries, this ensures AI applications meet strict governance requirements without sacrificing scalability or speed. For modernization scenarios, Managed Instance on Azure App Service simplifies the migration of legacy applications, including those with OS-level dependencies. Faster restarts, enhanced diagnostics, and AI-assisted migration workflows help organizations modernize existing systems cost-effectively—avoiding expensive rewrites while unlocking AI capabilities. Recent updates include an AI-assisted approach to migrating legacy IIS applications using a multi-agent workflow powered by MCP. Managed Instance is supported on both Premium v4 and Isolated v4, laying the foundation for a modern compute infrastructure across the board. Operational efficiency is further enhanced through platform and CLI improvements designed for the “agent era.” From structured deployment diagnostics to optimized Python pipelines delivering faster deployments, these updates reduce friction and infrastructure overhead, lowering total cost of ownership. Together, these innovations position Azure App Service as a future-ready platform where businesses can rapidly build intelligent, agent-driven applications securely, efficiently, and at scale. 👉 Learn more in the full announcement: Deep dive into Azure App Service Build 2026 updates1.4KViews0likes0CommentsDesigning for High Availability: The Operational Reference for Running a Geo-Replicated ACR
By Johnson Shi, Zoey (Zhuyu) Li, Huangli Wu Introduction Three of the most common questions we hear from enterprise teams running geo-replicated Azure Container Registries (ACR) are: "How do I control which region serves my traffic?" — When my AKS clusters are spread across regions, can I pin each one to its co-located replica, or am I stuck with however the global endpoint routes? "What happens during a regional incident — is failover automatic or do I have to act?" — If the registry in one region degrades, does the global endpoint reroute on its own, or do I need to manually disable the affected replica? "What happens after the region recovers — does traffic return on its own?" — Is there a cooldown, a quarantine, or any manual step before failback? We answer those head-on, then go deeper on the operational details that come up when you actually run a geo-replicated registry: authentication across endpoint switches, throttling under load concentration, eventual-consistency failure modes, home region outage scope, webhooks, and private endpoint interaction. We draw on the official geo-replication docs, the global endpoint health-aware failover blog, the regional endpoints engineering design implementation, the regional endpoints public preview and private preview announcements, and the ACR reference for various registry endpoints, . This post also draws notes from the ACR product team on roadmap items that aren't yet documented elsewhere. Key Takeaways Health-aware failover is automatic. When the registry in a region degrades, the global endpoint reroutes away from it on the order of minutes, evaluated per-registry. No customer action required. Failback is automatic too. Once health-aware failover marks a region healthy again, the global endpoint resumes routing to it. There is no cooldown period. Health-aware failover applies only to global endpoint operations. It does not apply to regional endpoints (you're talking to one replica, period) or to dedicated data endpoints (the redirect is per-region). Health-aware failover is not triggered by throttling. It responds to regional ACR service health and Azure infrastructure health, not HTTP 429 responses. Use regional endpoints to manage per-replica throttling. Regional endpoints (Step 2a) give you explicit per-region URLs for workloads that need affinity, capacity planning, push/pull consistency, troubleshooting, or client-side failover. Use myregistry.<region>.geo.azurecr.io . Regional endpoints are available on Premium SKU registries. For workloads that don't need pinning, do nothing (Step 2b). The global endpoint plus health-aware failover handles routing automatically. Re-authenticate when switching endpoints. Each global or regional endpoint is its own authenticated surface; re-auth via az acr login , SDK auth, or the Kubernetes ACR credential provider on endpoint change. Don't run a long-lived DNS cache for the global endpoint. ACR purges DNS server-side on disable and during failover; a long-lived client cache works against that. For production workloads, enable dedicated data endpoints for security and DNS predictability on layer downloads. ACR is working on bounded staleness consistency for cross-replica eventual-consistency failure modes; see the FAQ. Background What is ACR geo-replication? Geo-replication is a Premium SKU feature that turns a single ACR registry into a multi-region, multi-write service. Every geo-replica in every region is writable — you can push, pull, and delete from any of them — and content syncs asynchronously between replicas under an eventual consistency model. Per-push replication time scales with the size and number of images being pushed. Similarly, when creating a new geo-replica, the time to populate the new geo-replica scales with the total size of the registry. A geo-replicated registry exposes a global endpoint at myregistry.azurecr.io . Behind that endpoint, ACR uses an internal traffic manager to direct each request to the replica with the best network performance profile for the caller — usually the closest replica, but not always. When clients are equidistant from multiple replicas, or when the closest replica is experiencing Azure infrastructure degradation, requests may be routed elsewhere. A geo-replicated registry also exposes a regional endpoint at myregistry.<region>.geo.azurecr.io , which allows clients to pin API requests to a specific geo-replica in lieu of global endpoints, which has Azure-managed routing among geo-replicas. Zone redundancy is always enabled for geo-replicas in regions where Azure has multiple availability zones — in those regions, ACR automatically spreads replica data across multiple availability zones within each region to protect against zonal outages. Endpoints and data endpoints: what goes where A common point of confusion: when you push or pull, not every request goes to the same place. The registry endpoints (global endpoint and regional endpoints), as well as the data endpoint, do different jobs. Your choice of data endpoint configuration has real consequences for security and resilience. Two kinds of traffic flow during a typical pull: Registry API traffic — authentication, manifest reads/writes, tag resolution, referrers, repository operations, blob location lookups, listing, metadata. This is everything except the actual layer (blob) bytes. All these API requests go to the global endpoint ( myregistry.azurecr.io ) or, if you've pinned your clients to call these APIs to a specific geo-replica, a geo-replica's regional endpoint ( myregistry.<region>.geo.azurecr.io ). Behind the scenes, the global endpoint internally proxies these requests to a specific geo-replica. Layer (blob) downloads — when the client asks for a blob, the registry doesn't serve the bytes itself. It returns an HTTP 307 redirect to a regional data endpoint (separate endpoint from the global endpoint or regional endpoints), and the client follows the redirect to download the layer from that region. Where that 307 sends you depends on whether you've enabled the registry's dedicated data endpoints feature: Configuration Layer downloads redirect to Default (no dedicated data endpoints) *.blob.core.windows.net (the underlying Azure storage account) Dedicated data endpoints enabled myregistry.<region>.data.azurecr.io for the region you were routed to Private endpoints enabled myregistry.<region>.data.azurecr.io for the region you were routed to Regional by design. Dedicated data endpoints always land you on a specific geo-replica's data endpoint — there is no "global data endpoint." With the global endpoint as your registry endpoint, the 307 redirect picks the data endpoint for whichever region the global endpoint chose to serve you. With a regional endpoint pinned to a specific region, the 307 always redirects you to that same region's data endpoint — never cross-region. Why dedicated data endpoints matter. Dedicated data endpoints are a Premium SKU feature that exists primarily to address security and firewall scoping. By default, layer downloads redirect to *.blob.core.windows.net — a wildcard storage FQDN. Firewall rules to allow that wildcard either let all Azure storage accounts through or none of them, which raises data exfiltration concerns and isn't tightly scoped to your registry. Dedicated data endpoints replace the wildcard with a fully qualified domain in your registry's own domain — myregistry.<region>.data.azurecr.io — so firewall rules can be scoped tightly to your specific registry, in your specific regions. That same design choice can also make layer downloads more predictable during routing changes. With dedicated data endpoints, the data endpoint FQDN is known ahead of time and lives in the registry's domain — one predictable hostname per region, configured once. Without them, the layer download has to resolve a wildcard storage FQDN that points to whichever storage account the registry happens to have provisioned, which is a separate DNS resolution path with its own routing behavior and its own caching profile. Dedicated data endpoints simplify the DNS picture by aligning the data path with the registry path and keeping the entire pull experience inside one set of predictable, scoped FQDNs. For any geo-replicated registry where security and high availability matter, enable dedicated data endpoints. Note: Health-aware failover applies only to operations against the global endpoint, not to regional endpoints or dedicated data endpoints. Take note that health-aware failover only kicks in and directs traffic away from a geo-replica when an Azure region is experiencing significant infrastructure degradation. At this stage, it does not kick in to redirect traffic to another geo-replica if a client's data plane API requests are throttled. See the relevant section below for the full scope when health-aware auto failover kicks in or not. The three traffic control tools ACR geo-replication gives you three complementary tools for controlling where traffic lands. Each one solves a different class of problem, and customers most often run into trouble when they reach for the wrong one. We name them up front and use these names throughout the post: Tool Who controls it What it does Use cases Health-aware failover Platform (automatic) Reroutes the global endpoint away from a region whose registry can't reliably serve requests Regional incidents, automatic recovery Replica enable/disable for global routing Customer (manual) Excludes a specific replica from global endpoint routing without deleting it; data continues syncing DR rehearsals, planned maintenance, quarantining a replica without losing it Regional endpoints Customer (per request) Dedicated per-region URLs ( myregistry.<region>.geo.azurecr.io ) that bypass the internal traffic manager entirely Pinning AKS clusters to co-located replicas, push/pull consistency, capacity planning, troubleshooting, client-side failover Health-aware failover and replica enable/disable both act on the global endpoint. Regional endpoints are a separate URL surface that coexists with the global endpoint — enabling them does not disable the global endpoint myregistry.azurecr.io . You can use both simultaneously and choose per workload. The behavior in question When the registry in one region experiences a real degradation, there are three possible answers to "what happens?": (A) Nothing automatic. The customer must manually disable the affected region's endpoint to stop traffic from being routed there. (B) The system detects the regional front-door failure and reroutes within seconds. (C) A per-registry health evaluation detects the degradation and reroutes the global endpoint within minutes, with no customer action. After the region recovers, routing resumes automatically. The answer today is (C). Before health-aware failover, customers were stuck closer to (A) — the system could see whether the regional reverse proxy responded, but not whether the registry could actually serve real pull and push traffic end to end. Health-aware failover closes that gap. We walk through all three tools in the next section, in order: setting up geo-replication, using regional endpoints to pin specific workloads, keeping the global endpoint for everything else, the manual replica disable mechanism, re-enabling participation in global routing, and what to expect when health-aware failover triggers. Walkthrough The following steps assume an existing Premium SKU registry and the Azure CLI logged in. We use myregistry as the registry name, myrg as the resource group, and eastus as the home region. Substitute <your-registry> , <your-rg> , and <your-region> for your environment. Prerequisites A Premium SKU ACR registry (geo-replication requires Premium) Azure CLI ( az ) installed and logged in For regional endpoints (Step 2a): Azure CLI 2.86.0 or later. All regional endpoints commands ( --regional-endpoints , az acr show-endpoints , az acr login --endpoint ) are available natively in Azure CLI 2.86.0+. If you previously installed the acrregionalendpoint private preview CLI extension, uninstall it with az extension remove --name acrregionalendpoint to prevent conflicts with the built-in CLI commands. Step 1: Add a West US replica to a registry that lives in East US Geo-replication requires the Premium SKU. The create call below fails on Basic or Standard. # Confirm the registry is Premium az acr show --name myregistry --resource-group myrg \ --query sku.name --output tsv # Premium # Create a West US geo-replica az acr replication create --registry myregistry --location westus # Confirm both replicas are present az acr replication list --registry myregistry --output table NAME LOCATION PROVISIONING STATE STATUS REGION ENDPOINT ENABLED ------ ---------- -------------------- -------- ----------------------- eastus eastus Succeeded online True westus westus Succeeded online True Pushes and pulls continue working through the existing replica throughout initial sync. Because the registry is multi-region, multi-write, the existing replica keeps serving traffic while the new replica catches up in the background. Initial replica seeding time is a function of registry size — the total number and cumulative size of images already in the registry that need to be replicated to the new replica — not the size of any single image. Step 2a: Pin workloads to specific regions using regional endpoints Use regional endpoints when a workload needs explicit per-region control. The five common cases: Regional affinity — an AKS cluster in East US should pull from the East US replica, every time, without ever hopping to a more distant replica because of a network performance fluctuation. Predictable routing — workloads that need to know exactly which replica will serve them, for benchmarking, capacity planning, or in-region traffic SLAs. Push/pull consistency — pinning both ends of a publish-then-deploy flow to the same replica eliminates eventual-consistency races. Troubleshooting — reproducing an issue on a specific replica requires sending traffic to that specific replica. Client-side failover — customers with their own health checks and business rules want to implement failover on their own terms, on signals only they can see. Enable regional endpoints on the registry: az acr update -n myregistry -g myrg --regional-endpoints enabled When enabled, ACR automatically creates per-region login server URLs for every existing geo-replica. No per-region configuration is needed. Note: Regional endpoints can be enabled on any Premium SKU registry, even without geo-replication. A registry without geo-replication has a single geo-replica in the home region, which gets one regional endpoint URL. However, the feature is most useful when your registry has at least two geo-replicas, where you can pin different workloads to different replicas for routing control and capacity distribution. Push to a specific region using its regional endpoint: # Log in to the West US regional endpoint az acr login --name myregistry --endpoint westus # Tag and push using the regional endpoint URL docker tag myapp:v1 myregistry.westus.geo.azurecr.io/myapp:v1 docker push myregistry.westus.geo.azurecr.io/myapp:v1 Pin AKS deployments to their co-located replica by using regional endpoint URLs in the deployment manifest. The example below shows two clusters in different regions; each cluster references the regional endpoint for its own region's replica (assuming replicas exist in both eastus and westeurope ): # East US-based AKS cluster pulls from the East US replica apiVersion: apps/v1 kind: Deployment metadata: name: myapp-eastus spec: template: spec: containers: - name: myapp image: myregistry.eastus.geo.azurecr.io/myapp:v1 --- # West Europe-based AKS cluster pulls from the West Europe replica apiVersion: apps/v1 kind: Deployment metadata: name: myapp-westeurope spec: template: spec: containers: - name: myapp image: myregistry.westeurope.geo.azurecr.io/myapp:v1 This eliminates cross-region pulls when global routing would otherwise prefer a different replica for a given client, and it gives you a per-region traffic profile you can plan capacity against. Regional endpoint operational tips View all endpoints. Use az acr show-endpoints to see all endpoint URLs for your registry — global, regional (if enabled), and dedicated data endpoints (if enabled): az acr show-endpoints --name myregistry --resource-group myrg Import from a specific geo-replica. When importing images between registries, you can use a regional endpoint to import from a specific geo-replica of the source registry. This is useful when you want predictable network paths or need to import from a replica in a specific region: az acr import \ --name mydownstreamregistry \ --source myupstreamregistry.westeurope.geo.azurecr.io/myapp:v1 \ --image myapp:v1 Firewall rules for regional endpoints. If you use firewall rules, allow access to the following endpoints for each geo-replica that clients connect to: Endpoint Purpose myregistry.<region>.geo.azurecr.io Regional endpoint for registry operations myregistry.azurecr.io Global endpoint (if also used) myregistry.<region>.data.azurecr.io Layer downloads (if using private endpoints or dedicated data endpoints) *.blob.core.windows.net Layer downloads (if not using private endpoints or dedicated data endpoints) For the full list of endpoint types and FQDN patterns, see the ACR reference for various registry endpoints. DNS-based routing without changing manifests. If you don't want to maintain different deployment manifests per region, you can keep all manifests pointing to the global endpoint ( myregistry.azurecr.io ) and use software-defined networking or a regional traffic manager to resolve the global endpoint to the appropriate regional endpoint based on the originating region's traffic. This achieves the same co-location goals as regional endpoints — predictable routing and reduced latency — without embedding region-specific URLs in your deployment manifests. Step 2b: Keep using the global endpoint for everything else For workloads that don't need explicit pinning, do nothing. The global endpoint at myregistry.azurecr.io continues to work exactly as before, and the global endpoint plus health-aware failover gives you intelligent routing across replicas without configuration. ACR picks the best replica for each client based on network performance and reroutes during regional incidents. Regional endpoints coexist with the global endpoint — enabling them does not disable myregistry.azurecr.io . You can use both simultaneously and choose per workload, mixing pinned workloads (Step 2a) with workloads that ride the global endpoint (Step 2b) in the same registry. Step 3: Take a replica out of global endpoint routing Use this when you need to keep a replica alive but stop it from serving global-endpoint traffic — for DR rehearsals, planned maintenance, or troubleshooting an isolated replica. # Exclude the West US replica from global endpoint routing az acr replication update --registry myregistry --name westus \ --global-endpoint-routing false Confirm the change: az acr replication list --registry myregistry --output table NAME LOCATION PROVISIONING STATE STATUS REGION ENDPOINT ENABLED ------ ---------- -------------------- -------- ----------------------- eastus eastus Succeeded online True westus westus Succeeded online False Requests to myregistry.azurecr.io no longer route to West US. The replica still receives replicated content — and continues to replicate its own content out to other replicas — and storage quota and per-replica costs continue to accrue. If regional endpoints are enabled, the West US regional endpoint URL also continues to work; --global-endpoint-routing controls only the replica's participation in global endpoint routing. A note on naming. The CLI flag --global-endpoint-routing (on az acr replication update ) and the regional endpoints feature (enabled via az acr update --regional-endpoints enabled ) are two different things despite the similar names. --global-endpoint-routing controls whether a replica participates in global endpoint routing. The regional endpoints feature creates per-region URLs ( myregistry.<region>.geo.azurecr.io ) that bypass the global endpoint entirely. They are independent controls. In Azure CLI 2.86.0 and later, the old --region-endpoint-enabled flag has been renamed to --global-endpoint-routing . The old flag name is deprecated and will be removed in Azure CLI 2.87.0 (June 2026). If you have existing scripts or automation that use --region-endpoint-enabled , update them to use --global-endpoint-routing . CLI flags quick reference: Flag Scope Purpose --regional-endpoints Registry-level ( az acr create or az acr update ) Enables dedicated regional endpoint URLs ( myregistry.<region>.geo.azurecr.io ) for all geo-replicas. --global-endpoint-routing Per-geo-replica ( az acr replication create or az acr replication update ) Controls whether the global endpoint routes traffic to a specific geo-replica. Set to false to temporarily exclude a geo-replica from global routing. --data-endpoint-enabled Registry-level ( az acr create or az acr update ) Enables dedicated data endpoints ( myregistry.<region>.data.azurecr.io ) for layer blob downloads. Auto-enabled when at least one private endpoint is configured. This bidirectional sync during disable is intentional. When you re-enable the replica, every image pushed to the registry while the replica was disabled — from any region — is already present, so the replica can serve traffic immediately with no catch-up window. If we stopped syncing on disable, re-enabling would leave the replica with stale data and force a long catch-up before it could safely serve pulls. Step 4: Re-enable the replica to participate in global endpoint routing Re-enable the replica: az acr replication update --registry myregistry --name westus \ --global-endpoint-routing true NAME LOCATION PROVISIONING STATE STATUS REGION ENDPOINT ENABLED ------ ---------- -------------------- -------- ----------------------- eastus eastus Succeeded online True westus westus Succeeded online True There is no cooldown. The global endpoint resumes routing requests to the West US replica as soon as the change takes effect on ACR's side. Because data continued syncing while the replica was disabled (Step 3), the replica is immediately ready to serve pulls — no catch-up window. Note on DNS during disable/enable. When you take a replica out of global routing, ACR purges its own DNS records for that replica from the global endpoint on a fast path — there is no waiting on a published TTL on ACR's side. If clients run their own DNS cache for the global endpoint, however, those clients will keep resolving to the disabled replica until the client cache expires. We can't control client-side caches. The recommendation: do not run a long-lived DNS cache for the global endpoint. A short-lived DNS pin for the duration of a single push (covered in the DNS and Client-Side Considerations section) is fine and even helpful — but a long-lived DNS cache will make --global-endpoint-routing false look broken from the client's perspective. Step 5: What to expect when health-aware failover triggers Health-aware failover is automatic. ACR evaluates registry health on a per-registry basis, and when a registry in a region can't reliably serve requests, the global endpoint reroutes that registry's traffic to a healthy replica. There is no customer-invocable trigger — that's the point. End-to-end timing is on the order of minutes — fast enough to catch real regional degradation, slow enough to ride out transient errors that resolve on their own. DNS TTL may add additional propagation delay before all clients switch to the new region. Scope of health-aware failover. Health-aware failover applies only to operations against the global endpoint — the registry API calls (auth, get manifest, get tag, get referrers, get blob location). It evaluates health when those API calls come in; it does not trigger mid-operation. Two important consequences: Regional endpoints are not in scope. When you talk to a regional endpoint like myregistry.westus.geo.azurecr.io , you're talking to that one replica. There is no automatic reroute. If you've pinned a workload to a regional endpoint and that region degrades, you implement client-side failover by switching the workload to a different regional endpoint. Dedicated data endpoints are not in scope. Once a registry endpoint has redirected you to a dedicated data endpoint, you stay on that region's data endpoint for the duration of the layer download. There is no automatic reroute of an in-flight blob download. The region targeted by the redirect is decided up front by whichever registry endpoint served the blob-location call: the global endpoint chooses based on its per-registry health evaluation, and a regional endpoint always targets its own region. The signals you can use to confirm a failover is in progress: # Check replication status az acr replication list --registry myregistry --output table You can also check Resource Health for the registry in the Azure portal — navigate to your registry and select Resource health under the Help section to see platform-side degradation signals. You'll typically see: Increased pull latency as traffic shifts to a more distant replica Resource Health flagging known issues in the affected region Replication status indicating which replicas are online After the region recovers, the per-registry health evaluation marks it healthy again and the global endpoint resumes routing — automatic, no cooldown, no customer action. Note that health is evaluated per registry, not per region: if a degradation affects only a subset of registries in a region, only those registries are rerouted, and other registries in the same region continue to be served locally with no unnecessary latency penalty. Not triggered by throttling. Health-aware failover is DNS-based and responds to regional ACR service health and Azure infrastructure health. It does not reroute traffic based on HTTP 429 (throttling) responses. If a geo-replica is throttling your requests but the region's infrastructure is healthy, the global endpoint continues routing you to that geo-replica. To manage throttling, use regional endpoints to spread workloads across multiple geo-replicas for better capacity distribution. Note on long-running pushes during a failover. A multi-layer push that spans a failover boundary can land layers and the manifest on different replicas — exactly the failure mode that DNS bouncing produces during a single push. ACR is actively tightening health-aware failover behavior to minimize cross-replica scatter during these scenarios, and the recommendation today remains: pin pushes to a single replica via a regional endpoint when push/pull consistency matters. Common Questions Q1. Performance impact during initial replica creation on a live registry Because ACR is multi-region, multi-write, the existing replica continues serving pull and push traffic throughout the period when a new replica is being seeded. Replication is asynchronous and content propagates in the background; the time to populate a new geo-replica scales with the size of the registry — the cumulative number and total size of images already in the registry — not with any single image. The docs do not publish a quantified degradation percentage or a throttling window for this period, and they do not promise zero performance impact — the safe operating assumption for a live production registry is that existing replicas continue serving traffic normally, with the new replica catching up in the background. Q2. Restricted/updating state during initial sync There is no "restricted" state for the registry during normal replica creation. Writes, control-plane operations, and pushes/pulls against existing replicas continue normally. The only time configuration changes are unavailable is during a home region outage — see the relevant FAQ item later on for the full data-plane-versus-control-plane breakdown. Q3. Cooldown periods and non-straightforward failback scenarios There is no cooldown before failback, manual or automatic. Re-enabling a replica's participation in global endpoint routing takes effect immediately on ACR's side. Health-aware failover returns traffic to a region as soon as its per-registry health evaluation passes again. The failback case that is not seamless: if a recently pushed image has not yet replicated to the failover region, a pull from that region may not find the image until replication catches up. This is a function of eventual consistency, not failback timing — and it's part of a broader class of issues we cover in Q4. Q4. Common pull and push failure modes during the eventual-consistency window DNS bouncing during a single push is one well-known problem, but it isn't the only one. The eventual-consistency window between geo-replicas surfaces in several recurring failure modes worth knowing about: Push-then-immediate-pull-cross-region. Pushing myapp:v1 to one region and immediately pulling it from a different region can fail with manifest unknown until replication catches up. This shows up most painfully in CI/CD pipelines where one CI runner pushes an image and thousands of pods across other regions all try to pull from their local geo-replicas at the same time. Today, customers work around this with indeterminate sleeps before scheduling expensive compute, or with retry logic, or by waiting on a replication-complete signal — none of which is a clean planning story. Tag overwrite races. Pushing myapp:v1 , then re-pushing myapp:v1 shortly after with a fix (same tag, different digest), can leave different replicas resolving the same tag to different digests during the eventual-consistency window. Delete propagation. Deleting a tag or repository in one region takes some time to propagate to other replicas. Pulls from regions where the delete hasn't yet propagated can return the supposedly-deleted content. Mid-push failover scatter. A multi-layer push that spans a health-aware failover boundary or a DNS bouncing event can land layers on one replica and the manifest on another, surfacing as manifest validation errors or blob unknown on subsequent pulls. What ACR is doing about this. We're working on bounded staleness consistency for pushed images across all geo-replicas worldwide, which addresses these four failure modes directly. This will be covered in an upcoming blog post. If you're hitting eventual-consistency brittleness today and want to talk through your scenario, reach out to us on the Azure Container Registry GitHub repository — we want the customer signal to land in the design. Mitigations available today: Pin pushes to a single replica via a regional endpoint. Every sub-request in the push — login, blob uploads, manifest upload — goes to the same replica, eliminating the DNS bouncing and mid-push scatter classes entirely. Use a short-lived client-side DNS cache like dnsmasq scoped to the duration of a single push, only when you're not using regional endpoints. Do not run a long-lived DNS cache for the global endpoint — it interferes with --global-endpoint-routing false and with health-aware failover routing. Build retry logic into pulls that immediately follow a cross-region push. Either retry with backoff or check replication status with ACR webhooks before pulling. ACR can detect and notify you when an image or tag is available for pull in a geo-replica (say geo-replica B), after it has been pushed to another geo-replica (geo-replica A) and background replication has succeeded to geo-replica B. Design publish steps to be idempotent so retries triggered by mid-push failover are safe. Q5. Auth behavior across endpoint switches For safety, treat each global endpoint and each regional endpoint as its own authenticated surface. All registry APIs except the actual blob downloads (auth, manifests, tag resolution, referrers) flow through whichever endpoint you've chosen. If you switch from the global endpoint to a regional endpoint, or from one regional endpoint to another, re-authenticate. That means az acr login , fresh SDK auth, or — for AKS — letting the Kubernetes ACR credential provider handle re-auth, which it does automatically when the endpoint changes. Q6. Throttling under failover and pinning Throttling limits on registry API operations are per-replica, not per-registry. This has two operational implications: During health-aware failover, traffic that was spread across replicas can shift heavily onto whichever replicas remain in the global endpoint's routing pool. Capacity plan to spread traffic across two or three healthy replicas during a failover scenario rather than concentrating onto one — the global endpoint's routing already does this for you when multiple healthy replicas exist, but registries with only two regions configured can hit per-replica limits more easily during a failover. To mitigate, use regional endpoints to spread workloads across multiple geo-replicas and plan per-replica capacity. When pinning via regional endpoints (Step 2a), you concentrate traffic on whichever replica you've pinned to. If you've pinned all your AKS clusters to a single regional endpoint, you may hit that replica's per-region throttling limits at peak. Mitigations: pin different workloads to different regional endpoints across multiple regions for better topology mapping and capacity distribution, or use the global endpoint (Step 2b) for workloads where you don't need explicit pinning so ACR's routing can spread load. We're also working on improving the throttling metrics surfaced during health-aware failover events. Note: Health-aware failover does not reroute traffic based on HTTP 429 (throttling). If you're experiencing throttling but the region's infrastructure is healthy, the global endpoint continues routing you there. Use regional endpoints to explicitly spread load across replicas for capacity planning. Q7. Home region outage scope Geo-replication provides high availability for the data plane. During a home region outage, the control plane is unavailable, which means you can't create or delete replicas, modify network rules, or change replication settings until the home region recovers. ACR Tasks are also bound to the home region and don't run while it's unavailable. The data plane keeps working: Global endpoint continues routing pulls and pushes to healthy replicas. Regional endpoints continue working — you talk directly to specific replicas, and your client-side logic decides which region to use. Authentication, manifests, blob downloads, webhooks continue functioning through any healthy replica. The home region of a registry is fixed at creation and cannot be changed afterward. Microsoft's registry relocation guidance describes a redeployment procedure — creating a new registry in a different region — not an in-place change to an existing registry's home region. Note: If your registry uses a customer-managed key, review the key vault failover and redundancy guidance for maximum resilience. Key vault availability directly affects the registry's ability to encrypt and decrypt data. Q8. Webhooks during failover Webhooks fire from the replica that received the push. Because ACR also replicates content to other geo-replicas, webhooks fire from each geo-replica as the image syncs to it — so a single push results in webhook events from the receiving replica plus an event from each replica as replication completes. During a failover where pushes are routed to a different region, webhooks from those pushes fire from the new region; once the original region recovers and replication catches up, webhook events fire from there too. Webhook consumers should be designed to handle multiple events per pushed image and deduplicate as needed. Q9. Private endpoints with regional endpoints and dedicated data endpoints When a private endpoint is created against a registry, the private endpoint covers all of the registry's endpoint surfaces — the global endpoint, every regional endpoint (if regional endpoints are enabled), and every regional dedicated data endpoint. A single private endpoint in one VNet can reach the global endpoint (which routes you to a suitable replica), any regional endpoint in the same or a different region, and any region's dedicated data endpoint for blob downloads. The trade-off is private IP allocation: each endpoint surface consumes IPs in the VNet. With many replicas plus regional endpoints plus dedicated data endpoints all enabled, private endpoint creation can fail if the VNet runs out of available private IPs. IP address consumption per feature: Configuration IPs consumed per VNet Initial private endpoint (global endpoint + home region dedicated data endpoint) 2 Each geo-replication region added +1 (regional dedicated data endpoint) Regional endpoints enabled +1 per geo-replica Example: A registry with 3 geo-replicas and regional endpoints enabled consumes 7 private IPs per VNet: 1 (global) + 3 (data) + 3 (regional). Without regional endpoints, the same registry requires 4 private IPs: 1 (global) + 3 (data). Subnet sizing: Use at minimum a /27 (32 addresses) subnet for PE subnets on geo-replicated registries, and /24 where possible. To check how many private IPs are already consumed on a subnet: az network vnet subnet show \ --name <subnet-name> \ --vnet-name <vnet-name> \ --resource-group <resource-group> \ --query "{addressPrefix:addressPrefix, usedIPs:length(ipConfigurations || \`[]\`)}" \ --output table See the ACR private endpoints documentation for the full IP-allocation math and sizing guidance. Q10. Geo-replica creation stuck for private endpoint-enabled registries When creating a geo-replica for a registry that has private endpoints configured, the replica provisioning can get stuck in a Creating state if the identity performing the operation doesn't have sufficient permissions to create private endpoint networking resources. Solution: Manually delete the geo-replica that got stuck in the provisioning state. Ensure the identity has the permission Microsoft.Network/privateEndpoints/privateLinkServiceProxies/write before creating the geo-replica again. Also verify that every PE subnet connected to the registry has free IP capacity — if any PE subnet across any connected VNet does not have enough free IPs, the replication provisioning fails and rolls back. The replica appears briefly in a Creating state and then is removed. The resulting error does not identify which subnet or VNet is exhausted. Q11. Metrics, logs, and alerts for the three phases We map each phase to the signals available in the Monitoring Guidance section below. The headline: Resource Health (in the Azure portal) and az acr replication list give you the platform-side signals; Azure Monitor platform metrics are collected automatically, and resource logs require Diagnostic Settings to be enabled on the customer side. Behavior summary Scenario Automatic? Customer Action Required Notes Registry in a region degrades Yes None Health-aware failover; per-registry; minutes-scale; global endpoint operations only Region recovers after a degradation event Yes None No cooldown Pin AKS clusters to co-located replicas No Use regional endpoint URLs in deployment manifests (Step 2a) Coexists with global endpoint No pinning needed for most workloads Yes None — keep using myregistry.azurecr.io (Step 2b) Global endpoint plus health-aware failover Push/pull from the same replica (consistency) No Use a regional endpoint for both push and pull Eliminates DNS bouncing and mid-push scatter Capacity planning per region No Spread workloads across multiple regional endpoints Per-replica throttling; avoid concentrating on one replica DR rehearsal: take a replica out of global routing No az acr replication update --global-endpoint-routing false Data continues syncing both directions; costs continue accruing Re-enable replica participation in global routing No az acr replication update --global-endpoint-routing true No cooldown; replica is immediately ready Switch a workload between endpoints No Re-auth ( az acr login , SDK auth, or Kubernetes ACR credential provider) Each endpoint is its own authenticated surface Initial replica seeding on a live registry N/A None Existing replica continues serving traffic; seeding time scales with registry size Long-running push during a failover No Retry; design publishes to be idempotent Pin via regional endpoint to avoid mid-push scatter; ACR is tightening this behavior Pull of a recently pushed image from a different region No Wait for replication, retry with backoff, or check replication status Eventual consistency; bounded staleness consistency in development Home region outage Data plane: yes; control plane: no Use global or regional endpoints for data plane operations Control plane (replica config, network rules) requires home region DNS and Client-Side Considerations DNS bouncing during a single push is the most common geo-replication push problem in customer threads, and it warrants a section of its own. The failure mode. A docker push is a sequence of HTTP requests: blob uploads for each layer, then a manifest upload that references those layers by digest. If the Linux DNS resolver on the client doesn't cache myregistry.azurecr.io consistently for the duration of the push, individual sub-requests can resolve to different replicas. Because replication is eventually consistent, the manifest can land on a replica that doesn't yet have the layers it references, and the manifest validation fails. The two mitigations: Regional endpoints pin the push to a single replica end-to-end. Every sub-request — login, blob uploads, manifest upload — goes to the same replica. This is the cleanest fix and the one we recommend for any pipeline where push/pull consistency matters. A short-lived client-side DNS cache like dnsmasq scoped to the duration of a single push. For Linux VMs in Azure, follow the DNS name resolution options guidance. The pin should last the push and no longer. For other clients performing pushes, you can customize your stack's DNS resolver to have a similar short-lived DNS cache to pin the global endpoint's resolved DNS for only the duration of an image push operation. A note on long-lived DNS caching for the global endpoint. Don't run a long-lived DNS cache for myregistry.azurecr.io . ACR purges its own DNS records on the server side when a replica is taken out of global routing (Step 3) and during health-aware failover; a long-lived client-side cache will keep clients pointed at the old region after our purge, which makes both the manual disable mechanism and health-aware failover look broken from the client's perspective. Retry behavior: In-flight pushes during a failover may fail. Design publish steps to be idempotent so retries are safe. Pipelines that push in one region and immediately pull from a different region should retry with backoff or check replication status — eventual consistency means the pull may race ahead of replication. ACR is working on bounded staleness consistency that addresses this directly by enabling proxying (on ACR infrastructure) an image pull request from one geo-replica (if it does not have the image) to another geo-replica that has the image; see the relevant FAQ item. Note: Specific retry counts, back-off intervals, and push timeout values are application-layer decisions. The platform behavior is documented; the retry policy belongs to your client. Monitoring Guidance We map the three phases to the signals available from each source. Where a signal requires customer-side configuration, we flag it. Phase A: Initial replication (after creating a new replica) az acr replication list and az acr replication show — confirm the new replica reaches provisioningState: Succeeded and status: online , and view per-replica status. Azure Monitor platform metrics — push count, pull count, and other registry metrics are collected automatically and visible in the Azure portal under Metrics. No customer configuration is needed to view platform metrics. To export metrics or enable resource logs (detailed operation logs), configure Diagnostic Settings on the registry. Phase B: Failover (planned via replica disable, or automatic via health-aware failover) Per-replica regionEndpointEnabled state via az acr replication list — confirms whether a manual disable took effect, i.e. which replicas are currently eligible for global endpoint routing. Note: this flag reflects the manual configuration for configuring a geo-replica's global endpoint routing eligibility; it does not indicate whether health-aware failover has actively rerouted traffic away from a replica. Resource Health for the registry (in the Azure portal under Help > Resource health) — surfaces platform-side degradation signals during incidents. ACR does not yet expose a definitive "this region is currently serving your traffic" signal; Resource Health and client-side latency changes are the best available indicators. Pull latency from clients — increased latency from a more distant replica is the client-observable signal that traffic has rerouted. Azure Monitor platform metrics — visible per-region in the Azure portal Metrics blade. To export metrics or query them programmatically, enable Diagnostic Settings. Phase C: Failback (replica returns to global routing) az acr replication list — confirms regionEndpointEnabled: True (manual) or online status across all replicas (automatic). Pull latency normalizing as clients reach the recovered replica again. Resource Health clearing for the registry (visible in the Azure portal). Note: The health-aware failover blog calls out ongoing work to surface richer signals — including notifications for when routing changes and which region is currently serving your traffic. The signals listed above are what's available today. Pricing Considerations Storage billing vs. storage quota: Storage is billed per geo-replica — a 1 GiB image replicated to 5 geo-replicas is charged as 5 GiB of storage (1 GiB × 5 geo-replicas). However, storage quota (the tier's maximum storage limit) counts the image only once — the same 1 GiB image counts as 1 GiB toward your tier's maximum, not 5 GiB. Data transfer: Geo-replication can reduce costs by enabling in-region image pushes and pulls, which avoids cross-region data transfer charges during these push or pull operations. However, cross-region data transfer charges still apply when ACR replicates pushed content to other geo-replicas as part of eventual consistency. Disabled replicas still cost: When you take a replica out of global routing with --global-endpoint-routing false , storage and per-replica costs continue accruing because data continues syncing bidirectionally. For more information, see ACR pricing. Cleanup Run these commands to undo the walkthrough setup. Order matters: disable regional endpoints before deleting replicas, since regional endpoint URLs depend on which replicas exist. # Disable regional endpoints if you enabled them in Step 2a az acr update -n myregistry -g myrg --regional-endpoints disabled # Re-enable any replicas you disabled in Step 3 (no-op if already enabled) az acr replication update --registry myregistry --name westus \ --global-endpoint-routing true # Delete the West US replica created in Step 1 az acr replication delete --registry myregistry --name westus # Confirm only the home region replica remains az acr replication list --registry myregistry --output table Note: Replica deletion is a control-plane operation that requires the home region to be available. During a home region outage, replica configuration cannot be modified. Summary Table Question Answer When should I use regional endpoints vs the global endpoint? Use regional endpoints (Step 2a) for workloads that need affinity, predictable routing, push/pull consistency, troubleshooting, or client-side failover. Use the global endpoint (Step 2b) for everything else and let health-aware failover handle routing. What should I enable for secure, resilient layer downloads? Enable dedicated data endpoints. They scope firewall rules tightly to your registry and replace wildcard storage DNS with predictable per-region FQDNs. How do I avoid DNS-bouncing manifest validation failures on push? Pin pushes to a single replica via a regional endpoint. A short-lived client-side dnsmasq for the push duration is also fine if you're not using regional endpoints. Should I run a long-lived DNS cache for the global endpoint? No. ACR purges DNS server-side on disable and during failover; client-side caching works against that. Do I need to re-auth when switching endpoints? Yes. Each global or regional endpoint is its own authenticated surface. az acr login , SDK auth, or the Kubernetes ACR credential provider handles the re-auth. What happens during a home region outage? Data plane keeps working through any replica via the global endpoint or regional endpoints. Control plane operations (replica configuration, network rules) are unavailable until the home region recovers. The home region is fixed at registry creation. What's ACR doing about eventual-consistency pain? Bounded staleness consistency for cross-replica pushed images is in development and will be covered in an upcoming blog post. Reach out via GitHub if you want to share your scenario. For the full automation matrix — what's automatic, what requires customer action, and what to expect for each scenario — see the behavior summary above. If you have further questions about ACR geo-replication routing, pinning, capacity planning, eventual consistency, or failover behavior, reach out to us on the Azure Container Registry GitHub repository or file feedback through the Azure portal.215Views0likes0CommentsNow in preview: built-in MCP for Azure App Service
At Build 2026 last week, we announced the public preview of built-in MCP for Azure App Service. It does one thing, and it does it with almost no effort on your part: it turns a REST API you already host on App Service into a Model Context Protocol (MCP) server, so AI agents and assistants can call your API as a set of tools. No MCP code to write. No second service to deploy. What it does You give App Service an OpenAPI 3.x specification (JSON or YAML) describing the operations you want to expose. The platform reads that spec and generates one MCP tool per operation, then serves the MCP endpoint over streamable HTTP at a path you choose (the default is /mcp). From there, App Service handles the parts that are tedious to build yourself: MCP protocol negotiation Tool discovery, so clients can list the operations your spec exposes Hot reload of the spec when it changes Client cancellation Any MCP-compatible client can connect, including GitHub Copilot Chat, Cursor, Windsurf, and Claude Desktop. Why it matters Most teams already have the API the agent ecosystem wants to call. What they don't have is the time to wrap it in a bespoke MCP server, keep that server in sync with the API, and operate it. Built-in MCP removes that work entirely. If your REST API runs on App Service and you can produce an OpenAPI spec for it — and most web frameworks generate one for you — you're a configuration change away from an agent-ready endpoint. Built-in or custom? Built-in MCP is the fastest path when your tools map cleanly to REST operations. If you need behavior that doesn't — multi-step workflows, in-memory aggregation, MCP resources or prompts, or more than one MCP server on a single app — a custom MCP server built with an MCP SDK and deployed as your application code is still the right choice. The two approaches complement each other, and you can read more about choosing between them in the docs. Security Built-in MCP works with App Service Authentication, so MCP requests go through the same identity checks as every other route on your app, using Microsoft Entra or any OpenID Connect provider you've configured. When App Service Authentication is enabled, the platform also publishes OAuth protected-resource metadata so MCP clients can complete the OAuth flow automatically. As always, your application code is responsible for validating the bearer token on each request — and you should avoid exposing an MCP server publicly without authentication, since every published tool becomes callable once a client connects. Getting started Built-in MCP is configured through the aiIntegration property on your App Service app, and the preview supports three configuration paths: the Azure portal, the Azure CLI (az rest), and Bicep. It runs on dedicated pricing tiers, Basic or higher — it isn't supported on Free, Shared, Consumption, or Flex Consumption plans. This is a preview, and we'd love your feedback as you try it. To enable built-in MCP on your own app and connect an MCP client, head to the docs: Configure App Service built-in MCP (preview) Use App Service as a Model Context Protocol (MCP) server345Views0likes0CommentsClosing the AI-readiness gap with agentic modernization
Legacy debt is widening the AI-readiness gap Legacy systems and mounting tech debt aren't just slowing your AI agenda — they're quietly stealing its potential. Aging architectures, and complex, decades-old applications, databases, and infrastructure weren't designed for high performance, complex, dynamically scaling agentic workloads. The longer legacy lasts, the wider the gap between AI ambition and AI-readiness. This year at Microsoft Build 2026, we're taking our biggest step yet toward closing it. In a recent Forrester study, 94% of IT leaders ranked modernization as a top priority for their AI strategy, yet only 43% of their portfolios have been modernized on average, and only 32% are AI-ready.¹ Ambition for AI adoption is at a high, yet most are held back by the underlying legacy code, technical debt, and modernization backlog. Forrester’s data underscores it: on average, 35% of modernization projects stall due to legacy constraints, 65% cite security and compliance as the top challenge, 58% are held back by the complexity of monolithic applications, and 59% struggle with finding skilled talent to execute.¹ The result: AI initiatives that stall before they ever reach production. Modernization is a key step to move towards AI production; and it’s typically easier said than done. The growing problem is how to execute at the pace and scale that AI now demands. That’s why IT operators, architects, application owners, and developers are turning to agents to eliminate legacy toil, connect workstreams across teams, and scale their modernization efforts while customizing how they modernize. The first agentic end-to-end modernization solution that unifies IT and developer workflows Azure Copilot migration agent and GitHub Copilot modernization agent create the first agentic, end-to-end modernization solution that unifies IT and developer workflows— helping organizations close the AI-readiness gap by connecting discovery, assessment, planning, code transformation, governance, deployment, and observability into one continuous system. Built into the tools IT teams and developers already use, the solution combines estate-scale planning with GitHub-native execution, application-aware migration, broad workload coverage across apps, infrastructure, and data, and enterprise-grade privacy, security, and flexibility— so modernization becomes a governed, scalable, portfolio-level capability across teams, rather than a series of one-off projects. Building the estate-wide modernization plan Azure Copilot migration agent (public preview) brings AI to every step of estate modernization planning - from discovery and assessment to dependency mapping, ROI analysis, and wave planning - reducing months of manual analysis to minutes. For organizations with a clear picture of their estate, the migration agent accelerates the path from inventory to wave plan. For organizations that don’t, the migration agent helps them build that picture from scratch: what is running, what depends on what, what to move, what to modernize, what to retire, and in what order. By creating business-goal oriented estate plans, generating ROI analysis in minutes, and aligning IT and development teams through connected workflows, it helps enterprises move mission critical applications, databases, and infrastructure onto Azure faster and more confidently, with a continuous, data-driven view of what to modernizing next. Freeing teams from the legacy tax GitHub Copilot modernization agent, now generally available, empowers application owners, architects, and developers to scale modernization across their entire application portfolio. Operated from the CLI, the modernization agent acts as an orchestrator that simultaneously: Assesses readiness across multiple applications at once Plans application-specific modernization journeys and executes the identified migration tasks Surfaces deep code and dependency-level insights and recommendations Automates upgrades for Java and .NET applications Recommends Azure services aligned to organizational standards With its native design into GitHub Copilot, the modernization agent creates issues, pull requests, and shareable assessment reports for each application as it works. Architects and application owners retain visibility and governance from a single view, while developers receive clear, prioritized work they can execute from the agent or finish directly in their preferred editor. Behind the scenes, the modernization agent coordinates with GitHub Copilot's coding agent to complete tasks asynchronously across repositories, with a full monitoring and audit trail in GitHub's Agent HQ. The result is a connected planning-to-execution flow that finally makes modernization at scale possible, without sacrificing oversight or control. In just a few months, the modernization agent has already accelerated modernization up to 4x faster across hundreds of thousands of legacy .NET and Java applications at hundreds of customers. Enterprise level customization Every application is built and operated as uniquely as the business it serves. The path to modernization must be equally unique: tailored to each application's architecture, dependencies, and intent. At Build, we're excited to announce the general availability of custom skills for the modernization agent. Custom skills (GA) let developers teach the modernization agent how their organization works by encoding proprietary patterns, libraries, Azure best practices, and migration approaches once, then reusing them across every run. Each skill is authored as a skill.md file with build instructions, sample usage, reference APIs, and more, and is built on open-standard agent skills so teams aren't locked into a proprietary format. With custom skills, developers can equip the modernization agent with: Business-specific context, knowledge, intent, and migration approaches for application-aware guidance Centralized skills library to reuse and repeat tasks across the portfolio Full traceability for every skill used in generating the modernization plan The result is portfolio-scale execution with application-level specificity, in the same agentic workflow. And because skills live in a shared library, teams can reuse and repeat for faster, more consistent modernization outcomes aligned with the application's goals. Innovating while closing the AI-readiness gap GitHub Copilot is already dramatically reducing technical debt in real world environments, helping to close the AI readiness gap and, more importantly, innovate faster with AI. Organizations that adopt agentic modernization can not only close their AI-readiness gap, they can also make modernization a continuous process, allowing them to more readily integrate AI into existing business applications and services Ready to reimagine your applications? Join us at Microsoft Build this year, online or in person, to see our product teams reimagine applications live with GitHub Copilot modernization, share customer success, and empower you to modernize with confidence in days, not months. Join online or in person for Build session BRK220 on Wednesday 9AM PST Learn more about GitHub Copilot modernization: aka.ms/ghcp-modernization Dive deeper at the virtual .NET Agentic Modernization Day on June 16 th : aka.ms/dotnetday/rsvp ¹ Forrester’s Q1 2026 Cloud and AI Application Modernization Survey [E-66670]1.1KViews1like0CommentsFrom "Maybe Next Quarter" to "Running Before Lunch" on Container Apps - Modernizing Legacy .NET App
In early 2025, we wanted to modernize Jon Galloway's MVC Music Store - a classic ASP.NET MVC 5 app running on .NET Framework 4.8 with Entity Framework 6. The goal was straightforward: address vulnerabilities, enable managed identity, and deploy to Azure Container Apps and Azure SQL. No more plaintext connection strings. No more passwords in config files. We hit a wall immediately. Entity Framework on .NET Framework did not support Azure.Identity or DefaultAzureCredential. We just could not add a NuGet package and call it done - we’d need EF Core, which means modern .NET - and rewriting the data layer, the identity system, the startup pipeline, the views. The engineering team estimated one week of dedicated developer work. As a product manager without extensive .NET modernization experience, I wasn't able to complete it quickly on my own, so the project was placed in the backlog. This was before the GitHub Copilot "Agent" mode, the GitHub Copilot app modernization (a specialized agent with skills for modernization) existed but only offered assessment - it could tell you what needed to change, but couldn't make the end to end changes for you. Fast-forward one year. The full modernization agent is available. I sat down with the same app and the same goal. A few hours later, it was running on .NET 10 on Azure Container Apps with managed identity, Key Vault integration, and zero plaintext credentials. Thank you GitHub Copilot app modernization! And while we were on it – GitHub Copilot helped to modernize the experience as well, built more tests and generated more synthetic data for testing. Why Azure Container Apps? Azure Container Apps is an ideal deployment target for this modernized MVC Music Store application because it provides a serverless, fully managed container hosting environment. It abstracts away infrastructure management while natively supporting the key security and operational features this project required. It pairs naturally with infrastructure-as-code deployments, and its per-second billing on a consumption plan keeps costs minimal for a lightweight web app like this, eliminating the overhead of managing Kubernetes clusters while still giving you the container portability that modern .NET apps benefit from. That is why I asked Copilot to modernize to Azure Container Apps - here's how it went - Phase 1: Assessment GitHub Copilot App Modernization started by analyzing the codebase and producing a detailed assessment: Framework gap analysis - .NET Framework 4.0 → .NET 10, identifying every breaking change Dependency inventory - Entity Framework 6 (not EF Core), MVC 5 references, System.Web dependencies Security findings - plaintext SQL connection strings in Web.config, no managed identity support API surface changes - Global.asax → Program.cs minimal hosting, System.Web.Mvc → Microsoft.AspNetCore.Mvc The assessment is not a generic checklist. It reads your code - your controllers, your DbContext, your views - and maps a concrete modernization path. For this app, the key finding was clear: EF 6 on .NET Framework cannot support DefaultAzureCredential. The entire data layer needs to move to EF Core on modern .NET to unlock passwordless authentication. Phase 2: Code & Dependency Modernization This is where last year's experience ended and this year's began. The agent performed the actual modernization: Project structure: .csproj converted from legacy XML format to SDK-style targeting net10.0 Global.asax replaced with Program.cs using minimal hosting packages.config → NuGet PackageReference entries Data layer (the hard part): Entity Framework 6 → EF Core with Microsoft.EntityFrameworkCore.SqlServer DbContext rewritten with OnModelCreating fluent configuration System.Data.Entity → Microsoft.EntityFrameworkCore namespace throughout EF Core modernization generated from scratch Database seeding moved to a proper DbSeeder pattern with MigrateAsync() Identity: ASP.NET Membership → ASP.NET Core Identity with ApplicationUser, ApplicationDbContext Cookie authentication configured through ConfigureApplicationCookie Security (the whole trigger for this modernization): Azure.Identity + DefaultAzureCredential integrated in Program.cs Azure Key Vault configuration provider added via Azure.Extensions.AspNetCore.Configuration.Secrets Connection strings use Authentication=Active Directory Default — no passwords anywhere Application Insights wired through OpenTelemetry Views: Razor views updated from MVC 5 helpers to ASP.NET Core Tag Helpers and conventions _Layout.cshtml and all partials migrated The code changes touched every layer of the application. This is not a find-and-replace - it's a structural rewrite that maintains functional equivalence. Phase 3: Local Testing After modernization, the app builds, runs locally, and connects to a local SQL Server (or SQL in a container). EF Core modernizations apply cleanly, the seed data loads, and you can browse albums, add to cart, and check out. The identity system works. The Key Vault integration gracefully skips when KeyVaultName isn't configured - meaning local dev and Azure use the same Program.cs with zero code branches. Phase 4: AZD UP and Deployment to Azure The agent also generates the deployment infrastructure: azure.yaml - AZD service definition pointing to the Dockerfile, targeting Azure Container Apps Dockerfile - Multi-stage build using mcr.microsoft.com/dotnet/sdk:10.0 and aspnet:10.0 infra/main.bicep - Full IaaC including: Azure Container Apps with system + user-assigned managed identity Azure SQL Server with Azure AD-only authentication (no SQL auth) Azure Key Vault with RBAC, Secrets Officer role for the managed identity Container Registry with ACR Pull role assignment Application Insights + Log Analytics All connection strings injected as Container App secrets — using Active Directory Default, not passwords One command: AZD UP Provisions everything, builds the container, pushes to ACR, deploys to Container Apps. The app starts, runs MigrateAsync() on first boot, seeds the database, and serves traffic. Managed identity handles all auth to SQL and Key Vault. No credentials stored anywhere. What Changed in a Year Early 2025 Now Assessment Available Available Automated code modernization Semi-manual ✅ Full modernization agent Infrastructure generation Semi-manual ✅ Bicep + AZD generated Time to complete Weeks ✅ Hours The technology didn't just improve incrementally. The gap between "assessment" and "done" collapsed. A year ago, knowing what to do and being able to do it were very different things. Now they're the same step. Who This Is For If you have a .NET Framework app sitting on a backlog because "the modernization is too expensive" - revisit that assumption. The process changed. GitHub Copilot app modernization helps you rewrite your data layer, generates your infrastructure, and gets you to azd up. It can help you generate tests to increase your code coverage. If you have some feature requests or if you want to further optimize the code for scale - bring your requirements or logs or profile traces, you can take care of all of that during the modernization process. MVC Music Store went from .NET Framework 4.0 with Entity Framework 6 and plaintext SQL credentials to .NET 10 on Azure Container Apps with managed identity, Key Vault, and zero secrets in code. In an afternoon. That backlog item might be a lunch break now 😊. Really. Find your legacy apps and try it yourself. Next steps Modernize your .Net or Java apps with GitHub Copilot app modernization – https://aka.ms/ghcp-appmod Open your legacy application in Visual Studio or Visual Studio Code to start the process Deploy to Azure Container Apps https://aka.ms/aca/start498Views0likes1CommentAnnouncing Public Preview of Argo CD extension in AKS Azure Portal Experience
We are excited to announce the public preview of Argo CD in the Azure Portal for Azure Kubernetes Service. As GitOps becomes the standard for deploying and operating applications at scale, customers need a way to adopt GitOps with simpler onboarding, secure defaults, and integrated workflows. With Argo CD now available directly in the Portal, teams can enable and manage GitOps without the complexity of manual setup. Bringing GitOps into the AKS experience Argo CD is widely used across Kubernetes environments, but setup often requires manual configuration across identity, networking, and registry integrations. With the Azure Portal experience, customers can: Enable Argo CD directly from the AKS cluster Configure identity, access, ingress, and registry integration in a guided flow Manage and monitor GitOps workflows through Argo CD UI This reduces onboarding friction and helps you reach your first successful GitOps deployment faster. Trusted identity and secure access The Argo CD experience integrates with Microsoft Entra ID to provide a secure, enterprise-ready foundation: Secure authentication using Workload Identity federation to Azure Container Registry (ACR) and Azure DevOps, removing long-lived credentials and hard-coded secrets Single Sign-On (SSO) using existing Azure identities Enterprise-grade hardening and security This preview includes built-in improvements to strengthen security posture: Images built on Azure Linux for reduced CVEs and improved baseline security Optional automatic patch updates to stay current while maintaining control over change management Parity with upstream Argo CD Argo CD in AKS remains aligned with the upstream open-source project, supporting: High availability (HA) configurations for production workloads Hub-and-spoke architectures for multi-cluster GitOps Application and ApplicationSet for scalable deployment across fleets Getting Started We invite you to explore the Argo CD experience in the Azure Portal and share feedback. To get started, go to your AKS cluster in the Azure Portal, navigate to the GitOps experience, and select Enable Argo CD. Follow the guided setup to configure identity, access, ingress, and registry integration with secure defaults. Once enabled, you can monitor your deployment and view application health and sync status from the Argo CD UI linked in the GitOps blade. For customers who prefer automation and scripting, the Argo CD extension is also available via Azure CLI public preview. NOTE: You can choose between Flux and Argo CD as your GitOps solution based on your needs. The Argo CD option is available during the initial GitOps setup experience, while existing Flux users will continue to see their current configuration.526Views0likes0CommentsDon’t Start Your Application Upgrade by Changing Code
The first step of a .NET upgrade isn't code, it's assessment! That sounds obvious, but it isn't. When somebody tells you to upgrade an app, the instinct is to open the repo, bump the TargetFramework, update a few NuGet packages, and start chasing build errors, that feels like progress, but you're working without knowing what you signed up for. As we wrote at our previous posts, the Copilot modernization agent runs in three steps: assess, plan, execute. At each step it writes a Markdown file you can read and edit before it moves forward, and this time we will deep dive to learn about the assessment step! Look at the app before you touch it Jeff Fritz recorded a walkthrough of the Copilot modernization agent assessing the eShop reference app, upgrading from .NET 8 to .NET 10. The interesting part isn't that it upgrades the code. It's that it doesn't, not at first. The agent runs an assessment pass first, reading the solution, walks the dependency graph, and writes a Markdown report you can actually open and read. No commits, no `.csproj` edits. Just a file. What's in that report? In the eShop run: 10 projects that all need to move. 24 NuGet packages requiring updates, 5 of them incompatible. 52 APIs with behavioral changes that'll need testing. And every project marked "low difficulty," which tells you this particular upgrade is an afternoon, not a sprint, Knowing which project is going to be the hard one before you start changes how you plan the whole upgrade. That's what the assessment actually produced on a real codebase, and you can see it generated in about minutes with the agent running in autopilot mode in VS Code. The order matters The workflow the agent encourages is straightforward, first, assess the app as it is today, plan the upgrade based on what assessment found, lastly, implement the changes to modernize the app. The order is the whole point, if you jump straight to step three you discover the dependency conflicts and the missing test coverage live, while you're already mid-upgrade. By the time you find them, you're committed. You've already burned the easy rollback. The point to do the assessment is to move all of that to the front, when it's still cheap to change your mind. The Markdown report isn't meant to sit there untouched. You can open it, edit it, add notes, mark the projects you know are fragile, assign owners, flag the services that have a release freeze next month. Turn it into something your team will actually use to run the upgrade. Remember that Copilot doesn't know that the payments service is owned by a team that's mid-reorg, or that the reporting module is the one the CFO looks at on Monday mornings. That context still has to come from you. When you move to the planning step, the plan builds on whatever you left in that assessment. If you added constraints or flagged a project as "don't touch until Q3," that's what Copilot works from. You're setting direction, not just reviewing output. Watch it in action If you want to see the full assessment flow running on eShop, Fritz's video walks through it start to finish in VS Code with autopilot mode enabled: Modernize .NET Apps with GitHub Copilot! Test with just one app you've been putting off, run the assessment, and read the Markdown file it gives you, edit the parts that need editing, and hand it around to the two or three people who actually know that codebase. See what they push back on.514Views0likes0CommentsPerformance Tuning and Scaling Optimization for Large-Scale Azure Workloads
Summary As cloud-native systems scale, performance challenges rarely stem from a single bottleneck. Instead, they emerge from the interaction between compute, orchestration, and data layers under load. This article captures a practical optimization journey of a high-volume Azure-based workload and highlights how controlled scaling, improved orchestration design, and proactive database maintenance can significantly outperform brute-force scaling. Introduction Distributed systems are often designed with the assumption that scaling out will solve performance issues. However, for orchestration-heavy and database-intensive workloads, this approach can introduce more problems than it solves. In this scenario, the system processed millions of transactional records through Azure Functions, Durable Functions, messaging pipelines, APIs, and SQL databases. As the workload grew, the platform began experiencing: CPU and memory spikes Slower SQL queries Service Bus throttling Increased retries and execution delays What stood out was that these issues were not due to insufficient resources, but due to inefficient execution patterns at scale. The optimization effort therefore focused on controlling how the system scaled and executed, rather than simply increasing capacity. Understanding Workload Behavior A critical early step was identifying the nature of the workload—specifically, whether it was CPU-heavy or data-heavy. Rethinking Scaling: More Is Not Always Better One of the most important lessons was that scaling out aggressively can degrade performance. As more function instances processed messages in parallel: Database calls increased sharply API traffic surged Lock contention intensified Retry rates increased This created a cascading effect where retries amplified load, further slowing down the system. To address this, scaling was intentionally controlled using: Concurrency limits on function execution Batch-based processing instead of full parallel fan-out Small delays to smooth traffic spikes Chunking of large datasets into manageable units This shift from maximum parallelism to controlled throughput significantly improved system stability. Compute Optimization: CPU and Memory After stabilizing scaling behavior, the next step was optimizing compute usage. CPU Optimization CPU spikes were largely caused by excessive parallel execution and orchestration overhead. Improvements included: Breaking large workloads into smaller units Reducing unnecessary fan-outs of processes Limiting concurrent executions This resulted in more predictable CPU usage and improved execution consistency. Memory Optimization Memory pressure was primarily driven by large payloads and batch processing. Optimizations focused on: Processing data in smaller chunks Avoiding large in-memory payloads and memory leaks Reducing orchestration state size These changes improved system reliability and reduced execution failures under load. Scaling Approaches: Practical Trade-Offs Both vertical and horizontal scaling were used, but with careful consideration. Scale Up (Vertical Scaling) Quick to implement No architectural changes required Useful for immediate stabilization However, it had cost and scalability limits. Scale Out (Horizontal Scaling) Better suited for long-term scalability Enables workload distribution But without control, it can: Increase database contention Amplify retries Introduce instability Key Insight The most effective approach was not choosing one over the other but combining both with strict control over concurrency and execution patterns. Durable Functions: Orchestration Optimization Durable Functions were central to the system, making orchestration design a key factor in performance. Challenges Observed The initial design relied heavily on nested sub-orchestrators, which introduced: High orchestration overhead Increased replay and persistence operations Slower execution at scale Key Improvements Refactoring unnecessary sub-orchestrators into Activity Functions simplified execution and improved throughput. The benefits included: Reduced orchestration latency Faster execution cycles Lower infrastructure cost Note: However, sub-orchestrators remain the right choice when the design requires composing multiple dependent steps, managing scoped retry/error logic, or isolating orchestration history. The decision should be driven by the complexity and reuse requirements of each workflow segment and not applied as a blanket rule. Improved Retry Strategy Retry behavior was also optimized by redefining execution boundaries. Previously: One activity processed multiple records A single failure triggered a retry of the entire batch After optimization: One activity handled one logical unit of work This enabled: Granular retries Better failure isolation Reduced duplicate processing Database Hygiene: A Critical Foundation The database emerged as a major bottleneck due to fragmentation and stale statistics caused by continuous high-volume operations. Issues Identified Fragmented indexes Inefficient query plans Increased query execution time Optimization Approach A proactive maintenance strategy was implemented using scheduled jobs to: Update statistics regularly Rebuild indexes Maintain query performance consistency Controlled Database Load For heavy long-running workloads in multi-tenant architecture, execution of DB intensive process was intentionally run in singleton fashion at a tenant level to reduce contention. This approach: Prevented concurrent heavy operations Improved overall system stability Delivered more predictable throughput Observability: Finding the Real Problem A major challenge during optimization was distinguishing between symptoms and root causes. For example: Slow APIs were often caused by database contention High retries were triggered by upstream throttling Orchestration delays originated from downstream dependencies To address this, end-to-end observability was established using: Application-level tracing Load testing correlations Cross-service telemetry analysis This enabled accurate root cause identification and prevented misdirected optimization efforts. Key Takeaways Some key principles emerged from this optimization journey: Scaling more does not always mean performing better Controlled parallelism is more effective than unrestricted concurrency Orchestration design directly impacts system performance Database maintenance must be proactive Retry strategies should align with logical units of work Observability is essential for correct diagnosis Conclusion Performance tuning in distributed systems is less about adding resources and more about using them efficiently. By focusing on controlled scaling, simplifying orchestration, maintaining database health, and improving observability, the system achieved higher throughput, lower cost, and significantly improved stability. These lessons are broadly applicable to any Azure-based system handling large-scale, orchestration-heavy workloads and can help teams design more predictable and resilient architectures.716Views5likes0CommentsExplaining what GitHub Copilot Modernization can (and cannot do)
In the last post, we looked at the workflow: assess, plan, execute. You get reports you can review and the agent makes changes you can inspect. If you don’t know, GitHub Copilot Modernization is the new agentic tool that supports you to in modernizing older applications. Could it support you with that old 4.8 Framework app, even that forgotten VB.NET script? You're probably not modernizing one small app. It is probably a handful of projects, each with its own stack of blockers. Different frameworks, different databases, different dependencies frozen in time because nobody wants to touch them. GitHub Copilot modernization handles two big categories: upgrading .NET projects to newer versions and migrating .NET apps to Azure. But what does that look like? Upgrading .NET Projects Let’s say, you've got an ASP.NET app running on .NET Framework 4.8 or it's a web API stuck on .NET Core 3.1. Unfortunately, getting it to .NET 9 or 10 isn't just updating a target framework property. Here's what the upgrade workflow handles in Visual Studio: Assessment first. - The agent examines your project structure, dependencies, and code patterns. It generates an Assessment Report, which shows both the app information, to create the plan, and shows what it needs to do and update. Then planning. - Once you approve the assessment, it moves to planning. Here you get upgrade strategies, refactoring approaches, dependency upgrade paths, and risk mitigations documented in a plan.md file at .appmod/.migration, you can check and edit that Markdown before moving forward or ask in the Copilot Chat window to change it. # .NET 10.0 Upgrade Plan ## Execution Steps Execute steps below sequentially one by one in the order they are listed. 1. Validate that a .NET 10.0 SDK required for this upgrade is installed on the machine and if not, help to get it installed. 2. Ensure that the SDK version specified in global.json files is compatible with the .NET 10.0 upgrade. 3. Upgrade src\eShopLite.StoreFx\eShopLite.StoreFx.csproj ## Settings This section contains settings and data used by execution steps. ### Excluded projects No projects are excluded from this upgrade. ### Aggregate NuGet packages modifications across all projects NuGet packages used across all selected projects or their dependencies that need version update in projects that reference them Then execution. - After you approve the plan, and the agent breaks it into discrete tasks in a tasks.md file. Each task gets validation criteria. As it works, it updates the file with checkboxes and completion percentages so you can track progress. It makes code changes, verifies builds, runs tests. If it hits a problem, it tries to identify the cause and apply a fix. Go to the GitHub Copilot Chat window and type: The plan and progress tracker look good to me. Go ahead with the migration. It usually creates Git commits for each portion so you can review what changed or roll back if you need to. In case you don’t have a need for the Git commits for the change, you can ask the agent at the start to not commit anything. The agent primarily focuses on ASP.NET, ASP.NET Core, Blazor, Razor Pages, MVC, and Web API. It can also handle Azure Functions, WPF, Windows Forms, console apps, class libraries, and test projects. What It Handles Well (and What It Doesn't) The agent is good at code-level transformations: updating TargetFramework in .csproj files, upgrading NuGet packages, replacing deprecated APIs with their modern equivalents, fixing breaking changes like removed BinaryFormatter methods, running builds, and validating test suites. It can handle repetitive work across multiple projects in a solution without you needing to track every dependency manually. It's also solid at applying predefined Azure migration patterns, swapping plaintext credentials for managed identity, replacing file I/O with Azure Blob Storage calls, moving authentication from on-prem Active Directory to Microsoft Entra ID. These are structured transformations with clear before-and-after code patterns. But here's where you may need to pay closer attention: Language and framework coverage: It works with C# projects mainly. If your codebase includes complex Entity Framework migrations that rely on hand-tuned database scripts, the agent won't rewrite those for you. It also won't handle third-party UI framework patterns that don't map cleanly to ASP.NET Core conventions that have breaking changes between .NET Framework and later .NET versions. Web Forms migration is underway. Configuration and infrastructure: The agent doesn't migrate IIS-specific web.config settings that don't have direct equivalents in Kestrel or ASP.NET Core. It won't automatically set up a CI/CD pipeline or any modernization features; for that, you need to implement it with Copilot’s help. If you've got frontend frameworks bundled with ASP.NET (like an older Angular app served through MVC), you'll need to separate and upgrade that layer yourself. Learning and memory: The agent uses your code as context during the session, and if you correct a fix or update the plan, it tries to apply that learning within the same session. But those corrections don't persist across future upgrades. You can encode internal standards using custom skills, but that requires deliberate setup. Offline and deployment: There's no offline mode. The agent needs connectivity to run. And while it can help prepare your app for Azure deployment, it doesn't manage the actual infrastructure provisioning or ongoing operations, that's still on you. Guarantees: The suggestions aren't guaranteed to follow best practices. The agent won't always pick the best migration path. It won't catch every edge case. You're reviewing the work; pay attention to the results before putting it into production. What it does handle: the tedious parts. Reading dependency graphs. Finding all the places a deprecated API is used. Updating project files. Writing boilerplate for managed identity. Fixing compilation errors that follow a predictable pattern. Where to Start If you've been staring at a modernization backlog, pick one project. See what it comes up with! You don't have to commit to upgrading your entire portfolio. Try it on one project and see if it saves you time. Modernization at scale still happens application by application, repo by repo, and decision by decision. GitHub Copilot modernization just makes each one a little less painful. Experiment with it!1.3KViews0likes0Comments