cloud native
109 TopicsAgentic Power for AKS: Introducing the Agentic CLI in Public Preview
We are excited to announce the agentic CLI for AKS, available now in public preview directly through the Azure CLI. A huge thank you to all our private preview customers who took the time to try out our beta releases and provide feedback to our team. The agentic CLI is now available for everyone to try--continue reading to learn how you can get started. Why we built the agentic CLI for AKS The way we build software is changing with the democratization of coding agents. We believe the same should happen for how users manage their Kubernetes environments. With this feature, we want to simplify the management and troubleshooting of AKS clusters, while reducing the barrier to entry for startups and developers by bridging the knowledge gap. The agentic CLI for AKS is designed to simplify this experience by bringing agentic capabilities to your cluster operations and observability, translating natural language into actionable guidance and analysis. Whether you need to right-size your infrastructure, troubleshoot complex networking issues like DNS or outbound connectivity, or ensure smooth K8s upgrades, the agentic CLI helps you make informed decisions quickly and confidently. Our goal: streamline cluster operations and empower teams to ask questions like “Why is my pod restarting?” or “How can I optimize my cluster for cost?” and get instant, actionable answers. The agentic CLI for AKS is built on the open-source HolmesGPT project, which has recently been accepted as a CNCF Sandbox project. With a pluggable LLM endpoint structure and open-source backing, the agentic CLI is purpose-built for customizability and data privacy. From private to public preview: what's new? Earlier this year, we launched the agentic CLI in private beta for a small group of AKS customers. Their feedback has shaped what's new in our public preview release, which we are excited to share with the broader AKS community. Let’s dig in: Simplified setup: One-time initialization for LLM parameters with ‘az aks agent-init'. Configure your LLM parameters such as API key and model through a simple, guided user interface. AKS MCP integration: Enable the agent to install and run the AKS MCP server locally (directly in your CLI client) for advanced context-aware operations. The AKS MCP server includes tools for AKS clusters and associated Azure resources. Try it out: az aks agent “list all my unhealthy nodepools” --aks-mcp -n <cluster-name> -g <resource-group> Deeper investigations: New "Task List" feature which helps the agent plan and execute on complex investigations. Checklist-style tracker that allows you to stay updated on the agent's progress and planned tool calls. Provide in-line feedback: Share insights directly from the CLI about the agent's performance using /feedback. Provide a rating of the agent's analysis and optional written feedback directly to the agentic CLI team. Your feedback is highly appreciated and will help us improve the agentic CLI's capabilities. Performance and security improvements: Minor improvements for faster load times and reduced latency, as well as hardened initialization and token handling. Getting Started Install the extension az extension add --name aks-agent Set up you LLM endpoint az aks agent-init Start asking questions Some recommended scenarios to try out: Troubleshoot cluster health: az aks agent "Give me an overview of my cluster's health" Right-size your cluster: az aks agent "How can I optimize my node pool for cost?" Try out the AKS MCP integration: az aks agent "Show me CPU and memory usage trends" --aks-mcp -n <cluster-name> -g <resource-group> Get upgrade guidance: az aks agent "What should I check before upgrading my AKS cluster?" Update the agentic CLI extension az extension update --name aks-agent Join the Conversation We’d love your feedback! Use the built-in '/feedback' command or visit our GitHub repository to share ideas and issues. Learn more: https://aka.ms/aks/agentic-cli Share feedback: https://aka.ms/aks/agentic-cli/issues319Views1like0CommentsMicrosoft Azure at KubeCon North America 2025 | Atlanta, GA - Nov 10-13
KubeCon + CloudNativeCon North America is back - this time in Atlanta, Georgia, and the excitement is real. Whether you’re a developer, operator, architect, or just Kubernetes-curious, Microsoft Azure is showing up with a packed agenda, hands-on demos, and plenty of ways to connect and learn with our team of experts. Read on for all the ways you can connect with our team! Kick off with Azure Day with Kubernetes (Nov 10) Before the main conference even starts, join us for Azure Day with Kubernetes on November 10. It’s a full day of learning, best practices, deep-dive discussions, and hands-on labs, all designed to help you build cloud-native and AI apps with Kubernetes on Azure. You’ll get to meet Microsoft experts, dive into technical sessions, roll up your sleeves in the afternoon labs or have focused deep-dive discussions in our whiteboarding sessions. If you’re looking to sharpen your skills or just want to chat with folks who live and breathe Kubernetes on Azure, this is the place to be. Spots are limited, so register today at: https://aka.ms/AzureKubernetesDay Catch up with our experts at Booth #500 The Microsoft booth is more than just a spot to grab swag (though, yes, there will be swag and stickers!). It’s a central hub for connecting with product teams, setting up meetings, and seeing live demos. Whether you want to learn how to troubleshoot Kubernetes with agentic AI tools, explore open-source projects, or just talk shop, you’ll find plenty of friendly faces ready to help. We will be running a variety of theatre sessions and demos out of the booth week on topics including AKS Automatic, agentic troubleshooting, Azure Verified Modules, networking, app modernization, hybrid deployments, storage, and more. 🔥Hot tip: join us for our live Kubernetes Trivia Show at the Microsoft Azure booth during the KubeCrawl on Tuesday to win exclusive swag! Microsoft sessions at KubeCon NA 2025 Here’s a quick look at all the sessions with Microsoft speakers that you won’t want to miss. Click the titles for full details and add them to your schedule! Keynotes Date: Thu November 13, 2025 Start Time: 9:49 AM Room: Exhibit Hall B2 Title: Scaling Smarter: Simplifying Multicluster AI with KAITO and KubeFleet Speaker: Jorge Palma Abstract: As demand for AI workloads on Kubernetes grows, multicluster inferencing has emerged as a powerful yet complex architectural pattern. While multicluster support offers benefits in terms of geographic redundancy, data sovereignty, and resource optimization, it also introduces significant challenges around orchestration, traffic routing, cost control, and operational overhead. To address these challenges, we’ll introduce two CNCF projects—KAITO and KubeFleet—that work together to simplify and optimize multicluster AI operations. KAITO provides a declarative framework for managing AI inference workflows with built-in support for model versioning, and performance telemetry. KubeFleet complements this by enabling seamless workload distribution across clusters, based on cost, latency, and availability. Together, these tools reduce operational complexity, improve cost efficiency, and ensure consistent performance at scale. Date: Thu November 13, 2025 Start Time: 9:56 AM Room: Exhibit Hall B2 Title: Cloud Native Back to the Future: The Road Ahead Speakers: Jeremy Rickard (Microsoft), Alex Chircop (Akamai) Abstract: The Cloud Native Computing Foundation (CNCF) turns 10 this year, now home to more than 200 projects across the cloud native landscape. As we look ahead, the community faces new demands around security, sustainability, complexity, and emerging workloads like AI inference and agents. As many areas of the ecosystem transition to mature foundational building blocks, we are excited to explore the next evolution of cloud native development. The TOC will highlight how these challenges open opportunities to shape the next generation of applications and ensure the ecosystem continues to thrive. How are new projects addressing these new emerging workloads? How will these new projects impact security hygiene in the ecosystem? How will existing projects adapt to meet new realities? How is the CNCF evolving to support this next generation of computing? Join us as we reflect on the first decade of cloud native—and look ahead to how this community will power the age of AI, intelligent systems, and beyond. Featured Demo Date: Wed November 12, 2025 Start Time: 2:15-2:35 PM Room: Expo Demo Area Title: HolmesGPT: Agentic K8s troubleshooting in your terminal Speakers: Pavneet Singh Ahluwalia (Microsoft), Arik Alon (Robusta) Abstract: Troubleshooting Kubernetes shouldn’t require hopping across dashboards, logs, and docs. With open-source tools like HolmesGPT and the Model Context Protocol (MCP) server, you can now bring an agentic experience directly into your CLI. In this demo, we’ll show how this OSS stack can run everywhere, from lightweight kind clusters on your laptop to production-grade clusters at scale. The experience supports any LLM provider: in-cluster, local, or cloud, ensuring data never leaves your environment and costs remain predictable. We will showcase how users can ask natural-language questions (e.g., “why is my pod Pending?”) and get grounded reasoning, targeted diagnostics, and safe, human-in-the-loop remediation steps -- all without leaving the terminal. Whether you’re experimenting locally or running mission-critical workloads, you’ll walk away knowing how to extend these OSS components to build your own agentic workflows in Kubernetes. All sessions Microsoft Speaker(s) Session Will Case No Kubectl, No Problem: The Future With Conversational Kubernetes Ana Maria Lopez Moreno Smarter Together: Orchestrating Multi-Agent AI Systems With A2A and MCP on Container Neha Aggarwal 10 Years of Cilium: Connecting, Securing, and Simplifying the Cloud Native Stack Yi Zha Strengthening Supply Chain for Kubernetes: Cross-Cloud SLSA Attestation Verification Joaquim Rocha & Oleksandr Dubenko Contribfest: Power up Your CNCF Tools With Headlamp Jeremy Rickard Shaping LTS Together: What We’ve Learned the Hard Way Feynman Zhou Shipping Secure, Reusable, and Composable Infrastructure as Code: GE HealthCare’s Journey With ORAS Jackie Maertens & Nilekh Chaudhari No Joke: Two Security Maintainers Walk Into a Cluster Paul Yu, Sachi Desai Rage Against the Machine: Fighting AI Complexity with Kubernetes simplicity Dipti Pai Flux - The GitLess GitOps Edition Trask Stalnaker OpenTelemetry: Unpacking 2025, Charting 2026 Mike Morris Gateway API: Table Stakes Anish Ramasekar, Mo Khan, Stanislav Láznička, Rita Zhang & Peter Engelbert Strengthening Kubernetes Trust: SIG Auth's Latest Security Enhancements Ernest Wong AI Models Are Huge, but Your GPUs Aren’t: Mastering multi-mode distributed inference on Kubernetes Rita Zhang Navigating the Rapid Evolution of Large Model Inference: Where does Kubernetes fit? Suraj Deshmukh LLMs on Kubernetes: Squeeze 5x GPU Efficiency with cache, route, repeat! Aman Singh Drasi: A New Take on Change-driven Architectures Ganeshkumar Ashokavardhanan & Qinghui Zhuang Agent-Driven MCP for AI Workloads on Kubernetes Steven Jin Contribfest: From Farm (Fork) To Table (Feature): Growing Your First (Free-range Organic) Istio PR Jack Francis SIG Autoscaling Projects Update Mark Rossetti Kubernetes SIG-Windows Updates Apurup Chevuru & Michael Zappa Portable MTLS for Kubernetes: A QUIC-Based Plugin Compatible With Any CNI Ciprian Hacman The Next Decoupling: From Monolithic Cluster, To Control-Plane With Nodes Keith Mattix Istio Project Updates: AI Inference, Ambient Multicluster & Default Deny Jonathan Smith How Comcast Leverages Radius in Their Internal Developer Platform Jon Huhn Lightning Talk: Getting (and Staying) up To Speed on DRA With the DRA Example Driver Rita Zhang, Jaydip Gabani Open Policy Agent (OPA) Intro & Deep Dive Bridget Kromhout SIG Cloud Provider Deep Dive: Expanding Our Mission Pavneet Ahluwalia Beyond ChatOps: Agentic AI in Kubernetes—What Works, What Breaks, and What’s Next Ryan Zhang Finally, a Cluster Inventory I Can USE! Michael Katchinskiy, Yossi Weizman You Deployed What?! Data-driven lesson on Unsafe Helm Chart Defaults Mauricio Vásquez Bernal & Jose Blanquicet Contribfest: Inspektor Gadget Contribfest: Enhancing the Observability and Security of Your K8s Clusters Through an easy to use Framework Wei Fu etcd V3.6 and Beyond + Etcd-operator Updates Jeremy Rickard GitHub Actions: Project Usage and Deep Dive Dor Serero & Michael Katchinskiy What Doesn't Kill You Makes You Stronger: The Vulnerabilities that Redefined Kubernetes Security We can't wait to see you in Atlanta! Microsoft’s presence is all about empowering developers and operators to build, secure, and scale modern applications. You’ll see us leading sessions, sharing open-source contributions, and hosting roundtables on how cloud native powers AI in production. We’re here to learn from you, too - so bring your questions, ideas, and feedback.507Views0likes0CommentsAzure Container Registry Repository Permissions with Attribute-based Access Control (ABAC)
General Availability announcement Today marks the general availability of Azure Container Registry (ACR) repository permissions with Microsoft Entra attribute-based access control (ABAC). ABAC augments the familiar Azure RBAC model with namespace and repository-level conditions so platform teams can express least-privilege access at the granularity of specific repositories or entire logical namespaces. This capability is designed for modern multi-tenant platform engineering patterns where a central registry serves many business domains. With ABAC, CI/CD systems and runtime consumers like Azure Kubernetes Service (AKS) clusters have least-privilege access to ACR registries. Why this matters Enterprises are converging on a central container registry pattern that hosts artifacts and container images for multiple business units and application domains. In this model: CI/CD pipelines from different parts of the business push container images and artifacts only to approved namespaces and repositories within a central registry. AKS clusters, Azure Container Apps (ACA), Azure Container Instances (ACI), and consumers pull only from authorized repositories within a central registry. With ABAC, these repository and namespace permission boundaries become explicit and enforceable using standard Microsoft Entra identities and role assignments. This aligns with cloud-native zero trust, supply chain hardening, and least-privilege permissions. What ABAC in ACR means ACR registries now support a registry permissions mode called “RBAC Registry + ABAC Repository Permissions.” Configuring a registry to this mode makes it ABAC-enabled. When a registry is configured to be ABAC-enabled, registry administrators can optionally add ABAC conditions during standard Azure RBAC role assignments. This optional ABAC conditions scope the role assignment’s effect to specific repositories or namespace prefixes. ABAC can be enabled on all new and existing ACR registries across all SKUs, either during registry creation or configured on existing registries. ABAC-enabled built-in roles Once a registry is ABAC-enabled (configured to “RBAC Registry + ABAC Repository Permissions), registry admins can use these ABAC-enabled built-in roles to grant repository-scoped permissions: Container Registry Repository Reader: grants image pull and metadata read permissions, including tag resolution and referrer discoverability. Container Registry Repository Writer: grants Repository Reader permissions, as well as image and tag push permissions. Container Registry Repository Contributor: grants Repository Reader and Writer permissions, as well as image and tag delete permissions. Note that these roles do not grant repository list permissions. The separate Container Registry Repository Catalog Lister must be assigned to grant repository list permissions. The Container Registry Repository Catalog Lister role does not support ABAC conditions; assigning it grants permissions to list all repositories in a registry. Important role behavior changes in ABAC mode When a registry is ABAC-enabled by configuring its permissions mode to “RBAC Registry + ABAC Repository Permissions”: Legacy data-plane roles such as AcrPull, AcrPush, AcrDelete are not honored in ABAC-enabled registries. For ABAC-enabled registries, use the ABAC-enabled built-in roles listed above. Broad roles like Owner, Contributor, and Reader previously granted full control plane and data plane permissions, which is typically an overprivileged role assignment. In ABAC-enabled registries, these broad roles will only grant control plane permissions to the registry. They will no longer grant data plane permissions, such as image push, pull, delete or repository list permissions. ACR Tasks, Quick Tasks, Quick Builds, and Quick Runs no longer inherit default data-plane access to source registries; assign the ABAC-enabled roles above to the calling identity as needed. Identities you can assign ACR ABAC uses standard Microsoft Entra role assignments. Assign RBAC roles with optional ABAC conditions to users, groups, service principals, and managed identities, including AKS kubelet and workload identities, ACA and ACI identities, and more. Next steps Start using ABAC repository permissions in ACR to enforce least-privilege artifact push, pull, and delete boundaries across your CI/CD systems and container image workloads. This model is now the recommended approach for multi-tenant platform engineering patterns and central registry deployments. To get started, follow the step-by-step guides in the official ACR ABAC documentation: https://aka.ms/acr/auth/abac697Views1like0CommentsScaling Azure Functions Python with orjson
Azure Functions now supports ORJSON in the Python worker, giving developers an easy way to boost performance by simply adding the library to their environment. Benchmarks show that ORJSON delivers measurable gains in throughput and latency, with the biggest improvements on small–medium payloads common in real-world workloads. In tests, ORJSON improved throughput by up to 6% on 35 KB payloads and significantly reduced response times under load, while also eliminating dropped requests in high-throughput scenarios. With its Rust-based speed, standards compliance, and drop-in adoption, ORJSON offers a straightforward path to faster, more scalable Python Functions without any code changes.333Views0likes0CommentsBeyond the Desktop: The Future of Development with Microsoft Dev Box and GitHub Codespaces
The modern developer platform has already moved past the desktop. We’re no longer defined by what’s installed on our laptops, instead we look at what tooling we can use to move from idea to production. An organisations developer platform strategy is no longer a nice to have, it sets the ceiling for what’s possible, an organisation can’t iterate it's way to developer nirvana if the foundation itself is brittle. A great developer platform shrinks TTFC (time to first commit), accelerates release velocity, and maybe most importantly, helps alleviate everyday frictions that lead to developer burnout. Very few platforms deliver everything an organization needs from a developer platform in one product. Modern development spans multiple dimensions, local tooling, cloud infrastructure, compliance, security, cross-platform builds, collaboration, and rapid onboarding. The options organizations face are then to either compromise on one or more of these areas or force developers into rigid environments that slow productivity and innovation. This is where Microsoft Dev Box and GitHub Codespaces come into play. On their own, each addresses critical parts of the modern developer platform: Microsoft Dev Box provides a full, managed cloud workstation. Dev Box gives developers a consistent, high-performance environment while letting central IT apply strict governance and control. Internally at Microsoft, we estimate that usage of Dev Box by our development teams delivers savings of 156 hours per year per developer purely on local environment setup and upkeep. We have also seen significant gains in other key SPACE metrics reducing context-switching friction and improving build/test cycles. Although the benefits of Dev Box are clear in the results demonstrated by our customers it is not without its challenges. The biggest challenge often faced by Dev Box customers is its lack of native Linux support. At the time of writing and for the foreseeable future Dev Box does not support native Linux developer workstations. While WSL2 provides partial parity, I know from my own engineering projects it still does not deliver the full experience. This is where GitHub Codespaces comes into this story. GitHub Codespaces delivers instant, Linux-native environments spun up directly from your repository. It’s lightweight, reproducible, and ephemeral ideal for rapid iteration, PR testing, and cross-platform development where you need Linux parity or containerized workflows. Unlike Dev Box, Codespaces can run fully in Linux, giving developers access to native tools, scripts, and runtimes without workarounds. It also removes much of the friction around onboarding: a new developer can open a repository and be coding in minutes, with the exact environment defined by the project’s devcontainer.json. That said, Codespaces isn’t a complete replacement for a full workstation. While it’s perfect for isolated project work or ephemeral testing, it doesn’t provide the persistent, policy-controlled environment that enterprise teams often require for heavier workloads or complex toolchains. Used together, they fill the gaps that neither can cover alone: Dev Box gives the enterprise-grade foundation, while Codespaces provides the agile, cross-platform sandbox. For organizations, this pairing sets a higher ceiling for developer productivity, delivering a truly hybrid, agile and well governed developer platform. Better Together: Dev Box and GitHub Codespaces in action Together, Microsoft Dev Box and GitHub Codespaces deliver a hybrid developer platform that combines consistency, speed, and flexibility. Teams can spin up full, policy-compliant Dev Box workstations preloaded with enterprise tooling, IDEs, and local testing infrastructure, while Codespaces provides ephemeral, Linux-native environments tailored to each project. One of my favourite use cases is having local testing setups like a Docker Swarm cluster, ready to go in either Dev Box or Codespaces. New developers can jump in and start running services or testing microservices immediately, without spending hours on environment setup. Anecdotally, my time to first commit and time to delivering “impact” has been significantly faster on projects where one or both technologies provide local development services out of the box. Switching between Dev Boxes and Codespaces is seamless every environment keeps its own libraries, extensions, and settings intact, so developers can jump between projects without reconfiguring or breaking dependencies. The result is a turnkey, ready-to-code experience that maximizes productivity, reduces friction, and lets teams focus entirely on building, testing, and shipping software. To showcase this value, I thought I would walk through an example scenario. In this scenario I want to simulate a typical modern developer workflow. Let's look at a day in the life of a developer on this hybrid platform building an IOT project using Python and React. Spin up a ready-to-go workstation (Dev Box) for Windows development and heavy builds. Launch a Linux-native Codespace for cross-platform services, ephemeral testing, and PR work. Run "local" testing like a Docker Swarm cluster, database, and message queue ready to go out-of-the-box. Switch seamlessly between environments without losing project-specific configurations, libraries, or extensions. 9:00 AM – Morning Kickoff on Dev Box I start my day on my Microsoft Dev Box, which gives me a fully-configured Windows environment with VS Code, design tools, and Azure integrations. I select my teams project, and the environment is pre-configured for me through the Dev Box catalogue. Fortunately for me, its already provisioned. I could always self service another one using the "New Dev Box" button if I wanted too. I'll connect through the browser but I could use the desktop app too if I wanted to. My Tasks are: Prototype a new dashboard widget for monitoring IoT device temperature. Use GUI-based tools to tweak the UI and preview changes live. Review my Visio Architecture. Join my morning stand up. Write documentation notes and plan API interactions for the backend. In a flash, I have access to my modern work tooling like Teams, I have this projects files already preloaded and all my peripherals are working without additional setup. Only down side was that I did seem to be the only person on my stand up this morning? Why Dev Box first: GUI-heavy tasks are fast and responsive. Dev Box’s environment allows me to use a full desktop. Great for early-stage design, planning, and visual work. Enterprise Apps are ready for me to use out of the box (P.S. It also supports my multi-monitor setup). I use my Dev Box to make a very complicated change to my IoT dashboard. Changing the title from "IoT Dashboard" to "Owain's IoT Dashboard". I preview this change in a browser live. (Time for a coffee after this hardwork). The rest of the dashboard isnt loading as my backend isnt running... yet. 10:30 AM – Switching to Linux Codespaces Once the UI is ready, I push the code to GitHub and spin up a Linux-native GitHub Codespace for backend development. Tasks: Implement FastAPI endpoints to support the new IoT feature. Run the service on my Codespace and debug any errors. Why Codespaces now: Linux-native tools ensure compatibility with the production server. Docker and containerized testing run natively, avoiding WSL translation overhead. The environment is fully reproducible across any device I log in from. 12:30 PM – Midday Testing & Sync I toggle between Dev Box and Codespaces to test and validate the integration. I do this in my Dev Box Edge browser viewing my codespace (I use my Codespace in a browser through this demo to highlight the difference in environments. In reality I would leverage the VSCode "Remote Explorer" extension and its GitHub Codespace integration to use my Codespace from within my own desktop VSCode but that is personal preference) and I use the same browser to view my frontend preview. I update the environment variable for my frontend that is running locally in my Dev Box and point it at the port running my API locally on my Codespace. In this case it was a web socket connection and HTTPS calls to port 8000. I can make this public by changing the port visibility in my Codespace. https://fluffy-invention-5x5wp656g4xcp6x9-8000.app.github.dev/api/devices wss://fluffy-invention-5x5wp656g4xcp6x9-8000.app.github.dev/ws This allows me to: Preview the frontend widget on Dev Box, connecting to the backend running in Codespaces. Make small frontend adjustments in Dev Box while monitoring backend logs in Codespaces. Commit changes to GitHub, keeping both environments in sync and leveraging my CI/CD for deployment to the next environment. We can see the Dev Box running local frontend and the Codespace running the API connected to each other, making requests and displaying the data in the frontend! Hybrid advantage: Dev Box handles GUI previews comfortably and allows me to live test frontend changes. Codespaces handles production-aligned backend testing and Linux-native tools. Dev Box allows me to view all of my files in one screen with potentially multiple Codespaces running in browser of VS Code Desktop. Due to all of those platform efficiencies I have completed my days goals within an hour or two and now I can spend the rest of my day learning about how to enable my developers to inner source using GitHub CoPilot and MCP (Shameless plug). The bottom line There are some additional considerations when architecting a developer platform for an enterprise such as private networking and security not covered in this post but these are implementation details to deliver the described developer experience. Architecting such a platform is a valuable investment to deliver the developer platform foundations we discussed at the top of the article. While in this demo I have quickly built I was working in a mono repository in real engineering teams it is likely (I hope) that an application is built of many different repositories. The great thing about Dev Box and Codespaces is that this wouldn’t slow down the rapid development I can achieve when using both. My Dev Box would be specific for the project or development team, pre loaded with all the tools I need and potentially some repos too! When I need too I can quickly switch over to Codespaces and work in a clean isolated environment and push my changes. In both cases any changes I want to deliver locally are pushed into GitHub (Or ADO), merged and my CI/CD ensures that my next step, potentially a staging environment or who knows perhaps *Whispering* straight into production is taken care of. Once I’m finished I delete my Codespace and potentially my Dev Box if I am done with the project, knowing I can self service either one of these anytime and be up and running again! Now is there overlap in terms of what can be developed in a Codespace vs what can be developed in Azure Dev Box? Of course, but as organisations prioritise developer experience to ensure release velocity while maintaining organisational standards and governance then providing developers a windows native and Linux native service both of which are primarily charged on the consumption of the compute* is a no brainer. There are also gaps that neither fill at the moment for example Microsoft Dev Box only provides windows compute while GitHub Codespaces only supports VS Code as your chosen IDE. It's not a question of which service do I choose for my developers, these two services are better together! * Changes have been announced to Dev Box pricing. A W365 license is already required today and dev boxes will continue to be managed through Azure. For more information please see: Microsoft Dev Box capabilities are coming to Windows 365 - Microsoft Dev Box | Microsoft Learn787Views2likes0CommentsLeveraging Low Priority Pods for Rapid Scaling in AKS
If you're running workloads in Kubernetes, you'll know that scalability is key to keeping things available and responsive. But there's a problem: when your cluster runs out of resources, the node autoscaler needs to spin up new nodes, and this takes anywhere from 5 to 10 minutes. That's a long time to wait when you're dealing with a traffic spike. One way to handle this is using low priority pods to create buffer nodes that can be preempted when your actual workloads need the resources. The Problem Cloud-native applications are dynamic, and workload demands can spike quickly. Automatic scaling helps, but the delay in scaling up nodes when you run out of capacity can leave you vulnerable, especially in production. When a cluster runs out of available nodes, the autoscaler provisions new ones, and during that 5-10 minute wait you're facing: Increased Latency: Users experience lag or downtime whilst they're waiting for resources to become available. Resource Starvation: High-priority workloads don't get the resources they need, leading to degraded performance or failed tasks. Operational Overhead: SREs end up manually intervening to manage resource loads, which takes them away from more important work. This is enough reason to look at creating spare capacity in your cluster, and that's where low priority pods come in. The Solution The idea is pretty straightforward: you run low priority pods in your cluster that don't actually do any real work - they're just placeholders consuming resources. These pods are sized to take up enough space that the cluster autoscaler provisions additional nodes for them. Effectively, you're creating a buffer of "standby" nodes that are ready and waiting. When your real workloads need resources and the cluster is under pressure, Kubernetes kicks out these low priority pods to make room - this is called preemption. Essentially, Kubernetes looks at what's running, sees the low priority pods, and terminates them to free up the nodes. This happens almost immediately, and your high-priority workloads can use that capacity straight away. Meanwhile, those evicted low priority pods sit in a pending state, which triggers the autoscaler to spin up new nodes to replace the buffer you just used. The whole thing is self-maintaining. How Preemption Actually Works When a high-priority pod needs to be scheduled but there aren't enough resources, the Kubernetes scheduler kicks off preemption. This happens almost instantly compared to the 5-10 minute wait for new nodes. Here's what happens: Identification: The scheduler works out which low priority pods need to be evicted to make room. It picks the lowest priority pods first. Graceful Termination: The selected pods get a termination signal (SIGTERM) and a grace period (usually 30 seconds by default) to shut down cleanly. Resource Release: Once the low priority pods terminate, their resources are immediately released and available for scheduling. The high-priority pod can then be scheduled onto the node, typically within seconds. Buffer Pod Rescheduling: After preemption, the evicted low priority pods try to reschedule. If there's capacity on existing nodes, they'll land there. If not, they'll sit in a pending state, which triggers the cluster autoscaler to provision new nodes. This gives you a dual benefit: your critical workloads get immediate access to the nodes that were running low priority pods, and the system automatically replenishes the buffer in the background. Whilst your high-priority workloads are running on the newly freed capacity, the autoscaler is already provisioning replacement nodes for the evicted buffer pods. Your buffer capacity is continuously maintained without any manual work, so you're always ready for the next spike. The key advantage here is speed. Whilst provisioning a new node takes 5-10 minutes, preempting a low priority pod and scheduling a high-priority pod in its place typically completes in under a minute. Why This Approach Works Well Now that you understand how the solution works, let's look at why it's effective: Immediate Resource Availability: You maintain a pool of ready nodes that can rapidly scale up when needed. There's always capacity available to handle sudden load spikes without waiting for new nodes. Seamless Scaling: High-priority workloads never face resource starvation, even during traffic surges. They get immediate access to capacity, whilst the buffer automatically replenishes itself in the background. Self-Maintaining: Once set up, the system handles everything automatically. You don't need to manually manage the buffer or intervene when workloads spike. The Trade-Off Whilst low priority pods offer significant advantages for keeping your cluster responsive, you need to understand the cost implications. By maintaining buffer nodes with low priority pods, you're running machines that aren't hosting active, productive workloads. You're paying for additional infrastructure just for availability and responsiveness. These buffer nodes consume compute resources you're paying for, even though they're only running placeholder workloads. The decision for your organisation comes down to whether the improved responsiveness and elimination of that 5-10 minute scaling delay justifies the extra cost. For production environments with strict SLA requirements or where downtime is expensive, this trade-off is usually worth it. However, you'll want to carefully size your buffer capacity to balance cost with availability needs. Setting It Up Step 1: Define Your Low Priority Pod Configurations Start by defining low priority pods using the PriorityClass resource. This is where you create configurations that designate certain workloads as low priority. Here's what that configuration looks like: apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: low-priority value: 0 globalDefault: false description: "Priority class for buffer pods" --- apiVersion: apps/v1 kind: Deployment metadata: name: buffer-pods namespace: default spec: replicas: 3 # Adjust based on how much buffer capacity you need selector: matchLabels: app: buffer template: metadata: labels: app: buffer spec: priorityClassName: low-priority containers: - name: buffer-container image: registry.k8s.io/pause:3.9 # Lightweight image that does nothing resources: requests: cpu: "1000m" # Size these based on your typical workload needs memory: "2Gi" # Large enough to trigger node creation limits: cpu: "1000m" memory: "2Gi" The key things to note here: The PriorityClass has a value of 0, which is lower than the default priority for regular pods (typically 1000+) We're using a Deployment rather than individual pods so we can easily scale the buffer size The pause image is a minimal container that does basically nothing - perfect for a placeholder The resource requests are what matter - these determine how much space each buffer pod takes up You'll want to size the CPU and memory requests based on your actual workload needs Step 2: Deploy the Low Priority Pods Next, deploy these low priority pods across your cluster. Use affinity configurations to spread them out and let Kubernetes manage them. Step 3: Monitor and Adjust You'll want to monitor your deployment to make sure your buffer nodes are scaling up when needed and scaling down during idle periods to save costs. Tools like Prometheus and Grafana work well for monitoring resource usage and pod status so you can refine your setup over time. Best Practices Right-Sizing Your Buffer Pods: The resource requests for your low priority pods need careful thought. They need to be big enough to consume sufficient capacity that additional buffer nodes actually get provisioned by the autoscaler. But they shouldn't be so large that you end up over-provisioning beyond your required buffer size. Think about your typical workload resource requirements and size your buffer pods to create exactly the number of standby nodes you need. Regular Assessment: Keep assessing your scaling strategies and adjust based on what you're seeing with workload patterns and demands. Monitor how often your buffer pods are getting evicted and whether the buffer size makes sense for your traffic patterns. Communication and Documentation: Make sure your team understands what low priority pods do in your deployment and what this means for your SLAs. Document the cost of running your buffer nodes and why you're justifying this overhead. Automated Alerts: Set up alerts for when pod eviction happens so you can react quickly and make sure critical workloads aren't being affected. Also alert on buffer pod status to ensure your buffer capacity stays available. Wrapping Up Leveraging low priority pods to create buffer nodes is an effective way to handle resource constraints when you need rapid scaling and can't afford to wait for the node autoscaler. This approach is particularly valuable if you're dealing with workloads that experience sudden, unpredictable traffic spikes and need to scale up immediately - think scenarios like flash sales, breaking news events, or user-facing applications with strict SLA requirements. However, this isn't a one-size-fits-all solution. If your workloads are fairly static or you can tolerate the 5-10 minute wait for new nodes to provision, you probably don't need this. The buffer comes at an additional cost since you're running nodes that aren't doing productive work, so you need to weigh whether the improved responsiveness justifies the extra spend for your specific use case. If you do decide this approach fits your needs, remember to keep monitoring and iterating on your configuration for the best resource management. By maintaining a buffer of low priority pods, you can address resource scarcity before it becomes a problem, reduce latency, and provide a much better experience for your users. This approach will make your cluster more responsive and free up your operational capacity to focus on improving services instead of constantly firefighting resource issues.219Views0likes0CommentsSearch Less, Build More: Inner Sourcing with GitHub CoPilot and ADO MCP Server
Developers burn cycles context‑switching: opening five repos to find a logging example, searching a wiki for a data masking rule, scrolling chat history for the latest pipeline pattern. Organisations that I speak to are often on the path of transformational platform engineering projects but always have the fear or doubt of "what if my engineers don't use these resources". While projects like Backstage still play a pivotal role in inner sourcing and discoverability I also empathise with developers who would argue "How would I even know in the first place, which modules have or haven't been created for reuse". In this blog we explore how we can ensure organisational standards and developer satisfaction without any heavy lifting on either side, no custom model training, no rewriting or relocating of repositories and no stagnant local data. Using GitHub CoPilot + Azure DevOps MCP server (with the free `code_search` extension) we turn the IDE into an organizational knowledge interface. Instead of guessing or re‑implementing, engineers can start scaffolding projects or solving issues as they would normally (hopefully using CoPilot) and without extra prompting. GitHub CoPilot can lean into organisational standards and ensure recommendations are made with code snippets directly generated from existing examples. What Is the Azure DevOps MCP Server + code_search Extension? MCP (Model Context Protocol) is an open standard that lets agents (like GitHub Copilot) pull in structured, on-demand context from external systems. MCP servers contain natural language explanations of the tools that the agent can utilise allowing dynamic decision making of when to implement certain toolsets over others. The Azure DevOps MCP Server is the ADO Product Team's implementation of that standard. It exposes your ADO environment in a way CoPilot can consume. Out of the box it gives you access to: Projects – list and navigate across projects in your organization. Repositories – browse repos, branches, and files. Work items – surface user stories, bugs, or acceptance criteria. Wiki's – pull policies, standards, and documentation. This means CoPilot can ground its answers in live ADO content, instead of hallucinating or relying only on what’s in the current editor window. The ADO server runs locally from your own machine to ensure that all sensitive project information remains within your secure network boundary. This also means that existing permissions on ADO objects such as Projects or Repositories are respected. Wiki search tooling available out of the box with ADO MCP server is very useful however if I am honest I have seen these wiki's go unused with documentation being stored elsewhere either inside the repository or in a project management tool. This means any tool that needs to implement code requires the ability to accurately search the code stored in the repositories themself. That is where the code_search extension enablement in ADO is so important. Most organisations have this enabled already however it is worth noting that this pre-requisite is the real unlock of cross-repo search. This allows for CoPilot to: Query for symbols, snippets, or keywords across all repos. Retrieve usage examples from code, not just docs. Locate standards (like logging wrappers or retry policies) wherever they live. Back every recommendation with specific source lines. In short: MCP connects CoPilot to Azure DevOps. code_search makes that connection powerful by turning it into a discovery engine. What is the relevance of CoPilot Instructions? One of the less obvious but most powerful features of GitHub CoPilot is its ability to follow instructions files. CoPilot automatically looks for these files and uses them as a “playbook” for how it should behave. There are different types of instructions you can provide: Organisational instructions – apply across your entire workspace, regardless of which repo you’re in. Repo-specific instructions – scoped to a particular repository, useful when one project has unique standards or patterns. Personal instructions – smaller overrides layered on top of global rules when a local exception applies. (Stored in .github/copilot-instructions.md) In this solution, I’m using a single personal instructions file. It tells CoPilot: When to search (e.g., always query repos and wikis before answering a standards question). Where to look (Azure DevOps repos, wikis, and with code_search, the code itself). How to answer (responses must cite the repo/file/line or wiki page; if no source is found, say so). How to resolve conflicts (prefer dated wiki entries over older README fragments). As a small example, a section of a CoPilot instruction file could look like this: # GitHub Copilot Instructions for Azure DevOps MCP Integration This project uses Azure DevOps with MCP server integration to provide organizational context awareness. Always check to see if the Azure DevOps MCP server has a tool relevant to the user's request. ## Core Principles ### 1. Azure DevOps Integration - **Always prioritize Azure DevOps MCP tools** when users ask about: - Work items, stories, bugs, tasks - Pull requests and code reviews - Build pipelines and deployments - Repository operations and branch management - Wiki pages and documentation - Test plans and test cases - Project and team information ### 2. Organizational Context Awareness - Before suggesting solutions, **check existing organizational patterns** by: - Searching code across repositories for similar implementations - Referencing established coding standards and frameworks - Looking for existing shared libraries and utilities - Checking architectural decision records (ADRs) in wikis ### 3. Cross-Repository Intelligence - When providing code suggestions: - **Search for existing patterns** in other repositories first - **Reference shared libraries** and common utilities - **Maintain consistency** with organizational standards - **Suggest reusable components** when appropriate ## Tool Usage Guidelines ### Work Items and Project Management When users mention bugs, features, tasks, or project planning: ``` ✅ Use: wit_my_work_items, wit_create_work_item, wit_update_work_item ✅ Use: wit_list_backlogs, wit_get_work_items_for_iteration ✅ Use: work_list_team_iterations, core_list_projects The result... To test this I created 3 ADO Projects each with between 1-2 repositories. The repositories were light with only ReadMe's inside containing descriptions of the "repo" and some code snippets examples for usage. I have then created a brand-new workspace with no context apart from a CoPilot instructions document (which could be part of a repo scaffold or organisation wide) which tells CoPilot to search code and the wikis across all ADO projects in my demo environment. It returns guidance and standards from all available repo's and starts to use it to formulate its response. In the screenshot I have highlighted some key parts with red boxes. The first being a section of the readme that CoPilot has identified in its response, that part also highlighted within CoPilot chat response. I have highlighted the rather generic prompt I used to get this response at the bottom of that window too. Above I have highlighted CoPilot using the MCP server tooling searching through projects, repo's and code. Finally the largest box highlights the instructions given to CoPilot on how to search and how easily these could be optimised or changed depending on the requirements and organisational coding standards. How did I implement this? Implementation is actually incredibly simple. As mentioned I created multiple projects and repositories within my ADO Organisation in order to test cross-project & cross-repo discovery. I then did the following: Enable code_search - in your Azure DevOps organization (Marketplace → install extension). Login to Azure - Use the AZ CLI to authenticate to Azure with "az login". Create vscode/mcp.json file - Snippet is provided below, the organisation name should be changed to your organisations name. Start and enable your MCP server - In the mcp.json file you should see a "Start" button. Using the snippet below you will be prompted to add your organisation name. Ensure your CoPilot agent has access to the server under "tools" too. View this setup guide for full setup instructions (azure-devops-mcp/docs/GETTINGSTARTED.md at main · microsoft/azure-devops-mcp) Create a CoPilot Instructions file - with a search-first directive. I have inserted the full version used in this demo at the bottom of the article. Experiment with Prompts – Start generic (“How do we secure APIs?”). Review the output and tools used and then tailor your instructions. Considerations While this is a great approach I do still have some considerations when going to production. Latency - Using MCP tooling on every request will add some latency to developer requests. We can look at optimizing usage through copilot instructions to better identify when CoPilot should or shouldn't use the ADO MCP server. Complex Projects and Repositories - While I have demonstrated cross project and cross repository retrieval my demo environment does not truly simulate an enterprise ADO environment. Performance should be tested and closely monitored as organisational complexity increases. Public Preview - The ADO MCP server is moving quickly but is currently still public preview. We have demonstrated in this article how quickly we can make our Azure DevOps content discoverable. While their are considerations moving forward this showcases a direction towards agentic inner sourcing. Feel free to comment below how you think this approach could be extended or augmented for other use cases! Resources MCP Server Config (/.vscode/mcp.json) { "inputs": [ { "id": "ado_org", "type": "promptString", "description": "Azure DevOps organization name (e.g. 'contoso')" } ], "servers": { "ado": { "type": "stdio", "command": "npx", "args": ["-y", "@azure-devops/mcp", "${input:ado_org}"] } } } CoPilot Instructions (/.github/copilot-instructions.md) # GitHub Copilot Instructions for Azure DevOps MCP Integration This project uses Azure DevOps with MCP server integration to provide organizational context awareness. Always check to see if the Azure DevOps MCP server has a tool relevant to the user's request. ## Core Principles ### 1. Azure DevOps Integration - **Always prioritize Azure DevOps MCP tools** when users ask about: - Work items, stories, bugs, tasks - Pull requests and code reviews - Build pipelines and deployments - Repository operations and branch management - Wiki pages and documentation - Test plans and test cases - Project and team information ### 2. Organizational Context Awareness - Before suggesting solutions, **check existing organizational patterns** by: - Searching code across repositories for similar implementations - Referencing established coding standards and frameworks - Looking for existing shared libraries and utilities - Checking architectural decision records (ADRs) in wikis ### 3. Cross-Repository Intelligence - When providing code suggestions: - **Search for existing patterns** in other repositories first - **Reference shared libraries** and common utilities - **Maintain consistency** with organizational standards - **Suggest reusable components** when appropriate ## Tool Usage Guidelines ### Work Items and Project Management When users mention bugs, features, tasks, or project planning: ``` ✅ Use: wit_my_work_items, wit_create_work_item, wit_update_work_item ✅ Use: wit_list_backlogs, wit_get_work_items_for_iteration ✅ Use: work_list_team_iterations, core_list_projects ``` ### Code and Repository Operations When users ask about code, branches, or pull requests: ``` ✅ Use: repo_list_repos_by_project, repo_list_pull_requests_by_repo ✅ Use: repo_list_branches_by_repo, repo_search_commits ✅ Use: search_code for finding patterns across repositories ``` ### Documentation and Knowledge Sharing When users need documentation or want to create/update docs: ``` ✅ Use: wiki_list_wikis, wiki_get_page_content, wiki_create_or_update_page ✅ Use: search_wiki for finding existing documentation ``` ### Build and Deployment When users ask about builds, deployments, or CI/CD: ``` ✅ Use: pipelines_get_builds, pipelines_get_build_definitions ✅ Use: pipelines_run_pipeline, pipelines_get_build_status ``` ## Response Patterns ### 1. Discovery First Before providing solutions, always discover organizational context: ``` "Let me first check what patterns exist in your organization..." → Search code, check repositories, review existing work items ``` ### 2. Reference Organizational Standards When suggesting code or approaches: ``` "Based on patterns I found in your [RepositoryName] repository..." "Following your organization's standard approach seen in..." "This aligns with the pattern established in [TeamName]'s implementation..." ``` ### 3. Actionable Integration Always offer to create or update Azure DevOps artifacts: ``` "I can create a work item for this enhancement..." "Should I update the wiki page with this new pattern?" "Let me link this to the current iteration..." ``` ## Specific Scenarios ### New Feature Development 1. **Search existing repositories** for similar features 2. **Check architectural patterns** and shared libraries 3. **Review related work items** and planning documents 4. **Suggest implementation** based on organizational standards 5. **Offer to create work items** and documentation ### Bug Investigation 1. **Search for similar issues** across repositories and work items 2. **Check related builds** and recent changes 3. **Review test results** and failure patterns 4. **Provide solution** based on organizational practices 5. **Offer to create/update** bug work items and documentation ### Code Review and Standards 1. **Compare against organizational patterns** found in other repositories 2. **Reference coding standards** from wiki documentation 3. **Suggest improvements** based on established practices 4. **Check for reusable components** that could be leveraged ### Documentation Requests 1. **Search existing wikis** for related content 2. **Check for ADRs** and technical documentation 3. **Reference patterns** from similar projects 4. **Offer to create/update** wiki pages with findings ## Error Handling If Azure DevOps MCP tools are not available or fail: 1. **Inform the user** about the limitation 2. **Provide alternative approaches** using available information 3. **Suggest manual steps** for Azure DevOps integration 4. **Offer to help** with configuration if needed ## Best Practices ### Always Do: - ✅ Search organizational context before suggesting solutions - ✅ Reference existing patterns and standards - ✅ Offer to create/update Azure DevOps artifacts - ✅ Maintain consistency with organizational practices - ✅ Provide actionable next steps ### Never Do: - ❌ Suggest solutions without checking organizational context - ❌ Ignore existing patterns and implementations - ❌ Provide generic advice when specific organizational context is available - ❌ Forget to offer Azure DevOps integration opportunities --- **Remember: The goal is to provide intelligent, context-aware assistance that leverages the full organizational knowledge base available through Azure DevOps while maintaining development efficiency and consistency.**981Views1like3CommentsSimplify Image Signing and Verification with Notary Project and Trusted Signing (Public Preview)
Supply chain security has become one of the most pressing challenges for modern cloud-native applications. Every container image, Helm chart, SBOM, or AI model that flows through your CI/CD pipeline carries risk if its integrity or authenticity cannot be guaranteed. Attackers may attempt to tamper with artifacts, replace trusted images with malicious ones, or inject unverified base images into builds. Today, we’re excited to highlight how Notary Project and Trusted Signing (Public Preview) make it easier than ever to secure your container image supply chain with strong, standards-based signing and verification. Why image signing matters Image signing addresses two fundamental questions in the software supply chain: Integrity: Is this artifact exactly the same one that was originally published? Authenticity: Did this artifact really come from the expected publisher? Without clear answers, organizations risk deploying compromised images into production environments. With signing and verification in place, you can block untrusted artifacts at build time or deployment, ensuring only approved content runs in your clusters. Notary Project: A standard-based solution Notary Project is a CNCF open-source initiative that defines standards for signing and verifying OCI artifacts—including container images, SBOMs, Helm charts, and AI models. It provides a consistent, interoperable framework for ensuring artifact integrity and authenticity across different registries, platforms, and tools. Notary Project includes two key sub-projects that address different stages of the supply chain: Notation – a CLI tool designed for developers and CI/CD pipelines. It enables publishers to sign artifacts after they are built and consumers to verify signatures before artifacts are used in builds. Ratify – a verification engine that integrates with Azure policy and Azure Kubernetes Service (AKS). It enforces signature verification at deployment time, ensuring only trusted artifacts are admitted to run in the cluster. Together, Notation and Ratify extend supply chain security from the build pipeline all the way to runtime, closing critical gaps and reducing the risk of running unverified content. Trusted Signing: Simplifying certificate management Traditionally, signing workflows required managing certificates: issuing, rotating, and renewing them through services like Azure Key Vault. While this provides control, it also adds operational overhead. Trusted Signing changes the game. It offers: Zero-touch certificate lifecycle management: no manual issuance or rotation. Short-lived certificate: reducing the attack surface. Built-in timestamping support: ensuring signatures remain valid even after certificates expire. With Trusted Signing, developers focus on delivering software, not managing certificates. End-to-end scenarios Here’s how organizations can use Notary Project and Trusted Signing together: Sign in CI/CD: An image publisher signs images as part of a GitHub Actions or Azure DevOps pipeline, ensuring every artifact carries a verifiable signature. Verify in AKS: An image consumer configures Ratify and Azure Policy on an AKS cluster to enforce that only signed images can be deployed. Verify in build pipelines: Developers ensure base images and dependencies are verified before they’re used in application builds, blocking untrusted upstream components. Extend to all OCI artifacts: Beyond container images, SBOMs, Helm charts, and even AI models can be signed and verified with the same workflow. Get started To help you get started, we’ve published new documentation and step-by-step tutorials: Overview: Ensuring integrity and authenticity of container images and OCI artifacts Sign and verify images with Notation CLI and Trusted Signing Sign container images in GitHub Actions with Trusted Signing Verify signatures in GitHub Actions Verify signatures on AKS with Ratify Try it now Supply chain security is no longer optional. By combining Notary Project with the streamlined certificate management experience of Trusted Signing, you can strengthen the integrity and authenticity of every artifact in your pipeline without slowing down your teams. Start signing today and take the next step toward a trusted software supply chain.497Views2likes0CommentsUnlocking Application Modernisation with GitHub Copilot
AI-driven modernisation is unlocking new opportunities you may not have even considered yet. It's also allowing organisations to re-evaluate previously discarded modernisation attempts that were considered too hard, complex or simply didn't have the skills or time to do. During Microsoft Build 2025, we were introduced to the concept of Agentic AI modernisation and this post from Ikenna Okeke does a great job of summarising the topic - Reimagining App Modernisation for the Era of AI | Microsoft Community Hub. This blog post however, explores the modernisation opportunities that you may not even have thought of yet, the business benefits, how to start preparing your organisation, empowering your teams, and identifying where GitHub Copilot can help. I’ve spent the last 8 months working with customers exploring usage of GitHub Copilot, and want to share what my team members and I have discovered in terms of new opportunities to modernise, transform your applications, bringing some fun back into those migrations! Let’s delve into how GitHub Copilot is helping teams update old systems, move processes to the cloud, and achieve results faster than ever before. Background: The Modernisation Challenge (Then vs Now) Modernising legacy software has always been hard. In the past, teams faced steep challenges: brittle codebases full of technical debt, outdated languages (think decades-old COBOL or VB6), sparse documentation, and original developers long gone. Integrating old systems with modern cloud services often requiring specialised skills that were in short supply – for example, check out this fantastic post from Arvi LiVigni (@arilivigni ) which talks about migrating from COBOL “the number of developers who can read and write COBOL isn’t what it used to be,” making those systems much harder to update". Common pain points included compatibility issues, data migrations, high costs, security vulnerabilities, and the constant risk that any change could break critical business functions. It’s no wonder many modernisation projects stalled or were “put off” due to their complexity and risk. So, what’s different now (circa 2025) compared to two years ago? In a word: Intelligent AI assistance. Tools like GitHub Copilot have emerged as AI pair programmers that dramatically lower the barriers to modernisation. Arvi’s post talks about how only a couple of years ago, developers had to comb through documentation and Stack Overflow for clues when deciphering old code or upgrading frameworks. Today, GitHub Copilot can act like an expert co-developer inside your IDE, ready to explain mysterious code, suggest updates, and even rewrite legacy code in modern languages. This means less time fighting old code and more time implementing improvements. As Arvi says “nine times out of 10 it gives me the right answer… That speed – and not having to break out of my flow – is really what’s so impactful.” In short, AI coding assistants have evolved from novel experiments to indispensable tools, reimagining how we approach software updates and cloud adoption. I’d also add from my own experience – the models we were using 12 months ago have already been superseded by far superior models with ability to ingest larger context and tackle even further complexity. It's easier to experiment, and fail, bringing more robust outcomes – with such speed to create those proof of concepts, experimentation and failing faster, this has also unlocked the ability to test out multiple hypothesis’ and get you to the most confident outcome in a much shorter space of time. Modernisation is easier now because AI reduces the heavy lifting. Instead of reading the 10,000-line legacy program alone, a developer can ask Copilot to explain what the code does or even propose a refactored version. Rather than manually researching how to replace an outdated library, they can get instant recommendations for modern equivalents. These advancements mean that tasks which once took weeks or months can now be done in days or hours – with more confidence and less drudgery - more fun! The following sections will dive into specific opportunities unlocked by GitHub Copilot across the modernisation journey which you may not even have thought of. Modernisation Opportunities Unlocked by Copilot Modernising an application isn’t just about updating code – it involves bringing everyone and everything up to speed with cloud-era practices. Below are several scenarios and how GitHub Copilot adds value, with the specific benefits highlighted: 1. AI-Assisted Legacy Code Refactoring and Upgrades Instant Code Comprehension: GitHub Copilot can explain complex legacy code in plain English, helping developers quickly understand decades-old logic without scouring scarce documentation. For example, you can highlight a cryptic COBOL or C++ function and ask Copilot to describe what it does – an invaluable first step before making any changes. This saves hours and reduces errors when starting a modernisation effort. Automated Refactoring Suggestions: The AI suggests modern replacements for outdated patterns and APIs, and can even translate code between languages. For instance, Copilot can help convert a COBOL program into JavaScript or C# by recognising equivalent constructs. It also uses transformation tools (like OpenRewrite for Java/.NET) to systematically apply code updates – e.g. replacing all legacy HTTP calls with a modern library in one sweep. Developers remain in control, but GitHub Copilot handles the tedious bulk edits. Bulk Code Upgrades with AI: GitHub Copilot’s App Modernisation capabilities can analyse an entire codebase and generate a detailed upgrade plan, then execute many of the code changes automatically. It can upgrade framework versions (say from .NET Framework 4.x to .NET 6, or Java 8 to Java 17) by applying known fix patterns and even fixing compilation errors after the upgrade. Teams can finally tackle those hundreds of thousand-line enterprise applications – a task that could take multiple years with GitHub Copilot handling the repetitive changes. Technical Debt Reduction: By cleaning up old code and enforcing modern best practices, GitHub Copilot helps chip away at years of technical debt. The modernised codebase is more maintainable and stable, which lowers the long-term risk hanging over critical business systems. Notably, the tool can even scan for known security vulnerabilities during refactoring as it updates your code. In short, each legacy component refreshed with GitHub Copilot comes out safer and easier to work on, instead of remaining a brittle black box. 2. Accelerating Cloud Migration and Azure Modernisation Guided Azure Migration Planning: GitHub Copilot can assess a legacy application’s cloud readiness and recommend target Azure services for each component. For instance, it might suggest migrating an on-premises database to Azure SQL, moving file storage to Azure Blob Storage, and converting background jobs to Azure Functions. This provides a clear blueprint to confidently move an app from servers to Azure PaaS. One-Click Cloud Transformations: GitHub Copilot comes with predefined migration tasksthat automate the code changes required for cloud adoption. With one click, you can have the AI apply dozens of modifications across your codebase. For example: File storage: Replace local file read/writes with Azure Blob Storage SDK calls. Email/Comms: Swap out SMTP email code for Azure Communication Services or SendGrid. Identity: Migrate authentication from Windows AD to Azure AD (Entra ID) libraries. Configuration: Remove hard-coded configurations and use Azure App Configuration or Key Vault for secrets. GitHub Copilot performs these transformations consistently, following best practices (like using connection strings from Azure settings). After applying the changes, it even fixes any compile errors automatically, so you’re not left with broken builds. What used to require reading countless Azure migration guides is now handled in minutes. Automated Validation & Deployment: Modernisation doesn’t stop at code changes. GitHub Copilot can also generate unit tests to validate that the application still behaves correctly after the migration. It helps ensure that your modernised, cloud-ready app passes all its checks before going live. When you’re ready to deploy, GitHub Copilot can produce the necessary Infrastructure-as-Code templates (e.g. Azure Resource Manager Bicep files or Terraform configs) and even set up CI/CD pipeline scripts for you. In other words, the AI can configure the Azure environment and deployment process end-to-end. This dramatically reduces manual effort and error, getting your app to the cloud faster and with greater confidence. Integrations: GitHub Copilot also helps tackle larger migration scenarios that were previously considered too complex. For example, many enterprises want to retire expensive proprietary integration platforms like MuleSoft or Apigee and use Azure-native services instead, but rewriting hundreds of integration workflows was daunting. Now, GitHub Copilot can assist in translating those workflows: for instance, converting an Apigee API proxy into an Azure API Management policy, or a MuleSoft integration into an Azure Logic App. Multi-Cloud Migrations: if you plan to consolidate from other clouds into Azure, GitHub Copilot can suggest equivalent Azure services and SDK calls to replace AWS or GCP-specific code. These AI-assisted conversions significantly cut down the time needed to reimplement functionality on Azure. The business impact can be substantial. By lowering the effort of such migrations, GitHub Copilot makes it feasible to pursue opportunities that deliver big cost savings and simplification. 3. Boosting Developer Productivity and Quality Instant Unit Tests (TDD Made Easy): Writing tests for old code can be tedious, but GitHub Copilot can generate unit test cases on the fly. Developers can highlight an existing function and ask Copilot to create tests; it will produce meaningful test methods covering typical and edge scenarios. This makes it practical to apply test-driven development practices even to legacy systems – you can quickly build a safety net of tests before refactoring. By catching bugs early through these AI-generated tests, teams gain confidence to modernise code without breaking things. It essentially injects quality into the process from the start, which is crucial for successful modernisation. DevOps Automation: GitHub Copilot helps modernise your build and deployment process as well. It can draft CI/CD pipeline configurations, Dockerfiles, Kubernetes manifests, and other DevOps scripts by leveraging its knowledge of common patterns. For example, when setting up a GitHub Actions workflow to deploy your app, GitHub Copilot will autocomplete significant parts (like build steps, test runs, deployment jobs) based on the project structure. This not only saves time but also ensures best practices (proper caching, dependency installation, etc.) are followed by default. Microsoft even provides an extension where you can describe your Azure infrastructure needs in plain language and have GitHub Copilot generate the corresponding templates and pipeline YAML. By automating these pieces, teams can move to cloud-based, automated deployments much faster. Behaviour-Driven Development Support: Teams practicing BDD write human-readable scenarios (e.g. using Gherkin syntax) describing application behaviour. GitHub Copilot’s AI is adept at interpreting such descriptions and suggesting step definition code or test implementations to match. For instance, given a scenario “When a user with no items checks out, then an error message is shown,” GitHub Copilot can draft the code for that condition or the test steps required. This helps bridge the gap between non-technical specifications and actual code. It makes BDD more efficient and accessible, because even if team members aren’t strong coders, the AI can translate their intent into working code that developers can refine. Quality and Consistency: By using AI to handle boilerplate and repetitive tasks, developers can focus more on high-value improvements. GitHub Copilot’s suggestions are based on a vast corpus of code, which often means it surfaces well-structured, idiomatic patterns. Starting from these suggestions, developers are less likely to introduce errors or reinvent the wheel, which leads to more consistent code quality across the project. The AI also often reminds you of edge cases (for example, suggesting input validation or error handling code that might be missed), contributing to a more robust application. In practice, many teams find that adopting GitHub Copilot results in fewer bugs and quicker code reviews, as the code is cleaner on the first pass. It’s like having an extra set of eyes on every pull request, ensuring standards are met. Business Benefits of AI-Powered Modernisation Bringing together the technical advantages above, what’s the payoff for the business and stakeholders? Modernising with GitHub Copilot can yield multiple tangible and intangible benefits: Accelerated Time-to-Market: Modernisation projects that might have taken a year can potentially be completed in a few months, or an upgrade that took weeks can be done in days. This speed means you can deliver new features to customers sooner and respond faster to market changes. It also reduces downtime or disruption since migrations happen more swiftly. Cost Savings: By automating repetitive work and reducing the effort required from highly paid senior engineers, GitHub Copilot can trim development costs. Faster project completion also means lower overall project cost. Additionally, running modernised apps on cloud infrastructure (with updated code) often lowers operational costs due to more efficient resource usage and easier maintenance. There’s also an opportunity cost benefit: developers freed up by Copilot can work on other value-adding projects in parallel. Improved Quality & Reliability: GitHub Copilot’s contributions to testing, bug-fixing, and even security (like patching known vulnerabilities during upgrades) result in more robust applications. Modernised systems have fewer outages and security incidents than shaky legacy ones. Stakeholders will appreciate that with GitHub Copilot, modernisation doesn’t mean “trading one set of bugs for another” – instead, you can increase quality as you modernise (GitHub’s research noted higher code quality when using Copilot, as developers are less likely to introduce errors or skip tests). Business Agility: A modernised application (especially one refactored for cloud) is typically more scalable and adaptable. New integrations or features can be added much faster once the platform is up-to-date. GitHub Copilot helps clear the modernisation hurdle, after which the business can innovate on a solid, flexible foundation (for example, once a monolith is broken into microservices or moved to Azure PaaS, you can iterate on it much faster in the future). AI-assisted modernisation thus unlocks future opportunities (like easier expansion, integrations, AI features, etc.) that were impractical on the legacy stack. Employee Satisfaction and Innovation: Developer happiness is a subtle but important benefit. When tedious work is handled by AI, developers can spend more time on creative tasks – designing new features, improving user experience, exploring new technologies. This can foster a culture of innovation. Moreover, being seen as a company that leverages modern tools (like AI Co-pilots) helps attract and retain top tech talent. Teams that successfully modernise critical systems with Copilot will gain confidence to tackle other ambitious projects, creating a positive feedback loop of improvement. To sum up, GitHub Copilot acts as a force-multiplier for application modernisation. It enables organisations to do more with less: convert legacy “boat anchors” into modern, cloud-enabled assets rapidly, while improving quality and developer morale. This aligns IT goals with business goals – faster delivery, greater efficiency, and readiness for the future. Call to Action: Embrace the Future of Modernisation GitHub Copilot has proven to be a catalyst for transforming how we approach legacy systems and cloud adoption. If you’re excited about the possibilities, here are next steps and what to watch for: Start Experimenting: If you haven’t already, try GitHub Copilot on a sample of your code. Use Copilot or Copilot Chat to explain a piece of old code or generate a unit test. Seeing it in action on your own project can build confidence and spark ideas for where to apply it. Identify a Pilot Project: Look at your application portfolio for a candidate that’s ripe for modernisation – maybe a small legacy service that could be moved to Azure, or a module that needs a refactor. Use GitHub Copilot to assess and estimate the effort. Often, you’ll find tasks once deemed “too hard” might now be feasible. Early successes will help win support for larger initiatives. Stay Tuned for Our Upcoming Blog Series: This post is just the beginning. In forthcoming posts, we’ll dive deeper into: Setting Up Your Organisation for Copilot Adoption: Practical tips on preparing your enterprise environment – from licensing and security considerations to training programs. We’ll discuss best practices (like running internal awareness campaigns, defining success metrics, and creating Copilot champions in your teams) to ensure a smooth rollout. Empowering Your Colleagues: How to foster a culture that embraces AI assistance. This includes enabling continuous learning, sharing prompt techniques and knowledge bases, and addressing any scepticism. We’ll cover strategies to support developers in using Copilot effectively, so that everyone from new hires to veteran engineers can amplify their productivity. Identifying High-Impact Modernisation Areas: Guidance on spotting where GitHub Copilot can add the most value. We’ll look at different domains – code, cloud, tests, data – and how to evaluate opportunities (for example, using telemetry or feedback to find repetitive tasks suited for AI, or legacy components with high ROI if modernised). Engage and Share: As you start leveraging Copilot for modernisation, share your experiences and results. Success stories (even small wins like “GitHub Copilot helped reduce our code review times” or “we migrated a component to Azure in 1 sprint”) can build momentum within your organisation and the broader community. We invite you to discuss and ask questions in the comments or in our tech community forums. Take a look at the new App Modernisation Guidance—a comprehensive, step-by-step playbook designed to help organisations: Understand what to modernise and why Migrate and rebuild apps with AI-first design Continuously optimise with built-in governance and observability Modernisation is a journey, and AI is the new compass and co-pilot to guide the way. By embracing tools like GitHub Copilot, you position your organisation to break through modernisation barriers that once seemed insurmountable. The result is not just updated software, but a more agile, cloud-ready business and a happier, more productive development team. Now is the time to take that step. Empower your team with Copilot, and unlock the full potential of your applications and your developers. Stay tuned for more insights in our next posts, and let’s modernise what’s possible together!1KViews4likes1CommentBuild Multi-Agent AI Systems on Azure App Service
Introduction: The Evolution of AI-Powered App Service Applications Over the past few months, we've been exploring how to supercharge existing Azure App Service applications with AI capabilities. If you've been following along with this series, you've seen how we can quickly integrate AI Foundry agents with MCP servers and host remote MCP servers directly on App Service. Today, we're taking the next leap forward by demonstrating how to build sophisticated multi-agent systems that leverage connected agents, Model Context Protocol (MCP), and OpenAPI tools - all running on Azure App Service's Premium v4 tier with .NET Aspire for enhanced observability and cloud-native development experience. 💡 Want the full technical details? This blog provides an overview of the key concepts and capabilities. For comprehensive setup instructions, architecture deep-dives, performance considerations, debugging guidance, and detailed technical documentation, check out the complete README on GitHub. What Makes This Sample Special? This fashion e-commerce demo showcases several cutting-edge technologies working together: 🤖 Multi-Agent Architecture with Connected Agents Unlike single-agent systems, this sample implements an orchestration pattern where specialized agents work together: Main Orchestrator: Coordinates workflow and handles inventory queries via MCP tools Cart Manager: Specialized in shopping cart operations via OpenAPI tools Fashion Advisor: Provides expert styling recommendations Content Moderator: Ensures safe, professional interactions 🔧 Advanced Tool Integration MCP Tools: Real-time connection to external inventory systems using the Model Context Protocol OpenAPI Tools: Direct agent integration with your existing App Service APIs Connected Agent Tools: Seamless agent-to-agent communication with automatic orchestration ⚡ .NET Aspire Integration Enhanced development experience with built-in observability Simplified cloud-native application patterns Real-time monitoring and telemetry (when developing locally) 🚀 Premium v4 App Service Tier Latest App Service performance capabilities Optimized for modern cloud-native workloads Enhanced scalability for AI-powered applications Key Technical Innovations Connected Agent Orchestration Your application communicates with a single main agent, which automatically coordinates with specialist agents as needed. No changes to your existing App Service code required. Dual Tool Integration This sample demonstrates both MCP tools for external system connectivity and OpenAPI tools for direct API integration. Zero-Infrastructure Overhead Agents work directly with your existing App Service APIs and external endpoints - no additional infrastructure deployment needed. Why These Technologies Matter for Real Applications The combination of these technologies isn't just about showcasing the latest features - it's about solving real business challenges. Let's explore how each component contributes to building production-ready AI applications. .NET Aspire: Enhancing the Development Experience This sample leverages .NET Aspire to provide enhanced observability and simplified cloud-native development patterns. While .NET Aspire is still in preview on App Service, we encourage you to start exploring its capabilities and keep an eye out for future updates planned for later this year. What's particularly exciting about Aspire is how it maintains the core principle we've emphasized throughout this series: making AI integration as simple as possible. You don't need to completely restructure your application to benefit from enhanced observability and modern development patterns. Premium v4 App Service: Built for Modern AI Workloads This sample is designed to run on Azure App Service Premium v4, which we recently announced is Generally Available. Premium v4 is the latest offering in the Azure App Service family, delivering enhanced performance, scalability, and cost efficiency. From Concept to Implementation: Staying True to Our Core Promise Throughout this blog series, we've consistently demonstrated that adding AI capabilities to existing applications doesn't require massive rewrites or complex architectural changes. This multi-agent sample continues that tradition - what might seem like a complex system is actually built using the same principles we've established: ✅ Incremental Enhancement: Build on your existing App Service infrastructure ✅ Simple Integration: Use familiar tools like azd up for deployment ✅ Production-Ready: Leverage mature Azure services you already trust ✅ Future-Proof: Easy to extend as new capabilities become available Looking Forward: What's Coming Next This sample represents just the beginning of what's possible with AI-powered App Service applications. Here's what we're working on next: 🔐 MCP Authentication Integration Enhanced security patterns for production MCP server deployments, including Azure Entra ID integration. 🚀 New Azure AI Foundry Features As Azure AI Foundry continues to evolve, we'll be updating this sample to showcase: New agent capabilities Enhanced tool integrations Performance optimizations Additional model support 📊 Advanced Analytics and Monitoring Deeper integration with Azure Monitor for: Agent performance analytics Business intelligence from agent interactions 🔧 Additional Programming Language Support Following our multi-language MCP server samples, we'll be adding support for other languages in samples that will be added to the App Service documentation. Getting Started Today Ready to add multi-agent capabilities to your existing App Service application? The process follows the same streamlined approach we've used throughout this series. Quick Overview Clone and Deploy: Use azd up for one-command infrastructure deployment Create Your Agents: Run a Python setup script to configure the multi-agent system Connect Everything: Add one environment variable to link your agents Test and Explore: Try the sample conversations and see agent interactions 📚 For detailed step-by-step instructions, including prerequisites, troubleshooting tips, environment setup, and comprehensive configuration guidance, see the complete setup guide in the README. Learning Resources If you're new to this ecosystem, we recommend starting with these foundational resources: Integrate AI into your Azure App Service applications - Comprehensive guide with language-specific tutorials for building intelligent applications on App Service Supercharge Your App Service Apps with AI Foundry Agents Connected to MCP Servers - Learn the basics of integrating AI Foundry agents with MCP servers Host Remote MCP Servers on App Service - Deploy and manage MCP servers on Azure App Service Conclusion: The Future of AI-Powered Applications This multi-agent sample represents the natural evolution of our App Service AI integration journey. We started with basic agent integration, progressed through MCP server hosting, and now we're showcasing sophisticated multi-agent orchestration - all while maintaining our core principle that AI integration should enhance, not complicate, your existing applications. Whether you're just getting started with AI agents or ready to implement complex multi-agent workflows, the path forward is clear and incremental. As Azure AI Foundry adds new capabilities and App Service continues to evolve, we'll keep updating these samples and sharing new patterns. Stay tuned - the future of AI-powered applications is being built today, one agent at a time. Additional Resources 🚀 Start Building GitHub repository for this sample - Comprehensive setup guide, architecture details, troubleshooting, and technical deep-dives 📚 Learn More Azure AI Foundry Documentation: Connected Agents Guide MCP Tools Setup: Model Context Protocol Integration .NET Aspire on App Service: Deployment Guide Premium v4 App Service: General Availability Announcement Have questions or want to share how you're using multi-agent systems in your applications? Join the conversation in the comments below. We'd love to hear about your AI-powered App Service success stories!1.1KViews3likes0Comments