cloud native
117 TopicsUnlocking Application Modernisation with GitHub Copilot
AI-driven modernisation is unlocking new opportunities you may not have even considered yet. It's also allowing organisations to re-evaluate previously discarded modernisation attempts that were considered too hard, complex or simply didn't have the skills or time to do. During Microsoft Build 2025, we were introduced to the concept of Agentic AI modernisation and this post from Ikenna Okeke does a great job of summarising the topic - Reimagining App Modernisation for the Era of AI | Microsoft Community Hub. This blog post however, explores the modernisation opportunities that you may not even have thought of yet, the business benefits, how to start preparing your organisation, empowering your teams, and identifying where GitHub Copilot can help. I’ve spent the last 8 months working with customers exploring usage of GitHub Copilot, and want to share what my team members and I have discovered in terms of new opportunities to modernise, transform your applications, bringing some fun back into those migrations! Let’s delve into how GitHub Copilot is helping teams update old systems, move processes to the cloud, and achieve results faster than ever before. Background: The Modernisation Challenge (Then vs Now) Modernising legacy software has always been hard. In the past, teams faced steep challenges: brittle codebases full of technical debt, outdated languages (think decades-old COBOL or VB6), sparse documentation, and original developers long gone. Integrating old systems with modern cloud services often requiring specialised skills that were in short supply – for example, check out this fantastic post from Arvi LiVigni (@arilivigni ) which talks about migrating from COBOL “the number of developers who can read and write COBOL isn’t what it used to be,” making those systems much harder to update". Common pain points included compatibility issues, data migrations, high costs, security vulnerabilities, and the constant risk that any change could break critical business functions. It’s no wonder many modernisation projects stalled or were “put off” due to their complexity and risk. So, what’s different now (circa 2025) compared to two years ago? In a word: Intelligent AI assistance. Tools like GitHub Copilot have emerged as AI pair programmers that dramatically lower the barriers to modernisation. Arvi’s post talks about how only a couple of years ago, developers had to comb through documentation and Stack Overflow for clues when deciphering old code or upgrading frameworks. Today, GitHub Copilot can act like an expert co-developer inside your IDE, ready to explain mysterious code, suggest updates, and even rewrite legacy code in modern languages. This means less time fighting old code and more time implementing improvements. As Arvi says “nine times out of 10 it gives me the right answer… That speed – and not having to break out of my flow – is really what’s so impactful.” In short, AI coding assistants have evolved from novel experiments to indispensable tools, reimagining how we approach software updates and cloud adoption. I’d also add from my own experience – the models we were using 12 months ago have already been superseded by far superior models with ability to ingest larger context and tackle even further complexity. It's easier to experiment, and fail, bringing more robust outcomes – with such speed to create those proof of concepts, experimentation and failing faster, this has also unlocked the ability to test out multiple hypothesis’ and get you to the most confident outcome in a much shorter space of time. Modernisation is easier now because AI reduces the heavy lifting. Instead of reading the 10,000-line legacy program alone, a developer can ask Copilot to explain what the code does or even propose a refactored version. Rather than manually researching how to replace an outdated library, they can get instant recommendations for modern equivalents. These advancements mean that tasks which once took weeks or months can now be done in days or hours – with more confidence and less drudgery - more fun! The following sections will dive into specific opportunities unlocked by GitHub Copilot across the modernisation journey which you may not even have thought of. Modernisation Opportunities Unlocked by Copilot Modernising an application isn’t just about updating code – it involves bringing everyone and everything up to speed with cloud-era practices. Below are several scenarios and how GitHub Copilot adds value, with the specific benefits highlighted: 1. AI-Assisted Legacy Code Refactoring and Upgrades Instant Code Comprehension: GitHub Copilot can explain complex legacy code in plain English, helping developers quickly understand decades-old logic without scouring scarce documentation. For example, you can highlight a cryptic COBOL or C++ function and ask Copilot to describe what it does – an invaluable first step before making any changes. This saves hours and reduces errors when starting a modernisation effort. Automated Refactoring Suggestions: The AI suggests modern replacements for outdated patterns and APIs, and can even translate code between languages. For instance, Copilot can help convert a COBOL program into JavaScript or C# by recognising equivalent constructs. It also uses transformation tools (like OpenRewrite for Java/.NET) to systematically apply code updates – e.g. replacing all legacy HTTP calls with a modern library in one sweep. Developers remain in control, but GitHub Copilot handles the tedious bulk edits. Bulk Code Upgrades with AI: GitHub Copilot’s App Modernisation capabilities can analyse an entire codebase and generate a detailed upgrade plan, then execute many of the code changes automatically. It can upgrade framework versions (say from .NET Framework 4.x to .NET 6, or Java 8 to Java 17) by applying known fix patterns and even fixing compilation errors after the upgrade. Teams can finally tackle those hundreds of thousand-line enterprise applications – a task that could take multiple years with GitHub Copilot handling the repetitive changes. Technical Debt Reduction: By cleaning up old code and enforcing modern best practices, GitHub Copilot helps chip away at years of technical debt. The modernised codebase is more maintainable and stable, which lowers the long-term risk hanging over critical business systems. Notably, the tool can even scan for known security vulnerabilities during refactoring as it updates your code. In short, each legacy component refreshed with GitHub Copilot comes out safer and easier to work on, instead of remaining a brittle black box. 2. Accelerating Cloud Migration and Azure Modernisation Guided Azure Migration Planning: GitHub Copilot can assess a legacy application’s cloud readiness and recommend target Azure services for each component. For instance, it might suggest migrating an on-premises database to Azure SQL, moving file storage to Azure Blob Storage, and converting background jobs to Azure Functions. This provides a clear blueprint to confidently move an app from servers to Azure PaaS. One-Click Cloud Transformations: GitHub Copilot comes with predefined migration tasksthat automate the code changes required for cloud adoption. With one click, you can have the AI apply dozens of modifications across your codebase. For example: File storage: Replace local file read/writes with Azure Blob Storage SDK calls. Email/Comms: Swap out SMTP email code for Azure Communication Services or SendGrid. Identity: Migrate authentication from Windows AD to Azure AD (Entra ID) libraries. Configuration: Remove hard-coded configurations and use Azure App Configuration or Key Vault for secrets. GitHub Copilot performs these transformations consistently, following best practices (like using connection strings from Azure settings). After applying the changes, it even fixes any compile errors automatically, so you’re not left with broken builds. What used to require reading countless Azure migration guides is now handled in minutes. Automated Validation & Deployment: Modernisation doesn’t stop at code changes. GitHub Copilot can also generate unit tests to validate that the application still behaves correctly after the migration. It helps ensure that your modernised, cloud-ready app passes all its checks before going live. When you’re ready to deploy, GitHub Copilot can produce the necessary Infrastructure-as-Code templates (e.g. Azure Resource Manager Bicep files or Terraform configs) and even set up CI/CD pipeline scripts for you. In other words, the AI can configure the Azure environment and deployment process end-to-end. This dramatically reduces manual effort and error, getting your app to the cloud faster and with greater confidence. Integrations: GitHub Copilot also helps tackle larger migration scenarios that were previously considered too complex. For example, many enterprises want to retire expensive proprietary integration platforms like MuleSoft or Apigee and use Azure-native services instead, but rewriting hundreds of integration workflows was daunting. Now, GitHub Copilot can assist in translating those workflows: for instance, converting an Apigee API proxy into an Azure API Management policy, or a MuleSoft integration into an Azure Logic App. Multi-Cloud Migrations: if you plan to consolidate from other clouds into Azure, GitHub Copilot can suggest equivalent Azure services and SDK calls to replace AWS or GCP-specific code. These AI-assisted conversions significantly cut down the time needed to reimplement functionality on Azure. The business impact can be substantial. By lowering the effort of such migrations, GitHub Copilot makes it feasible to pursue opportunities that deliver big cost savings and simplification. 3. Boosting Developer Productivity and Quality Instant Unit Tests (TDD Made Easy): Writing tests for old code can be tedious, but GitHub Copilot can generate unit test cases on the fly. Developers can highlight an existing function and ask Copilot to create tests; it will produce meaningful test methods covering typical and edge scenarios. This makes it practical to apply test-driven development practices even to legacy systems – you can quickly build a safety net of tests before refactoring. By catching bugs early through these AI-generated tests, teams gain confidence to modernise code without breaking things. It essentially injects quality into the process from the start, which is crucial for successful modernisation. DevOps Automation: GitHub Copilot helps modernise your build and deployment process as well. It can draft CI/CD pipeline configurations, Dockerfiles, Kubernetes manifests, and other DevOps scripts by leveraging its knowledge of common patterns. For example, when setting up a GitHub Actions workflow to deploy your app, GitHub Copilot will autocomplete significant parts (like build steps, test runs, deployment jobs) based on the project structure. This not only saves time but also ensures best practices (proper caching, dependency installation, etc.) are followed by default. Microsoft even provides an extension where you can describe your Azure infrastructure needs in plain language and have GitHub Copilot generate the corresponding templates and pipeline YAML. By automating these pieces, teams can move to cloud-based, automated deployments much faster. Behaviour-Driven Development Support: Teams practicing BDD write human-readable scenarios (e.g. using Gherkin syntax) describing application behaviour. GitHub Copilot’s AI is adept at interpreting such descriptions and suggesting step definition code or test implementations to match. For instance, given a scenario “When a user with no items checks out, then an error message is shown,” GitHub Copilot can draft the code for that condition or the test steps required. This helps bridge the gap between non-technical specifications and actual code. It makes BDD more efficient and accessible, because even if team members aren’t strong coders, the AI can translate their intent into working code that developers can refine. Quality and Consistency: By using AI to handle boilerplate and repetitive tasks, developers can focus more on high-value improvements. GitHub Copilot’s suggestions are based on a vast corpus of code, which often means it surfaces well-structured, idiomatic patterns. Starting from these suggestions, developers are less likely to introduce errors or reinvent the wheel, which leads to more consistent code quality across the project. The AI also often reminds you of edge cases (for example, suggesting input validation or error handling code that might be missed), contributing to a more robust application. In practice, many teams find that adopting GitHub Copilot results in fewer bugs and quicker code reviews, as the code is cleaner on the first pass. It’s like having an extra set of eyes on every pull request, ensuring standards are met. Business Benefits of AI-Powered Modernisation Bringing together the technical advantages above, what’s the payoff for the business and stakeholders? Modernising with GitHub Copilot can yield multiple tangible and intangible benefits: Accelerated Time-to-Market: Modernisation projects that might have taken a year can potentially be completed in a few months, or an upgrade that took weeks can be done in days. This speed means you can deliver new features to customers sooner and respond faster to market changes. It also reduces downtime or disruption since migrations happen more swiftly. Cost Savings: By automating repetitive work and reducing the effort required from highly paid senior engineers, GitHub Copilot can trim development costs. Faster project completion also means lower overall project cost. Additionally, running modernised apps on cloud infrastructure (with updated code) often lowers operational costs due to more efficient resource usage and easier maintenance. There’s also an opportunity cost benefit: developers freed up by Copilot can work on other value-adding projects in parallel. Improved Quality & Reliability: GitHub Copilot’s contributions to testing, bug-fixing, and even security (like patching known vulnerabilities during upgrades) result in more robust applications. Modernised systems have fewer outages and security incidents than shaky legacy ones. Stakeholders will appreciate that with GitHub Copilot, modernisation doesn’t mean “trading one set of bugs for another” – instead, you can increase quality as you modernise (GitHub’s research noted higher code quality when using Copilot, as developers are less likely to introduce errors or skip tests). Business Agility: A modernised application (especially one refactored for cloud) is typically more scalable and adaptable. New integrations or features can be added much faster once the platform is up-to-date. GitHub Copilot helps clear the modernisation hurdle, after which the business can innovate on a solid, flexible foundation (for example, once a monolith is broken into microservices or moved to Azure PaaS, you can iterate on it much faster in the future). AI-assisted modernisation thus unlocks future opportunities (like easier expansion, integrations, AI features, etc.) that were impractical on the legacy stack. Employee Satisfaction and Innovation: Developer happiness is a subtle but important benefit. When tedious work is handled by AI, developers can spend more time on creative tasks – designing new features, improving user experience, exploring new technologies. This can foster a culture of innovation. Moreover, being seen as a company that leverages modern tools (like AI Copilot) helps attract and retain top tech talent. Teams that successfully modernise critical systems with Copilot will gain confidence to tackle other ambitious projects, creating a positive feedback loop of improvement. To sum up, GitHub Copilot acts as a force-multiplier for application modernisation. It enables organisations to do more with less: convert legacy “boat anchors” into modern, cloud-enabled assets rapidly, while improving quality and developer morale. This aligns IT goals with business goals – faster delivery, greater efficiency, and readiness for the future. Call to Action: Embrace the Future of Modernisation GitHub Copilot has proven to be a catalyst for transforming how we approach legacy systems and cloud adoption. If you’re excited about the possibilities, here are next steps and what to watch for: Start Experimenting: If you haven’t already, try GitHub Copilot on a sample of your code. Use Copilot or Copilot Chat to explain a piece of old code or generate a unit test. Seeing it in action on your own project can build confidence and spark ideas for where to apply it. Identify a Pilot Project: Look at your application portfolio for a candidate that’s ripe for modernisation – maybe a small legacy service that could be moved to Azure, or a module that needs a refactor. Use GitHub Copilot to assess and estimate the effort. Often, you’ll find tasks once deemed “too hard” might now be feasible. Early successes will help win support for larger initiatives. Stay Tuned for Our Upcoming Blog Series: This post is just the beginning. In forthcoming posts, we’ll dive deeper into: Setting Up Your Organisation for Copilot Adoption: Practical tips on preparing your enterprise environment – from licensing and security considerations to training programs. We’ll discuss best practices (like running internal awareness campaigns, defining success metrics, and creating Copilot champions in your teams) to ensure a smooth rollout. Empowering Your Colleagues: How to foster a culture that embraces AI assistance. This includes enabling continuous learning, sharing prompt techniques and knowledge bases, and addressing any scepticism. We’ll cover strategies to support developers in using Copilot effectively, so that everyone from new hires to veteran engineers can amplify their productivity. Identifying High-Impact Modernisation Areas: Guidance on spotting where GitHub Copilot can add the most value. We’ll look at different domains – code, cloud, tests, data – and how to evaluate opportunities (for example, using telemetry or feedback to find repetitive tasks suited for AI, or legacy components with high ROI if modernised). Engage and Share: As you start leveraging Copilot for modernisation, share your experiences and results. Success stories (even small wins like “GitHub Copilot helped reduce our code review times” or “we migrated a component to Azure in 1 sprint”) can build momentum within your organisation and the broader community. We invite you to discuss and ask questions in the comments or in our tech community forums. Take a look at the new App Modernisation Guidance—a comprehensive, step-by-step playbook designed to help organisations: Understand what to modernise and why Migrate and rebuild apps with AI-first design Continuously optimise with built-in governance and observability Modernisation is a journey, and AI is the new compass and Copilot to guide the way. By embracing tools like GitHub Copilot, you position your organisation to break through modernisation barriers that once seemed insurmountable. The result is not just updated software, but a more agile, cloud-ready business and a happier, more productive development team. Now is the time to take that step. Empower your team with Copilot, and unlock the full potential of your applications and your developers. Stay tuned for more insights in our next posts, and let’s modernise what’s possible together!1.3KViews4likes1CommentSearch Less, Build More: Inner Sourcing with GitHub Copilot and ADO MCP Server
Developers burn cycles context‑switching: opening five repos to find a logging example, searching a wiki for a data masking rule, scrolling chat history for the latest pipeline pattern. Organisations that I speak to are often on the path of transformational platform engineering projects but always have the fear or doubt of "what if my engineers don't use these resources". While projects like Backstage still play a pivotal role in inner sourcing and discoverability I also empathise with developers who would argue "How would I even know in the first place, which modules have or haven't been created for reuse". In this blog we explore how we can ensure organisational standards and developer satisfaction without any heavy lifting on either side, no custom model training, no rewriting or relocating of repositories and no stagnant local data. Using GitHub Copilot + Azure DevOps MCP server (with the free `code_search` extension) we turn the IDE into an organizational knowledge interface. Instead of guessing or re‑implementing, engineers can start scaffolding projects or solving issues as they would normally (hopefully using Copilot) and without extra prompting. GitHub Copilot can lean into organisational standards and ensure recommendations are made with code snippets directly generated from existing examples. What Is the Azure DevOps MCP Server + code_search Extension? MCP (Model Context Protocol) is an open standard that lets agents (like GitHub Copilot) pull in structured, on-demand context from external systems. MCP servers contain natural language explanations of the tools that the agent can utilise allowing dynamic decision making of when to implement certain toolsets over others. The Azure DevOps MCP Server is the ADO Product Team's implementation of that standard. It exposes your ADO environment in a way Copilot can consume. Out of the box it gives you access to: Projects – list and navigate across projects in your organization. Repositories – browse repos, branches, and files. Work items – surface user stories, bugs, or acceptance criteria. Wiki's – pull policies, standards, and documentation. This means Copilot can ground its answers in live ADO content, instead of hallucinating or relying only on what’s in the current editor window. The ADO server runs locally from your own machine to ensure that all sensitive project information remains within your secure network boundary. This also means that existing permissions on ADO objects such as Projects or Repositories are respected. Wiki search tooling available out of the box with ADO MCP server is very useful however if I am honest I have seen these wiki's go unused with documentation being stored elsewhere either inside the repository or in a project management tool. This means any tool that needs to implement code requires the ability to accurately search the code stored in the repositories themself. That is where the code_search extension enablement in ADO is so important. Most organisations have this enabled already however it is worth noting that this pre-requisite is the real unlock of cross-repo search. This allows for Copilot to: Query for symbols, snippets, or keywords across all repos. Retrieve usage examples from code, not just docs. Locate standards (like logging wrappers or retry policies) wherever they live. Back every recommendation with specific source lines. In short: MCP connects Copilot to Azure DevOps. code_search makes that connection powerful by turning it into a discovery engine. What is the relevance of Copilot Instructions? One of the less obvious but most powerful features of GitHub Copilot is its ability to follow instructions files. Copilot automatically looks for these files and uses them as a “playbook” for how it should behave. There are different types of instructions you can provide: Organisational instructions – apply across your entire workspace, regardless of which repo you’re in. Repo-specific instructions – scoped to a particular repository, useful when one project has unique standards or patterns. Personal instructions – smaller overrides layered on top of global rules when a local exception applies. (Stored in .github/copilot-instructions.md) In this solution, I’m using a single personal instructions file. It tells Copilot: When to search (e.g., always query repos and wikis before answering a standards question). Where to look (Azure DevOps repos, wikis, and with code_search, the code itself). How to answer (responses must cite the repo/file/line or wiki page; if no source is found, say so). How to resolve conflicts (prefer dated wiki entries over older README fragments). As a small example, a section of a Copilot instruction file could look like this: # GitHub Copilot Instructions for Azure DevOps MCP Integration This project uses Azure DevOps with MCP server integration to provide organizational context awareness. Always check to see if the Azure DevOps MCP server has a tool relevant to the user's request. ## Core Principles ### 1. Azure DevOps Integration - **Always prioritize Azure DevOps MCP tools** when users ask about: - Work items, stories, bugs, tasks - Pull requests and code reviews - Build pipelines and deployments - Repository operations and branch management - Wiki pages and documentation - Test plans and test cases - Project and team information ### 2. Organizational Context Awareness - Before suggesting solutions, **check existing organizational patterns** by: - Searching code across repositories for similar implementations - Referencing established coding standards and frameworks - Looking for existing shared libraries and utilities - Checking architectural decision records (ADRs) in wikis ### 3. Cross-Repository Intelligence - When providing code suggestions: - **Search for existing patterns** in other repositories first - **Reference shared libraries** and common utilities - **Maintain consistency** with organizational standards - **Suggest reusable components** when appropriate ## Tool Usage Guidelines ### Work Items and Project Management When users mention bugs, features, tasks, or project planning: ``` ✅ Use: wit_my_work_items, wit_create_work_item, wit_update_work_item ✅ Use: wit_list_backlogs, wit_get_work_items_for_iteration ✅ Use: work_list_team_iterations, core_list_projects The result... To test this I created 3 ADO Projects each with between 1-2 repositories. The repositories were light with only ReadMe's inside containing descriptions of the "repo" and some code snippets examples for usage. I have then created a brand-new workspace with no context apart from a Copilot instructions document (which could be part of a repo scaffold or organisation wide) which tells Copilot to search code and the wikis across all ADO projects in my demo environment. It returns guidance and standards from all available repo's and starts to use it to formulate its response. In the screenshot I have highlighted some key parts with red boxes. The first being a section of the readme that Copilot has identified in its response, that part also highlighted within CoPilot chat response. I have highlighted the rather generic prompt I used to get this response at the bottom of that window too. Above I have highlighted Copilot using the MCP server tooling searching through projects, repo's and code. Finally the largest box highlights the instructions given to Copilot on how to search and how easily these could be optimised or changed depending on the requirements and organisational coding standards. How did I implement this? Implementation is actually incredibly simple. As mentioned I created multiple projects and repositories within my ADO Organisation in order to test cross-project & cross-repo discovery. I then did the following: Enable code_search - in your Azure DevOps organization (Marketplace → install extension). Login to Azure - Use the AZ CLI to authenticate to Azure with "az login". Create vscode/mcp.json file - Snippet is provided below, the organisation name should be changed to your organisations name. Start and enable your MCP server - In the mcp.json file you should see a "Start" button. Using the snippet below you will be prompted to add your organisation name. Ensure your Copilot agent has access to the server under "tools" too. View this setup guide for full setup instructions (azure-devops-mcp/docs/GETTINGSTARTED.md at main · microsoft/azure-devops-mcp) Create a Copilot Instructions file - with a search-first directive. I have inserted the full version used in this demo at the bottom of the article. Experiment with Prompts – Start generic (“How do we secure APIs?”). Review the output and tools used and then tailor your instructions. Considerations While this is a great approach I do still have some considerations when going to production. Latency - Using MCP tooling on every request will add some latency to developer requests. We can look at optimizing usage through copilot instructions to better identify when Copilot should or shouldn't use the ADO MCP server. Complex Projects and Repositories - While I have demonstrated cross project and cross repository retrieval my demo environment does not truly simulate an enterprise ADO environment. Performance should be tested and closely monitored as organisational complexity increases. Public Preview - The ADO MCP server is moving quickly but is currently still public preview. We have demonstrated in this article how quickly we can make our Azure DevOps content discoverable. While their are considerations moving forward this showcases a direction towards agentic inner sourcing. Feel free to comment below how you think this approach could be extended or augmented for other use cases! Resources MCP Server Config (/.vscode/mcp.json) { "inputs": [ { "id": "ado_org", "type": "promptString", "description": "Azure DevOps organization name (e.g. 'contoso')" } ], "servers": { "ado": { "type": "stdio", "command": "npx", "args": ["-y", "@azure-devops/mcp", "${input:ado_org}"] } } } Copilot Instructions (/.github/copilot-instructions.md) # GitHub Copilot Instructions for Azure DevOps MCP Integration This project uses Azure DevOps with MCP server integration to provide organizational context awareness. Always check to see if the Azure DevOps MCP server has a tool relevant to the user's request. ## Core Principles ### 1. Azure DevOps Integration - **Always prioritize Azure DevOps MCP tools** when users ask about: - Work items, stories, bugs, tasks - Pull requests and code reviews - Build pipelines and deployments - Repository operations and branch management - Wiki pages and documentation - Test plans and test cases - Project and team information ### 2. Organizational Context Awareness - Before suggesting solutions, **check existing organizational patterns** by: - Searching code across repositories for similar implementations - Referencing established coding standards and frameworks - Looking for existing shared libraries and utilities - Checking architectural decision records (ADRs) in wikis ### 3. Cross-Repository Intelligence - When providing code suggestions: - **Search for existing patterns** in other repositories first - **Reference shared libraries** and common utilities - **Maintain consistency** with organizational standards - **Suggest reusable components** when appropriate ## Tool Usage Guidelines ### Work Items and Project Management When users mention bugs, features, tasks, or project planning: ``` ✅ Use: wit_my_work_items, wit_create_work_item, wit_update_work_item ✅ Use: wit_list_backlogs, wit_get_work_items_for_iteration ✅ Use: work_list_team_iterations, core_list_projects ``` ### Code and Repository Operations When users ask about code, branches, or pull requests: ``` ✅ Use: repo_list_repos_by_project, repo_list_pull_requests_by_repo ✅ Use: repo_list_branches_by_repo, repo_search_commits ✅ Use: search_code for finding patterns across repositories ``` ### Documentation and Knowledge Sharing When users need documentation or want to create/update docs: ``` ✅ Use: wiki_list_wikis, wiki_get_page_content, wiki_create_or_update_page ✅ Use: search_wiki for finding existing documentation ``` ### Build and Deployment When users ask about builds, deployments, or CI/CD: ``` ✅ Use: pipelines_get_builds, pipelines_get_build_definitions ✅ Use: pipelines_run_pipeline, pipelines_get_build_status ``` ## Response Patterns ### 1. Discovery First Before providing solutions, always discover organizational context: ``` "Let me first check what patterns exist in your organization..." → Search code, check repositories, review existing work items ``` ### 2. Reference Organizational Standards When suggesting code or approaches: ``` "Based on patterns I found in your [RepositoryName] repository..." "Following your organization's standard approach seen in..." "This aligns with the pattern established in [TeamName]'s implementation..." ``` ### 3. Actionable Integration Always offer to create or update Azure DevOps artifacts: ``` "I can create a work item for this enhancement..." "Should I update the wiki page with this new pattern?" "Let me link this to the current iteration..." ``` ## Specific Scenarios ### New Feature Development 1. **Search existing repositories** for similar features 2. **Check architectural patterns** and shared libraries 3. **Review related work items** and planning documents 4. **Suggest implementation** based on organizational standards 5. **Offer to create work items** and documentation ### Bug Investigation 1. **Search for similar issues** across repositories and work items 2. **Check related builds** and recent changes 3. **Review test results** and failure patterns 4. **Provide solution** based on organizational practices 5. **Offer to create/update** bug work items and documentation ### Code Review and Standards 1. **Compare against organizational patterns** found in other repositories 2. **Reference coding standards** from wiki documentation 3. **Suggest improvements** based on established practices 4. **Check for reusable components** that could be leveraged ### Documentation Requests 1. **Search existing wikis** for related content 2. **Check for ADRs** and technical documentation 3. **Reference patterns** from similar projects 4. **Offer to create/update** wiki pages with findings ## Error Handling If Azure DevOps MCP tools are not available or fail: 1. **Inform the user** about the limitation 2. **Provide alternative approaches** using available information 3. **Suggest manual steps** for Azure DevOps integration 4. **Offer to help** with configuration if needed ## Best Practices ### Always Do: - ✅ Search organizational context before suggesting solutions - ✅ Reference existing patterns and standards - ✅ Offer to create/update Azure DevOps artifacts - ✅ Maintain consistency with organizational practices - ✅ Provide actionable next steps ### Never Do: - ❌ Suggest solutions without checking organizational context - ❌ Ignore existing patterns and implementations - ❌ Provide generic advice when specific organizational context is available - ❌ Forget to offer Azure DevOps integration opportunities --- **Remember: The goal is to provide intelligent, context-aware assistance that leverages the full organizational knowledge base available through Azure DevOps while maintaining development efficiency and consistency.**1.2KViews1like3CommentsReimagining AI Ops with Azure SRE Agent: New Automation, Integration, and Extensibility features
Azure SRE Agent offers intelligent and context aware automation for IT operations. Enhanced by customer feedback from our preview, the SRE Agent has evolved into an extensible platform to automate and manage tasks across Azure and other environments. Built on an Agentic DevOps approach - drawing from proven practices in internal Azure operations - the Azure SRE Agent has already saved over 20,000 engineering hours across Microsoft product teams operations, delivering strong ROI for teams seeking sustainable AIOps. An Operations Agent that adapts to your playbooks Azure SRE Agent is an AI powered operations automation platform that empowers SREs, DevOps, IT operations, and support teams to automate tasks such as incident response, customer support, and developer operations from a single, extensible agent. Its value proposition and capabilities have evolved beyond diagnosis and mitigation of Azure issues, to automating operational workflows and seamless integration with the standards and processes used in your organization. SRE Agent is designed to automate operational work and reduce toil, enabling developers and operators to focus on high-value tasks. By streamlining repetitive and complex processes, SRE Agent accelerates innovation and improves reliability across cloud and hybrid environments. In this article, we will look at what’s new and what has changed since the last update. What’s New: Automation, Integration, and Extensibility Azure SRE Agent just got a major upgrade. From no-code automation to seamless integrations and expanded data connectivity, here’s what’s new in this release: No-code Sub-Agent Builder: Rapidly create custom automations without writing code. Flexible, event-driven triggers: Instantly respond to incidents and operational changes. Expanded data connectivity: Unify diagnostics and troubleshooting across more data sources. Custom actions: Integrate with your existing tools and orchestrate end-to-end workflows via MCP. Prebuilt operational scenarios: Accelerate deployment and improve reliability out of the box. Unlike generic agent platforms, Azure SRE Agent comes with deep integrations, prebuilt tools, and frameworks specifically for IT, DevOps, and SRE workflows. This means you can automate complex operational tasks faster and more reliably, tailored to your organization’s needs. Sub-Agent Builder: Custom Automation, No Code Required Empower teams to automate repetitive operational tasks without coding expertise, dramatically reducing manual workload and development cycles. This feature helps address the need for targeted automation, letting teams solve specific operational pain points without relying on one-size-fits-all solutions. Modular Sub-Agents: Easily create custom sub-agents tailored to your team’s needs. Each sub-agent can have its own instructions, triggers, and toolsets, letting you automate everything from outage response to customer email triage. Prebuilt System Tools: Eliminate the inefficiency of creating basic automation from scratch, and choose from a rich library of hundreds of built-in tools for Azure operations, code analysis, deployment management, diagnostics, and more. Custom Logic: Align automation to your unique business processes by defining your automation logic and prompts, teaching the agent to act exactly as your workflow requires. Flexible Triggers: Automate on Your Terms Invoke the agent to respond automatically to mission-critical events, not wait for manual commands. This feature helps speed up incident response and eliminate missed opportunities for efficiency. Multi-Source Triggers: Go beyond chat-based interactions, and trigger the agent to automatically respond to Incident Management and Ticketing systems like PagerDuty and ServiceNow, Observability Alerting systems like Azure Monitor Alerts, or even on a cron-based schedule for proactive monitoring and best-practices checks. Additional trigger sources such as GitHub issues, Azure DevOps pipelines, email, etc. will be added over time. This means automation can start exactly when and where you need it. Event-Driven Operations: Integrate with your CI/CD, monitoring, or support systems to launch automations in response to real-world events - like deployments, incidents, or customer requests. Vital for reducing downtime, it ensures that business-critical actions happen automatically and promptly. Expanded Data Connectivity: Unified Observability and Troubleshooting Integrate data, enabling comprehensive diagnostics and troubleshooting and faster, more informed decision-making by eliminating silos and speeding up issue resolution. Multiple Data Sources: The agent can now read data from Azure Monitor, Log Analytics, and Application Insights based on its Azure role-based access control (RBAC). Additional observability data sources such as Dynatrace, New Relic, Datadog, and more can be added via the Remote Model Context Protocol (MCP) servers for these tools. This gives you a unified view for diagnostics and automation. Knowledge Integration: Rather than manually detailing every instruction in your prompt, you can upload your Troubleshooting Guide (TSG) or Runbook directly, allowing the agent to automatically create an execution plan from the file. You may also connect the agent to resources like SharePoint, Jira, or documentation repositories through Remote MCP servers, enabling it to retrieve needed files on its own. This approach utilizes your organization’s existing knowledge base, streamlining onboarding and enhancing consistency in managing incidents. Azure SRE Agent is also building multi-agent collaboration by integrating with PagerDuty and Neubird, enabling advanced, cross-platform incident management and reliability across diverse environments. Custom Actions: Automate Anything, Anywhere Extend automation beyond Azure and integrate with any tool or workflow, solving the problem of limited automation scope and enabling end-to-end process orchestration. Out-of-the-Box Actions: Instantly automate common tasks like running azcli, kubectl, creating GitHub issues, or updating Azure resources, reducing setup time and operational overhead. Communication Notifications: The SRE Agent now features built-in connectors for Outlook, enabling automated email notifications, and for Microsoft Teams, allowing it to post messages directly to Teams channels for streamlined communication. Bring Your Own Actions: Drop in your own Remote MCP servers to extend the agent’s capabilities to any custom tool or workflow. Future-proof your agentic DevOps by automating proprietary or emerging processes with confidence. Prebuilt Operations Scenarios Address common operational challenges out of the box, saving teams time and effort while improving reliability and customer satisfaction. Incident Response: Minimize business impact and reduce operational risk by automating detection, diagnosis, and mitigation of your workload stack. The agent has built-in runbooks for common issues related to many Azure resource types including Azure Kubernetes Service (AKS), Azure Container Apps (ACA), Azure App Service, Azure Logic Apps, Azure Database for PostgreSQL, Azure CosmosDB, Azure VMs, etc. Support for additional resource types is being added continually, please see product documentation for the latest information. Root Cause Analysis & IaC Drift Detection: Instantly pinpoint incident causes with AI-driven root cause analysis including automated source code scanning via GitHub and Azure DevOps integration. Proactively detect and resolve infrastructure drift by comparing live cloud environments against source-controlled IaC, ensuring configuration consistency and compliance. Handle Complex Investigations: Enable the deep investigation mode that uses a hypothesis-driven method to analyze possible root causes. It collects logs and metrics, tests hypotheses with iterative checks, and documents findings. The process delivers a clear summary and actionable steps to help teams accurately resolve critical issues. Incident Analysis: The integrated dashboard offers a comprehensive overview of all incidents managed by the SRE Agent. It presents essential metrics, including the number of incidents reviewed, assisted, and mitigated by the agent, as well as those awaiting human intervention. Users can leverage aggregated visualizations and AI-generated root cause analyses to gain insights into incident processing, identify trends, enhance response strategies, and detect areas for improvement in incident management. Inbuilt Agent Memory: The new SRE Agent Memory System transforms incident response by institutionalizing the expertise of top SREs - capturing, indexing, and reusing critical knowledge from past incidents, investigations, and user guidance. Benefit from faster, more accurate troubleshooting, as the agent learns from both successes and mistakes, surfacing relevant insights, runbooks, and mitigation strategies exactly when needed. This system leverages advanced retrieval techniques and a domain-aware schema to ensure every on-call engagement is smarter than the last, reducing mean time to resolution (MTTR) and minimizing repeated toil. Automatically gain a continuously improving agent that remembers what works, avoids past pitfalls, and delivers actionable guidance tailored to the environment. GitHub Copilot and Azure DevOps Integration: Automatically triage, respond to, and resolve issues raised in GitHub or Azure DevOps. Integration with modern development platforms such as GitHub Copilot coding agent increases efficiency and ensures that issues are resolved faster, reducing bottlenecks in the development lifecycle. Ready to get started? Azure SRE Agent home page Product overview Pricing Page Pricing Calculator Pricing Blog Demo recordings Deployment samples What’s Next? Give us feedback: Your feedback is critical - You can Thumbs Up / Thumbs Down each interaction or thread, or go to the “Give Feedback” button in the agent to give us in-product feedback - or you can create issues or just share your thoughts in our GitHub repo at https://github.com/microsoft/sre-agent. We’re just getting started. In the coming months, expect even more prebuilt integrations, expanded data sources, and new automation scenarios. We anticipate continuous growth and improvement throughout our agentic AI platforms and services to effectively address customer needs and preferences. Let us know what Ops toil you want to automate next!2.2KViews0likes0CommentsProactive Monitoring Made Simple with Azure SRE Agent
SRE teams strive for proactive operations, catching issues before they impact customers and reducing the chaos of incident response. While perfection may be elusive, the real goal is minimizing outages and gaining immediate line of sight into production environments. Today, that’s harder than ever. It requires correlating countless signals and alerts, understanding how they relate—or don’t relate—to each other, and assigning the right sense of urgency and impact. Anything that shortens this cycle, accelerates detection, and enables automated remediation is what modern SRE teams crave. What if you could skip the scripting and pipelines? What if you could simply describe what you want in plain language and let it run automatically on a schedule? Scheduled Tasks for Azure SRE Agent With Scheduled Tasks for Azure SRE Agent, that what-if scenario is now a reality. Scheduled tasks combine natural language prompts with Azure SRE Agent’s automation capabilities, so you can express intent, set a schedule, and let the agent do the rest—without writing a single line of code. This means: ⚡ Faster incident response through early detection ✅ Better compliance with automated checks 🎯 More time for high-value engineering work and innovation 💡 The shift from reactive to proactive: Instead of waiting for alerts to fire or customers to report issues, you’re continuously monitoring, validating, and catching problems before they escalate. How Scheduled Tasks Work Under the Hood When you create a Scheduled Task, the process is more than just running a prompt on a timer. Here’s what happens: 1. Prompt Interpretation and Plan Creation The SRE Agent takes your natural language prompt—such as “Scan all resources for security best practices”—and converts it into a structured execution plan. This plan defines the steps, tools, and data sources required to fulfill your request. 2. Built-In Tools and MCP Integration The agent uses its built-in capabilities (Azure CLI, Log Analytics workspace, Appinsights) and can also leverage 3 rd party data sources or tools via MCP server integration for extended functionality. 3. Results Analysis and Smart Summarization After execution, the agent analyzes results, identifies anomalies or issues, and provides actionable summaries not just raw data dumps. 4. Notification and Escalation Based on findings, the agent can: Post updates to Teams channels Create or update incidents Send email notifications Trigger follow-up actions Real-World Use Cases for Proactive Ops Here’s where scheduled tasks shine for SRE teams: Use Case Prompt Example Schedule Security Posture Check “Scan all subscriptions for resources with public endpoints and flag any that shouldn’t be exposed” Daily Cost Anomaly Detection “Compare this week’s spend against last week and alert if any service exceeds 20% growth” Weekly Compliance Drift Detection “Check all storage accounts for encryption settings and report any non-compliant resources” Daily Resource Health Summary “Summarize the health status of all production VMs and highlight any degraded instances” Every 4 hours Incident Trend Analysis “Analyze ICM incidents from the past week, identify patterns, and summarize top contributing services” Weekly Getting Started in 3 Steps Step 1: Define Your Intent Write a natural language prompt describing what you want to monitor or check. Be specific about: - What resources or scope - What conditions to look for - What action to take if issues are found Example: > “Every morning at 8 AM, check all production Kubernetes clusters for pods in CrashLoopBackOff state. If any are found, post a summary to the #sre-alerts Teams channel with cluster name, namespace, and pod details.” Step 2: Set Your Schedule Choose how often the task should run: - Cron expressions for precise control - Simple intervals (hourly, daily, weekly) Step 3: Define Where to Receive Updates Include in your prompt where you want results delivered when the task finishes execution. The agent can use its built-in tools and connectors to: - Post summaries to a Teams channel - Send email notifications - Create or update ICM incidents Example prompt with notification: > "Check all production databases for long-running queries over 30 seconds. If any are found, post a summary to the #database-alerts Teams channel." Why This Matters for Proactive Operations Traditional monitoring approaches have limitations: Traditional Approach With Scheduled Tasks Write scripts, maintain pipelines Describe in plain language Static thresholds and rules Contextual, AI-powered analysis Alert fatigue from noisy signals Smart summarization of what matters Separate tools for check vs. action Unified detection and response Requires dedicated DevOps effort Any SRE can create and modify The result? Your team spends less time building and maintaining monitoring infrastructure and more time on the work that truly requires human expertise. Best Practices for Scheduled Tasks Start simple, iterate — Begin with one or two high-value checks and expand as you gain confidence Be specific in prompts — The more context you provide, the better the results Set appropriate frequencies — Not everything needs to run hourly; match the schedule to the risk Review and refine — Check task results periodically and adjust prompts for better accuracy What’s Next? Scheduled tasks are just the beginning. We’re continuing to invest in capabilities that help SRE teams shift left—catching issues earlier, automating routine checks, and freeing up time for strategic work. Ready to Start? Use this sample that shows how to create a scheduled health check sub-agent: https://github.com/microsoft/sre-agent/blob/main/samples/automation/samples/02-scheduled-health-check-sample.md This example demonstrates: - Building a HealthCheckAgent using built-in tools like Azure CLI and Log Analytics Workspace - Scheduling daily health checks for a container app at 7 AM - Sending email alerts when anomalies are detected 🔗 Explore more samples here: https://github.com/microsoft/sre-agent/tree/main/samples More to Learn Ignite 2025 announcements: https://aka.ms/ignite25/blog/sreagent Documentation: https://aka.ms/sreagent/docs Support & Feature Requests: https://github.com/microsoft/sre-agent/issues549Views0likes0CommentsAgentic Applications on Azure Container Apps with Microsoft Foundry
Agents have exploded in popularity over the last year, reshaping not only the kinds of applications developers build but also the underlying architectures required to run them. As agentic applications grow more complex by invoking tools, collaborating with other services, and orchestrating multi-step workflows, architectures are naturally shifting toward microservice patterns. Azure Container Apps is purpose-built for this world: a fully managed, serverless platform designed to run independent, composable services with autoscaling, pay-per-second pricing, and seamless app-to-app communication. By combining Azure Container Apps with the Microsoft Agent Framework (MAF) and Microsoft Foundry, developers can run containerized agents on ACA while using Foundry to visualize and monitor how those agents behave. Azure Container Apps handles scalable, high-performance execution of agent logic, and Microsoft Foundry lights up rich observability for reasoning, planning, tool calls, and errors through its integrated monitoring experience. Together, they form a powerful foundation for building and operating modern, production-grade agentic applications. In this blog, we’ll walk through how to build an agent running on Azure Container Apps using Microsoft Agent Framework and OpenTelemetry, and then connect its telemetry to Microsoft Foundry so you can see your ACA-hosted agent directly in the Foundry monitoring experience. Prerequisites An Azure account with an active subscription. If you don't have one, you can create one for free. Ensure you have a Microsoft Foundry project setup. If you don’t already have one, you can create a project from the Azure AI Foundry portal. Azure Developer CLI (azd) installed Git installed The Sample Agent The complete sample code is available in this repo and can be deployed end-to-end with a single command. It's a basic currency agent. This sample deploys: An Azure Container Apps environment An agent built with Microsoft Agent Framework (MAF) OpenTelemetry instrumentation using the Azure Monitor exporter Application Insights to collect agent telemetry A Microsoft Foundry resource Environment wiring to integrate the agent with Microsoft Foundry Deployment Steps Clone the repository: git clone https://github.com/cachai2/foundry-3p-agents-samples.git cd foundry-3p-agents-samples/azure Authenticate with Azure: azd auth login Set the following azd environment variable azd env set AZURE_AI_MODEL_DEPLOYMENT_NAME gpt-4.1-mini Deploy to Azure azd up This provisions your Azure Container App, Application Insights, logs pipeline and required environment variables. While deployment runs, let’s break down how the code becomes compatible with Microsoft Foundry. 1. Setting up the agent in Azure Container Apps To integrate with Microsoft Foundry, the agent needs two essential capabilities: Microsoft Agent Framework (MAF): Handles agent logic, tools, schema-driven execution, and emits standardized gen_ai.* spans. OpenTelemetry: Sends the required agent/model/tool spans to Application Insights, which Microsoft Foundry consumes for visualization and monitoring. Although this sample uses MAF, the same pattern works with any agent framework. MAF and LangChain currently provide the richest telemetry support out-of-the-box. 1.1 Configure Microsoft Agent Framework (MAF) The agent includes: A tool (get_exchange_rate) An agent created by ChatAgent A runtime manager (AgentRuntime) A FastAPI app exposing /invoke Telemetry is enabled using two components already present in the repo: configure_azure_monitor: Configures OpenTelemetry + Azure Monitor exporter + auto-instrumentation. setup_observability(): Enables MAF’s additional spans (gen_ai.*, tool spans, agent lifecycle spans). From the repo (_configure_observability()): from azure.monitor.opentelemetry import configure_azure_monitor from agent_framework.observability import setup_observability from opentelemetry.sdk.resources import Resource def _configure_observability() -> None: configure_azure_monitor( resource=Resource.create({"service.name": SERVICE_NAME}), connection_string=APPLICATION_INSIGHTS_CONNECTION_STRING, ) setup_observability(enable_sensitive_data=False) This gives you: gen_ai.model.* spans (model usage + token counts) tool call spans agent lifecycle & execution spans HTTP + FastAPI instrumentation Standardized telemetry required by Microsoft Foundry No manual TracerProvider wiring or OTLP exporter setup is needed. 1.2 OpenTelemetry Setup (Azure Monitor Exporter) In this sample, OpenTelemetry is fully configured by Azure Monitor’s helper: import os from azure.monitor.opentelemetry import configure_azure_monitor from opentelemetry.sdk.resources import Resource from agent_framework.observability import setup_observability SERVICE_NAME = os.getenv("ACA_SERVICE_NAME", "aca-currency-exchange-agent") configure_azure_monitor( resource=Resource.create({"service.name": SERVICE_NAME}), connection_string=os.getenv("APPLICATION_INSIGHTS_CONNECTION_STRING"), ) # Enable Microsoft Agent Framework gen_ai/tool spans on top of OTEL setup_observability(enable_sensitive_data=False) This automatically: Installs and configures the OTEL tracer provider Enables batching + exporting of spans Adds HTTP/FastAPI/Requests auto-instrumentation Sends telemetry to Application Insights Adds MAF’s agent + tool spans All required environment variables (such as APPLICATION_INSIGHTS_CONNECTION_STRING) are injected automatically by azd up. 2. Deploy the Model and Test Your Agent Once azd up completes, you're ready to deploy a model to the Microsoft Foundry instance and test it. Find the resource name of your deployed Azure AI Services from azd up and navigate to it. From there, open it in Microsoft Foundry, navigate to the Model Catalog and add the gpt-4.1-mini model. Find the resource name of your deployed Azure Container App and navigate to it. Copy the application URL Set your container app URL environment variable in your terminal. (The below commands are for WSL.) export APP_URL="Your container app URL" Now, go back to your terminal and run the following curl command to invoke the agent curl -X POST "$APP_URL/invoke" \ -H "Content-Type: application/json" \ -d '{ "prompt": "How do I convert 100 USD to EUR?" }' 3. Verifying Telemetry to Application Insights Once your Container App starts, you can validate telemetry: Open the Application Insights resource created by azd up Go to Logs Run these queries (make sure you're in KQL mode not simple mode) Check MAF-genAI spans: dependencies | where timestamp > ago(3h) | extend genOp = tostring(customDimensions["gen_ai.operation.name"]), genSys = tostring(customDimensions["gen_ai.system"]), reqModel = tostring(customDimensions["gen_ai.request.model"]), resModel = tostring(customDimensions["gen_ai.response.model"]) | summarize count() by genOp, genSys, reqModel, resModel | order by count_ desc Check agent + tools: dependencies | where timestamp > ago(1h) | extend genOp = tostring(customDimensions["gen_ai.operation.name"]), agent = tostring(customDimensions["gen_ai.agent.name"]), tool = tostring(customDimensions["gen_ai.tool.name"]) | where genOp in ("agent.run", "invoke_agent", "execute_tool") | project timestamp, genOp, agent, tool, name, target, customDimensions | order by timestamp desc If telemetry is flowing, you’re ready to plug your agent into Microsoft Foundry. 4. Connect Application Insights to Microsoft Foundry Microsoft Foundry uses your Application Insights resource to power: Agent monitoring Tool call traces Reasoning graphs Multi-agent orchestration views Error analysis To connect: Navigate to Monitoring in the left navigation pane of the Microsoft Foundry portal. Select the Application analytics tab. Select your application insights resource created from azd up Connect the resource to your AI Foundry project. Note: If you are unable to add your application insights connection this way, you may need to follow the following: Navigate to the Overview of your Foundry project -> Open in management center -> Connected resources -> New Connection -> Application Insights Foundry will automatically start ingesting: gen_ai.* spans tool spans agent lifecycle spans workflow traces No additional configuration is required. 5. Viewing Dashboards & Traces in Microsoft Foundry Once your Application Insights connection is added, you can view your agent’s telemetry directly in Microsoft Foundry’s Monitoring experience. 5.1 Monitoring The Monitoring tab shows high-level operational metrics for your application, including: Total inference calls Average call duration Overall success/error rate Token usage (when available) Traffic trends over time This view is useful for spotting latency spikes, increased load, or changes in usage patterns, and these visualizations are powered by the telemetry emitting from your agents in Azure Container Apps. 5.2 Traces Timeline The Tracing tab shows the full distributed trace of each agent request, including all spans emitted by Microsoft Foundry and your Azure Container App with Microsoft Agent Framework. You can see: Top-level operations such as invoke_agent, chat, and process_thread_run Tool calls like execute_tool_get_exchange_rate Internal MAF steps (create_thread, create_message, run tool) Azure credential calls (GET /msi/token) Input/output tokens and duration for each span This view gives you an end-to-end breakdown of how your agent executed, which tools it invoked, and how long each step took — essential for debugging and performance tuning. Conclusion By combining Azure Container Apps, the Microsoft Agent Framework, and OpenTelemetry, you can build agents that are not only scalable and production-ready, but also fully observable and orchestratable inside Microsoft Foundry. Container Apps provides the execution engine and autoscaling foundation, MAF supplies structured agent logic and telemetry, and Microsoft Foundry ties everything together with powerful planning, monitoring, and workflow visualization. This architecture gives you the best of both worlds: the flexibility of running your own containerized agents with the dependencies you choose, and the intelligence of Microsoft Foundry to coordinate multi-step reasoning, tool call, and cross-agent workflows. As the agent ecosystem continues to evolve, Azure Container Apps and Microsoft Foundry provide a strong, extensible foundation for building the next generation of intelligent, microservice-driven applications.633Views1like0CommentsAnnouncing Advanced Kubernetes Troubleshooting Agent Capabilities (preview) in Azure Copilot
What’s new? Today, we're announcing Kubernetes troubleshooting agent capabilities in Azure Copilot, offering an intuitive, guided agentic experience that helps users detect, triage, and resolve common Kubernetes issues in their AKS clusters. The agent can provide root cause analysis for Kubernetes clusters and resources and is triggered by Kubernetes-specific keywords. It can detect problems like resource failures and scaling bottlenecks and intelligently correlates signals across metrics and events using `kubectl` commands when reasoning and provides actionable solutions. By simplifying complex diagnostics and offering clear next steps, the agent empowers users to troubleshoot independently. How it works With Kubernetes troubleshooting agent, Azure Copilot automatically investigates issues in your cluster by running targeted `kubectl` commands and analyzing your cluster’s configuration and current state. For instance, it identifies failing or pending pods, cluster events, resource utilization metrics, and configuration details to build a complete picture of what’s causing the issue. Azure Copilot then determines the most effective mitigation steps for your specific environment. It provides clear, step-by-step guidance, and in many cases, offers a one-click fix to resolve the issue automatically. If Azure Copilot can’t fully resolve the problem, it can generate a pre-populated support request with all the diagnostic details Microsoft Support needs. You’ll be able to review and confirm everything before the request is submitted. This agent is available via Azure Copilot in the Azure Portal. Learn more about how Azure Copilot works. How to Get Started To start using agents, your global administrator must request access to the agents preview at the tenant level in the Azure Copilot admin center. This confirms your interest in the preview and allows us to enable access. Once approved, users will see the Agent mode toggle in Azure Copilot chat and can then start using Copilot agents. Capacity is limited, so sign up early for the best chance to participate. Additionally, if you are interested in helping shape the future of agentic cloud ops and the role Copilot will play in it, please join our customer feedback program by filling up this form. Agents (preview) in Azure Copilot | Microsoft Learn Troubleshooting sample prompts From an AKS cluster resource, click Kubernetes troubleshooting with Copilot to automatically open Azure Copilot in context of the resource you want to troubleshoot: Try These Prompts to Get Started: Here are a few examples of the kinds of prompts you can use. If you're not already working in the context of a resource, you may need to provide the specific resource that you want to troubleshoot. "My pod keeps restarting can you help me figure out why" "Pods are stuck pending what is blocking them from being scheduled" "I am getting ImagePullBackOff how do I fix this" "One of my nodes is NotReady what is causing it" "My service cannot reach the backend pod what should I check" Note: When using these kinds of prompts, be sure agent mode is enabled by selecting the icon in the chat window: Learn More Troubleshooting agent capabilities in Agents (preview) in Azure Copilot | Microsoft Learn Announcing the CLI Agent for AKS: Agentic AI-powered operations and diagnostics at your fingertips - AKS Engineering Blog Microsoft Copilot in Azure Series - Kubectl | Microsoft Community Hub429Views3likes0CommentsBuilding AI apps and agents for the new frontier
Every new wave of applications brings with it the promise of reshaping how we work, build and create. From digitization to web, from cloud to mobile, these shifts have made us all more connected, more engaged and more powerful. The incoming wave of agentic applications, estimated to number 1.3 billion over the next 2 years[1] is no different. But the expectations of these new services are unprecedented, in part for how they will uniquely operate with both intelligence and agency, how they will act on our behalf, integrated as a member of our teams and as a part of our everyday lives. The businesses already achieving the greatest impact from agents are what we call Frontier Organizations. This week at Microsoft Ignite we’re showcasing what the best frontier organizations are delivering, for their employees, for their customers and for their markets. And we’re introducing an incredible slate of innovative services and tools that will help every organization achieve this same frontier transformation. What excites me most is how frontier organizations are applying AI to achieve their greatest level of creativity and problem solving. Beyond incremental increases in efficiency or cost savings, frontier firms use AI to accelerate the pace of innovation, shortening the gap from prototype to production, and continuously refining services to drive market fit. Frontier organizations aren’t just moving faster, they are using AI and agents to operate in novel ways, redefining traditional business processes, evolving traditional roles and using agent fleets to augment and expand their workforce. To do this they build with intent, build for impact and ground services in deep, continuously evolving, context of you, your organization and your market that makes every service, every interaction, hyper personalized, relevant and engaging. Today we’re announcing new capabilities that help you build what was previously impossible. To launch and scale fleets of agents in an open system across models, tools, and knowledge. And to run and operate agents with the confidence that every service is secure, governed and trusted. The question is, how do you get there? How do you build the AI apps and agents fueling the future? Read further for just a few highlights of how Microsoft can help you become frontier: Build with agentic DevOps Perhaps the greatest area of agentic innovation today is in service of developers. Microsoft’s strategy for agentic DevOps is redefining the developer experience to be AI-native, extending the power of AI to every stage of the software lifecycle and integrating AI services into the tools embraced by millions of developers. At Ignite, we’re helping every developer build faster, build with greater quality and security and deliver increasingly innovative apps that will shape their businesses. Across our developer services, AI agents now operate like an active member of your development and operations teams – collaborating, automating, and accelerating every phase of the software development lifecycle. From planning and coding to deployment and production, agents are reshaping how we build. And developers can now orchestrate fleets of agents, assigning tasks to agents to execute code reviews, testing, defect resolution, and even modernization of legacy Java and .NET applications. We continue to take this strategy forward with a new generation of AI-powered tools, with GitHub Agent HQ making coding agents like Codex, Claude Code, and Jules available soon directly in GitHub and Visual Studio Code, to Custom Agents to encode domain expertise, and “bring your own models” to empower teams to adapt and innovate. It’s these advancements that make GitHub Copilot, the world’s the most popular AI pair programmer, serving over 26 million users and helping organizations like Pantone, Ahold Delhaize USA, and Commerzbank streamline processes and save time. Within Microsoft’s own developer teams, we’re seeing transformative results with agentic DevOps. GitHub Copilot coding agent is now a top contributor—not only to GitHub’s core application but also to our major open-source projects like the Microsoft Agent Framework and Aspire. Copilot is reducing task completion time from hours to minutes and eliminating up to two weeks of manual development effort for complex work. Across Microsoft, 90% of pull requests are now covered by GitHub Copilot code review, increasing the pace of PR completion. Our AI-powered assistant for Microsoft’s engineering ecosystem is deeply integrated into VS Code, Teams, and other tools, giving engineers and product managers real-time, context-aware answers where they work—saving 2.2k developer days in September alone. For app modernization, GitHub Copilot has reduced modernization project timelines by as much as 88%. In production environments, Azure SRE agent has handled over 7K incidents and collected diagnostics on over 18K incidents, saving over 10,000 hours for on-call engineers. These results underscore how agentic workflows are redefining speed, scale, and reliability across the software lifecycle at Microsoft. Launch at speed and scale with a full-stack AI app and agent platform We’re making it easier to build, run, and scale AI agents that deliver real business outcomes. To accelerate the path to production for advanced AI applications and agents is delivering a complete, and flexible foundation that helps every organization move with speed and intelligence without compromising security, governance or operations. Microsoft Foundry helps organizations move from experimentation to execution at scale, providing the organization-wide observability and control that production AI requires. More than 80,000 customers, including 80% of the Fortune 500, use Microsoft Foundry to build, optimize, and govern AI apps and agents today. Foundry supports open frameworks like the Microsoft Agent Framework for orchestration, standard protocols like Model Context Protocol (MCP) for tool calling, and expansive integrations that enable context-aware, action-oriented agents. Companies like Nasdaq, Softbank, Sierra AI, and Blue Yonder are shipping innovative solutions with speed and precision. New at Ignite this year: Foundry Models With more than 11,000 models like OpenAI’s GPT-5, Anthropic’s Claude, and Microsoft’s Phi at their fingertips, developers, Foundry delivers the broadest model selection on any cloud. Developers have the power to benchmark, compare, and dynamically route models to optimize performance for every task. Model router is now generally available in Microsoft Foundry and in public preview in Foundry Agent Service. Foundry IQ, Delivering the deep context needed to make every agent grounded, productive, and reliable. Foundry IQ, now available in public preview, reimagines retrieval-augmented generation (RAG) as a dynamic reasoning process rather than a one-time lookup. Powered by Azure AI Search, it centralizes RAG workflows into a single grounding API, simplifying orchestration and improving response quality while respecting user permissions and data classifications. Foundry Agent Service now offers Hosted Agents, multi-agent workflows, built-in memory, and the ability to deploy agents directly to Microsoft 365 and Agent 365 in public preview. Foundry Tools, empowers developers to create agents with secure, real-time access to business systems, business logic, and multimodal capabilities. Developers can quickly enrich agents with real-time business context, multimodal capabilities, and custom business logic through secure, governed integration with 1,400+ systems and APIs. Foundry Control Plane, now in public preview, centralizes identity, policy, observability, and security signals and capabilities for AI developers in one portal. Build on an AI-Ready foundation for all applications Managed Instance on Azure App Service lets organizations migrate existing .NET web applications to the cloud without the cost or effort of rewriting code, allowing them to migrate directly into a fully managed platform-as-a-service (PaaS) environment. With Managed Instance, organizations can keep operating applications with critical dependencies on local Windows services, third-party vendor libraries, and custom runtimes without requiring any code changes. The result is faster modernizations with lower overhead, and access to cloud-native scalability, built-in security and Azure’s AI capabilities. MCP Governance with Azure API Management now delivers a unified control plane for APIs and MCP servers, enabling enterprises to extend their existing API investments directly into the agentic ecosystem with trusted governance, secure access, and full observability. Agent Loop and native AI integrations in Azure Logic Apps enable customers to move beyond rigid workflows to intelligent, adaptive automation that saves time and reduces complexity. These capabilities make it easier to build AI-powered, context-aware applications using low-code tools, accelerating innovation without heavy development effort. Azure Functions now supports hosting production-ready, reliable AI agents with stateful sessions, durable tool calls, and deterministic multi-agent orchestrations through the durable extension for Microsoft Agent Framework. Developers gain automatic session management, built-in HTTP endpoints, and elastic scaling from zero to thousands of instances — all with pay-per-use pricing and automated infrastructure. Azure Container Apps agents and security supercharges agentic workloads with automated deployment of multi-container agents, on-demand dynamic execution environments, and built-in security for runtime protection, and data confidentiality. Run and operate agents with confidence New at Ignite, we’re also expanding the use of agents to keep every application secure, managed and operating without compromise. Expanded agentic capabilities protect applications from code to cloud and continuously monitor and remediate production issues, while minimizing the efforts on developers, operators and security teams. Microsoft Defender for Cloud and GitHub Advanced Security: With the rise of multi-agent systems, the security threat surface continues to expand. Increased alert volumes, unprioritized threat signals, unresolved threats and a growing backlog of vulnerabilities is increasing risk for businesses while security teams and developers often operate in disconnected tools, making collaboration and remediation even more challenging. The new Defender for Cloud and GitHub Advanced Security integration closes this gap, connecting runtime context to code for faster alert prioritization and AI-powered remediation. Runtime context prioritizes security risks with insights that allow teams to focus on what matters most and fix issues faster with AI-powered remediation. When Defender for Cloud finds a threat exposed in production, it can now link to the exact code in GitHub. Developers receive AI suggested fixes directly inside GitHub, while security teams track progress in Defender for Cloud in real time. This gives both sides a faster, more connected way to identify issues, drive remediation, and keep AI systems secure throughout the app lifecycle. Azure SRE Agent is an always-on, AI-powered partner for cloud reliability, enabling production environments to become self-healing, proactively resolve issues, and optimize performance. Seamlessly integrated with Azure Monitor, GitHub Copilot, and incident management tools, Azure SRE Agent reduces operational toil. The latest update introduces no-code automation, empowering teams to tailor processes to their unique environments with minimal engineering overhead. Event-driven triggers enable proactive checks and faster incident response, helping minimize downtime. Expanded observability across Azure and third-party sources is designed to help teams troubleshoot production issues more efficiently, while orchestration capabilities support integration with MCP-compatible tools for comprehensive process automation. Finally, its adaptive memory system is designed to learn from interactions, helping improve incident handling and reduce operational toil, so organizations can achieve greater reliability and cost efficiency. The future is yours to build We are living in an extraordinary time, and across Microsoft we’re focused on helping every organization shape their future with AI. Today’s announcements are a big step forward on this journey. Whether you’re a startup fostering the next great concept or a global enterprise shaping your future, we can help you deliver on this vision. The frontier is open. Let’s build beyond expectations and build the future! Check out all the learning at Microsoft Ignite on-demand and read more about the announcements making it happen at: Recommended sessions BRK113: Connected, managed, and complete BRK103: Modernize your apps in days, not months, with GitHub Copilot BRK110: Build AI Apps fast with GitHub and Microsoft Foundry in action BRK100: Best practices to modernize your apps and databases at scale BRK114: AI Agent architectures, pitfalls and real-world business impact BRK115: Inside Microsoft's AI transformation across the software lifecycle Announcements aka.ms/AgentFactory aka.ms/AppModernizationBlog aka.ms/SecureCodetoCloudBlog aka.ms/AppPlatformBlog [1] IDC Info Snapshot, sponsored by Microsoft, 1.3 Billion AI Agents by 2028, #US53361825 and May 2025.961Views0likes0CommentsWhat's new in Azure Container Apps at Ignite'25
Azure Container Apps (ACA) is a fully managed serverless container platform that enables developers to design and deploy microservices and modern apps without requiring container expertise or needing infrastructure management. ACA is rapidly emerging as the preferred platform for hosting AI workloads and intelligent agents in the cloud. With features like code interpreter, Serverless GPUs, simplified deployments, and per-second billing, ACA empowers developers to build, deploy, and scale AI-driven applications with exceptional agility. ACA makes it easy to integrate agent frameworks, leverage GPU acceleration, and manage complex, multi-container AI environments - all while benefiting from a serverless, fully managed infrastructure. External customers like Replit, NFL Combine, Coca-Cola, and European Space Agency as well as internal teams like Microsoft Copilot (as well as many others) have bet on ACA as their compute platform for AI workloads. ACA is quickly becoming the leading platform for updating existing applications and moving them to a cloud-native setup. It allows organizations to seamlessly migrate legacy workloads - such as Java and .NET apps - by using AI-powered tools like GitHub Copilot to automate code upgrades, analyze dependencies, and handle cloud transformations. ACA’s fully managed, serverless environment removes the complexity of container orchestration. This helps teams break down monolithic or on-premises applications into robust microservices, making use of features like version control, traffic management, and advanced networking for fast iteration and deployment. By following proven modernization strategies while ensuring strong security, scalability, and developer efficiency, ACA helps organizations continuously innovate and future-proof their applications in the cloud. Customers like EY, London Stock Exchange, Chevron, and Paychex have unlocked significant business value by modernizing their workloads onto ACA. This blog presents the latest features and capabilities of ACA, enhancing its value for customers by enabling the rapid migration of existing workloads and development of new cloud applications, all while following cloud-native best practices. Secure sandboxes for AI compute ACA now supports dynamic shell sessions, currently available in public preview. These shell sessions are platform-managed built-in containers designed to execute common shell commands within an isolated, sandboxed environment. With the addition of empty shell sessions and an integrated MCP server, ACA enables customers to provision secure, isolated sandboxes instantly - ideal for use cases such as code execution, tool testing, and workflow automation. This functionality facilitates seamless integration with agent frameworks, empowering agents to access disposable compute environments as needed. Customers can benefit from rapid provisioning, improved security, and decreased operational overhead when managing agentic workloads. To learn more about how to add secure sandbox shell sessions to Microsoft Foundry agents as a tool, visit the walkthrough at https://aka.ms/aca/dynamic-sessions-mcp-tutorial. Docker Compose for Agents support ACA has added Docker Compose for Agents support in public preview, making it easy for developers to define agentic applications stack-agnostic, with MCP and custom model support. Combined with native serverless GPU support, Docker Compose for Agents allows fast iteration and scaling for AI-driven agents and application using LangGraph, LangChain CrewAI, Spring AI, Vercel AI SDK and Agno. These enhancements provide a developer-focused platform that streamlines the process for modern AI workloads, bringing together both development and production cycles into one unified environment. Additional regional availability for Serverless GPUs Serverless GPU solutions offer capabilities such as automatic scaling with NVIDIA A100 or T4 GPUs, per-second billing, and strict data isolation within container boundaries. ACA Serverless GPUs are now generally available in 11 additional regions, further facilitating developers’ ability to deploy AI inference, model training, and GPU-accelerated workloads efficiently. For further details on supported regions, please visit https://aka.ms/aca/serverless-gpu-regions. New Flexible Workload Profile The Flexible workload profile is a new option that combines the simplicity of serverless Consumption with the performance and control in Dedicated profiles. It offers a familiar pay-per-use model along with enhanced features like scheduled maintenance, dedicated networking, and support for larger replicas to meet demanding application needs. Customers can enjoy the advantages of dedicated resources together with effortless infrastructure management and billing from the Consumption model. Operating on a dedicated compute pool, this profile ensures better predictability and isolation without introducing extra operational complexity. It is designed for users who want the ease of serverless scaling, but also need more control over performance and environmental stability. Confidential Computing Confidential computing support is now available in public preview for ACA, offering hardware-based Trusted Execution Environments (TEEs) to secure data in use. This adds to existing encryption of data at rest and in transit by encrypting memory and verifying the cloud environment before processing. It helps protect sensitive data from unauthorized access, including by cloud operators, and is useful for organizations with high security needs. Confidential computing can be enabled via workload profiles, with the preview limited to certain regions. Extending Network capabilities General Availability of Rule-based Routing Rule-based routing for ACA is now generally available, offering users improved flexibility and easier composition when designing microservice architectures, conducting A/B testing, or implementing blue-green deployments. With this feature, you can route incoming HTTP traffic to specific apps within your environment by specifying host names or paths - including support for custom domains. You no longer need to set up an extra reverse proxy (like NGINX); simply define routing rules for your environment, and traffic will be automatically directed to the appropriate target apps. General Availability of Premium Ingress ACA support for Premium Ingress is now Generally Available. This feature introduces environment-level ingress configuration options, with the primary highlight being customizable ingress scaling. This capability supports the scaling of the ingress proxy, enabling customers to better handle higher demand workloads, such as large performance tests. By configuring your ingress proxy to run on workload profiles, you can scale out more ingress instances to handle more load. Running the ingress proxy on a workload profile will incur associated costs. To further enhance the flexibility of your application, this release includes other ingress-related settings, such as termination grace period, idle request timeout, and header count. Additional Management capabilities Public Preview of Deployment labels ACA now offers deployment labels in public preview, letting you assign names like dev, staging, or prod to container revisions which can be automatically assigned. This makes environment management easier and supports advanced strategies such as A/B testing and blue-green deployments. Labels help route traffic, control revisions, and streamline rollouts or rollbacks with minimal hassle. With deployment labels, you can manage app lifecycles more efficiently and reduce complexity across environments. General Availability of Durable Task Scheduler support Durable Task Scheduler (DTS) support is now generally available on ACA, empowering users with a robust pro-code workflow solution. With DTS, you can define reliable, containerized workflows as code, benefiting from built-in state persistence and fault-tolerant execution. This enhancement streamlines the creation and administration of complex workflows by boosting scalability, reliability, and enabling efficient monitoring capabilities. What’s next ACA is redefining how developers build and deploy intelligent agents. Agents deployed to Azure Container Apps with Microsoft Agent Framework and Open Telemetry can also be plugged directly into Microsoft Foundry, giving teams a single pane of glass for their agents in Azure. With serverless scale, GPU-on-demand, and enterprise-grade isolation, ACA provides the ideal foundation for hosting AI agents securely and cost-effectively. Utilizing open-source frameworks such as n8n on ACA enables the deployment of no-code automation agents that integrate seamlessly with Azure OpenAI models, supporting intelligent routing, summarization, and adaptive decision-making processes. Similarly, running other agent frameworks like Goose AI Agent on ACA enables it to operate concurrently with model inference workloads (including Ollama and GPT-OSS) within a unified, secure environment. The inclusion of serverless GPU support allows for efficient hosting of large language models such as GPT-OSS, optimizing both cost and scalability for inference tasks. Furthermore, ACA facilitates the remote hosting of Model Context Protocol (MCP) servers, granting agents secure access to external tools and APIs via streamable HTTP transport. Collectively, these features enable organizations to develop, scale, and manage complex agentic workloads - from workflow automation to AI-driven assistants - while leveraging ACA’s enterprise-grade security, autoscaling capabilities, and developer-centric user experience. In addition to these, ACA also enables a wide range of cross-compatibility with various frameworks and services, making it an ideal platform for running Azure Functions on ACA, Distributed Application Runtime (Dapr) microservices, as well as polyglot apps across .NET / Java / JavaScript. As always, we invite you to visit our GitHub page for feedback, feature requests, or questions about Azure Container Apps, where you can open a new issue or up-vote existing ones. If you’re curious about what we’re working on next, check out our roadmap. We look forward to hearing from you!904Views0likes0Comments