What’s new?
Today, we're announcing Kubernetes troubleshooting agent capabilities in Azure Copilot, offering an intuitive, guided agentic experience that helps users detect, triage, and resolve common Kubernetes issues in their AKS clusters. The agent can provide root cause analysis for Kubernetes clusters and resources and is triggered by Kubernetes-specific keywords. It can detect problems like resource failures and scaling bottlenecks and intelligently correlates signals across metrics and events using `kubectl` commands when reasoning and provides actionable solutions. By simplifying complex diagnostics and offering clear next steps, the agent empowers users to troubleshoot independently.
How it works
With Kubernetes troubleshooting agent, Azure Copilot automatically investigates issues in your cluster by running targeted `kubectl` commands and analyzing your cluster’s configuration and current state. For instance, it identifies failing or pending pods, cluster events, resource utilization metrics, and configuration details to build a complete picture of what’s causing the issue. Azure Copilot then determines the most effective mitigation steps for your specific environment. It provides clear, step-by-step guidance, and in many cases, offers a one-click fix to resolve the issue automatically.
If Azure Copilot can’t fully resolve the problem, it can generate a pre-populated support request with all the diagnostic details Microsoft Support needs. You’ll be able to review and confirm everything before the request is submitted.
This agent is available via Azure Copilot in the Azure Portal. Learn more about how Azure Copilot works.
How to Get Started
To start using agents, your global administrator must request access to the agents preview at the tenant level in the Azure Copilot admin center.
This confirms your interest in the preview and allows us to enable access. Once approved, users will see the Agent mode toggle in Azure Copilot chat and can then start using Copilot agents. Capacity is limited, so sign up early for the best chance to participate.
Additionally, if you are interested in helping shape the future of agentic cloud ops and the role Copilot will play in it, please join our customer feedback program by filling up this form.
Read more: Agents (preview) in Azure Copilot | Microsoft LearnTroubleshooting sample prompts
From an AKS cluster resource, click Kubernetes troubleshooting with Copilot to automatically open Azure Copilot in context of the resource you want to troubleshoot:
Try These Prompts to Get Started:
Here are a few examples of the kinds of prompts you can use. If you're not already working in the context of a resource, you may need to provide the specific resource that you want to troubleshoot.
- "My pod keeps restarting can you help me figure out why"
- "Pods are stuck pending what is blocking them from being scheduled"
- "I am getting ImagePullBackOff how do I fix this"
- "One of my nodes is NotReady what is causing it"
- "My service cannot reach the backend pod what should I check"
Note: When using these kinds of prompts, be sure agent mode is enabled by selecting the icon in the chat window:
Learn More
- Troubleshooting agent capabilities in Agents (preview) in Azure Copilot | Microsoft Learn
- Announcing the CLI Agent for AKS: Agentic AI-powered operations and diagnostics at your fingertips - AKS Engineering Blog
- Microsoft Copilot in Azure Series - Kubectl | Microsoft Community Hub