Blog Post

Apps on Azure Blog
3 MIN READ

Using an AI Agent to Troubleshoot and Fix Azure Function App Issues

theringe's avatar
theringe
Icon for Microsoft rankMicrosoft
Apr 15, 2026

TOC

  1. Preparation
  2. Troubleshooting Workflow
  3. Conclusion

 

Preparation

Topic: Required tools

  • AI agent: for example, Copilot CLI / OpenCode / Hermes / OpenClaw, etc. In this example, we use Copilot CLI.
  • Model access: for example, Anthropic Claude Opus.
  • Relevant skills: this example does not use skills, but using relevant skills can speed up troubleshooting.

 

Topic: Compliant with your organization

  • Enterprise-level projects are sensitive, so you must confirm with the appropriate stakeholders before using them.
  • Enterprise environments may also have strict standards for AI agent usage.

 

Topic: Network limitations

  • If the process involves restarting the Function App container or restarting related settings, communication between the user and the agent may be interrupted, and you will need to use /resume.
  • If the agent needs internet access for investigation, the app must have outbound connectivity.
  • If the Kudu container cannot be used because of network issues, this type of investigation cannot be carried out.

 

Topic: Permission limitations

  • If you are using Azure blessed images, according to the official documentation, the containers use the fixed password Docker!. However, if you are using a custom container, you will need to provide an additional login method.
  • For resources the agent does not already have permission to investigate, you will need to enable SAMI and assign the appropriate RBAC roles.

 

Troubleshooting Workflow

Let’s use a classic case where an HTTP trigger cannot be tested from the Azure Portal. As you can see, when clicking Test/Run in the Azure Portal, an error message appears.

 

At the same time, however, the home page does not show any abnormal status.

 

At this point, we first obtain the Function App’s SAMI and assign it the Owner role for the entire resource group. This is only for demonstration purposes. In practice, you should follow the principle of least privilege and scope permissions down to only the specific resources and operations that are actually required.

 

Next, go to the Kudu container, which is the always-on maintenance container dedicated to the app.

 

Install and enable Copilot CLI.

 

Then we can describe the problem we are encountering.

 

After the agent processes the issue and interacts with you further, it can generate a reasonable investigation report. In this example, it appears that the Function App’s Storage Account access key had been rotated previously, but the Function App had not updated the corresponding environment variable.

 

Once we understand the issue, we could perform the follow-up actions ourselves. However, to demonstrate the agent’s capabilities, you can also allow it to fix the problem directly, provided that you have granted the corresponding permissions through SAMI.

 

During the process, the container restart will disconnect the session, so you will need to return to the Kudu container and resume the previous session so it can continue.

 

Finally, it will inform you that the issue has been fixed, and then you can validate the result.

 

This is the validation result, and it looks like the repair was successful.

 

 

Conclusion

After each repair, we can even extract the experience from that case into a skill and store it in a Storage Account for future reuse. In this way, we can not only reduce the agent’s initial investigation time for similar issues, but also save tokens. This makes both time and cost management more efficient.

Published Apr 15, 2026
Version 1.0
No CommentsBe the first to comment