Overview
Connect Azure SRE Agent to AWS services using the official AWS MCP server. Query AWS documentation, execute any of the 15,000+ AWS APIs, run operational workflows, and kick off incident investigations through AWS DevOps Agent, which is now generally available.
The AWS MCP server connects Azure SRE Agent to AWS documentation, APIs, regional availability data, pre-built operational workflows (Agent SOPs), and AWS DevOps Agent for incident investigation. When connected, the proxy exposes 23 MCP tools organized into four categories: documentation and knowledge, API execution, guided workflows, and DevOps Agent operations.
How it works
The MCP Proxy for AWS runs as a local stdio process that SRE Agent spawns via uvx. The proxy handles AWS authentication using credentials you provide as environment variables. No separate infrastructure or container deployment is needed.
In the portal, you use the generic MCP server (User provided connector) option with stdio transport.
Key capabilities
| Area | Capabilities |
|---|---|
| Documentation | Search all AWS docs, API references, and best practices; retrieve pages as markdown |
| API execution | Execute authenticated calls across 15,000+ AWS APIs with syntax validation and error handling |
| Agent SOPs | Pre-built multi-step workflows following AWS Well-Architected principles |
| Regional info | List all AWS regions, check service and feature availability by region |
| Infrastructure | Provision VPCs, databases, compute instances, storage, and networking resources |
| Troubleshooting | Analyze CloudWatch logs, CloudTrail events, permission issues, and application failures |
| Cost management | Set up billing alerts, analyze resource usage, and review cost data |
| DevOps Agent | Start AWS incident investigations, read root cause analyses, get remediation recommendations, and chat with AWS DevOps Agent |
Note: The AWS MCP Server is free to use. You pay only for the AWS resources consumed by API calls made through the server. All actions respect your existing IAM policies.
Prerequisites
- Azure SRE Agent resource deployed in Azure
- AWS account with IAM credentials configured
- uv package manager installed on the SRE Agent host (used to run the MCP proxy via
uvx) - IAM permissions:
aws-mcp:InvokeMcp,aws-mcp:CallReadOnlyTool, and optionallyaws-mcp:CallReadWriteTool
Step 1: Create AWS access keys
The AWS MCP server authenticates using AWS access keys (an Access Key ID and a Secret Access Key). These keys are tied to an IAM user in your AWS account. You create them in the AWS Management Console.
Navigate to IAM in the AWS Console
- Sign in to the AWS Management Console
- In the top search bar, type IAM and select IAM from the results (Direct URL:
https://console.aws.amazon.com/iam/) - In the left sidebar, select Users (Direct URL:
https://console.aws.amazon.com/iam/home#/users)
Create a dedicated IAM user
Create a dedicated user for SRE Agent rather than reusing a personal account. This makes it easy to scope permissions and rotate keys independently.
- Select Create user
- Enter a descriptive user name (e.g.,
sre-agent-mcp) - Do not check "Provide user access to the AWS Management Console" (this user only needs programmatic access)
- Select Next
- Select Attach policies directly
- Select Create policy (opens in a new tab) and paste the following JSON in the JSON editor:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"aws-mcp:InvokeMcp",
"aws-mcp:CallReadOnlyTool",
"aws-mcp:CallReadWriteTool"
],
"Resource": "*"
}
]
}
- Select Next, give the policy a name (e.g.,
SREAgentMCPAccess), and select Create policy - Back on the Create user tab, select the refresh button in the policy list, search for
SREAgentMCPAccess, and check it - Select Next > Create user
Generate access keys
After the user is created, generate the access keys that SRE Agent will use:
- From the Users list, select the user you just created (e.g.,
sre-agent-mcp) - Select the Security credentials tab
- Scroll down to the Access keys section
- Select Create access key
- For the use case, select Third-party service
- Check the confirmation checkbox and select Next
- Optionally add a description tag (e.g.,
Azure SRE Agent) and select Create access key - Copy both values immediately:
| Value | Example format | Where you'll use it |
|---|---|---|
| Access Key ID | <your-access-key-id> | Connector environment variable AWS_ACCESS_KEY_ID |
| Secret Access Key | <your-secret-access-key> | Connector environment variable AWS_SECRET_ACCESS_KEY |
Important: The Secret Access Key is shown only once on this screen. If you close the page without copying it, you must delete the key and create a new one. Select Download .csv file as a backup, then store the file securely and delete it after configuring the connector.
Tip: For production use, also add service-specific IAM permissions for the AWS APIs you want SRE Agent to call. The MCP permissions above grant access to the MCP server itself, but individual API calls (e.g., ec2:DescribeInstances, logs:GetQueryResults) require their own IAM actions. Start broad for testing, then scope down using the principle of least privilege.
Required permissions summary
| Permission | Description | Required? |
|---|---|---|
aws-mcp:InvokeMcp | Base access to the AWS MCP server | Yes |
aws-mcp:CallReadOnlyTool | Read operations (describe, list, get, search) | Yes |
aws-mcp:CallReadWriteTool | Write operations (create, update, delete resources) | Optional |
Step 2: Add the MCP connector
Connect the AWS MCP server to your SRE Agent using the portal. The proxy runs as a local stdio process that SRE Agent spawns via uvx. It handles SigV4 signing using the AWS credentials you provide as environment variables.
Determine the AWS MCP endpoint for your region
The AWS MCP server has regional endpoints. Choose the one matching your AWS resources:
| AWS Region | MCP Endpoint URL |
|---|---|
| us-east-1 (default) | https://aws-mcp.us-east-1.api.aws/mcp |
| us-west-2 | https://aws-mcp.us-west-2.api.aws/mcp |
| eu-west-1 | https://aws-mcp.eu-west-1.api.aws/mcp |
Note: Without the --metadata AWS_REGION=<region> argument, operations default to us-east-1. You can always override the region in your query.
Using the Azure portal
- In Azure portal, navigate to your SRE Agent resource
- Select Builder > Connectors
- Select Add connector
- Select MCP server (User provided connector) and select Next
- Configure the connector with these values:
| Field | Value |
|---|---|
| Name | aws-mcp |
| Connection type | stdio |
| Command | uvx |
| Arguments | mcp-proxy-for-aws@latest https://aws-mcp.us-east-1.api.aws/mcp --metadata AWS_REGION=us-west-2 |
| Environment variables | AWS_ACCESS_KEY_ID=<your-access-key-id>, AWS_SECRET_ACCESS_KEY=<your-secret-access-key> |
- Select Next to review
- Select Add connector
This is equivalent to the following MCP client configuration used by tools like Claude Desktop or Amazon Kiro CLI:
{
"mcpServers": {
"aws-mcp": {
"command": "uvx",
"args": [
"mcp-proxy-for-aws@latest",
"https://aws-mcp.us-east-1.api.aws/mcp",
"--metadata", "AWS_REGION=us-west-2"
]
}
}
}
Important: Store the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY securely. In the portal, environment variables for connectors are stored encrypted. For production deployments, consider using a dedicated IAM user with scoped-down permissions (see Step 1). Never commit credentials to source control.
Tip: If your SRE Agent host already has AWS credentials configured (e.g., via aws configure or an instance profile), the proxy will pick them up automatically from the environment. In that case, you can omit the explicit AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables.
Note: After adding the connector, the agent service initializes the MCP connection. This may take up to 30 seconds as uvx downloads the proxy package on first run (~89 dependencies). If the connector does not show Connected status after a minute, see the Troubleshooting section below.
Step 3: Add an AWS skill
Skills give agents domain knowledge and best practices for specific tool sets. Create an AWS skill so your agent knows how to troubleshoot AWS services, provision infrastructure, and follow operational workflows.
Tip: Why skills over subagents? Skills inject domain knowledge into the main agent's context, so it can use AWS expertise without handing off to a separate agent. Conversation context stays intact and there's no handoff latency. Use a subagent when you need full isolation with its own system prompt and tool restrictions.
- Navigate to Builder > Skills
- Select Add skill
- Paste the following skill configuration:
api_version: azuresre.ai/v1
kind: SkillConfiguration
metadata:
owner: your-team@contoso.com
version: "1.0.0"
spec:
name: aws_infrastructure_operations
display_name: AWS Infrastructure & Operations
description: |
AWS infrastructure and operations: EC2, EKS, Lambda, S3, RDS, CloudWatch,
CloudTrail, IAM, VPC, and others. Also covers AWS DevOps Agent for
incident investigation, root cause analysis, and remediation. Use for
querying AWS resources, investigating issues, provisioning infrastructure,
searching documentation, running AWS API calls via the AWS MCP server,
and coordinating investigations between Azure SRE Agent and AWS DevOps Agent.
instructions: |
## Overview
The AWS MCP Server is a managed remote MCP server that gives AI
assistants authenticated access to AWS services. It combines
documentation access, authenticated API execution, and pre-built
Agent SOPs in a single interface.
**Authentication:** Handled automatically by the MCP Proxy for AWS,
running as a local stdio process. All actions respect existing IAM
policies configured in the connector environment variables.
**Regional endpoints:** The MCP server has regional endpoints. The proxy
is configured with a default region; you can override by specifying a
region in your queries (e.g., "list my EC2 instances in eu-west-1").
## Searching Documentation
Use aws___search_documentation to find information across all AWS docs.
## Executing AWS API Calls
Use aws___call_aws to execute authenticated AWS API calls. The tool
handles SigV4 signing and provides syntax validation.
## Using Agent SOPs
Use aws___retrieve_agent_sop to find and follow pre-built workflows.
SOPs provide step-by-step guidance following AWS Well-Architected
principles.
## Regional Operations
Use aws___list_regions to see all available AWS regions and
aws___get_regional_availability to check service support in
specific regions.
## AWS DevOps Agent Integration
The AWS MCP server includes tools for AWS DevOps Agent:
- aws___list_agent_spaces / aws___create_agent_space: Manage AgentSpaces
- aws___create_investigation: Start incident investigations (5-8 min async)
- aws___get_task: Poll investigation status
- aws___list_journal_records: Read root cause analysis
- aws___list_recommendations / aws___get_recommendation: Get remediation steps
- aws___start_evaluation: Run proactive infrastructure evaluations
- aws___create_chat / aws___send_message: Chat with AWS DevOps Agent
## Troubleshooting
| Issue | Solution |
|-------|----------|
| Access denied errors | Verify IAM policy includes aws-mcp:InvokeMcp and aws-mcp:CallReadOnlyTool |
| API call fails | Check IAM policy includes the specific service action |
| Wrong region results | Specify the region explicitly in your query |
| Proxy connection error | Verify uvx is installed and the proxy can reach aws-mcp.region.api.aws |
mcp_connectors:
- aws-mcp
- Select Save
Note: The mcp_connectors: - aws-mcp at the bottom links this skill to the connector you created in Step 2. The skill's instructions teach the agent how to use the 23 AWS MCP tools effectively.
Step 4: Test the integration
Open a new chat session with your SRE Agent and try these example prompts to verify the connection is working.
Quick verification
Start with this simple test to confirm the AWS MCP proxy is connected and authenticating correctly:
What AWS regions are available?
If the agent returns a list of regions, the connection is working. If you see authentication errors, go back and verify the IAM credentials and permissions from Step 1.
Documentation and knowledge
Search AWS documentation for EKS best practices for production clusters
What AWS regions support Amazon Bedrock?
Read the AWS documentation page about S3 bucket policies
Infrastructure queries
List all my running EC2 instances in us-east-1
Show me the details of my EKS cluster named "production-cluster"
What Lambda functions are deployed in my account?
CloudWatch and monitoring
What CloudWatch alarms are currently in ALARM state?
Show me the CPU utilization metrics for my RDS instance over the last 24 hours
Search CloudWatch Logs for errors in the /aws/lambda/my-function log group
Troubleshooting workflows
My EC2 instance i-0abc123 is not reachable. Help me troubleshoot.
My Lambda function is timing out. Walk me through the investigation.
Find an Agent SOP for troubleshooting EKS pod scheduling failures
Cross-cloud scenarios
My Azure Function is failing when calling AWS S3. Check if there are any S3
service issues and review the bucket policy for "my-data-bucket".
Compare the health of my AWS EKS cluster with my Azure AKS cluster.
AWS DevOps Agent investigations
List all available AWS DevOps Agent spaces in my account
Create an AWS DevOps Agent investigation for the high error rate on my
Lambda function "order-processor" in us-west-2
Start a chat with AWS DevOps Agent about my EKS cluster performance
Cross-agent investigation (Azure SRE Agent + AWS DevOps Agent)
My application is failing across both Azure and AWS. Start an AWS DevOps
Agent investigation for the AWS side while you check Azure Monitor for
errors on the Azure side. Then combine the findings into a unified root
cause analysis.
What's New: AWS DevOps Agent Integration
The AWS MCP server now includes full integration with AWS DevOps Agent, which recently became generally available. This means Azure SRE Agent can start autonomous incident investigations on AWS infrastructure and get back root cause analyses and remediation recommendations — all within the same chat session.
Available tools by category
AgentSpace management
| Tool | Description |
|---|---|
aws___list_agent_spaces | Discover available AgentSpaces |
aws___get_agent_space | Get AgentSpace details including ARN and configuration |
aws___create_agent_space | Create a new AgentSpace for investigations |
Investigation lifecycle
| Tool | Description |
|---|---|
aws___create_investigation | Start an incident investigation (async, 5-8 min) |
aws___get_task | Poll investigation task status |
aws___list_tasks | List investigation tasks with filters |
aws___list_journal_records | Read root cause analysis journal |
aws___list_executions | List execution runs for a task |
aws___list_recommendations | Get prioritized mitigation recommendations |
aws___get_recommendation | Get full remediation specification |
Proactive evaluations
| Tool | Description |
|---|---|
aws___start_evaluation | Start an evaluation to find preventive recommendations |
aws___list_goals | List evaluation goals and criteria |
Real-time chat
| Tool | Description |
|---|---|
aws___create_chat | Start a real-time chat session with AWS DevOps Agent |
aws___list_chats | List recent chat sessions |
aws___send_message | Send a message and get a streamed response |
Cross-Agent Investigation Workflow
With the AWS MCP server connected, SRE Agent can run parallel investigations across both clouds. Here's how the cross-agent workflow works:
- Start an AWS investigation: Ask SRE Agent to create an AWS DevOps Agent investigation for the AWS-side symptoms
- Investigate Azure in parallel: While the AWS investigation runs (5-8 minutes), SRE Agent uses its native tools to check Azure Monitor, Log Analytics, and resource health
- Read AWS results: When the investigation completes, SRE Agent reads the journal records and recommendations
- Correlate findings: SRE Agent combines both sets of findings into a single root cause analysis with remediation steps for both clouds
Common cross-cloud scenarios:
- Azure app calling AWS services: Investigate Azure Function errors that correlate with AWS API failures
- Hybrid deployments: Check AWS EKS clusters alongside Azure AKS clusters during multi-cloud outages
- Data pipeline issues: Trace data flow across Azure Event Hubs and AWS Kinesis or SQS
- Agent-to-agent investigation: Start an AWS DevOps Agent investigation for the AWS side while Azure SRE Agent checks Azure resources in parallel
Architecture
The integration uses a stdio proxy architecture. SRE Agent spawns the proxy as a child process, and the proxy forwards requests to the AWS MCP endpoint:
Azure SRE Agent
|
| stdio (local process)
v
mcp-proxy-for-aws (spawned via uvx)
|
| Authenticated HTTPS requests
v
AWS MCP Server (aws-mcp.<region>.api.aws)
|
|--- Authenticated AWS API calls --> AWS Services
| (EC2, S3, CloudWatch, EKS, Lambda, etc.)
|
'--- DevOps Agent API calls ------> AWS DevOps Agent
|-- AgentSpaces (workspaces)
|-- Investigations (async root cause analysis)
|-- Recommendations (remediation specs)
'-- Chat sessions (real-time interaction)
Troubleshooting
Authentication and connectivity issues
| Error | Cause | Solution |
|---|---|---|
403 Forbidden | IAM user lacks MCP permissions | Add aws-mcp:InvokeMcp, aws-mcp:CallReadOnlyTool to the IAM policy |
401 Unauthorized | Invalid or expired AWS credentials | Rotate access keys and update the connector environment variables |
| Proxy fails to start | uvx not installed or not on PATH | Install uv on the SRE Agent host |
| Connection timeout | Proxy cannot reach the AWS MCP endpoint | Verify outbound HTTPS (port 443) is allowed to aws-mcp.<region>.api.aws |
| Connector added but tools not available | MCP connections are initialized at agent startup | Redeploy or restart the agent service from the Azure portal |
| Slow first connection | uvx downloads ~89 dependencies on first run | Wait up to 30 seconds for the initial connection |
API and permission issues
| Error | Cause | Solution |
|---|---|---|
AccessDenied on API call | IAM user lacks the service-specific permission | Add the required IAM action (e.g., ec2:DescribeInstances) to the user's policy |
CallReadWriteTool denied | Write permission not granted | Add aws-mcp:CallReadWriteTool to the IAM policy |
| Wrong region data | Proxy configured for a different region | Update the AWS_REGION metadata in the connector arguments, or specify the region in your query |
| API not found | Newly released or unsupported API | Use aws___suggest_aws_commands to find the correct API name |
Verify the connection
Test that the proxy can authenticate by opening a new chat session and asking:
What AWS regions are available?
If the agent returns a list of regions, the connection is working. If you see authentication errors, verify the IAM credentials and permissions from Step 1.
Re-authorize the integration
If you encounter persistent authentication issues:
- Navigate to the IAM console
- Select the user created in Step 1
- Navigate to Security credentials > Access keys
- Deactivate or delete the old access key
- Create a new access key
- Update the connector environment variables in the SRE Agent portal with the new credentials