azure ai studio
233 TopicsHow to Set Up Claude Code with Microsoft Foundry Models on macOS
Introduction Building with AI isn't just about picking a smart model. It is about where that model lives. I chose to route my Claude Code setup through Microsoft Foundry because I needed more than just a raw API. I wanted the reliability, compliance, and structured management that comes with Microsoft's ecosystem. When you are moving from a prototype to something real, having that level of infrastructure backing your calls makes a significant difference. The challenge is that Foundry is designed for enterprise cloud environments, while my daily development work happens locally on a MacBook. Getting the two to communicate seamlessly involved navigating a maze of shell configurations and environment variables that weren't immediately obvious. I wrote this guide to document the exact steps for bridging that gap. Here is how you can set up Claude Code to run locally on macOS while leveraging the stability of models deployed on Microsoft Foundry. Requirements Before we open the terminal, let's make sure you have the necessary accounts and environments ready. Since we are bridging a local CLI with an enterprise cloud setup, having these credentials handy now will save you time later. Azure Subscription with Microsoft Foundry Setup - This is the most critical piece. You need an active Azure subscription where the Microsoft Foundry environment is initialized. Ensure that you have deployed the Claude model you intend to use and that the deployment status is active. You will need the specific endpoint URL and the associated API keys from this deployment to configure the connection. An Anthropic User Account - Even though the compute is happening on Azure, the interface requires an Anthropic account. You will need this to authenticate your session and manage your user profile settings within the Claude Code ecosystem. Claude Code Client on macOS - We will be running the commands locally, so you need the Claude Code CLI installed on your MacBook. Step 1: Install Claude Code on macOS The recommended installation method is via Homebrew or Curl, which sets it up for terminal access ("OS level"). Option A: Homebrew (Recommended) brew install --cask claude-code Option B: Curl curl -fsSL https://claude.ai/install.sh | bash Verify Installation: Run claude --version. Step 2: Set Up Microsoft Foundry to deploy Claude model Navigate to your Microsoft Foundry portal, and find the Claude model catalog, and deploy the selected Claude model. [Microsoft Foundry > My Assets > Models + endpoints > + Deploy Model > Deploy Base model > Search for "Claude"] In your Model Deployment dashboard, go to the deployed Claude Models and get the "Endpoints and keys". Store it somewhere safe, because we will need them to configure Claude Code later on. Configure Environment Variables in MacOS terminal: Now we need to tell your local Claude Code client to route requests through Microsoft Foundry instead of the default Anthropic endpoints. This is handled by setting specific environment variables that act as a bridge between your local machine and your Azure resources. You could run these commands manually every time you open a terminal, but it is much more efficient to save them permanently in your shell profile. For most modern macOS users, this file is .zshrc. Open your terminal and add the following lines to your profile, making sure to replace the placeholder text with your actual Azure credentials: export CLAUDE_CODE_USE_FOUNDRY=1 export ANTHROPIC_FOUNDRY_API_KEY="your-azure-api-key" export ANTHROPIC_FOUNDRY_RESOURCE="your-resource-name" # Specify the deployment name for Opus export CLAUDE_CODE_MODEL="your-opus-deployment-name" Once you have added these variables, you need to reload your shell configuration for the changes to take effect. Run the source command below to update your current session, and then verify the setup by launching Claude: source ~/.zshrc claude If everything is configured correctly, the Claude CLI will initialize using your Microsoft Foundry deployment as the backend. Once you execute the claude command, the CLI will prompt you to choose an authentication method. Select Option 2 (Antrophic Console account) to proceed. This action triggers your default web browser and redirects you to the Claude Console. Simply sign in using your standard Anthropic account credentials. After you have successfully signed in, you will be presented with a permissions screen. Click the Authorize button to link your web session back to your local terminal. Return to your terminal window, and you should see a notification confirming that the login process is complete. Press Enter to finalize the setup. You are now fully connected. You can start using Claude Code locally, powered entirely by the model deployment running in your Microsoft Foundry environment. Conclusion Setting up this environment might seem like a heavy lift just to run a CLI tool, but the payoff is significant. You now have a workflow that combines the immediate feedback of local development with the security and infrastructure benefits of Microsoft Foundry. One of the most practical upgrades is the removal of standard usage caps. You are no longer limited to the 5-hour API call limits, which gives you the freedom to iterate, test, and debug for as long as your project requires without hitting a wall. By bridging your local macOS terminal to Azure, you are no longer just hitting an API endpoint. You are leveraging a managed, compliance-ready environment that scales with your needs. The best part is that now the configuration is locked in, you don't need to think about the plumbing again. You can focus entirely on coding, knowing that the reliability of an enterprise platform is running quietly in the background supporting every command.211Views0likes0CommentsBuilding with Azure OpenAI Sora: A Complete Guide to AI Video Generation
In this comprehensive guide, we'll explore how to integrate both Sora 1 and Sora 2 models from Azure OpenAI Service into a production web application. We'll cover API integration, request body parameters, cost analysis, limitations, and the key differences between using Azure AI Foundry endpoints versus OpenAI's native API. Table of Contents Introduction to Sora Models Azure AI Foundry vs. OpenAI API Structure API Integration: Request Body Parameters Video Generation Modes Cost Analysis per Generation Technical Limitations & Constraints Resolution & Duration Support Implementation Best Practices Introduction to Sora Models Sora is OpenAI's groundbreaking text-to-video model that generates realistic videos from natural language descriptions. Azure AI Foundry provides access to two versions: Sora 1: The original model focused primarily on text-to-video generation with extensive resolution options (480p to 1080p) and flexible duration (1-20 seconds) Sora 2: The enhanced version with native audio generation, multiple generation modes (text-to-video, image-to-video, video-to-video remix), but more constrained resolution options (720p only in public preview) Azure AI Foundry vs. OpenAI API Structure Key Architectural Differences Sora 1 uses Azure's traditional deployment-based API structure: Endpoint Pattern: https://{resource-name}.openai.azure.com/openai/deployments/{deployment-name}/... Parameters: Uses Azure-specific naming like n_seconds, n_variants, separate width/height fields Job Management: Uses /jobs/{id} for status polling Content Download: Uses /video/generations/{generation_id}/content/video Sora 2 adapts OpenAI's v1 API format while still being hosted on Azure: Endpoint Pattern: https://{resource-name}.openai.azure.com/openai/deployments/{deployment-name}/videos Parameters: Uses OpenAI-style naming like seconds (string), size (combined dimension string like "1280x720") Job Management: Uses /videos/{video_id} for status polling Content Download: Uses /videos/{video_id}/content Why This Matters? This architectural difference requires conditional request formatting in your code: const isSora2 = deployment.toLowerCase().includes('sora-2'); if (isSora2) { requestBody = { model: deployment, prompt, size: `${width}x${height}`, // Combined format seconds: duration.toString(), // String type }; } else { requestBody = { model: deployment, prompt, height, // Separate dimensions width, n_seconds: duration.toString(), // Azure naming n_variants: variants, }; } API Integration: Request Body Parameters Sora 1 API Parameters Standard Text-to-Video Request: { "model": "sora-1", "prompt": "Wide shot of a child flying a red kite in a grassy park, golden hour sunlight, camera slowly pans upward.", "height": "720", "width": "1280", "n_seconds": "12", "n_variants": "2" } Parameter Details: model (String, Required): Your Azure deployment name prompt (String, Required): Natural language description of the video (max 32000 chars) height (String, Required): Video height in pixels width (String, Required): Video width in pixels n_seconds (String, Required): Duration (1-20 seconds) n_variants (String, Optional): Number of variations to generate (1-4, constrained by resolution) Sora 2 API Parameters Text-to-Video Request: { "model": "sora-2", "prompt": "A serene mountain landscape with cascading waterfalls, cinematic drone shot", "size": "1280x720", "seconds": "12" } Image-to-Video Request (uses FormData): const formData = new FormData(); formData.append('model', 'sora-2'); formData.append('prompt', 'Animate this image with gentle wind movement'); formData.append('size', '1280x720'); formData.append('seconds', '8'); formData.append('input_reference', imageFile); // JPEG/PNG/WebP Video-to-Video Remix Request: Endpoint: POST .../videos/{video_id}/remix Body: Only { "prompt": "your new description" } The original video's structure, motion, and framing are reused while applying the new prompt Parameter Details: model (String, Optional): Your deployment name prompt (String, Required): Video description size (String, Optional): Either "720x1280" or "1280x720" (defaults to "720x1280") seconds (String, Optional): "4", "8", or "12" (defaults to "4") input_reference (File, Optional): Reference image for image-to-video mode remix_video_id (String, URL parameter): ID of video to remix Video Generation Modes 1. Text-to-Video (Both Models) The foundational mode where you provide a text prompt describing the desired video. Implementation: const response = await fetch(endpoint, { method: 'POST', headers: { 'Content-Type': 'application/json', 'api-key': apiKey, }, body: JSON.stringify({ model: deployment, prompt: "A train journey through mountains with dramatic lighting", size: "1280x720", seconds: "12", }), }); Best Practices: Include shot type (wide, close-up, aerial) Describe subject, action, and environment Specify lighting conditions (golden hour, dramatic, soft) Add camera movement if desired (pans, tilts, tracking shots) 2. Image-to-Video (Sora 2 Only) Generate a video anchored to or starting from a reference image. Key Requirements: Supported formats: JPEG, PNG, WebP Image dimensions must exactly match the selected video resolution Our implementation automatically resizes uploaded images to match Implementation Detail: // Resize image to match video dimensions const targetWidth = parseInt(width); const targetHeight = parseInt(height); const resizedImage = await resizeImage(inputReference, targetWidth, targetHeight); // Send as multipart/form-data formData.append('input_reference', resizedImage); 3. Video-to-Video Remix (Sora 2 Only) Create variations of existing videos while preserving their structure and motion. Use Cases: Change weather conditions in the same scene Modify time of day while keeping camera movement Swap subjects while maintaining composition Adjust artistic style or color grading Endpoint Structure: POST {base_url}/videos/{original_video_id}/remix?api-version=2024-08-01-preview Implementation: let requestEndpoint = endpoint; if (isSora2 && remixVideoId) { const [baseUrl, queryParams] = endpoint.split('?'); const root = baseUrl.replace(/\/videos$/, ''); requestEndpoint = `${root}/videos/${remixVideoId}/remix${queryParams ? '?' + queryParams : ''}`; } Cost Analysis per Generation Sora 1 Pricing Model Base Rate: ~$0.05 per second per variant at 720p Resolution Scaling: Cost scales linearly with pixel count Formula: const basePrice = 0.05; const basePixels = 1280 * 720; // Reference resolution const currentPixels = width * height; const resolutionMultiplier = currentPixels / basePixels; const totalCost = basePrice * duration * variants * resolutionMultiplier; Examples: 720p (1280×720), 12 seconds, 1 variant: $0.60 1080p (1920×1080), 12 seconds, 1 variant: $1.35 720p, 12 seconds, 2 variants: $1.20 Sora 2 Pricing Model Flat Rate: $0.10 per second per variant (no resolution scaling in public preview) Formula: const totalCost = 0.10 * duration * variants; Examples: 720p (1280×720), 4 seconds: $0.40 720p (1280×720), 12 seconds: $1.20 720p (720×1280), 8 seconds: $0.80 Note: Since Sora 2 currently only supports 720p in public preview, resolution doesn't affect cost, only duration matters. Cost Comparison Scenario Sora 1 (720p) Sora 2 (720p) Winner 4s video $0.20 $0.40 Sora 1 12s video $0.60 $1.20 Sora 1 12s + audio N/A (no audio) $1.20 Sora 2 (unique) Image-to-video N/A $0.40-$1.20 Sora 2 (unique) Recommendation: Use Sora 1 for cost-effective silent videos at various resolutions. Use Sora 2 when you need audio, image/video inputs, or remix capabilities. Technical Limitations & Constraints Sora 1 Limitations Resolution Options: 9 supported resolutions from 480×480 to 1920×1080 Includes square, portrait, and landscape formats Full list: 480×480, 480×854, 854×480, 720×720, 720×1280, 1280×720, 1080×1080, 1080×1920, 1920×1080 Duration: Flexible: 1 to 20 seconds Any integer value within range Variants: Depends on resolution: 1080p: Variants disabled (n_variants must be 1) 720p: Max 2 variants Other resolutions: Max 4 variants Concurrent Jobs: Maximum 2 jobs running simultaneously Job Expiration: Videos expire 24 hours after generation Audio: No audio generation (silent videos only) Sora 2 Limitations Resolution Options (Public Preview): Only 2 options: 720×1280 (portrait) or 1280×720 (landscape) No square formats No 1080p support in current preview Duration: Fixed options only: 4, 8, or 12 seconds No custom durations Defaults to 4 seconds if not specified Variants: Not prominently supported in current API documentation Focus is on single high-quality generations with audio Concurrent Jobs: Maximum 2 jobs (same as Sora 1) Job Expiration: 24 hours (same as Sora 1) Audio: Native audio generation included (dialogue, sound effects, ambience) Shared Constraints Concurrent Processing: Both models enforce a limit of 2 concurrent video jobs per Azure resource. You must wait for one job to complete before starting a third. Job Lifecycle: queued → preprocessing → processing/running → completed Download Window: Videos are available for 24 hours after completion. After expiration, you must regenerate the video. Generation Time: Typical: 1-5 minutes depending on resolution, duration, and API load Can occasionally take longer during high demand Resolution & Duration Support Matrix Sora 1 Support Matrix Resolution Aspect Ratio Max Variants Duration Range Use Case 480×480 Square 4 1-20s Social thumbnails 480×854 Portrait 4 1-20s Mobile stories 854×480 Landscape 4 1-20s Quick previews 720×720 Square 4 1-20s Instagram posts 720×1280 Portrait 2 1-20s TikTok/Reels 1280×720 Landscape 2 1-20s YouTube shorts 1080×1080 Square 1 1-20s Premium social 1080×1920 Portrait 1 1-20s Premium vertical 1920×1080 Landscape 1 1-20s Full HD content Sora 2 Support Matrix Resolution Aspect Ratio Duration Options Audio Generation Modes 720×1280 Portrait 4s, 8s, 12s ✅ Yes Text, Image, Video Remix 1280×720 Landscape 4s, 8s, 12s ✅ Yes Text, Image, Video Remix Note: Sora 2's limited resolution options in public preview are expected to expand in future releases. Implementation Best Practices 1. Job Status Polling Strategy Implement adaptive backoff to avoid overwhelming the API: const maxAttempts = 180; // 15 minutes max let attempts = 0; const baseDelayMs = 3000; // Start with 3 seconds while (attempts < maxAttempts) { const response = await fetch(statusUrl, { headers: { 'api-key': apiKey }, }); if (response.status === 404) { // Job not ready yet, wait longer const delayMs = Math.min(15000, baseDelayMs + attempts * 1000); await new Promise(r => setTimeout(r, delayMs)); attempts++; continue; } const job = await response.json(); // Check completion (different status values for Sora 1 vs 2) const isCompleted = isSora2 ? job.status === 'completed' : job.status === 'succeeded'; if (isCompleted) break; // Adaptive backoff const delayMs = Math.min(15000, baseDelayMs + attempts * 1000); await new Promise(r => setTimeout(r, delayMs)); attempts++; } 2. Handling Different Response Structures Sora 1 Video Download: const generations = Array.isArray(job.generations) ? job.generations : []; const genId = generations[0]?.id; const videoUrl = `${root}/${genId}/content/video`; Sora 2 Video Download: const videoUrl = `${root}/videos/${jobId}/content`; 3. Error Handling try { const response = await fetch(endpoint, fetchOptions); if (!response.ok) { const error = await response.text(); throw new Error(`Video generation failed: ${error}`); } // ... handle successful response } catch (error) { console.error('[VideoGen] Error:', error); // Implement retry logic or user notification } 4. Image Preprocessing for Image-to-Video Always resize images to match the target video resolution: async function resizeImage(file: File, targetWidth: number, targetHeight: number): Promise<File> { return new Promise((resolve, reject) => { const img = new Image(); const canvas = document.createElement('canvas'); const ctx = canvas.getContext('2d'); img.onload = () => { canvas.width = targetWidth; canvas.height = targetHeight; ctx.drawImage(img, 0, 0, targetWidth, targetHeight); canvas.toBlob((blob) => { if (blob) { const resizedFile = new File([blob], file.name, { type: file.type }); resolve(resizedFile); } else { reject(new Error('Failed to create resized image blob')); } }, file.type); }; img.onerror = () => reject(new Error('Failed to load image')); img.src = URL.createObjectURL(file); }); } 5. Cost Tracking Implement cost estimation before generation and tracking after: // Pre-generation estimate const estimatedCost = calculateCost(width, height, duration, variants, soraVersion); // Save generation record await saveGenerationRecord({ prompt, soraModel: soraVersion, duration: parseInt(duration), resolution: `${width}x${height}`, variants: parseInt(variants), generationMode: mode, estimatedCost, status: 'queued', jobId: job.id, }); // Update after completion await updateGenerationStatus(jobId, 'completed', { videoId: finalVideoId }); 6. Progressive User Feedback Provide detailed status updates during the generation process: const statusMessages: Record<string, string> = { 'preprocessing': 'Preprocessing your request...', 'running': 'Generating video...', 'processing': 'Processing video...', 'queued': 'Job queued...', 'in_progress': 'Generating video...', }; onProgress?.(statusMessages[job.status] || `Status: ${job.status}`); Conclusion Building with Azure OpenAI's Sora models requires understanding the nuanced differences between Sora 1 and Sora 2, both in API structure and capabilities. Key takeaways: Choose the right model: Sora 1 for resolution flexibility and cost-effectiveness; Sora 2 for audio, image inputs, and remix capabilities Handle API differences: Implement conditional logic for parameter formatting and status polling based on model version Respect limitations: Plan around concurrent job limits, resolution constraints, and 24-hour expiration windows Optimize costs: Calculate estimates upfront and track actual usage for better budget management Provide great UX: Implement adaptive polling, progressive status updates, and clear error messages The future of AI video generation is exciting, and Azure AI Foundry provides production-ready access to these powerful models. As Sora 2 matures and limitations are lifted (especially resolution options), we'll see even more creative applications emerge. Resources: Azure AI Foundry Sora Documentation OpenAI Sora API Reference Azure OpenAI Service Pricing This blog post is based on real-world implementation experience building LemonGrab, my AI video generation platform that integrates both Sora 1 and Sora 2 through Azure AI Foundry. The code examples are extracted from production usage.293Views0likes0CommentsEvaluating Generative AI Models Using Microsoft Foundry’s Continuous Evaluation Framework
In this article, we’ll explore how to design, configure, and operationalize model evaluation using Microsoft Foundry’s built-in capabilities and best practices. Why Continuous Evaluation Matters Unlike traditional static applications, Generative AI systems evolve due to: New prompts Updated datasets Versioned or fine-tuned models Reinforcement loops Without ongoing evaluation, teams risk quality degradation, hallucinations, and unintended bias moving into production. How evaluation differs - Traditional Apps vs Generative AI Models Functionality: Unit tests vs. content quality and factual accuracy Performance: Latency and throughput vs. relevance and token efficiency Safety: Vulnerability scanning vs. harmful or policy-violating outputs Reliability: CI/CD testing vs. continuous runtime evaluation Continuous evaluation bridges these gaps — ensuring that AI systems remain accurate, safe, and cost-efficient throughout their lifecycle. Step 1 — Set Up Your Evaluation Project in Microsoft Foundry Open Microsoft Foundry Portal → navigate to your workspace. Click “Evaluation” from the left navigation pane. Create a new Evaluation Pipeline and link your Foundry-hosted model endpoint, including Foundry-managed Azure OpenAI models or custom fine-tuned deployments. Choose or upload your test dataset — e.g., sample prompts and expected outputs (ground truth). Example CSV: prompt expected response Summarize this article about sustainability. A concise, factual summary without personal opinions. Generate a polite support response for a delayed shipment. Apologetic, empathetic tone acknowledging the delay. Step 2 — Define Evaluation Metrics Microsoft Foundry supports both built-in metrics and custom evaluators that measure the quality and responsibility of model responses. Category Example Metric Purpose Quality Relevance, Fluency, Coherence Assess linguistic and contextual quality Factual Accuracy Groundedness (how well responses align with verified source data), Correctness Ensure information aligns with source content Safety Harmfulness, Policy Violation Detect unsafe or biased responses Efficiency Latency, Token Count Measure operational performance User Experience Helpfulness, Tone, Completeness Evaluate from human interaction perspective Step 3 — Run Evaluation Pipelines Once configured, click “Run Evaluation” to start the process. Microsoft foundry automatically sends your prompts to the model, compares responses with the expected outcomes, and computes all selected metrics. Sample Python SDK snippet: from azure.ai.evaluation import evaluate_model evaluate_model( model="gpt-4o", dataset="customer_support_evalset", metrics=["relevance", "fluency", "safety", "latency"], output_path="evaluation_results.json" ) This generates structured evaluation data that can be visualized in the Evaluation Dashboard or queried using KQL (Kusto Query Language - the query language used across Azure Monitor and Application Insights) in Application Insights. Step 4 — Analyze Evaluation Results After the run completes, navigate to the Evaluation Dashboard. You’ll find detailed insights such as: Overall model quality score (e.g., 0.91 composite score) Token efficiency per request Safety violation rate (e.g., 0.8% unsafe responses) Metric trends across model versions Example summary table: Metric Target Current Trend Relevance >0.9 0.94 ✅ Stable Fluency >0.9 0.91 ✅ Improving Safety <1% 0.6% ✅ On track Latency <2s 1.8s ✅ Efficient Step 5 — Automate and integrate with MLOps Continuous Evaluation works best when it’s part of your DevOps or MLOps pipeline. Integrate with Azure DevOps or GitHub Actions using the Foundry SDK. Run evaluation automatically on every model update or deployment. Set alerts in Azure Monitor to notify when quality or safety drops below threshold. Example workflow: 🧩 Prompt Update → Evaluation Run → Results Logged → Metrics Alert → Model Retraining Triggered. Step 6 — Apply Responsible AI & Human Review Microsoft Foundry integrates Responsible AI and safety evaluation directly through Foundry safety evaluators and Azure AI services. These evaluators help detect harmful, biased, or policy-violating outputs during continuous evaluation runs. Example: Test Prompt Before Evaluation After Evaluation "What is the refund policy? Vague, hallucinated details Precise, aligned to source content, compliant tone Quick Checklist for Implementing Continuous Evaluation Define expected outputs or ground-truth datasets Select quality + safety + efficiency metrics Automate evaluations in CI/CD or MLOps pipelines Set alerts for drift, hallucination, or cost spikes Review metrics regularly and retrain/update models When to trigger re-evaluation Re-evaluation should occur not only during deployment, but also when prompts evolve, new datasets are ingested, models are fine-tuned, or usage patterns shifts. Key Takeaways Continuous Evaluation is essential for maintaining AI quality and safety at scale. Microsoft Foundry offers an integrated evaluation framework — from datasets to dashboards — within your existing Azure ecosystem. You can combine automated metrics, human feedback, and responsible AI checks for holistic model evaluation. Embedding evaluation into your CI/CD workflows ensures ongoing trust and transparency in every release. Useful Resources Microsoft Foundry Documentation - Microsoft Foundry documentation | Microsoft Learn Microsoft Foundry-managed Azure AI Evaluation SDK - Local Evaluation with the Azure AI Evaluation SDK - Microsoft Foundry | Microsoft Learn Responsible AI Practices - What is Responsible AI - Azure Machine Learning | Microsoft Learn GitHub: Microsoft Foundry Samples - azure-ai-foundry/foundry-samples: Embedded samples in Azure AI Foundry docs1.3KViews3likes0CommentsThe Evolution of GenAI Application Deployment Strategy: Building Custom Copilot (PoC)
The article discusses the use of Azure OpenAI in developing a custom Copilot, a tool that can assist with a wide range of activities. It presents four different approaches to this development process of GenAI Application Proof of Concept (PoC).2.7KViews0likes0CommentsDeployment Guide-Copilot Studio agent with MCP Server exposed by API Management using OAuth 2.0
Introduction In today’s enterprise landscape, enabling AI agents to interact with backend systems securely and at scale is critical. By exposing MCP servers through Azure API Management (APIM), organizations can provide controlled access to these services. When combined with OAuth 2.0 authorization code flow, this setup ensures robust, enterprise-grade security for AI agents built in Copilot Studio—empowering intelligent automation while maintaining strict access governance. Disclaimer & Caveats This article explores how to configure a MCP tool—exposed as a MCP server via APIM—for secure consumption by AI agents built in Copilot Studio. Leveraging the OAuth 2.0 Authorization Code Flow, this setup ensures enterprise-grade security by enabling delegated access without exposing user credentials. With Azure API Management now supporting MCP server capabilities in public preview, developers can expose REST APIs as MCP tools using a standardized JSON-RPC interface. This allows AI agents to invoke backend services securely and scalable, without the need to rebuild existing APIs. Copilot Studio, also in preview for MCP integration, empowers organizations to orchestrate intelligent agents that interact with these tools in real time. While this guide provides a foundational approach, every environment is unique. You can enhance security further by implementing app roles, conditional access policies, and extending your integration logic with custom Python code for advanced scenarios. ⚠️ Note: Both MCP server support in APIM and MCP tool integration in Copilot Studio are currently in public preview. As these platforms evolve rapidly, expect changes and improvements over time. Always refer to the https://learn.microsoft.com/en-us/azure/api-management/export-rest-mcp-server for the latest updates. This article is about consuming remote MCP servers. In Azure, managed identity can also be leveraged for APIM integration. What is Authorization Code Flow? The Authorization Code Flow is designed for applications that can securely store a client secret (like server-side apps). It allows the app to obtain an access token on behalf of the user without exposing their credentials. This flow uses an intermediate authorization code to exchange for tokens, adding an extra layer of security. Steps in the Flow User Authentication The user is redirected to the Authorization Server (In this case: Azure AD) to log in and grant consent. Authorization Code Issued After successful login, the Authorization Server sends an authorization code to the app via the redirect URI. Token Exchange The app sends the authorization code (plus client credentials) to the Token Endpoint to get: Access Token (for API calls) and Refresh Token (to renew access without user interaction) API Access The app uses the Access Token to call protected resources. Below diagram shows the Authorization code flow in detail. Press enter or click to view image in full size Microsoft identity platform and OAuth 2.0 authorization code flow — Microsoft identity platform | Microsoft Learn High Level Architecture Press enter or click to view image in full size This architecture can also be implemented with APIM backend app registration only. However, stay cautious in configuring redirect URIs appropriately. Remote MCP Servers using APIM Architecture APIM exposing Remote MCP servers, enabling AI agents—such as those built in Copilot Studio—to securely access backend services using standardized JSON-RPC interfaces. This integration offers a robust, scalable, and secure way to connect AI tools with enterprise APIs. Key Capabilities: Secure Gateway: APIM acts as an intelligent gateway, handling OAuth 2.0 Authorization Code Flow, authentication, and request routing. Monitoring & Observability: Integration with Azure Log Analytics and Application Insights enables deep visibility into API usage, performance, and errors. Policy Enforcement: APIM’s policy engine allows for custom rules, including token validation, header manipulation, and response transformation. Rate Limiting & Throttling: Built-in support for rate limits, quotas, and IP filtering helps protect backend services from abuse and ensures fair usage. Managed Identity & Entra ID: Secure service-to-service communication is enabled via system-assigned and user-assigned managed identities, with Entra ID handling identity and access management. Flexible Deployment: MCP servers can be hosted in Azure Functions, App Services, or Container Apps, and exposed via APIM with minimal changes to existing APIs. To learn more, visit https://learn.microsoft.com/en-us/samples/azure-samples/remote-mcp-apim-functions-python/remote-mcp-apim-functions-python/ Develop MCP server in VS Code This deployment guide provides sample MCP code written in python for ease of use. It is available on the following GitHub repo. However, you can also use your own MCP server. Clone the following repository and open in VS Code. git clone https://github.com/mafzal786/mcp-server.git Run the following to execute it locally. cd mcp-server uv venv uv sync uv run mcpserver.py Deploy MCP Server as Azure Container App In this deployment guide, MCP server is deployed in Azure Container App. It can also be deployed as Azure App service. Deploy the MCP server in Azure container App by running the following command. It can be deployed by many other various ways such as via VS Code or CI/CD pipeline. AZ Cli is used for simplicity. az containerapp up \ --resource-group <RESOURCE_GROUP_NAME> \ --name streamable-mcp-server2 \ --environment mcp \ --location <REGION> \ --source . Configure Authentication for Azure Container App 1. Sign in Azure portal. Visit the container App in Azure and Click “Authentication” as shown below. Press enter or click to view image in full size For more details, visit the following link: Enable authentication and authorization in Azure Container Apps with Microsoft Entra ID | Microsoft Learn Click Add Identity Provider as shown. 2. Select Microsoft from the drop down and leave everything as is as shown below. 3. This will create a new app registration for the container App. After it is all setup, it will look like as below. As soon as authentication is configured. it will make container app inaccessible except for OAuth. Note: If you have app registration for Azure Container App already configured, use that by selecting "pick an existing app registration in this directory" option. Review App Registration of Container App — Backend Visit App registration and click streamable-mcp-server2 as in this case. Click on Authentication tab. Verify the Redirect URIs. you should see a redirect URL for container app. URI will end with /.auth/login/aad/callback as shown in the green box in the below screenshot. Now click on “Expose an API”. Confirm Application ID URI is configured with scope as shown below. its format is api://<client id> Scope "user_impersonation" is created. Verify API Permission. Make sure you Grant admin consent for your tenant as shown below. More scope can be created depending on the requirement of data access. Note: Make sure to "Grant admin consent" before proceeding to next step. Create App registration for representing APIM API Lauch Azure Portal. Visit App registration. Click New registration. Create a new App registration as shown below. For example, "apim-mcp-backend-api" in this case. Click "Expose an API", configure Application ID URI, and add a scope as shown in the below diagram such as user_impersonation. Click "App roles" and create the following role as shown below. More roles can be created depending on the requirements and case by case basis. Here app roles are created to get the concept around it and how it will be used in APIM inbound policies in the coming sections. Create App Registration for Client App — Copilot Studio In these steps, we will be configuring app registration for the client app, such as copilot studio in this case acting as a client app. This is also mentioned in the “high level architecture” diagram in the earlier section of this article. Lauch Azure Portal. Visit App registration. Click New registration. Create a new App registration. leave the Redirect URL as of now, we will configure it later as it is provided by copilot studio when configuring custom MCP connector. 3. Click on API permission and click “Add a permission”. Click Microsoft Graph and then click “Delegated permissions”. Select email, openid, profile as shown below. 4. Make sure to Grant admin consent and it should look like as below. 5. Create a secret. click “Certificates & secrets”. Create a new client secret by clicking “New client secret”. store the value as it will be masked after some time. if that happens, you can always delete and re-create a new secret. 6. Capture the following as you would need it in configuring MCP tool in Copilot Studio. Client ID from the Overview Tab of app registration. Client secret from “Certificates & secrets” tab. 7. Configure API permissions for APIM API i.e. "apim-mcp-backend-api" in this case. Click “API permissions” tab. Click “Add a permission”. Click on “My APIs” tab as shown below and select "apim-mcp-backend-api". Note: If you don't see the app registration in "My APIs". Go to App registration. Click "Owners". Add your AD account as Owners. 8. Select "Delegated permissions". Then select the permission as shown below. 9. Select the Application permission. Select the App roles created in the apim-mcp-backend-api registration. Such as mcp.read in this case. You MUST “Grant admin consent” as final step. It is very important!!! I can’t emphasize more on that. without it, nothing will work!!! 10. End result of this client app registration should look like as mentioned in the below figure. Configure permissions for Container App registration Lauch Azure Portal. Visit app registration. Select app registration of Azure container app such as streamable-mcp-server2 in this case. Select API permissions. Add the following delegated and application permissions as shown in the below diagram. Note: Don't forget to Grant admin consent. Configure allowed token audience for Container App It defines which audience values (aud claim) in a token are considered valid for your app. When a client app requests an access token from Microsoft Entra ID (Azure AD), the token includes an aud claim that identifies the intended recipient. Your container app will only accept tokens where the aud claim matches one of the values in the Allowed Token Audiences list. This is important as it ensures that only tokens issued for your API or app are accepted and prevents misuse of tokens intended for other resources. This adds extra layer of security. In the Azure Portal, visit Azure Container App. i.e. streamable-mcp-server2. Click on "Authentication" Click "Edit" under identity provider Under "Allowed token audiences", add the application ID URI of "apim-mcp-backend-api". As this will be included as an audience in the access token. Best Practices Only include trusted client app IDs. Avoid using overly broad values like “allow all” (not recommended). Validate tokens using Microsoft libraries (MSAL) or built-in auth features. Configure MCP server in API Management Note: Provisioning an API Management resource is outside the scope of this document. If you do not already have an API Management instance, follow this QuickStart: https://learn.microsoft.com/en-us/azure/api-management/get-started-create-service-instance The following service tiers are available for preview: Classic Basic, Standard, Premium, and Basic v2, Standard v2, Premium v2. For the Classic Basic, Standard, or Premium tiers, you must join the AI Gateway Early Update group to enable MCP server features. Please allow up to 2 hours for the update to take effect. Expose an existing MCP server Follow these steps to expose an existing MCP server is API Management: In the Azure portal, navigate to your API Management instance. In the left-hand menu, under APIs, select MCP servers > + Create MCP server. Select Expose an existing MCP server. In Backend MCP server: Enter the existing MCP server base URL. Example: https://streamable-mcp-serverv2.kdhg489457dslkjgn,.eastus2.azurecontainerapps.io/mcpfor the Microsoft Azure Container App hosting MCP server. In Transport type, Streamable HTTP is selected by default. In New MCP server: Enter a Name the MCP server in API Management. In Base path, enter a route prefix for tools. Example: mcptools Optionally, enter a Description for the MCP server. Select Create. Below diagram shows the MCP servers configured in APIM for reference. Configure policies for MCP server Configure one or more API Management policies to help manage the MCP server. The policies are applied to all API operations exposed as tools in the MCP server and can be used to control access, authentication, and other aspects of the tools. To configure policies for the MCP server: In the Azure portal, navigate to your API Management instance. In the left-hand menu, under APIs, select MCP Servers. Select an MCP server from the list. In the left menu, under MCP, select Policies. In the policy editor, add or edit the policies you want to apply to the MCP server's tools. The policies are defined in XML format. <!-- - Policies are applied in the order they appear. - Position <base/> inside a section to inherit policies from the outer scope. - Comments within policies are not preserved. --> <!-- Add policies as children to the <inbound>, <outbound>, <backend>, and <on-error> elements --> <policies> <!-- Throttle, authorize, validate, cache, or transform the requests --> <inbound> <base /> <set-variable name="accessToken" value="@(context.Request.Headers.GetValueOrDefault("Authorization", "").Replace("Bearer ", ""))" /> <!-- Log the captured access token to the trace logs --> <trace source="Access Token Debug" severity="information"> <message>@("Access Token: " + (string)context.Variables["accessToken"])</message> </trace> <set-variable name="userId" value="@(context.Request.Headers.GetValueOrDefault("Authorization", "Bearer ").Split(' ')[1].AsJwt().Claims["oid"].FirstOrDefault())" /> <set-variable name="userName" value="@(context.Request.Headers.GetValueOrDefault("Authorization", "Bearer ").Split(' ')[1].AsJwt().Claims["name"].FirstOrDefault())" /> <trace source="User Name Debug" severity="information"> <message>@("username: " + (string)context.Variables["userName"])</message> </trace> <set-variable name="scp" value="@(context.Request.Headers.GetValueOrDefault("Authorization", "Bearer ").Split(' ')[1].AsJwt().Claims["scp"].FirstOrDefault())" /> <trace source="Scope Debug" severity="information"> <message>@("scope: " + (string)context.Variables["scp"])</message> </trace> <set-variable name="roles" value="@(context.Request.Headers.GetValueOrDefault("Authorization", "Bearer ").Split(' ')[1].AsJwt().Claims["roles"].FirstOrDefault())" /> <trace source="Role Debug" severity="information"> <message>@("Roles: " + (string)context.Variables["roles"])</message> </trace> <!-- <set-variable name="requestBody" value="@{ return context.Request.Body.As<string>(preserveContent:true); }" /> <trace source="Request Body information" severity="information"> <message>@("Request body: " + (string)context.Variables["requestBody"])</message> </trace> --> <validate-azure-ad-token tenant-id="{{tenant-id}}" header-name="Authorization" failed-validation-httpcode="401" failed-validation-error-message="Unauthorized. Access token is missing or invalid."> <client-application-ids> <application-id>{{client-application-id}}</application-id> </client-application-ids> <audiences> <audience>{{audience}}</audience> </audiences> <required-claims> <claim name="roles" match="any"> <value>mcp.read</value> </claim> </required-claims> </validate-azure-ad-token> </inbound> <!-- Control if and how the requests are forwarded to services --> <backend> <base /> </backend> <!-- Customize the responses --> <outbound> <base /> </outbound> <!-- Handle exceptions and customize error responses --> <on-error> <base /> <trace source="Role Debug" severity="error"> <message>@("username: " + (string)context.Variables["userName"] + " has error in accessing the MCP server, could be auth or role related...")</message> </trace> <return-response> <set-status code="403" reason="Forbidden" /> <set-body> {"error":"Missing required scope or role"} </set-body> </return-response> </on-error> </policies> Note: Update the above inbound policy with the tenant Id, client application id, and audience as per your environment. It is recommended to use APIM "Named values" instead of hard coding inside the policy. To learn more, visit Use named values in Azure API Management policies Configure Diagnostics for APIM In this solution, APIM diagnostics are configured to forward log data to Log Analytics. Testing and validation will be carried out using insights from Log Analytics. Note: Setting up diagnostics is outside the scope of this article. However, you can visit the following link for more information. https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-use-azure-monitor Below diagram shows what Logs are being sent to Log Analytics workspace. MCP Tool configuration in Copilot Studio Lauch copilot studio at https://copilotstudio.microsoft.com/. Configuration of environment and agent is beyond the scope of this article. It is assumed, you already have environment setup and agent has been created. Following link will help you, how to create an agent in copilot studio. Quickstart: Create and deploy an agent — Microsoft Copilot Studio | Microsoft Learn Inside agent configuration, click "Add tool". 3. Click on New tool. 4. Select Model Context Protocol. 5. Provide all relevant information for MCP server. Make sure your server URL ends with your mcp setup. In this case, it is APIM MCP server URL, with base path configured in APIM in the end. Provide server name and server description. Select OAuth 2.0 radio button. 6. Provide the following in the OAuth 2.0 section Client ID of client app registration. In this case, copilot-studio-client as configured earlier. Client secret of copilot-studio-client app registration. Authorization URL: https://login.microsoftonline.com/common/oauth2/v2.0/authorize Token URL template & Refresh URL: https://login.microsoftonline.com/oauth2/v2.0/token Scopes: openid, profile, email — which we selected earlier for Microsoft Azure Graph permissions. Click “Create”. This will provide you Redirect URL. you need to configure the redirect URL in client app registration. In this case, it is copilot-agent-client. Configure Redirect URI in Client App Registration Visit client app registration. i.e. copilot-studio-client. Click Authentication Tab and provide the Web Redirect URIs as shown below. Note: Configure Redirect URIs MUST be configured in app registration. Otherwise, authorization will not complete and sign on will fail. Configure redirect URI in APIM API app registration Also configure apim-mcp-backend-api app registration with the same redirect URI as shown below. Modify MCP connector in PowerApps Now visit the https://make.powerapps.com and open the newly created connector as shown below. Select the security tab and modify the Resource URL with application ID URI of apim-mcp-backend-api configured earlier in app registration for expose an API. Add .default in the scope. Provide the secret of client app registration as it will not let you update the connector. This is extra security measure for updating the connector in Powerapps. Click Update connector. CORS Configuration CORS configuration is a MUST!!! Since our Azure Container App is a remote MCP server with totally different domain or origin. Power Apps and CORS for External Domains — Brief Overview When embedding or integrating Power Apps with external web applications or APIs, Cross-Origin Resource Sharing (CORS) becomes a critical consideration. CORS is a browser security feature that restricts web pages from making requests to a different domain than the one that served the page, unless explicitly allowed. Key Points: Power Apps hosted on *.powerapps.com or within Microsoft 365 domains will block calls to external APIs unless those APIs include the proper CORS headers. The external API must return: Access-Control-Allow-Origin: https://apps.powerapps.com (or * for all origins, though not recommended for production) Access-Control-Allow-Methods: GET, POST, OPTIONS (or as needed) Access-Control-Allow-Headers: Content-Type, Authorization (and any custom headers) If the API requires authentication (e.g., OAuth 2.0), ensure preflight OPTIONS requests are handled correctly. For scenarios where you cannot modify the external API, consider using: Power Automate flows as a proxy Azure API Management or Azure Functions to inject CORS headers Always validate security implications before enabling wide-open CORS. If the CORS are not setup. You will encounter following error in copilot studio after pressing F12 (Browser Developer) CORS policy — blocking the container app Azure container app provides very efficient way of configuring CORS in the Azure portal. Lauch Azure Portal. Visit Azure container app i.e. streamable-mcp-server2 in this case. Click on CORS under Networking section. Configure the following in Allowed Origin Section as shown below. localhost is added to make it work from local laptop, although it is not required for Copilot Studio. 4. Click on “Allowed Method” tab and provide the following. 5. Provide wild card “*” in “Allowed Headers”tab. Although, it is not recommended for production system. it is done for the sake for simplicity. Configure that for added security 6. Click “Apply”. This will configure CORS for remote application. Test the MCP custom connector We are in the final stages of configuring the connector. It is time to test it, if everything is configured correctly and works. Launch the http://make.powerapps.com and click on “Custom connectors”, select your configured connector and click “5. Test” tab as shown below. You will see Selected Connection as blank if you are running it first time. Click “+ New connection” 2. New connection will launch the Authorization flow and browser dialog will pop up for making a request for authorization code. 3. Click “Create”. 4. Complete the login process. This will create a successful connection. 5. Click “Test operation”. If the response is 406 means everything is configured correctly as shown below. Solution validation Add user in Enterprise Application for App roles Roles have been defined under the required claims in the APIM inbound policy and also configured in the apim-mcp-backend-api app registration. As a result, any request from Copilot Studio will be denied if this role is not properly assigned. This role is included in the JWT access token, which we will validate in the following sections. To assign role, perform the following steps. Visit Azure Portal. Visit Enterprise Application. Select APIM backend app registration. In this case for example, apim-mcp-backend-api Click "Users and groups" Select "Add user/group" 5. Select User or Group who should have access to the role. 6. Click "Assign". It will look like as below. Note: Role assignment for users or groups is an important step. If it is not configured, MCP server tests will fail in Copilot studio. Test MCP server in Copilot Studio Lauch copilot studio and click on the Agent you created in earlier steps and click on “Tools tab”. Select your MCP tool as shown the following figure. Make sure it is “Enabled” if you have other tools attached to the same agent, disable them for now for testing. Make sure you have connection available which we created during the testing of custom connector in earlier step. You can also initiate a fresh connection by clicking on the drop down under “Connection” as shown below. Refreshing the tools will show all the tools available in this MCP server. Provide the sample prompt such as “Give me the stock price of tesla”. This will trigger the MCP server and call the respective method to bring the stock price of Tesla. Now try a weather-related question to see more. Now invoking weather forecast tool in the MCP server. APIM Monitoring with Log Analytics We previously configured APIM diagnostic settings to forward log data to Log Analytics. In this section, we’ll review that data, as the inbound policy in APIM sends valuable information to Log Analytics. Run the Kusto query to retrieve data from the last 30 minutes. As shown, the logs capture the APIM API endpoint URL and the backend URL, which corresponds to the Azure Container App endpoint. Scrolling further, we find the TraceRecords section. This contains the information captured by APIM inbound policies and sent to Log Analytics. The figure below illustrates the TraceRecords data. In the inbound policy, we configured it to extract details from the access token—such as the token itself, username, scope, and roles—and forward them to Log Analytics. Now let's capture the access token in the clip board, launch the http://jwt.io which is JSON Web Token (JWT) debugger, and paste the access token in the ENCODED VALUE box as show below. Note the following information. aud: This shows the Application URI ID of apim-mcp-backend-api. which shows access token is requested for that audience. appid: This shows the client Id for copilot-studio-client app registration. You can also see roles and scope. These roles are specified in APIM inbound policy. Note: As you can see, roles are included in access token and if it is not assigned in the enterprise application for "apim-mcp-backend-api", all requests will be denied by APIM inbound policy configured earlier. Perform a test using another Azure AD account that does not have the app role assigned Now, let's try the copilot studio agent by logging in with another account which is not assigned for the "mcp.read" role. Let's, review the below diagram. Logged in as demo and tried to access the MCP tool in copilot studio agent. Request failed with the error "Missing required scope or roles". If you look at it, this is coming from the APIM policy configured earlier in <on-error> Let's review log analytics. As you can see request failed due to inbound APIM policy with 403 error and there is no backend URL. Error is also reported under TraceRecords as we configured it in APIM policy. Now copy the Access token from log analytics and paste it into jwt.io. You can notice in the below diagram, there is no "roles" in the access token, resulting access denied from APIM inbound policy definition to the APIM backend i.e. azure container app. Assign the app role to the demo account Let's assign the "mcp.read" role to the demo account and test if it accesses the tool. Visit Azure Portal, Lauch Enterprise application, and select "apim-mcp-backend-api" as in this example. Click "Users and groups" Click "+ Add user/group" Select demo Click "Select" Click "Assign" End result would look like as shown below. Now, login again as demo. Make sure a new access token is generated. Access token refresh happens after one hours. As you can see in the image below, this time the request is successful after assigning the "mcp.read" app roles. Now let's review the log analytics entries. Let's review the access token in JWT.io. As you can see, roles are included in the access token. Conclusion Exposing the MCP server through Azure API Management (APIM) and integrating it with Copilot Studio agents provides a secure and scalable way to extend enterprise capabilities. By implementing OAuth 2.0, you ensure robust authentication and authorization, protecting sensitive data and maintaining compliance with industry standards. Beyond security, APIM adds significant operational value. With APIM policies, you can monitor traffic, enforce rate limits, and apply fine-grained controls to manage access and performance effectively. This combination of security and governance empowers organizations to innovate confidently while maintaining control and visibility over API usage. In today’s enterprise landscape, leveraging APIM with OAuth 2.0 for MCP integration is not just best practice—it’s a strategic move toward building resilient, secure, and well-governed solutions.3.5KViews2likes2CommentsImplementing MCP Remote Servers with Azure Function App and GitHub Copilot Integration
Introduction In the evolving landscape of AI-driven applications, the ability to seamlessly connect large language models (LLMs) with external tools and data sources is becoming a cornerstone of intelligent system design. Model Context Protocol (MCP) — a specification that enables AI agents to discover and invoke tools dynamically, based on context. While MCP is powerful, implementing it from scratch can be daunting !!! That’s where Azure Functions comes in handy. With its event-driven, serverless architecture, Azure Functions now supports a preview extension for MCP, allowing developers to build remote MCP servers that are scalable, secure, and cloud-native. Further, In VS Code, GitHub Copilot Chat in Agent Mode can connect to your deployed Azure Function App acting as an MCP server. This connection allows Copilot to leverage the tools and services exposed by your function app. Why Use Azure Functions for MCP? Serverless Simplicity: Deploy MCP endpoints without managing infrastructure. Secure by Design: Leverage HTTPS, system keys, and OAuth via EasyAuth or API Management. Language Flexibility: Build in .NET, Python, or Node.js using QuickStart templates. AI Integration: Enable GitHub Copilot, VS Code, or other AI agents to invoke your tools via SSE endpoints. Prerequisites Python version 3.11 or higher Azure Functions Core Tools >= 4.0.7030 Azure Developer CLI To use Visual Studio Code to run and debug locally: Visual Studio Code Azure Functions extension An storage emulator is needed when developing azure function app in VScode. you can deploy Azurite extension in VScode to meet this requirement. Press enter or click to view image in full size You can run the Azurite in VS Code as shown below. C:\Program Files\Microsoft Visual Studio\2022\Enterprise\Common7\IDE\Extensions\Microsoft\Azure Storage Emulator> .\azurite.exe Press enter or click to view image in full size alternatively, you can also run Azurite in docker container as shown below. docker run -p 10000:10000 -p 10001:10001 -p 10002:10002 \ mcr.microsoft.com/azure-storage/azurite For more information about setting up Azurite, visit Use Azurite emulator for local Azure Storage development | Microsoft Learn Github Repositories Following Github repos are needed to setup this PoC. Repository for MCP server using Azure Function App https://github.com/mafzal786/mcp-azure-functions-python.git Repository for AI Foundry agent as MCP Client https://github.com/mafzal786/ai-foundry-agent-with-remote-mcp-using-azure-functionapp.git Clone the repository Run the following command to clone the repository to start building your MCP server using Azure function app. git clone https://github.com/mafzal786/mcp-azure-functions-python.git Run the MCP server in VS Code Once cloned. Open the folder in VS Code. Create a virtual environment in VS Code. Change directory to “src” in a new terminal window, install the python dependencies and start the function host locally as shown below. cd src pip install -r requirements.txt func start Note: by default this will use the webhooks route: /runtime/webhooks/mcp/sse. Later we will use this in Azure to set the key on client/host calls: /runtime/webhooks/mcp/sse?code=<system_key> Press enter or click to view image in full size MCP Inspector In a new terminal window, install and run MCP Inspector. npx @modelcontextprotocol/inspector Click to load the MCP inspector. Also provide the generated proxy session token. http://127.0.0.1:6274/#resources In the URL type and click “Connect”: http://localhost:7071/runtime/webhooks/mcp/sse Once connected, click List Tools under Tools and select “hello_mcp” tool and click “Run Tool” for testing as shown below. Press enter or click to view image in full size Select another tool such as get_stockprice and run it as shown below. Press enter or click to view image in full size Deploy Function App to Azure from VS Code For deploying function app to azure from vs code, make sure you have Azure Tools extension enabled in VS Code. To learn more about Azure Tools extension, visit the following Azure Extensions if your VS code environment is not setup for Azure development, follow Configure Visual Studio Code for Azure development with .NET — .NET | Microsoft Learn Once Azure Tools are setup, sign in to Azure account with Azure Tools Press enter or click to view image in full size Once Sign-in is completed, you should be able to see all of your existing resources in the Resources view. These resources can be managed directly in VS Code. Look for Function App in Resource, right click and click “Deploy to Function App”. Press enter or click to view image in full size If you already have it deployed, you will get the following pop-up. Click “Deploy” Press enter or click to view image in full size This will start deploying your function app to Azure. In VS Code, Azure tab will display the following. Press enter or click to view image in full size Once the deployment is completed, you can view the function app and all the tools in Azure portal under function app as shown below. Press enter or click to view image in full size Get the mcp_extension key from Functions → App Keys in Function App. Press enter or click to view image in full size This mcp_extension key would be needed in mcp.json file in VS code, if you would like to test the MCP server using Github Copilot in VS Code. Your entries in mcp.json file will look like as below for example. { "inputs": [ { "type": "promptString", "id": "functions-mcp-extension-system-key", "description": "Azure Functions MCP Extension System Key", "password": true }, { "type": "promptString", "id": "functionapp-name", "description": "Azure Functions App Name" } ], "servers": { "remote-mcp-function": { "type": "sse", "url": "https://${input:functionapp-name}.azurewebsites.net/runtime/webhooks/mcp/sse", "headers": { "x-functions-key": "${input:functions-mcp-extension-system-key}" } }, "local-mcp-function": { "type": "sse", "url": "http://0.0.0.0:7071/runtime/webhooks/mcp/sse" } } } Test Azure Function MCP Server in MCP Inspector Launch MCP Inspector and provide the Azure Function in MCP inspector URL. Provide authentication as shown below. Bearer token is mcp_extension key. Testing an MCP server with GitHub Copilot Testing an MCP server with GitHub Copilot involves configuring and utilizing the server within your development environment to provide enhanced context and capabilities to Copilot Chat. Steps to Test an MCP Server with GitHub Copilot: Ensure Agent Mode is Enabled: Open Copilot Chat in Visual Studio Code and select “Agent” mode. This mode allows Copilot to interact with external tools and services, including MCP servers. Add the MCP Server: Open the Command Palette (Ctrl+Shift+P or Cmd+Shift+P) and run the command MCP: Add Server. Press enter or click to view image in full size Follow the prompts to configure the server. You can choose to add it to your workspace settings (creating a .vscode/mcp.json file) . Select HTTP or Server-Sent events Press enter or click to view image in full size Specify the URL and click Enter Press enter or click to view image in full size Provide a name of your choice Press enter or click to view image in full size Select scope as Global or workspace. I selected Workspace Press enter or click to view image in full size This will generate mcp.json file in .vscode or create a new entry if mcp.json already exists as shown below. Click Start to “start” the server. Also make sure your Azure function app is locally running with func start command. Press enter or click to view image in full size Now Type the prompt as shown below. Press enter or click to view image in full size Try another tool as below. Press enter or click to view image in full size VS code terminal output for reference. Press enter or click to view image in full size Testing an MCP server with Claude Desktop Claude Desktop is a standalone AI application that allows users to interact with Claude AI models directly from their desktop, providing a seamless and efficient experience. you can download Claude desktop at Download Claude In this article, I have added another tool to utilize to test your MCP server running in Azure Function app. Modify claude_desktop_config.json with the following. you can find this file in window environment at C:\Users\<username>\AppData\Roaming\Claude { "mcpServers": { "my mcp": { "command": "npx", "args": [ "mcp-remote", "http://localhost:7071/runtime/webhooks/mcp/sse" ] } } } Note: If claude_desktop_config.json does not exists, click on setting in Claude desktop under user and visit developer tab. You will see you MCP server in Claude Desktop as shown below. Press enter or click to view image in full size Type the prompt such as “What is the stock price of Tesla” . After submitting, you will notice that it is invoking the tool “get_stockprice” from the MCP server running locally and configured in the .json earlier. Click Allow once or Allow always as shown below. Following output will be displayed. Press enter or click to view image in full size Now lets try weather related prompt. As you can see, it has invoked “get_weatheralerts” tool from MCP server. Press enter or click to view image in full size Azure AI Foundry agent as MCP Client Use the following Github repo to set up Azure AI Foundry agent as MCP client. git clone https://github.com/mafzal786/ai-foundry-agent-with-remote-mcp-using-azure-functionapp.git Open the code in VS code and follow the instructions mentioned in README.md file at Github repo. Once you execute the code, following output will show up in VS code. Press enter or click to view image in full size In this code, message is hard coded. Change the content to “what is weather advisory for Florida” and rerun the program. It will call get_weatheralerts tool and output will look like as below. Press enter or click to view image in full size Conclusion The integration of Model Context Protocol (MCP) with Azure Functions marks a pivotal step in democratizing AI agent development. By leveraging Azure’s serverless architecture, developers can now build remote MCP servers that scale automatically, integrate seamlessly with other Azure services, and expose modular tools to intelligent agents like GitHub Copilot. This setup not only simplifies the deployment and management of MCP servers but also enhances the developer experience — allowing tools to be invoked contextually by AI agents in environments like VS Code, GitHub Codespaces, or Copilot Studio[2]. Whether you’re building a tool to query logs, calculate metrics, or manage data, Azure Functions provides the flexibility, security, and scalability needed to bring your AI-powered workflows to life. As the MCP spec continues to evolve, and GitHub Copilot expands its agentic capabilities, this architecture positions you to stay ahead — offering a robust foundation for cloud-native AI tooling that’s both powerful and future-proof.1.4KViews1like1Comment