genaiops
15 TopicsImplementing MCP Remote Servers with Azure Function App and GitHub Copilot Integration
Introduction In the evolving landscape of AI-driven applications, the ability to seamlessly connect large language models (LLMs) with external tools and data sources is becoming a cornerstone of intelligent system design. Model Context Protocol (MCP) — a specification that enables AI agents to discover and invoke tools dynamically, based on context. While MCP is powerful, implementing it from scratch can be daunting !!! That’s where Azure Functions comes in handy. With its event-driven, serverless architecture, Azure Functions now supports a preview extension for MCP, allowing developers to build remote MCP servers that are scalable, secure, and cloud-native. Further, In VS Code, GitHub Copilot Chat in Agent Mode can connect to your deployed Azure Function App acting as an MCP server. This connection allows Copilot to leverage the tools and services exposed by your function app. Why Use Azure Functions for MCP? Serverless Simplicity: Deploy MCP endpoints without managing infrastructure. Secure by Design: Leverage HTTPS, system keys, and OAuth via EasyAuth or API Management. Language Flexibility: Build in .NET, Python, or Node.js using QuickStart templates. AI Integration: Enable GitHub Copilot, VS Code, or other AI agents to invoke your tools via SSE endpoints. Prerequisites Python version 3.11 or higher Azure Functions Core Tools >= 4.0.7030 Azure Developer CLI To use Visual Studio Code to run and debug locally: Visual Studio Code Azure Functions extension An storage emulator is needed when developing azure function app in VScode. you can deploy Azurite extension in VScode to meet this requirement. Press enter or click to view image in full size You can run the Azurite in VS Code as shown below. C:\Program Files\Microsoft Visual Studio\2022\Enterprise\Common7\IDE\Extensions\Microsoft\Azure Storage Emulator> .\azurite.exe Press enter or click to view image in full size alternatively, you can also run Azurite in docker container as shown below. docker run -p 10000:10000 -p 10001:10001 -p 10002:10002 \ mcr.microsoft.com/azure-storage/azurite For more information about setting up Azurite, visit Use Azurite emulator for local Azure Storage development | Microsoft Learn Github Repositories Following Github repos are needed to setup this PoC. Repository for MCP server using Azure Function App https://github.com/mafzal786/mcp-azure-functions-python.git Repository for AI Foundry agent as MCP Client https://github.com/mafzal786/ai-foundry-agent-with-remote-mcp-using-azure-functionapp.git Clone the repository Run the following command to clone the repository to start building your MCP server using Azure function app. git clone https://github.com/mafzal786/mcp-azure-functions-python.git Run the MCP server in VS Code Once cloned. Open the folder in VS Code. Create a virtual environment in VS Code. Change directory to “src” in a new terminal window, install the python dependencies and start the function host locally as shown below. cd src pip install -r requirements.txt func start Note: by default this will use the webhooks route: /runtime/webhooks/mcp/sse. Later we will use this in Azure to set the key on client/host calls: /runtime/webhooks/mcp/sse?code=<system_key> Press enter or click to view image in full size MCP Inspector In a new terminal window, install and run MCP Inspector. npx @modelcontextprotocol/inspector Click to load the MCP inspector. Also provide the generated proxy session token. http://127.0.0.1:6274/#resources In the URL type and click “Connect”: http://localhost:7071/runtime/webhooks/mcp/sse Once connected, click List Tools under Tools and select “hello_mcp” tool and click “Run Tool” for testing as shown below. Press enter or click to view image in full size Select another tool such as get_stockprice and run it as shown below. Press enter or click to view image in full size Deploy Function App to Azure from VS Code For deploying function app to azure from vs code, make sure you have Azure Tools extension enabled in VS Code. To learn more about Azure Tools extension, visit the following Azure Extensions if your VS code environment is not setup for Azure development, follow Configure Visual Studio Code for Azure development with .NET — .NET | Microsoft Learn Once Azure Tools are setup, sign in to Azure account with Azure Tools Press enter or click to view image in full size Once Sign-in is completed, you should be able to see all of your existing resources in the Resources view. These resources can be managed directly in VS Code. Look for Function App in Resource, right click and click “Deploy to Function App”. Press enter or click to view image in full size If you already have it deployed, you will get the following pop-up. Click “Deploy” Press enter or click to view image in full size This will start deploying your function app to Azure. In VS Code, Azure tab will display the following. Press enter or click to view image in full size Once the deployment is completed, you can view the function app and all the tools in Azure portal under function app as shown below. Press enter or click to view image in full size Get the mcp_extension key from Functions → App Keys in Function App. Press enter or click to view image in full size This mcp_extension key would be needed in mcp.json file in VS code, if you would like to test the MCP server using Github Copilot in VS Code. Your entries in mcp.json file will look like as below for example. { "inputs": [ { "type": "promptString", "id": "functions-mcp-extension-system-key", "description": "Azure Functions MCP Extension System Key", "password": true }, { "type": "promptString", "id": "functionapp-name", "description": "Azure Functions App Name" } ], "servers": { "remote-mcp-function": { "type": "sse", "url": "https://${input:functionapp-name}.azurewebsites.net/runtime/webhooks/mcp/sse", "headers": { "x-functions-key": "${input:functions-mcp-extension-system-key}" } }, "local-mcp-function": { "type": "sse", "url": "http://0.0.0.0:7071/runtime/webhooks/mcp/sse" } } } Test Azure Function MCP Server in MCP Inspector Launch MCP Inspector and provide the Azure Function in MCP inspector URL. Provide authentication as shown below. Bearer token is mcp_extension key. Testing an MCP server with GitHub Copilot Testing an MCP server with GitHub Copilot involves configuring and utilizing the server within your development environment to provide enhanced context and capabilities to Copilot Chat. Steps to Test an MCP Server with GitHub Copilot: Ensure Agent Mode is Enabled: Open Copilot Chat in Visual Studio Code and select “Agent” mode. This mode allows Copilot to interact with external tools and services, including MCP servers. Add the MCP Server: Open the Command Palette (Ctrl+Shift+P or Cmd+Shift+P) and run the command MCP: Add Server. Press enter or click to view image in full size Follow the prompts to configure the server. You can choose to add it to your workspace settings (creating a .vscode/mcp.json file) . Select HTTP or Server-Sent events Press enter or click to view image in full size Specify the URL and click Enter Press enter or click to view image in full size Provide a name of your choice Press enter or click to view image in full size Select scope as Global or workspace. I selected Workspace Press enter or click to view image in full size This will generate mcp.json file in .vscode or create a new entry if mcp.json already exists as shown below. Click Start to “start” the server. Also make sure your Azure function app is locally running with func start command. Press enter or click to view image in full size Now Type the prompt as shown below. Press enter or click to view image in full size Try another tool as below. Press enter or click to view image in full size VS code terminal output for reference. Press enter or click to view image in full size Testing an MCP server with Claude Desktop Claude Desktop is a standalone AI application that allows users to interact with Claude AI models directly from their desktop, providing a seamless and efficient experience. you can download Claude desktop at Download Claude In this article, I have added another tool to utilize to test your MCP server running in Azure Function app. Modify claude_desktop_config.json with the following. you can find this file in window environment at C:\Users\<username>\AppData\Roaming\Claude { "mcpServers": { "my mcp": { "command": "npx", "args": [ "mcp-remote", "http://localhost:7071/runtime/webhooks/mcp/sse" ] } } } Note: If claude_desktop_config.json does not exists, click on setting in Claude desktop under user and visit developer tab. You will see you MCP server in Claude Desktop as shown below. Press enter or click to view image in full size Type the prompt such as “What is the stock price of Tesla” . After submitting, you will notice that it is invoking the tool “get_stockprice” from the MCP server running locally and configured in the .json earlier. Click Allow once or Allow always as shown below. Following output will be displayed. Press enter or click to view image in full size Now lets try weather related prompt. As you can see, it has invoked “get_weatheralerts” tool from MCP server. Press enter or click to view image in full size Azure AI Foundry agent as MCP Client Use the following Github repo to set up Azure AI Foundry agent as MCP client. git clone https://github.com/mafzal786/ai-foundry-agent-with-remote-mcp-using-azure-functionapp.git Open the code in VS code and follow the instructions mentioned in README.md file at Github repo. Once you execute the code, following output will show up in VS code. Press enter or click to view image in full size In this code, message is hard coded. Change the content to “what is weather advisory for Florida” and rerun the program. It will call get_weatheralerts tool and output will look like as below. Press enter or click to view image in full size Conclusion The integration of Model Context Protocol (MCP) with Azure Functions marks a pivotal step in democratizing AI agent development. By leveraging Azure’s serverless architecture, developers can now build remote MCP servers that scale automatically, integrate seamlessly with other Azure services, and expose modular tools to intelligent agents like GitHub Copilot. This setup not only simplifies the deployment and management of MCP servers but also enhances the developer experience — allowing tools to be invoked contextually by AI agents in environments like VS Code, GitHub Codespaces, or Copilot Studio[2]. Whether you’re building a tool to query logs, calculate metrics, or manage data, Azure Functions provides the flexibility, security, and scalability needed to bring your AI-powered workflows to life. As the MCP spec continues to evolve, and GitHub Copilot expands its agentic capabilities, this architecture positions you to stay ahead — offering a robust foundation for cloud-native AI tooling that’s both powerful and future-proof.840Views1like1CommentThe Future of AI: The paradigm shifts in Generative AI Operations
Dive into the transformative world of Generative AI Operations (GenAIOps) with Microsoft Azure. Discover how businesses are overcoming the challenges of deploying and scaling generative AI applications. Learn about the innovative tools and services Azure AI offers, and how they empower developers to create high-quality, scalable AI solutions. Explore the paradigm shift from MLOps to GenAIOps and see how continuous improvement practices ensure your AI applications remain cutting-edge. Join us on this journey to harness the full potential of generative AI and drive operational excellence.7.3KViews1like1CommentThe Future of AI: Maximize your fine-tuned model performance with the new Azure AI Evaluation SDK
In this article, we will explore how to effectively evaluate fine-tuned AI models using the new Azure AI Evaluation SDK. This comprehensive guide is the fourth part of our series on making large language model distillation easier. We delve into the importance of model evaluation, outline a systematic process for assessing the performance of a distilled student model against a baseline model, and demonstrate the use of advanced metrics provided by Azure's SDK. Join us as we navigate the intricacies of AI evaluation and provide insights for continuous model improvement and operational efficiency.1.6KViews1like0CommentsThe Future Of AI: Deconstructing Contoso Chat - Learning GenAIOps in practice
How can AI engineers build applied knowledge for GenAIOps practices? By deconstructing working samples! In this multi-part series, we deconstruct Contoso Chat (a RAG-based retail copilot sample) and use it to learn the tools and workflows to streamline out end-to-end developer journey using Azure AI Foundry.910Views0likes0CommentsThe Future of AI: Reduce AI Provisioning Effort - Jumpstart your solutions with AI App Templates
In the previous post, we introduced Contoso Chat – an open-source RAG-based retail chat sample for Azure AI Foundry, that serves as both an AI App template (for builders) and the basis for a hands-on workshop (for learners). And we briefly talked about five stages in the developer workflow (provision, setup, ideate, evaluate, deploy) that take them from the initial prompt to a deployed product. But how can that sample help you build your app? The answer lies in developer tools and AI App templates that jumpstart productivity by giving you a fast start and a solid foundation to build on. In this post, we answer that question with a closer look at Azure AI App templates - what they are, and how we can jumpstart our productivity with a reuse-and-extend approach that builds on open-source samples for core application architectures.486Views0likes0CommentsThe Future of AI: Developing Lacuna - an agent for Revealing Quiet Assumptions in Product Design
A conversational agent named Lacuna is helping product teams uncover hidden assumptions embedded in design decisions. Built with Copilot Studio and powered by Azure AI Foundry, Lacuna analyzes product documents to identify speculative beliefs and assess their risk using design analysis lenses: impact, confidence, and reversibility. By surfacing cognitive biases and prompting reflection, Lacuna encourages teams to validate assumptions through lightweight evidence-gathering methods. This experiment in human-AI collaboration explores how agents can foster epistemic humility and transform static documents into dynamic conversations.534Views1like1CommentThe Future of AI: Creating a Web Application with Vibe Coding
Discover how vibe coding with GPT-5 in Azure AI Foundry transforms web development. This post walks through building a Translator API-powered web app using natural language instructions in Visual Studio Code. Learn how adaptive translation, tone and gender customization, and Copilot agent collaboration redefine the developer experience.752Views0likes0CommentsAutomate Quota Discovery in Azure AI Foundry: A Tale of 3 APIs
Automate the discovery of Azure regions that meet your AI deployment needs using three essential APIs: Models API, Usages API, and Locations API. This process helps reduce decision fatigue and ensures compliance with enterprise-wide model deployment standards. Key learnings: Model Deployment Requirements: Understand the needs of a standard Retrieval-Augmented Generation (RAG) application, which involves deploying multiple models. Automation Benefits: Streamline your deployment process and ensure compliance with enterprise standards. Three Essential APIs: Models API: Query available models for a specific subscription within a chosen location. Usages API: Assess current usages and limits to infer available quotas. Locations API: Obtain a list of all available regions. A comprehensive Jupyter notebook with the implementation steps is available in the accompanying GitHub repository. This resource is invaluable for AI developers looking to streamline their deployment processes and ensure their applications meet all necessary requirements589Views3likes0CommentsNew evaluation tools for multimodal apps, benchmarking, CI/CD integration and more
If not designed carefully, GenAI applications can produce outputs that have errors, lack grounding in verifiable data, or are simply irrelevant or incoherent, resulting in poor customer experiences and attrition. Even worse, an application’s outputs could perpetuate bias, promote misinformation, or expose organizations to malicious attacks. By conducting proactive risk evaluations throughout the GenAIOps lifecycle, organizations can better-understand and mitigate risks to achieve more secure, safe, and trustworthy customer experiences. Whether you’re evaluating and comparing models at the start of an AI project or running a final evaluation of your application to demonstrate production-readiness, every evaluation has these key components: the evaluation target, whether a base model or an application in development or in production, it’s the thing you’re trying to assess, the evaluation data, comprised of inputs and generated outputs that form the basis of evaluation, and evaluators, or metrics, that help measure and compare performance in a consistent, interpretable way. Today, we’re excited to announce enhancements across these key components, making evaluations in Azure AI Foundry even more comprehensive and accessible for a broad set of generative AI use cases. Here’s a quick summary before we dive into details: Simplify model selection with enhanced benchmarks and model evaluations We’ve enhanced the model benchmarking experience in Azure AI Foundry, adding new performance metrics (e.g. latency, estimated cost, and throughput) and generation quality metrics. This allows users to compare base models across diverse criteria, to better understand potential trade-offs. Evaluate and compare base models using your own private data. This capability simplifies the model selection process by allowing organizations to compare how different models behave in real-world settings and assess which models align best with their unique requirements. Drive robust, measurable insights with new and advanced evaluators New risk and safety evaluations for image and multimodal content provide an out-of-the-box way to assess the frequency and severity of harmful content in generative AI interactions containing imagery. These evaluations can help inform targeted mitigations and demonstrate production-readiness. Evaluations for quality metrics are now generally available for text-based generative AI models and apps. Using either no-code and/or code-first experiences, users can assess generative AI models and applications for key quality attributes such as groundedness, coherence, recall, and fluency. Operationalize evaluations as part of your GenAIOps A new Python API allows developers to run built-in and custom text-based evaluations remotely in the cloud, streamlining the evaluation process at scale with the convenience of easy CI/CD integration. GitHub Actions for GenAI evaluations enable developers to use GitHub Actions to run automated evaluations of their models and applications, for faster experimentation and iteration within their coding environment. In related news, continuous online evaluations of generated outputs are now available, allowing teams to monitor and improve AI applications in production. Additionally, as applications transition from development to production, developers will soon have the capability to document and share evaluation results along with other key information about their fine-tuned models or applications through AI reports. With these expanded capabilities, cross-functional teams are empowered to iterate, launch, and govern their GenAI applications with greater observability and confidence. New benchmarking experience in Azure AI Foundry Picture this: You’re a developer exploring the Azure AI model catalog, trying to find the right fit for your use case. You use search filters, explore available models, and read the model cards to identify strong contenders, but you’re still not sure which model to choose. Why? Selecting the optimal model for an application isn't just about learning as much as you can about each individual model. Organizations need to understand and compare performance from multiple angles—accuracy, relevance, coherence, cost, and computational efficiency—to understand the trade-offs. Now, an enhanced benchmarking experience enables developers to view comprehensive, detailed performance data for models in the Azure AI model catalog while also allowing for direct comparison across multiple models. This provides developers with a clearer picture of each model’s relative performance across critical performance metrics to identify models that meet business requirements. Azure AI Foundry supports four categories of metrics to facilitate robust comparisons: Quality: Assess the accuracy, groundedness, coherence, and relevance of each model’s output. Cost: Assess estimated costs associated with deploying and running the models. Latency: Assess the response times for each model to understand speed and responsiveness. Throughput: Assess the number of tasks each model can process within a specific time frame, to gauge scalability and efficiency. Learn more in our documentation. within the Azure AI Foundry portal Evaluate and compare models using your own data Once you have compared various models using benchmarks on public data, you might still be wondering which model will perform best for your specific use case. At this point, it would be more helpful to compare each model using your own test dataset that reflects the inputs and outputs typical of your intended use case. We’re excited to provide developers with an easier way to do just that. Now, developers can easily evaluate and compare both base models and fine-tuned models from within the Azure AI Foundry portal. This is also helpful when comparing base models to fine-tuned models, to see the impact of your training data. With this update, developers can assess models using their own test data and pre-built quality and safety evaluators, for easier side-by-side model comparisons and data-driven decisions when building GenAI applications. Key components of this update, now available in public preview, include: A new entry point in the Azure AI model catalog to guide users through model evaluation. Expanded support for Azure OpenAI Service and Models as a Service (Maas) models, so developers can evaluate these models and user-defined prompts directly within Azure AI Foundry portal. Simplified evaluation setup wizard, so both experienced GenAI developers and those new to GenAI can navigate and evaluate models with ease. New tool for real-time test data generation, helping developers rapidly create sample data for evaluation purposes. Enhanced evaluation results page to help developers visualize and quickly grasp the tradeoffs between various evaluation metrics. Learn more in our documentation. Evaluate for risk and safety in image and multimodal content Risk and safety evaluations for images and multimodal content is now available in public preview in Azure AI Foundry. These evaluations can help organizations assess the frequency and severity of harmful content in human and AI-generated outputs to prioritize relevant risk mitigations. For example, these evaluations can help assess content risks in cases where 1) text inputs yield image outputs, 2) a combination of image and text inputs produce text outputs, and 3) images containing text (like memes) generate text and/or image outputs. Azure AI Foundry provides AI-assisted evaluators to streamline these evaluations at scale, where each evaluator functions like a grading assistant, using consistent and predefined grading instructions to assess large datasets of inputs and outputs across specific target metrics. Today, organizations can use these evaluations to assess generated outputs for hateful or unfair, violent, sexual, and self-harm-related content, as well as protected materials that may present infringement risks. These evaluators use a large multimodal language model hosted by Microsoft to not only grade the test datasets but also provide explanations for the evaluation results so they are interpretable and actionable. Making evaluations actionable is essential. Evaluation insights can help organizations compare base models and fine-tuned models to see which models are a better fit for their application. Or, they can help inform proactive steps to mitigate risk, such as activating image and multimodal content filters in Azure AI Content Safety to detect and block harmful content in real-time. After making changes, users can re-run an evaluation and compare the new scores to their baseline results side-by-side to understand the impact of their work and demonstrate production readiness for stakeholders. Learn more in our documentation. Evaluate GenAI models and applications for quality We’re excited to announce the general availability of quality evaluators for GenAI in Azure AI Foundry, accessible through the code-first Azure AI Foundry SDK experience and no-code Azure AI Foundry portal. These evaluators provide a scalable way to assess models and applications against key performance and quality metrics. This update also includes improvements to pre-existing AI-assisted metrics as well as explanations for evaluation results to help ensure they are interpretable and actionable. Generally available evaluators include: AI-assisted evaluators (these require an Azure OpenAI deployment to assist the evaluation), which are commonly used for retrieval augmented generation (RAG) and business and creative writing scenarios: • Groundedness • Retrieval • Relevance • Coherence • Fluency • Similarity Natural Language Processing (NLP) evaluators, which support assessments for the accuracy, precision, and recall of generative AI: • F1 score • ROUGE score • BLEU score • GLEU score • METEOR score Learn more in our documentation. Announcing a Python API for remote evaluation Previously, developers could only run local evaluations on their own machines when using the Azure AI Foundry SDK. Now, we're providing developers with a new, simplified Python API to run remote evaluations in the cloud. This API supports both built-in and custom prompt-based evaluators, allowing for scalable evaluation runs, seamless integration into CI/CD pipelines, and a more streamlined evaluation workflow. Plus, remote evaluation means developers don’t need to manage their own infrastructure for orchestrating evaluations. Instead, they can offload the task to Azure. Learn more in our documentation. GitHub Actions for GenAI evaluations are now available Given trade-offs between business impact, risk and cost, you need to be able to continuously evaluate your AI applications and run A/B experiments at scale. We are significantly simplifying this process with GitHub Actions that can be integrated seamlessly into existing CI/CD workflows in GitHub. With these actions, you can now run automated evaluations after each commit, using the Azure AI Foundry SDK to assess your applications for metrics such as groundedness, coherence, and fluency. First announced at GitHub Universe in October, these capabilities are now available in public preview. GitHub Actions for online A/B experimentation are available to try in private preview. These enable developers to seamlessly and automatically run A/B experiments comparing different models, prompts, and/or general UX changes to an AI application after deploying to production as part of a CD workflow. Analysis via out-of-the-box model monitoring metrics and custom metrics is seamless, with results posted back directly to GitHub. To participate in the private preview please sign up here. Build production-ready GenAI apps with Azure AI Foundry Want to learn about more ways to build trustworthy AI applications? Here are other exciting announcements from Microsoft Ignite to support your GenAIOps and governance workflows: Explore tracing and debugging capabilities to drive continuous improvement Monitor and improve GenAI apps in production Document and share evaluation results with business stakeholders Whether you’re joining in person or online, we can’t wait to see you at Microsoft Ignite 2024. We’ll share the latest from Azure AI and go deeper into best practices for evaluations and trustworthy AI in these sessions: Microsoft Ignite Keynote Trustworthy AI: Future trends and best practices Trustworthy AI: Advanced risk evaluation and mitigation Azure AI and the dev toolchain you need to infuse AI in all your apps Simulate, evaluate, and improve GenAI outputs with Azure AI Foundry _________ Please note: This article was edited on Dec 30, 2024 to reflect the availability of risk and safety evaluations for images in public preview in Azure AI Foundry. This feature was previously announced as "coming soon" at Microsoft Ignite.4.4KViews0likes0Comments