semantic kernel
58 TopicsStaying in the flow: SleekFlow and Azure turn customer conversations into conversions
A customer adds three items to their cart but never checks out. Another asks about shipping, gets stuck waiting eight minutes, only to drop the call. A lead responds to an offer but is never followed up with in time. Each of these moments represents lost revenue, and they happen to businesses every day. SleekFlow was founded in 2019 to help companies turn those almost-lost-customer moments into connection, retention, and growth. Today we serve more than 2,000 mid-market and enterprise organizations across industries including retail and e-commerce, financial services, healthcare, travel and hospitality, telecommunications, real estate, and professional services. In total, those customers rely on SleekFlow to orchestrate more than 600,000 daily customer interactions across WhatsApp, Instagram, web chat, email, and more. Our name reflects what makes us different. Sleek is about unified, polished experiences—consolidating conversations into one intelligent, enterprise-ready platform. Flow is about orchestration—AI and human agents working together to move each conversation forward, from first inquiry to purchase to renewal. The drive for enterprise-ready agentic AI Enterprises today expect always-on, intelligent conversations—but delivering that at scale proved daunting. When we set out to build AgentFlow, our agentic AI platform, we quickly ran into familiar roadblocks: downtime that disrupted peak-hour interactions, vector search delays that hurt accuracy, and costs that ballooned under multi-tenant workloads. Development slowed from limited compatibility with other technologies, while customer onboarding stalled without clear compliance assurances. To move past these barriers, we needed a foundation that could deliver the performance, trust, and global scale enterprises demand. The platform behind the flow: How Azure powers AgentFlow We chose Azure because building AgentFlow required more than raw compute power. Chatbots built on a single-agent model often stall out. They struggle to retrieve the right context, they miss critical handoffs, and they return answers too slowly to keep a customer engaged. To fix that, we needed an ecosystem capable of supporting a team of specialized AI agents working together at enterprise scale. Azure Cosmos DB provides the backbone for memory and context, managing short-term interactions, long-term histories, and vector embeddings in containers that respond in 15–20 milliseconds. Powered by Azure AI Foundry, our agents use Azure OpenAI models within Azure AI Foundry to understand and generate responses natively in multiple languages. Whether in English, Chinese, or Portuguese, the responses feel natural and aligned with the brand. Semantic Kernel acts as the conductor, orchestrating multiple agents, each of which retrieves the necessary knowledge and context, including chat histories, transactional data, and vector embeddings, directly from Azure Cosmos DB. For example, one agent could be retrieving pricing data, another summarizing it, and a third preparing it for a human handoff. The result is not just responsiveness but accuracy. A telecom provider can resolve a billing question while surfacing an upsell opportunity in the same dialogue. A financial advisor can walk into a call with a complete dossier prepared in seconds rather than hours. A retailer can save a purchase by offering an in-stock substitute before the shopper abandons the cart. Each of these conversations is different, yet the foundation is consistent on AgentFlow. Fast, fluent, and focused: Azure keeps conversations moving Speed is the heartbeat of a good conversation. A delayed answer feels like a dropped call, and an irrelevant one breaks trust. For AgentFlow to keep customers engaged, every operation behind the scenes has to happen in milliseconds. A single interaction can involve dozens of steps. One agent pulls product information from embeddings, another checks it against structured policy data, and a third generates a concise, brand-aligned response. If any of these steps lag, the dialogue falters. On Azure, they don’t. Azure Cosmos DB manages conversational memory and agent state across dedicated containers for short-term exchanges, long-term history, and vector search. Sharded DiskANN indexing powers semantic lookups that resolve in the 15–20 millisecond range—fast enough that the customer never feels a pause. Microsoft Phi’s model Phi-4 as well as Azure OpenAI in Foundry Models like o3-mini and o4-mini, provide the reasoning, and Azure Container Apps scale elastically, so performance holds steady during event-driven bursts, such as campaign broadcasts that can push the platform from a few to thousands of conversations per minute, and during daily peak-hour surges. To support that level of responsiveness, we run Azure Container Apps on the Pay-As-You-Go consumption plan, using KEDA-based autoscaling to expand from five idle containers to more than 160 within seconds. Meanwhile, Microsoft Orleans coordinates lightweight in-memory clustering to keep conversations sleek and flowing. The results are tangible. Retrieval-augmented generation recall improved from 50 to 70 percent. Execution speed is about 50 percent faster. For SleekFlow’s customers, that means carts are recovered before they’re abandoned, leads are qualified in real time, and support inquiries move forward instead of stalling out. With Azure handling the complexity under the hood, conversations flow naturally on the surface—and that’s what keeps customers engaged. Secure enough for enterprises, human enough for customers AgentFlow was built with security-by-design as a first principle, giving businesses confidence that every interaction is private, compliant, and reliable. On Azure, every AI agent operates inside guardrails enterprises can depend on. Azure Cosmos DB enforces strict per-tenant isolation through logical partitioning, encryption, and role-based access control, ensuring chat histories, knowledge bases, and embeddings remain auditable and contained. Models deployed through Azure AI Foundry, including Azure OpenAI and Microsoft Phi, process data entirely within SleekFlow’s Azure environment and guarantees it is never used to train public models, with activity logged for transparency. And Azure’s certifications—including ISO 27001, SOC 2, and GDPR—are backed by continuous monitoring and regional data residency options, proving compliance at a global scale. But trust is more than a checklist of certifications. AgentFlow brings human-like fluency and empathy to every interaction, powered by Azure OpenAI running with high token-per-second throughput so responses feel natural in real time. Quality control isn’t left to chance. Human override workflows are orchestrated through Azure Container Apps and Azure App Service, ensuring AI agents can carry conversations confidently until they’re ready for human agents. Enterprises gain the confidence to let AI handle revenue-critical moments, knowing Azure provides the foundation and SleekFlow provides the human-centered design. Shaping the next era of conversational AI on Azure The benefits of Azure show up not only in customer conversations but also in the way our own teams work. Faster processing speeds and high token-per-second throughput reduce latency, so we spend less time debugging and more time building. Stable infrastructure minimizes downtime and troubleshooting, lowering operational costs. That same reliability and scalability have transformed the way we engineer AgentFlow. AgentFlow started as part of our monolithic system. Shipping new features used to take about a month of development and another week of heavy testing to make sure everything held together. After moving AgentFlow to a microservices architecture on Azure Container Apps, we can now deploy updates almost daily with no down time or customer impact. And this is all thanks to native support for rolling updates and blue-green deployments. This agility is what excites us most about what's ahead. With Azure as our foundation, SleekFlow is not simply keeping pace with the evolution of conversational AI—we are shaping what comes next. Every interaction we refine, every second we save, and every workflow we streamline brings us closer to our mission: keeping conversations sleek, flowing, and valuable for enterprises everywhere.73Views2likes0CommentsImplementing MCP Remote Servers with Azure Function App and GitHub Copilot Integration
Introduction In the evolving landscape of AI-driven applications, the ability to seamlessly connect large language models (LLMs) with external tools and data sources is becoming a cornerstone of intelligent system design. Model Context Protocol (MCP) — a specification that enables AI agents to discover and invoke tools dynamically, based on context. While MCP is powerful, implementing it from scratch can be daunting !!! That’s where Azure Functions comes in handy. With its event-driven, serverless architecture, Azure Functions now supports a preview extension for MCP, allowing developers to build remote MCP servers that are scalable, secure, and cloud-native. Further, In VS Code, GitHub Copilot Chat in Agent Mode can connect to your deployed Azure Function App acting as an MCP server. This connection allows Copilot to leverage the tools and services exposed by your function app. Why Use Azure Functions for MCP? Serverless Simplicity: Deploy MCP endpoints without managing infrastructure. Secure by Design: Leverage HTTPS, system keys, and OAuth via EasyAuth or API Management. Language Flexibility: Build in .NET, Python, or Node.js using QuickStart templates. AI Integration: Enable GitHub Copilot, VS Code, or other AI agents to invoke your tools via SSE endpoints. Prerequisites Python version 3.11 or higher Azure Functions Core Tools >= 4.0.7030 Azure Developer CLI To use Visual Studio Code to run and debug locally: Visual Studio Code Azure Functions extension An storage emulator is needed when developing azure function app in VScode. you can deploy Azurite extension in VScode to meet this requirement. Press enter or click to view image in full size You can run the Azurite in VS Code as shown below. C:\Program Files\Microsoft Visual Studio\2022\Enterprise\Common7\IDE\Extensions\Microsoft\Azure Storage Emulator> .\azurite.exe Press enter or click to view image in full size alternatively, you can also run Azurite in docker container as shown below. docker run -p 10000:10000 -p 10001:10001 -p 10002:10002 \ mcr.microsoft.com/azure-storage/azurite For more information about setting up Azurite, visit Use Azurite emulator for local Azure Storage development | Microsoft Learn Github Repositories Following Github repos are needed to setup this PoC. Repository for MCP server using Azure Function App https://github.com/mafzal786/mcp-azure-functions-python.git Repository for AI Foundry agent as MCP Client https://github.com/mafzal786/ai-foundry-agent-with-remote-mcp-using-azure-functionapp.git Clone the repository Run the following command to clone the repository to start building your MCP server using Azure function app. git clone https://github.com/mafzal786/mcp-azure-functions-python.git Run the MCP server in VS Code Once cloned. Open the folder in VS Code. Create a virtual environment in VS Code. Change directory to “src” in a new terminal window, install the python dependencies and start the function host locally as shown below. cd src pip install -r requirements.txt func start Note: by default this will use the webhooks route: /runtime/webhooks/mcp/sse. Later we will use this in Azure to set the key on client/host calls: /runtime/webhooks/mcp/sse?code=<system_key> Press enter or click to view image in full size MCP Inspector In a new terminal window, install and run MCP Inspector. npx @modelcontextprotocol/inspector Click to load the MCP inspector. Also provide the generated proxy session token. http://127.0.0.1:6274/#resources In the URL type and click “Connect”: http://localhost:7071/runtime/webhooks/mcp/sse Once connected, click List Tools under Tools and select “hello_mcp” tool and click “Run Tool” for testing as shown below. Press enter or click to view image in full size Select another tool such as get_stockprice and run it as shown below. Press enter or click to view image in full size Deploy Function App to Azure from VS Code For deploying function app to azure from vs code, make sure you have Azure Tools extension enabled in VS Code. To learn more about Azure Tools extension, visit the following Azure Extensions if your VS code environment is not setup for Azure development, follow Configure Visual Studio Code for Azure development with .NET — .NET | Microsoft Learn Once Azure Tools are setup, sign in to Azure account with Azure Tools Press enter or click to view image in full size Once Sign-in is completed, you should be able to see all of your existing resources in the Resources view. These resources can be managed directly in VS Code. Look for Function App in Resource, right click and click “Deploy to Function App”. Press enter or click to view image in full size If you already have it deployed, you will get the following pop-up. Click “Deploy” Press enter or click to view image in full size This will start deploying your function app to Azure. In VS Code, Azure tab will display the following. Press enter or click to view image in full size Once the deployment is completed, you can view the function app and all the tools in Azure portal under function app as shown below. Press enter or click to view image in full size Get the mcp_extension key from Functions → App Keys in Function App. Press enter or click to view image in full size This mcp_extension key would be needed in mcp.json file in VS code, if you would like to test the MCP server using Github Copilot in VS Code. Your entries in mcp.json file will look like as below for example. { "inputs": [ { "type": "promptString", "id": "functions-mcp-extension-system-key", "description": "Azure Functions MCP Extension System Key", "password": true }, { "type": "promptString", "id": "functionapp-name", "description": "Azure Functions App Name" } ], "servers": { "remote-mcp-function": { "type": "sse", "url": "https://${input:functionapp-name}.azurewebsites.net/runtime/webhooks/mcp/sse", "headers": { "x-functions-key": "${input:functions-mcp-extension-system-key}" } }, "local-mcp-function": { "type": "sse", "url": "http://0.0.0.0:7071/runtime/webhooks/mcp/sse" } } } Test Azure Function MCP Server in MCP Inspector Launch MCP Inspector and provide the Azure Function in MCP inspector URL. Provide authentication as shown below. Bearer token is mcp_extension key. Testing an MCP server with GitHub Copilot Testing an MCP server with GitHub Copilot involves configuring and utilizing the server within your development environment to provide enhanced context and capabilities to Copilot Chat. Steps to Test an MCP Server with GitHub Copilot: Ensure Agent Mode is Enabled: Open Copilot Chat in Visual Studio Code and select “Agent” mode. This mode allows Copilot to interact with external tools and services, including MCP servers. Add the MCP Server: Open the Command Palette (Ctrl+Shift+P or Cmd+Shift+P) and run the command MCP: Add Server. Press enter or click to view image in full size Follow the prompts to configure the server. You can choose to add it to your workspace settings (creating a .vscode/mcp.json file) . Select HTTP or Server-Sent events Press enter or click to view image in full size Specify the URL and click Enter Press enter or click to view image in full size Provide a name of your choice Press enter or click to view image in full size Select scope as Global or workspace. I selected Workspace Press enter or click to view image in full size This will generate mcp.json file in .vscode or create a new entry if mcp.json already exists as shown below. Click Start to “start” the server. Also make sure your Azure function app is locally running with func start command. Press enter or click to view image in full size Now Type the prompt as shown below. Press enter or click to view image in full size Try another tool as below. Press enter or click to view image in full size VS code terminal output for reference. Press enter or click to view image in full size Testing an MCP server with Claude Desktop Claude Desktop is a standalone AI application that allows users to interact with Claude AI models directly from their desktop, providing a seamless and efficient experience. you can download Claude desktop at Download Claude In this article, I have added another tool to utilize to test your MCP server running in Azure Function app. Modify claude_desktop_config.json with the following. you can find this file in window environment at C:\Users\<username>\AppData\Roaming\Claude { "mcpServers": { "my mcp": { "command": "npx", "args": [ "mcp-remote", "http://localhost:7071/runtime/webhooks/mcp/sse" ] } } } Note: If claude_desktop_config.json does not exists, click on setting in Claude desktop under user and visit developer tab. You will see you MCP server in Claude Desktop as shown below. Press enter or click to view image in full size Type the prompt such as “What is the stock price of Tesla” . After submitting, you will notice that it is invoking the tool “get_stockprice” from the MCP server running locally and configured in the .json earlier. Click Allow once or Allow always as shown below. Following output will be displayed. Press enter or click to view image in full size Now lets try weather related prompt. As you can see, it has invoked “get_weatheralerts” tool from MCP server. Press enter or click to view image in full size Azure AI Foundry agent as MCP Client Use the following Github repo to set up Azure AI Foundry agent as MCP client. git clone https://github.com/mafzal786/ai-foundry-agent-with-remote-mcp-using-azure-functionapp.git Open the code in VS code and follow the instructions mentioned in README.md file at Github repo. Once you execute the code, following output will show up in VS code. Press enter or click to view image in full size In this code, message is hard coded. Change the content to “what is weather advisory for Florida” and rerun the program. It will call get_weatheralerts tool and output will look like as below. Press enter or click to view image in full size Conclusion The integration of Model Context Protocol (MCP) with Azure Functions marks a pivotal step in democratizing AI agent development. By leveraging Azure’s serverless architecture, developers can now build remote MCP servers that scale automatically, integrate seamlessly with other Azure services, and expose modular tools to intelligent agents like GitHub Copilot. This setup not only simplifies the deployment and management of MCP servers but also enhances the developer experience — allowing tools to be invoked contextually by AI agents in environments like VS Code, GitHub Codespaces, or Copilot Studio[2]. Whether you’re building a tool to query logs, calculate metrics, or manage data, Azure Functions provides the flexibility, security, and scalability needed to bring your AI-powered workflows to life. As the MCP spec continues to evolve, and GitHub Copilot expands its agentic capabilities, this architecture positions you to stay ahead — offering a robust foundation for cloud-native AI tooling that’s both powerful and future-proof.812Views1like1CommentThe Future of AI: The paradigm shifts in Generative AI Operations
Dive into the transformative world of Generative AI Operations (GenAIOps) with Microsoft Azure. Discover how businesses are overcoming the challenges of deploying and scaling generative AI applications. Learn about the innovative tools and services Azure AI offers, and how they empower developers to create high-quality, scalable AI solutions. Explore the paradigm shift from MLOps to GenAIOps and see how continuous improvement practices ensure your AI applications remain cutting-edge. Join us on this journey to harness the full potential of generative AI and drive operational excellence.7.3KViews1like1CommentThe Future of AI: Customizing AI agents with the Semantic Kernel agent framework
The blog post Customizing AI agents with the Semantic Kernel agent framework discusses the capabilities of the Semantic Kernel SDK, an open-source tool developed by Microsoft for creating AI agents and multi-agent systems. It highlights the benefits of using single-purpose agents within a multi-agent system to achieve more complex workflows with improved efficiency. The Semantic Kernel SDK offers features like telemetry, hooks, and filters to ensure secure and responsible AI solutions, making it a versatile tool for both simple and complex AI projects.2KViews3likes0CommentsThe Future of AI: Computer Use Agents Have Arrived
Discover the groundbreaking advancements in AI with Computer Use Agents (CUAs). In this blog, Marco Casalaina shares how to use the Responses API from Azure OpenAI Service, showcasing how CUAs can launch apps, navigate websites, and reason through tasks. Learn how CUAs utilize multimodal models for computer vision and AI frameworks to enhance automation. Explore the differences between CUAs and traditional Robotic Process Automation (RPA), and understand how CUAs can complement RPA systems. Dive into the future of automation and see how CUAs are set to revolutionize the way we interact with technology.11KViews6likes0CommentsEssential Microsoft Resources for MVPs & the Tech Community from the AI Tour
Unlock the power of Microsoft AI with redeliverable technical presentations, hands-on workshops, and open-source curriculum from the Microsoft AI Tour! Whether you’re a Microsoft MVP, Developer, or IT Professional, these expertly crafted resources empower you to teach, train, and lead AI adoption in your community. Explore top breakout sessions covering GitHub Copilot, Azure AI, Generative AI, and security best practices—designed to simplify AI integration and accelerate digital transformation. Dive into interactive workshops that provide real-world applications of AI technologies. Take it a step further with Microsoft’s Open-Source AI Curriculum, offering beginner-friendly courses on AI, Machine Learning, Data Science, Cybersecurity, and GitHub Copilot—perfect for upskilling teams and fostering innovation. Don’t just learn—lead. Access these resources, host impactful training sessions, and drive AI adoption in your organization. Start sharing today! Explore now: Microsoft AI Tour Resources.Azure AI Foundry: Advancing OpenTelemetry and delivering unified multi-agent observability
Microsoft is enhancing multi-agent observability by introducing new semantic conventions to OpenTelemetry, developed collaboratively with Outshift, Cisco’s incubation engine. These additions—built upon OpenTelemetry and W3C Trace Context—establish standardized practices for tracing and telemetry within multi-agent systems, facilitating consistent logging of key metrics for quality, performance, safety, and cost. This systematic approach enables more comprehensive visibility into multi-agent workflows, including tool invocations and collaboration. These advancements have been integrated into Azure AI Foundry, Microsoft Agent Framework, Semantic Kernel, and Azure AI packages for LangChain, LangGraph, and the OpenAI Agents SDK, enabling customers to get unified observability for agentic systems built using any of these frameworks with Azure AI Foundry. The additional semantic conventions and integration across different frameworks equip developers to monitor, troubleshoot, and optimize their AI agents in a unified solution with increased efficiency and valuable insights. “Outshift, Cisco's Incubation Engine, worked with Microsoft to add new semantic conventions in OpenTelemetry. These conventions standardize multi-agent observability and evaluation, giving teams comprehensive insights into their AI systems.” Giovanna Carofiglio, Distinguished Engineer, Cisco Multi-agent observability challenges Multi-agent systems involve multiple interacting agents with diverse roles and architectures. Such systems are inherently dynamic—they adapt in real time by decomposing complex tasks into smaller, manageable subtasks and distributing them across specialized agents. Each agent may invoke multiple tools, often in parallel or sequence, to fulfill its part of the task, resulting in emergent workflows that are non-linear and highly context dependent. Given the dynamic nature of multi-agent systems and the management across agents, observability becomes critical for debugging, performance tuning, security, and compliance for such systems. Multi-agent observability presents unique challenges that traditional GenAI telemetry standards fail to address. Traditional observability conventions are optimized for single-agent reasoning path visibility and lack the semantic depth to capture collaborative workflows across multiple agents. In multi-agent systems, tasks are often decomposed and distributed dynamically, requiring visibility into agent roles, task hierarchies, tool usage, and decision-making processes. Without standardized task spans and a unified namespace, it's difficult to trace cross-agent coordination, evaluate task outcomes, or analyze retry patterns. These gaps hinder white-box observability, making it hard to assess performance, safety, and quality across complex agentic workflows. Extending OpenTelemetry with multi-agent observability Microsoft has proposed new spans and attributes to OpenTelemetry semantic convention for GenAI agent and framework spans. They will enrich insights and capture the complexity of agent, tool, tasks and plans interactions in multi-agent systems. Below is a list of all the new additions proposed to OpenTelemetry. New Span/Trace/Attributes Name Purpose New span execute_task Captures task planning and event propagation, providing insights into how tasks are decomposed and distributed. New child spans under “invoke_agent” agent_to_agent_interaction Traces communication between agents agent.state.management Effective context, short or long term memory management agent_planning Logs the agent’s internal planning steps agent orchestration Capture agent-to-agent orchestration New attributes in invoke_agent span tool_definitions Describes the tool’s purpose or configuration llm_spans Records model call spans New attributes in “execute_tool” span tool.call.arguments Logs the arguments passed during tool invocation tool.call.results Records the results returned by the tool New event Evaluation - attributes (name, error.type, label) Enables structured evaluation of agent performance and decision-making More details can be found in the following pull-requests merged into OpenTelemetry Add tool definition plus tool-related attributes in invoke-agent, inference, and execute-tool spans Capture evaluation results for GenAI applications Azure AI Foundry delivers unified observability for Microsoft Agent Framework, LangChain, LangGraph, OpenAI Agents SDK Agents built with Azure AI Foundry (SDK or portal) get out-of-the box observability in Foundry. With the new addition, agents built on different frameworks including Microsoft Agent Framework, Semantic Kernel, LangChain, LangGraph and OpenAI Agents SDK can use Foundry for monitoring, analyzing, debugging and evaluation with full observability. Agents built on Microsoft Agent Framework and Semantic Kernel get out-of-box tracing and evaluations support in Foundry Observability. Agents built with LangChain, LangGraph and OpenAI Agents SDK can use the corresponding packages and detailed instructions listed in the documentation to get tracing and evaluations support in Foundry Observability. Customer benefits With standardized multi-agent observability and support across multiple agent frameworks, customers get the following benefits: Unified Observability Platform for Multi-agent systems: Foundry Observability is the unified multi-agent observability platform for agents built with Foundry SDK or other agent frameworks like Microsoft Agent Framework, LangGraph, Lang Chain, OpenAI SDK. End-to-End Visibility into multi-agent systems: Customers can see not just what the system did, but how and why—from user request, through agent collaboration, tool usage, to final output. Faster Debugging & Root Cause Analysis: When something goes wrong (e.g., wrong answer, safety violation, performance bottleneck), customers can trace the exact path, see which agent/tool/task failed, and why. Quality & Safety Assurance: Built-in metrics and evaluation events (e.g. task success and validation scores) help customers monitor and improve the quality and safety of their AI workflows. Cost & Performance Optimization: Detailed metrics on token usage, API calls, resource consumption, and latency help customers optimize efficiency and cost. Get Started today with end-to-end agent observability with Foundry Azure AI Foundry Observability is a unified solution for evaluating, monitoring, tracing, and governing the quality, performance, and safety of your AI systems end-to-end— all built into your AI development loop and backed by the power of Azure Monitor for full stack observability. From model selection to real-time debugging, Foundry Observability capabilities empower teams to ship production-grade AI with confidence and speed. It’s observability, reimagined for the enterprise AI era. With the above OpenTelemetry enhancements Azure AI Foundry now provides detailed multi-agent observability for agents built with different frameworks including Azure AI Foundry, Microsoft Agent Framework, LangChain, LangGraph and OpenAI Agents SDK. Learn more about Azure AI Foundry Observability and get end-to-end agent observability today!4.3KViews4likes0CommentsBuilding AI Apps with the Foundry Local C# SDK
What Is Foundry Local? Foundry Local is a lightweight runtime designed to run AI models directly on user devices. It supports a wide range of hardware (CPU, GPU, NPU) and provides a consistent developer experience across platforms. The SDKs are available in multiple languages, including Python, JavaScript, Rust, and now C#. Why a C# SDK? The C# SDK brings Foundry Local into the heart of the .NET ecosystem. It allows developers to: Download and manage models locally. Run inference using OpenAI-compatible APIs. Integrate seamlessly with existing .NET applications. This means you can build intelligent apps that run offline, reduce latency, and maintain data privacy—all without sacrificing developer productivity. Bootstrap Process: How the SDK Gets You Started One of the most developer-friendly aspects of the C# SDK is its automatic bootstrap process. Here's what happens under the hood when you initialise the SDK: Service Discovery and Startup The SDK automatically locates the Foundry Local installation on the device and starts the inference service if it's not already running. Model Download and Caching If the specified model isn't already cached locally, the SDK will download the most performant model variant (e.g. GPU, CPU, NPU) for the end user's hardware from the Foundry model catalog. This ensures you're always working with the latest optimised version. Model Loading into Inference Service Once downloaded (or retrieved from cache), the model is loaded into the Foundry Local inference engine, ready to serve requests. This streamlined process means developers can go from zero to inference with just a few lines of code—no manual setup or configuration required. Leverage Your Existing AI Stack One of the most exciting aspects of the Foundry Local C# SDK is its compatibility with popular AI tools such as: OpenAI SDK - Foundry local provides an OpenAI compliant chat completions (and embedding) API meaning. If you’re already using `OpenAI` chat completions API, you can reuse your existing code with minimal changes. Semantic Kernel - Foundry Local also integrates well with Semantic Kernel, Microsoft’s open-source framework for building AI agents. You can use Foundry Local models as plugins or endpoints within Semantic Kernel workflows—enabling advanced capabilities like memory, planning, and tool calling. Quick Start Example Follow these three steps: 1. Create a new project Create a new C# project and navigate to it: dotnet new console -n hello-foundry-local cd hello-foundry-local 2. Install NuGet packages Install the following NuGet packages into your project: dotnet add package Microsoft.AI.Foundry.Local --version 0.1.0 dotnet add package OpenAI --version 2.2.0-beta.4 3. Use the OpenAI SDK with Foundry Local The following example demonstrates how to use the OpenAI SDK with Foundry Local. The code initializes the Foundry Local service, loads a model, and generates a response using the OpenAI SDK. Copy-and-paste the following code into a C# file named Program.cs: using Microsoft.AI.Foundry.Local; using OpenAI; using OpenAI.Chat; using System.ClientModel; using System.Diagnostics.Metrics; var alias = "phi-3.5-mini"; var manager = await FoundryLocalManager.StartModelAsync(aliasOrModelId: alias); var model = await manager.GetModelInfoAsync(aliasOrModelId: alias); ApiKeyCredential key = new ApiKeyCredential(manager.ApiKey); OpenAIClient client = new OpenAIClient(key, new OpenAIClientOptions { Endpoint = manager.Endpoint }); var chatClient = client.GetChatClient(model?.ModelId); var completionUpdates = chatClient.CompleteChatStreaming("Why is the sky blue'"); Console.Write($"[ASSISTANT]: "); foreach (var completionUpdate in completionUpdates) { if (completionUpdate.ContentUpdate.Count > 0) { Console.Write(completionUpdate.ContentUpdate[0].Text); } } Run the code using the following command: dotnet run Final thoughts The Foundry Local C# SDK empowers developers to build intelligent, privacy-preserving applications that run anywhere. Whether you're working on desktop, mobile, or embedded systems, this SDK offers a robust and flexible way to bring AI closer to your users. Ready to get started? Dive into the official documentation: Getting started guide C# Reference documentation You can also make contributions to the C# SDK by creating a PR on GitHub: Foundry Local on GitHub476Views0likes0Comments