azure ai model

19 Topics

Unleashing the Power of Model Context Protocol (MCP): A Game-Changer in AI Integration
Artificial Intelligence is evolving rapidly, and one of the most pressing challenges is enabling AI models to interact effectively with external tools, data sources, and APIs. The Model Context Protocol (MCP) solves this problem by acting as a bridge between AI models and external services, creating a standardized communication framework that enhances tool integration, accessibility, and AI reasoning capabilities. What is Model Context Protocol (MCP)? MCP is a protocol designed to enable AI models, such as Azure OpenAI models, to interact seamlessly with external tools and services. Think of MCP as a universal USB-C connector for AI, allowing language models to fetch information, interact with APIs, and execute tasks beyond their built-in knowledge. Key Features of MCP Standardized Communication – MCP provides a structured way for AI models to interact with various tools. Tool Access & Expansion – AI assistants can now utilize external tools for real-time insights. Secure & Scalable – Enables safe and scalable integration with enterprise applications. Multi-Modal Integration – Supports STDIO, SSE (Server-Sent Events), and WebSocket communication methods. MCP Architecture & How It Works MCP follows a client-server architecture that allows AI models to interact with external tools efficiently. Here’s how it works: Components of MCP MCP Host – The AI model (e.g., Azure OpenAI GPT) requesting data or actions. MCP Client – An intermediary service that forwards the AI model's requests to MCP servers. MCP Server – Lightweight applications that expose specific capabilities (APIs, databases, files, etc.). Data Sources – Various backend systems, including local storage, cloud databases, and external APIs. Data Flow in MCP The AI model sends a request (e.g., "fetch user profile data"). The MCP client forwards the request to the appropriate MCP server. The MCP server retrieves the required data from a database or API. The response is sent back to the AI model via the MCP client. Integrating MCP with Azure OpenAI Services Microsoft has integrated MCP with Azure OpenAI Services, allowing GPT models to interact with external services and fetch live data. This means AI models are no longer limited to static knowledge but can access real-time information. Benefits of Azure OpenAI Services + MCP Integration ✔ Real-time Data Fetching – AI assistants can retrieve fresh information from APIs, databases, and internal systems. ✔ Contextual AI Responses – Enhances AI responses by providing accurate, up-to-date information. ✔ Enterprise-Ready – Secure and scalable for business applications, including finance, healthcare, and retail. Hands-On Tools for MCP Implementation To implement MCP effectively, Microsoft provides two powerful tools: Semantic Workbench and AI Gateway. Microsoft Semantic Workbench A development environment for prototyping AI-powered assistants and integrating MCP-based functionalities. Features: Build and test multi-agent AI assistants. Configure settings and interactions between AI models and external tools. Supports GitHub Codespaces for cloud-based development. Explore Semantic Workbench Workbench interface examples Microsoft AI Gateway A plug-and-play interface that allows developers to experiment with MCP using Azure API Management. Features: Credential Manager – Securely handle API credentials. Live Experimentation – Test AI model interactions with external tools. Pre-built Labs – Hands-on learning for developers. Explore AI Gateway Setting Up MCP with Azure OpenAI Services Step 1: Create a Virtual Environment First, create a virtual environment using Python: python -m venv .venv Activate the environment: # Windows venv\Scripts\activate # MacOS/Linux source .venv/bin/activate Step 2: Install Required Libraries Create a requirements.txt file and add the following dependencies: langchain-mcp-adapters langgraph langchain-openai Then, install the required libraries: pip install -r requirements.txt Step 3: Set Up OpenAI API Key Ensure you have your OpenAI API key set up: # Windows setx OPENAI_API_KEY "<your_api_key> # MacOS/Linux export OPENAI_API_KEY=<your_api_key> Building an MCP Server This server performs basic mathematical operations like addition and multiplication. Create the Server File First, create a new Python file: touch math_server.py Then, implement the server: from mcp.server.fastmcp import FastMCP # Initialize the server mcp = FastMCP("Math") MCP.tool() def add(a: int, b: int) -> int: return a + b MCP.tool() def multiply(a: int, b: int) -> int: return a * b if __name__ == "__main__": mcp.run(transport="stdio") Your MCP server is now ready to run. Building an MCP Client This client connects to the MCP server and interacts with it. Create the Client File First, create a new file: touch client.py Then, implement the client: import asyncio from mcp import ClientSession, StdioServerParameters from langchain_openai import ChatOpenAI from mcp.client.stdio import stdio_client # Define server parameters server_params = StdioServerParameters( command="python", args=["math_server.py"], ) # Define the model model = ChatOpenAI(model="gpt-4o") async def run_agent(): async with stdio_client(server_params) as (read, write): async with ClientSession(read, write) as session: await session.initialize() tools = await load_mcp_tools(session) agent = create_react_agent(model, tools) agent_response = await agent.ainvoke({"messages": "what's (4 + 6) x 14?"}) return agent_response["messages"][3].content if __name__ == "__main__": result = asyncio.run(run_agent()) print(result) Your client is now set up and ready to interact with the MCP server. Running the MCP Server and Client Step 1: Start the MCP Server Open a terminal and run: python math_server.py This starts the MCP server, making it available for client connections. Step 2: Run the MCP Client In another terminal, run: python client.py Expected Output 140 This means the AI agent correctly computed (4 + 6) x 14 using both the MCP server and GPT-4o. Conclusion Integrating MCP with Azure OpenAI Services enables AI applications to securely interact with external tools, enhancing functionality beyond text-based responses. With standardized communication and improved AI capabilities, developers can build smarter and more interactive AI-powered solutions. By following this guide, you can set up an MCP server and client, unlocking the full potential of AI with structured external interactions. Next Steps: Explore more MCP tools and integrations. Extend your MCP setup to work with additional APIs. Deploy your solution in a cloud environment for broader accessibility. For further details, visit the GitHub repository for MCP integration examples and best practices. MCP GitHub Repository MCP Documentation Semantic Workbench AI Gateway MCP Video Walkthrough MCP Blog MCP Github End to End Demo
Sharda_Kaur
Mar 27, 2025 Place Educator Developer Blog
63KViews
11likes
6Comments
Azure OpenAI Model Upgrades: Prompt Safety Pitfalls with GPT-4o and Beyond
Upgrading to New Azure OpenAI Models? Beware Your Old Prompts Might Break. I recently worked on upgrading our Azure OpenAI integration from gpt-35-turbo to gpt-4o-mini, expecting it to be a straightforward configuration change. Just update the Azure Foundry resource endpoint, change the model name, deploy the code — and voilà, everything should work as before. Right? Not quite. The Unexpected Roadblock As soon as I deployed the updated code, I started seeing 400 status errors from the OpenAI endpoint. The message was cryptic: The response was filtered due to the prompt triggering Azure OpenAI's content management policy. At first, I assumed it was a bug in my SDK call or a malformed payload. But after digging deeper, I realized this wasn’t a technical failure — it was a content safety filter kicking in before the prompt even reached the model. The Prompt That Broke It Here’s the original system prompt that worked perfectly with gpt-35-turbo: YOU ARE A QNA EXTRACTOR IN TEXT FORMAT. YOU WILL GET A SET OF SURVEYJS QNA JSONS. YOU WILL CONVERT THAT INTO A TEXT DOCUMENT. FOR THE QUESTIONS WHERE NO ANSWER WAS GIVEN, MARK THOSE AS NO ANSWER. HERE IS THE QNA: BE CREATIVE AND PROFESSIONAL. I WANT TO GENERATE A DOCUMENT TO BE PUBLISHED. {{$style}} +++++ {{$input}} +++++ This prompt had been reliable for months. But with gpt-4o-mini, it triggered Azure’s new input safety layer, introduced in mid-2024. What Changed with GPT-4o-mini? Unlike gpt-35-turbo, the gpt-4o family: Applies stricter content filtering — not just on the output, but also on the input prompt. Treats system messages and user messages as role-based chat messages, passing them through moderation before the model sees them. Flags prompts that look like prompt injection attempts like aggressive instructions like “YOU ARE…”, “BE CREATIVE”, “GENERATE”, “PROFESSIONAL”. Flags unusual formatting (like `+++++`), artificial delimiters or token markers as it may look like encoded content. In short, the model didn’t even get a chance to process my prompt — it was blocked at the gate. Fixing It: Softening the Prompt The solution wasn’t to rewrite the entire logic, but to soften the system prompt and remove formatting that could be misinterpreted. Here’s what helped: - Replacing “YOU ARE…” with a gentler instruction like “Please help convert the following Q&A data…” - Removing creative directives like “BE CREATIVE” or “PROFESSIONAL” unless clearly contextualized. - Avoiding raw JSON markers and template syntax (`{{ }}`, `+++++`) in the prompt. Once I made these changes, the model responded smoothly — and the upgrade was finally complete. Evolving the Prompt — Not Abandoning It Interestingly, for some prompts I didn’t have to completely eliminate the “YOU ARE…” structure. Instead, I refined it to be more natural and less directive. Here’s a comparison: ❌ Old Prompt (Blocked) ✅ New Prompt (Accepted) YOU ARE A SOURCING AND PROCUREMENT MANAGER. YOU WILL GET BUYER'S REQUIREMENTS IN QNA FORMAT. HERE IS THE QNA: {{$input}} +++++ YOU WILL GENERATE TOP 10 {{$category}} RELATED QUESTIONS THAT CAN BE ASKED OF A SUPPLIER IN JSON FORMAT. THE JSON MUST HAVE QUESTION NUMBER AS THE KEY AND QUESTION TEXT AS THE QUESTION. DON'T ADD ANY DESCRIPTION TEXT OR FORMATTING IN THE OUTPUT. BE CREATIVE AND PROFESSIONAL. I WANT TO GENERATE AN RFX. You are an AI assistant that helps clarify sourcing requirements. You will receive buyer's requirements in QnA format. Here is the QnA: {$input} Your task is to generate the top 10 {$category} related questions that can be asked of a supplier, in JSON format. - The JSON must use the question number as the key and the question text as the value. - Do not include any description text or formatting in the output. - Focus on creating clear, professional, and relevant questions that will help prepare an RFX. Key Takeaways - Model upgrades aren’t just about configuration changes — they can introduce new moderation layers that affect prompt design. - Prompt safety filtering is now a first-class citizen in Azure OpenAI, especially for newer models. - System prompts need to be rewritten with moderation in mind, not just clarity or creativity. This experience reminded me that even small upgrades can surface big learning moments. If you're planning to move to gpt-4o-mini or any newer Azure OpenAI model, take a moment to review your prompts — they might need a little more finesse than before.
ShwetaI
Nov 10, 2025 Place Microsoft Foundry Discussions
675Views
3likes
1Comment
Power Up Your Open WebUI with Azure AI Speech: Quick STT & TTS Integration
Introduction Ever found yourself wishing your web interface could really talk and listen back to you? With a few clicks (and a bit of code), you can turn your plain Open WebUI into a full-on voice assistant. In this post, you’ll see how to spin up an Azure Speech resource, hook it into your frontend, and watch as user speech transforms into text and your app’s responses leap off the screen in a human-like voice. By the end of this guide, you’ll have a voice-enabled web UI that actually converses with users, opening the door to hands-free controls, better accessibility, and a genuinely richer user experience. Ready to make your web app speak? Let’s dive in. Why Azure AI Speech? We use Azure AI Speech service in Open Web UI to enable voice interactions directly within web applications. This allows users to: Speak commands or input instead of typing, making the interface more accessible and user-friendly. Hear responses or information read aloud, which improves usability for people with visual impairments or those who prefer audio. Provide a more natural and hands-free experience especially on devices like smartphones or tablets. In short, integrating Azure AI Speech service into Open Web UI helps make web apps smarter, more interactive, and easier to use by adding speech recognition and voice output features. If you haven’t hosted Open WebUI already, follow my other step-by-step guide to host Ollama WebUI on Azure. Proceed to the next step if you have Open WebUI deployed already. Learn More about OpenWeb UI here. Deploy Azure AI Speech service in Azure. Navigate to the Azure Portal and search for Azure AI Speech on the Azure portal search bar. Create a new Speech Service by filling up the fields in the resource creation page. Click on “Create” to finalize the setup. After the resource has been deployed, click on “View resource” button and you should be redirected to the Azure AI Speech service page. The page should display the API Keys and Endpoints for Azure AI Speech services, which you can use in Open Web UI. Settings things up in Open Web UI Speech to Text settings (STT) Head to the Open Web UI Admin page > Settings > Audio. Paste the API Key obtained from the Azure AI Speech service page into the API key field below. Unless you use different Azure Region, or want to change the default configurations for the STT settings, leave all settings to blank. Text to Speech settings (TTS) Now, let's proceed with configuring the TTS Settings on OpenWeb UI by toggling the TTS Engine to Azure AI Speech option. Again, paste the API Key obtained from Azure AI Speech service page and leave all settings to blank. You can change the TTS Voice from the dropdown selection in the TTS settings as depicted in the image below: Click Save to reflect the change. Expected Result Now, let’s test if everything works well. Open a new chat / temporary chat on Open Web UI and click on the Call / Record button. The STT Engine (Azure AI Speech) should identify your voice and provide a response based on the voice input. To test the TTS feature, click on the Read Aloud (Speaker Icon) under any response from Open Web UI. The TTS Engine should reflect Azure AI Speech service! Conclusion And that’s a wrap! You’ve just given your Open WebUI the gift of capturing user speech, turning it into text, and then talking right back with Azure’s neural voices. Along the way you saw how easy it is to spin up a Speech resource in the Azure portal, wire up real-time transcription in the browser, and pipe responses through the TTS engine. From here, it’s all about experimentation. Try swapping in different neural voices or dialing in new languages. Tweak how you start and stop listening, play with silence detection, or add custom pronunciation tweaks for those tricky product names. Before you know it, your interface will feel less like a web page and more like a conversation partner.
suzarilshah
May 16, 2025 Place Educator Developer Blog
2.5KViews
3likes
2Comments
Integrating Microsoft Foundry with OpenClaw: Step by Step Model Configuration
Step 1: Deploying Models on Microsoft Foundry Let us kick things off in the Azure portal. To get our OpenClaw agent thinking like a genius, we need to deploy our models in Microsoft Foundry. For this guide, we are going to focus on deploying gpt-5.2-codex on Microsoft Foundry with OpenClaw. Navigate to your AI Hub, head over to the model catalog, choose the model you wish to use with OpenClaw and hit deploy. Once your deployment is successful, head to the endpoints section. Important: Grab your Endpoint URL and your API Keys right now and save them in a secure note. We will need these exact values to connect OpenClaw in a few minutes. Step 2: Installing and Initializing OpenClaw Next up, we need to get OpenClaw running on your machine. Open up your terminal and run the official installation script: curl -fsSL https://openclaw.ai/install.sh | bash The wizard will walk you through a few prompts. Here is exactly how to answer them to link up with our Azure setup: First Page (Model Selection): Choose "Skip for now". Second Page (Provider): Select azure-openai-responses. Model Selection: Select gpt-5.2-codex , For now only the models listed (hosted on Microsoft Foundry) in the picture below are available to be used with OpenClaw. Follow the rest of the standard prompts to finish the initial setup. Step 3: Editing the OpenClaw Configuration File Now for the fun part. We need to manually configure OpenClaw to talk to Microsoft Foundry. Open your configuration file located at ~/.openclaw/openclaw.json in your favorite text editor. Replace the contents of the models and agents sections with the following code block: { "models": { "providers": { "azure-openai-responses": { "baseUrl": "https://<YOUR_RESOURCE_NAME>.openai.azure.com/openai/v1", "apiKey": "<YOUR_AZURE_OPENAI_API_KEY>", "api": "openai-responses", "authHeader": false, "headers": { "api-key": "<YOUR_AZURE_OPENAI_API_KEY>" }, "models": [ { "id": "gpt-5.2-codex", "name": "GPT-5.2-Codex (Azure)", "reasoning": true, "input": ["text", "image"], "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }, "contextWindow": 400000, "maxTokens": 16384, "compat": { "supportsStore": false } }, { "id": "gpt-5.2", "name": "GPT-5.2 (Azure)", "reasoning": false, "input": ["text", "image"], "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }, "contextWindow": 272000, "maxTokens": 16384, "compat": { "supportsStore": false } } ] } } }, "agents": { "defaults": { "model": { "primary": "azure-openai-responses/gpt-5.2-codex" }, "models": { "azure-openai-responses/gpt-5.2-codex": {} }, "workspace": "/home/<USERNAME>/.openclaw/workspace", "compaction": { "mode": "safeguard" }, "maxConcurrent": 4, "subagents": { "maxConcurrent": 8 } } } } You will notice a few placeholders in that JSON. Here is exactly what you need to swap out: Placeholder Variable What It Is Where to Find It <YOUR_RESOURCE_NAME> The unique name of your Azure OpenAI resource. Found in your Azure Portal under the Azure OpenAI resource overview. <YOUR_AZURE_OPENAI_API_KEY> The secret key required to authenticate your requests. Found in Microsoft Foundry under your project endpoints or Azure Portal keys section. <USERNAME> Your local computer's user profile name. Open your terminal and type whoami to find this. Step 4: Restart the Gateway After saving the configuration file, you must restart the OpenClaw gateway for the new Foundry settings to take effect. Run this simple command: openclaw gateway restart Configuration Notes & Deep Dive If you are curious about why we configured the JSON that way, here is a quick breakdown of the technical details. Authentication Differences Azure OpenAI uses the api-key HTTP header for authentication. This is entirely different from the standard OpenAI Authorization: Bearer header. Our configuration file addresses this in two ways: Setting "authHeader": false completely disables the default Bearer header. Adding "headers": { "api-key": "<key>" } forces OpenClaw to send the API key via Azure's native header format. Important Note: Your API key must appear in both the apiKey field AND the headers.api-key field within the JSON for this to work correctly. The Base URL Azure OpenAI's v1-compatible endpoint follows this specific format: https://<your_resource_name>.openai.azure.com/openai/v1 The beautiful thing about this v1 endpoint is that it is largely compatible with the standard OpenAI API and does not require you to manually pass an api-version query parameter. Model Compatibility Settings "compat": { "supportsStore": false } disables the store parameter since Azure OpenAI does not currently support it. "reasoning": true enables the thinking mode for GPT-5.2-Codex. This supports low, medium, high, and xhigh levels. "reasoning": false is set for GPT-5.2 because it is a standard, non-reasoning model. Model Specifications & Cost Tracking If you want OpenClaw to accurately track your token usage costs, you can update the cost fields from 0 to the current Azure pricing. Here are the specs and costs for the models we just deployed: Model Specifications Model Context Window Max Output Tokens Image Input Reasoning gpt-5.2-codex 400,000 tokens 16,384 tokens Yes Yes gpt-5.2 272,000 tokens 16,384 tokens Yes No Current Cost (Adjust in JSON) Model Input (per 1M tokens) Output (per 1M tokens) Cached Input (per 1M tokens) gpt-5.2-codex $1.75 $14.00 $0.175 gpt-5.2 $2.00 $8.00 $0.50 Conclusion: And there you have it! You have successfully bridged the gap between the enterprise-grade infrastructure of Microsoft Foundry and the local autonomy of OpenClaw. By following these steps, you are not just running a chatbot; you are running a sophisticated agent capable of reasoning, coding, and executing tasks with the full power of GPT-5.2-codex behind it. The combination of Azure's reliability and OpenClaw's flexibility opens up a world of possibilities. Whether you are building an automated devops assistant, a research agent, or just exploring the bleeding edge of AI, you now have a robust foundation to build upon. Now it is time to let your agent loose on some real tasks. Go forth, experiment with different system prompts, and see what you can build. If you run into any interesting edge cases or come up with a unique configuration, let me know in the comments below. Happy coding!
suzarilshah
Feb 23, 2026 Place Educator Developer Blog
11KViews
2likes
2Comments
Deploy Open Web UI on Azure VM via Docker: A Step-by-Step Guide with Custom Domain Setup.
Introductions Open Web UI (often referred to as "Ollama Web UI" in the context of LLM frameworks like Ollama) is an open-source, self-hostable interface designed to simplify interactions with large language models (LLMs) such as GPT-4, Llama 3, Mistral, and others. It provides a user-friendly, browser-based environment for deploying, managing, and experimenting with AI models, making advanced language model capabilities accessible to developers, researchers, and enthusiasts without requiring deep technical expertise. This article will delve into the step-by-step configurations on hosting OpenWeb UI on Azure. Requirements: Azure Portal Account - For students you can claim $USD100 Azure Cloud credits from this URL. Azure Virtual Machine - with a Linux of any distributions installed. Domain Name and Domain Host Caddy Open WebUI Image Step One: Deploy a Linux – Ubuntu VM from Azure Portal Search and Click on “Virtual Machine” on the Azure portal search bar and create a new VM by clicking on the “+ Create” button > “Azure Virtual Machine”. Fill out the form and select any Linux Distribution image – In this demo, we will deploy Open WebUI on Ubuntu Pro 24.04. Click “Review + Create” > “Create” to create the Virtual Machine. Tips: If you plan to locally download and host open source AI models via Open on your VM, you could save time by increasing the size of the OS disk / attach a large disk to the VM. You may also need a higher performance VM specification since large resources are needed to run the Large Language Model (LLM) locally. Once the VM has been successfully created, click on the “Go to resource” button. You will be redirected to the VM’s overview page. Jot down the public IP Address and access the VM using the ssh credentials you have setup just now. Step Two: Deploy the Open WebUI on the VM via Docker Once you are logged into the VM via SSH, run the Docker Command below: docker run -d --name open-webui --network=host --add-host=host.docker.internal:host-gateway -e PORT=8080 -v open-webui:/app/backend/data --restart always ghcr.io/open-webui/open-webui:dev This Docker command will download the Open WebUI Image into the VM and will listen for Open Web UI traffic on port 8080. Wait for a few minutes and the Web UI should be up and running. If you had setup an inbound Network Security Group on Azure to allow port 8080 on your VM from the public Internet, you can access them by typing into the browser: [PUBLIC_IP_ADDRESS]:8080 Step Three: Setup custom domain using Caddy Now, we can setup a reverse proxy to map a custom domain to [PUBLIC_IP_ADDRESS]:8080 using Caddy. The reason why Caddy is useful here is because they provide automated HTTPS solutions – you don’t have to worry about expiring SSL certificate anymore, and it’s free! You must download all Caddy’s dependencies and set up the requirements to install it using this command: sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | sudo tee /etc/apt/sources.list.d/caddy-stable.list sudo apt update && sudo apt install caddy Once Caddy is installed, edit Caddy’s configuration file at: /etc/caddy/Caddyfile , delete everything else in the file and add the following lines: yourdomainname.com { reverse_proxy localhost:8080 } Restart Caddy using this command: sudo systemctl restart caddy Next, create an A record on your DNS Host and point them to the public IP of the server. Step Four: Update the Network Security Group (NSG) To allow public access into the VM via HTTPS, you need to ensure the NSG/Firewall of the VM allow for port 80 and 443. Let’s add these rules into Azure by heading to the VM resources page you created for Open WebUI. Under the “Networking” Section > “Network Settings” > “+ Create port rule” > “Inbound port rule” On the “Destination port ranges” field, type in 443 and Click “Add”. Repeat these steps with port 80. Additionally, to enhance security, you should avoid external users from directly interacting with Open Web UI’s port - port 8080. You should add an inbound deny rule to that port. With that, you should be able to access the Open Web UI from the domain name you setup earlier. Conclusion And just like that, you’ve turned a blank Azure VM into a sleek, secure home for your Open Web UI, no magic required! By combining Docker’s simplicity with Caddy’s “set it and forget it” HTTPS magic, you’ve not only made your app accessible via a custom domain but also locked down security by closing off risky ports and keeping traffic encrypted. Azure’s cloud muscle handles the heavy lifting, while you get to enjoy the perks of a pro setup without the headache. If you are interested in using AI models deployed on Azure AI Foundry on OpenWeb UI via API, kindly read my other article: Step-by-step: Integrate Ollama Web UI to use Azure Open AI API with LiteLLM Proxy
suzarilshah
Mar 06, 2025 Place Educator Developer Blog
4.8KViews
2likes
1Comment
Tiny But Mighty: Unleashing the Power of Small Language Models 🚀
While Large Language Models (LLMs) like GPT-4 dominate headlines with their extensive capabilities, they often come at the cost of high computational requirements and complexity. For developers and organizations looking to implement AI solutions on edge devices or with limited resources, Small Language Models (SLMs) are emerging as a practical alternative. SLMs are not just "smaller" versions of their larger counterparts—they're designed to be faster, more efficient, and adaptable for specific tasks. With fewer parameters and lower computational needs, SLMs open the door to deploying AI on mobile devices, IoT systems, and edge environments without compromising performance. What You Stand to Learn 🧠 Introduction to Microsoft's AI Ecosystem Discover Microsoft's end-to-end AI development tools, from Azure AI Services to ONNX Runtime, enabling efficient and secure deployment of AI models across cloud and edge environments. The Advantages of SLMs over LLMs SLMs are game-changers for edge AI applications, providing faster training and inference times, reduced energy costs, and scalability across diverse devices. Hands-On with Phi-3 and ONNX Runtime Experience live demonstrations of SLMs in action with tools like Phi-3 and ONNX Runtime, showcasing how to fine-tune and deploy models on mobile devices, IoT, and hybrid cloud environments. Responsible AI Practices Understand how to safeguard your AI applications with Microsoft's Responsible AI toolkit, ensuring ethical and trustworthy deployments. Watch the Full Session 👨‍💻 📅 Date: December 12, 2024 ⏰ Time: 4 PM GMT | 5 PM CEST | 8 AM PT | 11 AM ET | 7 PM EAT A session packed with live demos, practical examples, and Q&A opportunities. Register NOW | Events | Microsoft Reactor Agenda 🔍 Introduction (5 min) A brief overview of the session and its focus on SLMs and LLMs. Microsoft AI Tooling (5 min) Explore the latest tools like Azure AI Services, Azure Machine Learning, and Responsible AI Tooling. How to Choose the Right Model (10 min) Key considerations such as performance, customizability, and ethical implications. Comparing SLMs vs LLMs (10 min) The strengths, weaknesses, and best use cases for both Small and Large Language Models. Deploying Models at the Edge (10 min) Insights into optimizing AI for mobile, IoT, and edge devices. Q&A Addressing participant questions about AI development and deployment.
RayanPopat
Dec 05, 2024 Place Educator Developer Blog
542Views
2likes
0Comments
Kickstart Your AI Journey: Using Azure AI App Templates to Build AI Applications
In this blog, We will explore what Azure AI App Templates are, how students can use them, and why they’re a great starting point for building AI applications, even with minimal prior experience.
Sohaibkamran-25
Nov 26, 2024 Place Educator Developer Blog
876Views
2likes
0Comments
How to Set Up Claude Code with Microsoft Foundry Models on macOS
Introduction Building with AI isn't just about picking a smart model. It is about where that model lives. I chose to route my Claude Code setup through Microsoft Foundry because I needed more than just a raw API. I wanted the reliability, compliance, and structured management that comes with Microsoft's ecosystem. When you are moving from a prototype to something real, having that level of infrastructure backing your calls makes a significant difference. The challenge is that Foundry is designed for enterprise cloud environments, while my daily development work happens locally on a MacBook. Getting the two to communicate seamlessly involved navigating a maze of shell configurations and environment variables that weren't immediately obvious. I wrote this guide to document the exact steps for bridging that gap. Here is how you can set up Claude Code to run locally on macOS while leveraging the stability of models deployed on Microsoft Foundry. Requirements Before we open the terminal, let's make sure you have the necessary accounts and environments ready. Since we are bridging a local CLI with an enterprise cloud setup, having these credentials handy now will save you time later. Azure Subscription with Microsoft Foundry Setup - This is the most critical piece. You need an active Azure subscription where the Microsoft Foundry environment is initialized. Ensure that you have deployed the Claude model you intend to use and that the deployment status is active. You will need the specific endpoint URL and the associated API keys from this deployment to configure the connection. An Anthropic User Account - Even though the compute is happening on Azure, the interface requires an Anthropic account. You will need this to authenticate your session and manage your user profile settings within the Claude Code ecosystem. Claude Code Client on macOS - We will be running the commands locally, so you need the Claude Code CLI installed on your MacBook. Step 1: Install Claude Code on macOS The recommended installation method is via Homebrew or Curl, which sets it up for terminal access ("OS level"). Option A: Homebrew (Recommended) brew install --cask claude-code Option B: Curl curl -fsSL https://claude.ai/install.sh | bash Verify Installation: Run claude --version. Step 2: Set Up Microsoft Foundry to deploy Claude model Navigate to your Microsoft Foundry portal, and find the Claude model catalog, and deploy the selected Claude model. [Microsoft Foundry > My Assets > Models + endpoints > + Deploy Model > Deploy Base model > Search for "Claude"] In your Model Deployment dashboard, go to the deployed Claude Models and get the "Endpoints and keys". Store it somewhere safe, because we will need them to configure Claude Code later on. Configure Environment Variables in MacOS terminal: Now we need to tell your local Claude Code client to route requests through Microsoft Foundry instead of the default Anthropic endpoints. This is handled by setting specific environment variables that act as a bridge between your local machine and your Azure resources. You could run these commands manually every time you open a terminal, but it is much more efficient to save them permanently in your shell profile. For most modern macOS users, this file is .zshrc. Open your terminal and add the following lines to your profile, making sure to replace the placeholder text with your actual Azure credentials: export CLAUDE_CODE_USE_FOUNDRY=1 export ANTHROPIC_FOUNDRY_API_KEY="your-azure-api-key" export ANTHROPIC_FOUNDRY_RESOURCE="your-resource-name" # Specify the deployment name for Opus export CLAUDE_CODE_MODEL="your-opus-deployment-name" Once you have added these variables, you need to reload your shell configuration for the changes to take effect. Run the source command below to update your current session, and then verify the setup by launching Claude: source ~/.zshrc claude If everything is configured correctly, the Claude CLI will initialize using your Microsoft Foundry deployment as the backend. Once you execute the claude command, the CLI will prompt you to choose an authentication method. Select Option 2 (Antrophic Console account) to proceed. This action triggers your default web browser and redirects you to the Claude Console. Simply sign in using your standard Anthropic account credentials. After you have successfully signed in, you will be presented with a permissions screen. Click the Authorize button to link your web session back to your local terminal. Return to your terminal window, and you should see a notification confirming that the login process is complete. Press Enter to finalize the setup. You are now fully connected. You can start using Claude Code locally, powered entirely by the model deployment running in your Microsoft Foundry environment. Conclusion Setting up this environment might seem like a heavy lift just to run a CLI tool, but the payoff is significant. You now have a workflow that combines the immediate feedback of local development with the security and infrastructure benefits of Microsoft Foundry. One of the most practical upgrades is the removal of standard usage caps. You are no longer limited to the 5-hour API call limits, which gives you the freedom to iterate, test, and debug for as long as your project requires without hitting a wall. By bridging your local macOS terminal to Azure, you are no longer just hitting an API endpoint. You are leveraging a managed, compliance-ready environment that scales with your needs. The best part is that now the configuration is locked in, you don't need to think about the plumbing again. You can focus entirely on coding, knowing that the reliability of an enterprise platform is running quietly in the background supporting every command.
suzarilshah
Feb 10, 2026 Place Educator Developer Blog
1.3KViews
1like
0Comments
Introducing Azure AI Models: The Practical, Hands-On Course for Real Azure AI Skills
Hello everyone, Today, I’m excited to share something close to my heart. After watching so many developers, including myself—get lost in a maze of scattered docs and endless tutorials, I knew there had to be a better way to learn Azure AI. So, I decided to build a guide from scratch, with a goal to break things down step by step—making it easy for beginners to get started with Azure, My aim was to remove the guesswork and create a resource where anyone could jump in, follow along, and actually see results without feeling overwhelmed. Introducing Azure AI Models Guide. This is a brand new, solo-built, open-source repo aimed at making Azure AI accessible for everyone—whether you’re just getting started or want to build real, production-ready apps using Microsoft’s latest AI tools. The idea is simple: bring all the essentials into one place. You’ll find clear lessons, hands-on projects, and sample code in Python, JavaScript, C#, and REST—all structured so you can learn step by step, at your own pace. I wanted this to be the resource I wish I’d had when I started: straightforward, practical, and friendly to beginners and pros alike. It’s early days for the project, but I’m excited to see it grow. If you’re curious.. Check out the repo at https://github.com/DrHazemAli/Azure-AI-Models Your feedback—and maybe even your contributions—will help shape where it goes next!
Solved
hazem
Jun 22, 2025 Place Microsoft Foundry Discussions
1.1KViews
1like
5Comments
Introducing AzureImageSDK — A Unified .NET SDK for Azure Image Generation And Captioning
Hello 👋 I'm excited to share something I've been working on — AzureImageSDK — a modern, open-source .NET SDK that brings together Azure AI Foundry's image models (like Stable Image Ultra, Stable Image Core), along with Azure Vision and content moderation APIs and Image Utilities, all in one clean, extensible library. While working with Azure’s image services, I kept hitting the same wall: Each model had its own input structure, parameters, and output format — and there was no unified, async-friendly SDK to handle image generation, visual analysis, and moderation under one roof. So... I built one. AzureImageSDK wraps Azure's powerful image capabilities into a single, async-first C# interface that makes it dead simple to: 🎨 Inferencing Image Models 🧠 Analyze visual content (Image to text) 🚦 Image Utilities — with just a few lines of code. It's fully open-source, designed for extensibility, and ready to support new models the moment they launch. 🔗 GitHub Repo: https://github.com/DrHazemAli/AzureImageSDK Also, I've posted the release announcement on the Azure AI Foundry's GitHub Discussions 👉🏻 feel free to join the conversation there too. The SDK is available on NuGet too. Would love to hear your thoughts, use cases, or feedback!
hazem
Jun 11, 2025 Place Microsoft Foundry Discussions
179Views
1like
0Comments