model

6 Topics

Signing in to Microsoft Foundry from OpenClaw using Azure AD: a smoother way to bring your models in
This post is a quick update to walk through the new flow. If you read the previous one, think of this as the easier path I wish I had the first time round. If you have not seen the original, you can find it here: Integrating Microsoft Foundry with OpenClaw: Step by Step Model Configuration | Microsoft Community Hub Pre-requisite: You will need the Azure CLI (azure-cli) installed on your machine. The official install guide for Linux is here: https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-linux?view=azure-cli-latest I am on Linux so I went the Homebrew route, which keeps things simple. The formula is here: https://formulae.brew.sh/formula/azure-cli Microsoft also has official docs covering the Homebrew/Linuxbrew install: https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos?view=azure-cli-latest#install-with-homebrew Once Homebrew is ready, run this in your terminal: brew install azure-cli Why this matters: Before this update, every Foundry model you wanted to use in OpenClaw needed its own API key and endpoint pasted into the config. It worked, but it was tedious, and keys are easy to leak if you are copying them around. The Azure AD path solves both problems. You authenticate as yourself (or a service principal), OpenClaw asks Azure for the list of Foundry resources you have access to, and it brings the models in automatically. Signing in to Microsoft Foundry from OpenClaw via Azure AD A device-code OAuth handshake replaces the old static-API-key flow. OpenClaw delegates auth to the local Azure CLI; the CLI handles the browser-side sign-in, holds the resulting tokens, and refreshes them silently. OpenClaw then walks the Azure resource graph, subscriptions → Foundry resources → model deployments and registers each model into its own config. No API keys move through OpenClaw at any point. Sequence diagram of the OAuth 2.0 device-authorization flow as orchestrated by OpenClaw. Phases 1–3 establish identity (the developer authenticates once, in a real browser, against Azure AD). Phases 4–5 perform service discovery (OpenClaw walks the ARM resource hierarchy, subscriptions → Foundry accounts → model deployments and persists the result to a local provider config). After registration, every model call OpenClaw makes against Foundry reuses the same Azure-CLI-managed token cache: tokens refresh transparently, and access is gated by the Foundry resource's RBAC assignments rather than a static API key. Dashed lines denote return values; the teal line in step 7 marks the single token-issuance event the rest of the system pivots on. Walking through the new flow: Start with the command to onboard openclaw as if you were setting up OpenClaw for the first time: openclaw onboard Kick things off with the OpenClaw onboard command, the same one you would use when setting up OpenClaw for the first time. When it prompts you, choose update values. Next, you will be asked to configure your models. Scroll down a little and you will see Microsoft Foundry listed as a supported provider. Pick it. From here, you have two options. You can sign in with an API key, which is what I covered in the previous blog post, or you can sign in through Azure AD. The Azure AD path is easier and more secure, so that is the one we will use. OpenClaw will give you a URL and a device code. Copy the URL into your browser and use the code to complete the sign in. (This is where the az CLI from the pre-requisite section earns its keep.) If everything worked, you should see a success prompt similar to this: Once you are signed in, OpenClaw will ask you to pick the Azure subscription that your Microsoft Foundry resource lives in. Pick the subscription, then pick the Foundry resource where your models are deployed. And that is pretty much it. All the models you have deployed to that Foundry resource get pulled into OpenClaw automatically. Compared to the old way of pasting API keys and endpoints one by one, this is a huge time saver, and you do not have to babysit any keys. From here you can start using your Foundry-deployed models inside OpenClaw straight away: Wrapping up The Azure AD sign-in option in OpenClaw is one of those small updates that quietly removes a real pain point. If you have ever juggled multiple Foundry endpoints and rotated keys across them, you already know why. With this flow, you sign in once, your models show up, and you can get back to actually building. If you have not tried OpenClaw with Microsoft Foundry yet, this is a good time to give it a go. And if you were holding off because of the key management overhead, that excuse is gone now. References Previous post on integrating Microsoft Foundry with OpenClaw using API keys: Integrating Microsoft Foundry with OpenClaw: Step by Step Model Configuration | Microsoft Community Hub Install the Azure CLI on Linux: https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-linux?view=azure-cli-latest Install the Azure CLI on macOS: https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos?view=azure-cli-latest#install-with-homebrew Homebrew formula for azure-cli: https://formulae.brew.sh/formula/azure-cli
suzarilshah
May 20, 2026 Place Educator Developer Blog
138Views
0likes
0Comments
Microsoft Foundry Model Router: A Developer's Guide to Smarter AI Routing
Introduction When building AI-powered applications on Azure, one of the most impactful decisions you make isn't about which model to use, it's about how your application selects models at runtime. Microsoft Foundry Model Router, available through Microsoft Foundry, automatically routes your inference requests to the best available model based on prompt complexity, latency targets, and cost efficiency. But how do you know it's actually routing correctly? And how do you compare its behavior across different API paths? That's exactly the problem RouteLens solves. It's an open-source Node.js CLI and web-based testing tool that sends configurable prompts through two distinct Azure AI runtime paths and produces a detailed comparison of routing decisions, latency profiles, and reliability metrics. In this post, we'll walk through what Model Router does, why it matters, how to use the validator tool, and best practices for designing applications that get the most out of intelligent model routing. What Is Microsoft Founry Model Router? Microsoft Foundry Model Router is a deployment option in Microsoft Foundry that sits between your application and a pool of AI models. Instead of hard-coding a specific model like gpt-4o or gpt-4o-mini , you deploy a Model Router endpoint and let Azure decide which underlying model serves each request. How It Works Your application sends an inference request to the Model Router deployment. Model Router analyzes the request (prompt complexity, token count, required capabilities). It selects the most appropriate model from the available pool. The response is returned transparently — your application code doesn't change. Why This Matters Cost optimization — Simple prompts get routed to smaller, cheaper models. Complex prompts go to more capable (and expensive) ones. Latency reduction — Lightweight prompts complete faster when they don't need a heavyweight model. Resilience — If one model is experiencing high load or throttling, traffic can shift to alternatives. Simplified application code — No need to build your own model-selection logic. The Two Runtime Paths Microsoft Foundry offers two distinct endpoint configurations for hitting Model Router. Even though both use the Chat Completions API, they may have different routing behaviour: Path SDK Endpoint AOAI + Chat Completions OpenAI JS SDK https://.cognitiveservices.azure.com/openai/deployments/ Foundry Project + Chat Completions OpenAI JS SDK (separate client) https://.cognitiveservices.azure.com/openai/deployments/ Understanding whether these two paths produce the same routing decisions is critical for production applications. If the same prompt routes to different models depending on which endpoint you use, that's a signal you need to investigate. Introducing RouteLens RouteLens is a Node.js tool that automates this comparison. It: Sends a configurable set of prompts across categories (echo, summarize, code, reasoning) through both paths. Logs every response to structured JSONL files for post-hoc analysis. Computes statistics including p50/p95 latency, error rates, and model-choice distribution. Highlights routing differences — where the same prompt was served by different models across paths. Provides a web dashboard for interactive testing and real-time result visualization. The Web Dashboard The built-in web UI makes it easy to run tests and explore results without parsing log files: The dashboard includes: KPI Dashboard — Key metrics at a glance: Success Rate, Avg TPS, Gen TPS, Peak TPS, Fastest Response, p50/p95 Latency, Most Reliable Path, Total Tokens Summary view — Per-path/per-category stats with success rate, TPS, and latency Model Comparison — Side-by-side view of which models were selected by each path Latency Charts — Visual bar charts comparing p50 and p95 latencies Error Analysis — Error distribution and detailed error messages Live Feed — Real-time streaming of results as they come in Log Viewer — Browse and inspect historical JSONL log files Model Comparison — See which models were selected by each routing path for every prompt: Live Feed — Real-time streaming of results as they come in: Log Viewer — Browse and inspect historical JSONL log files with parsed table views: Mobile Responsive — The UI adapts to smaller screens: Getting Started Prerequisites Node.js 18+ (LTS recommended) An Azure subscription with a Foundry project Model Router deployed in your Foundry project An API key from your Azure OpenAI / Foundry resource The API version (e.g. 2024-05-01-preview ) Setup # Clone and install git clone https://github.com/leestott/modelrouter-routelens/ cd modelrouter-routelens npm install # Configure your endpoints cp .env.example .env # Edit .env with your Azure endpoints (see below) Configuration The .env file needs these key settings: # Your Foundry / Cognitive Services deployment endpoint # Format: https://<resource>.cognitiveservices.azure.com/openai/deployments/<deployment> # Do NOT include /chat/completions or ?api-version FOUNDRY_PROJECT_ENDPOINT=https://<resource>.cognitiveservices.azure.com/openai/deployments/model-router AOAI_BASE_URL=https://<resource>.cognitiveservices.azure.com/openai/deployments/model-router # API key from your Azure OpenAI / Foundry resource AOAI_API_KEY=your-api-key-here # Azure OpenAI API version AOAI_API_VERSION=2024-05-01-preview</resource></resource></deployment></resource> Running Tests # Full test matrix — sends all prompts through both paths npm run run:matrix # 408 timeout diagnostic — focuses on the Responses path timeout issue npm run run:repro408 # Web UI — interactive dashboard npm run ui # Then open http://localhost:3002 (or the port set in UI_PORT) Understanding the Results Latency Comparison The latency charts show p50 (median) and p95 (tail) latency for each path and prompt category: Key things to look for: Large p50 differences between paths suggest one path has consistently higher overhead. High p95 values indicate tail latency problems — possibly timeouts or retries. Category-specific patterns — If code prompts are slow on one path but fast on another, that's a routing difference worth investigating. Model Comparison The model comparison view shows which models were selected for each prompt: When both paths select the same model, you see a green "Match" indicator. When they differ, it's flagged in red — these are the cases you want to investigate. Error Analysis The errors view helps diagnose reliability issues: Common error patterns: 408 Timeout — The Responses path may take longer for certain prompt categories 401 Unauthorized — Authentication configuration issues 429 Rate Limited — You're hitting throughput limits 500 Internal Server Error — Backend model issues Best Practices for Designing Applications with Model Router 1. Design Prompts with Routing in Mind Model Router makes decisions based on prompt characteristics. To get the best routing: Keep prompts focused — A clear, single-purpose prompt is easier for the router to classify than a multi-part prompt that spans multiple complexity levels. Use system messages effectively — A well-structured system message helps the router understand the task complexity. Separate complex chains — If you have a multi-step workflow, make each step a separate API call rather than one massive prompt. This lets the router use a cheaper model for simple steps. 2. Set Appropriate Timeouts Different models have different latency profiles. Your timeout settings should account for the slowest model the router might select: // Too aggressive — may timeout when routed to a larger model const TIMEOUT = 5000; // 5s // Better — allows headroom for model variation const TIMEOUT = 30000; // 30s // Best — use different timeouts based on expected complexity function getTimeout(category) { switch (category) { case 'echo': return 10000; case 'summarize': return 20000; case 'code': return 45000; case 'reasoning': return 60000; default: return 30000; } } 3. Implement Robust Retry Logic Because the router may select different models on retry, transient failures can resolve themselves: async function callWithRetry(prompt, maxRetries = 3) { for (let attempt = 0; attempt < maxRetries; attempt++) { try { return await client.chat.completions.create({ model: 'model-router', messages: [{ role: 'user', content: prompt }], }); } catch (err) { if (attempt === maxRetries - 1) throw err; // Exponential backoff await new Promise(r => setTimeout(r, 1000 * Math.pow(2, attempt))); } } } 4. Monitor Model Selection in Production Log which model was selected for each request so you can track routing patterns over time: const response = await client.chat.completions.create({ model: 'model-router', messages: [{ role: 'user', content: prompt }], }); // The model field in the response tells you which model was actually used console.log(`Routed to: ${response.model}`); console.log(`Tokens: ${response.usage.total_tokens}`); 5. Use the Right API Path for Your Use Case Based on our testing with RouteLens, consider: Chat Completions path — The standard path for chat-style interactions. Uses the openai SDK directly. Foundry Project path — Uses the same Chat Completions API but through the Foundry project endpoint. Useful for comparing routing behaviour across different endpoint configurations. Note: The Responses API ( /responses ) is not currently available on cognitiveservices.azure.com Model Router deployments. Both paths in RouteLens use Chat Completions. 6. Test Before You Ship Run RouteLens as part of your pre-production validation: # In your CI/CD pipeline or pre-deployment check npm run run:matrix -- --runs 10 --concurrency 4 This helps you: Catch routing regressions when Azure updates model pools Verify that your prompt changes don't cause unexpected model selection shifts Establish latency baselines for alerting Architecture Overview RouteLens sends configurable prompts through two distinct Azure AI runtime paths and compares routing decisions, latency, and reliability. The Matrix Runner dispatches prompts to both the Chat Completions Client (OpenAI JS SDK → AOAI endpoint) and the Project Responses Client ( azure/ai-projects → Foundry endpoint). Both paths converge at Azure Model Router, which intelligently selects the optimal backend model. Results are logged to JSONL files and rendered in the web dashboard. Key Benefits of Model Router Benefit Description Cost savings Automatically routes simple prompts to cheaper models, reducing spend by 30-50% in typical workloads Lower latency Simple prompts complete faster on lightweight models Zero code changes Same API contract as a standard model deployment — just change the deployment name Future-proof As Azure adds new models to the pool, your application benefits automatically Built-in resilience Routing adapts to model availability and load conditions Conclusion Azure Model Router represents a shift from "pick a model" to "describe your task and let the platform decide." This is a natural evolution for AI applications — just as cloud platforms abstract away server selection, Model Router abstracts away model selection. RouteLens gives you the visibility to trust that abstraction. By systematically comparing routing behavior across API paths and prompt categories, you can deploy Model Router with confidence and catch issues before your users do. The tool is open source under the MIT license. Try it out, file issues, and contribute improvements: GitHub Repository Model Router Documentation Microsoft Foundry
Lee_Stott
Mar 23, 2026 Place Educator Developer Blog
807Views
1like
0Comments
Showcasing Phi-4-Reasoning: A Game-Changer for AI Developers
Showcasing Phi-4-Reasoning: A Game-Changer for AI Developers Introduction Phi-4-Reasoning is a state-of-the-art AI model developed by Microsoft Research, designed to excel in complex reasoning tasks. With its advanced capabilities, Phi-4-Reasoning is a powerful tool for AI developers, enabling them to tackle intricate problems with ease and precision. What is Phi-4-Reasoning? Phi-4-Reasoning is a 14-billion parameter open-weight reasoning model that has been fine-tuned from the Phi-4 model using supervised fine-tuning on a dataset of chain-of-thought traces.. We are also releasing Phi-4-reasoning-plus, a variant enhanced through a short phase of outcome-based reinforcement learning that offers higher performance by generating longer reasoning traces. This model is designed to handle complex reasoning tasks, including mathematical problems, scientific queries, and coding challenges Why Phi-4-reasoning and Phi-4-reasoning-plus Excels Advanced Reasoning Capabilities: Phi-4-Reasoning is trained to generate detailed reasoning chains, making it highly effective in solving complex problems. It leverages additional inference-time compute to enhance its reasoning abilities High-Quality Data Curation: The model's training involved meticulous data curation and the use of high-quality synthetic datasets, allowing it to compete with much larger models Performance: Despite its smaller size, Phi-4-Reasoning outperforms larger models like OpenAI's o1-mini and DeepSeek-R1-Distill-Llama-70B on various benchmarks, including mathematical reasoning and PhD-level science questions Industry Applications: Phi-4-Reasoning is versatile and can be deployed in various industry scenarios, including medicine, where it can be used for reasoning tasks on edge devices Real-World Examples Wordplay Riddle: Phi-4-Reasoning can solve wordplay riddles by recognizing patterns and applying logical reasoning. For example, it deduced that collecting three strawberries would yield 9 r's because the word "strawberry" contains three r's Example 1: How many strawberries for 9 r’s? User: How many strawberries do I need to collect 9 r’s? Phi-4-Reasoning: <think> TRUNCATED </think> It’s a wordplay riddle! Notice that the word “strawberry” has three r’s in it. So if you “collect” three strawberries, you get 3 Å~ 3 = 9 r’s. Thus, you need 3 strawberries. Solving a Riddle: The model can decrypt complex riddles by identifying patterns and performing arithmetic operations. It demonstrated this capability by solving a riddle involving foreign language number-words and reversed number-words. What This Problem Proves: Pattern Recognition: Phi-4-Reasoning demonstrates its ability to identify patterns in words and apply logical reasoning to solve the riddle. Wordplay Understanding: The model showcases its proficiency in understanding and solving wordplay riddles, which require a combination of linguistic knowledge and logical deduction. Efficiency: Phi-4-Reasoning quickly arrives at the correct solution by applying a straightforward multiplication operation, highlighting its efficiency in solving such problems. Example 2: Solving a Riddle User: I will give you a riddle to solve with a few examples, and something to complete at the end What is a likely completion for X that is consistent the with examples above? Phi-4-Reasoning: <think> TRUNCATED </think> We can “decrypt” the riddle by noticing that each line has five parts. In every example the parts are: A “number-word” in a foreign language (or with an extra minus sign), A “first name” (of any nationality), A “reversed number-word” (or two, joined by a hyphen), A “second name”, and A final number. When you “reverse” the third word you get a familiar numberin English: In line 1 “evif” is “five” (5) and the first number is “uno” = 1, and 5 – 1 = 4. In line 2 “neves” is “seven” (7) and “tres” = 3, and 7 – 3 = 4. In line 3 “evlewt” is “twelve” (12) and “drei” = 3, and 12 – 3 = 9. In lines 4 and 5 with two reversed “number-words”: – Line 4: “xis” →“six” (6) and “ytnewt” →“twelve” (12); 6 + 12 = 18, plus 3 = 21. – Line 5: “eerht” → “three” (3) and “ytnewt” → “twelve” (12); 3 + 12 = 15, plus 3 = 18. Now look at the final line: “minus one-point-two Satya eno Bill X” It follows the same five–part format. Notice that “eno” reversed is “one” (1). And the “first number” is “minus onepoint-two” (–1.2). Since for a one–number–word line the rule is: final number = (number from reversed word)−(first number) we compute: final = 1 − (−1.2) = 1 + 1.2 = 2.2 Thus, a likely completion for X is 2.2. Phi-4-Reasoning: Phi-4-Reasoning decrypts the riddle by identifying a pattern in the examples provided. Each line consists of five parts: a number-word in a foreign language, a first name, a reversed number-word, a second name, and a final number. By reversing the third word and performing arithmetic operations, Phi-4-Reasoning determines that the likely completion for X is 2.2 What This Problem Proves: Complex Pattern Recognition: Phi-4-Reasoning excels in recognizing complex patterns involving multiple elements, such as foreign language number-words and reversed number-words. Logical Deduction: The model demonstrates its ability to apply logical deduction to solve the riddle by performing arithmetic operations based on the identified patterns. Versatility: Phi-4-Reasoning showcases its versatility in handling diverse types of riddles and problems, making it a valuable tool for AI developers. Example 3. solve complex graph traversal tasks, such as finding a path through a maze. The maze is represented as an adjacency list, which is a common way to describe graphs in computer science. Each node in the graph (or maze) is connected to other nodes, and the task is to find a valid path from the starting node (node 0) to the destination node Example 3: Solving a maze User: Please solve the maze provided in adjacency list form below, by outputting a valid path from node 0 to node {destination}, in a space-separated list of integers. Adjacency list: {depicted as an image below} Phi-4-Reasoning: <think> TRUNCATED </think> {path depicted as an image below} What This Problem Proves: Graph Traversal Capability: Phi-4-Reasoning can effectively navigate through graphs using algorithms like Depth-First Search (DFS) or Breadth-First Search (BFS). This capability is crucial for solving problems that involve finding paths, cycles, or connectivity in graphs. Logical Reasoning: The model demonstrates its ability to apply logical reasoning to determine the correct sequence of nodes to traverse from the start to the destination. This involves understanding the structure of the graph and making decisions based on the connections between nodes. Pattern Recognition: Phi-4-Reasoning can recognize patterns in the adjacency list and use them to find a solution. This is important for tasks that require identifying and following specific paths or routes. Versatility: The ability to solve a maze using an adjacency list showcases the model's versatility in handling different types of data structures and problem-solving scenarios. This is beneficial for AI developers who need to work with various data representations and algorithms. Efficiency: The model's ability to quickly and accurately find a valid path through the maze highlights its efficiency in solving complex problems. This is valuable for applications that require fast and reliable solutions. Conclusion: Phi-4-Reasoning's ability to solve a maze using an adjacency list demonstrates its advanced reasoning capabilities, making it a powerful tool for AI developers. Its proficiency in graph traversal, logical reasoning, pattern recognition, versatility, and efficiency makes it well-suited for tackling a wide range of complex problems. Deployment and Integration Phi-4-Reasoning can be deployed on various platforms, including Azure AI Foundry and Hugging Face. It supports quantization using tools like Microsoft Olive, making it suitable for deployment on edge devices such as IoT, laptops, and mobile devices. Phi-4-Reasoning is a groundbreaking AI model that offers advanced reasoning capabilities, high performance, and versatility. Its ability to handle complex reasoning tasks makes it an invaluable tool for AI developers, enabling them to create innovative solutions across various industries. References Make Phi-4-mini-reasoning more powerful with industry reasoning on edge devices | Microsoft Community Hub Phi-4 Reasoning Technical Paper Phi-4-Mini-Reasoning Technical Paper One year of Phi: Small language models making big leaps in AI | Microsoft Azure Blog PhiCookBook Access Phi-4-reasoning models Phi Models at Azure AI Foundry Models Phi Models on Hugging Face Phi Models on GitHub Marketplace Models
Lee_Stott
Apr 30, 2025 Place Educator Developer Blog
4KViews
0likes
0Comments
GitHub Copilot Update: New AI Model That Also Filters Out Security Vulnerabilities
GitHub Copilot has announced major updates to its AI model, resulting in significant improvements to the quality and responsiveness of code suggestions. The updated Codex model has led to a large-scale improvement in the quality of code suggestions and a reduction in the time it takes to serve those suggestions to users. In addition, the new vulnerability filtering system uses advanced AI models to detect insecure coding patterns in real-time, making suggestions more secure. These updates are available to all users of GitHub Copilot and are free to students with the GitHub Students Developer Pack.
AnthonyBartolo
Feb 16, 2023 Place Educator Developer Blog
23KViews
2likes
0Comments
Form Recognizer from Power Apps
How can I integrate the "GENERAL DOCUMENTS" model from Power Apps. What is the ID of this model? I will appreciate your contributions. Regards
ja_Hor_365
Feb 09, 2023 Place Microsoft Foundry Discussions
1.1KViews
0likes
1Comment
I have a challenging excel problem
Hi guys, Have got quite a challenging problem. I'm trying to build a model that produces the probability of horses winning a race. There are 4 "speedmaps" (i.e. scenarios of how the race might play out) and each scenario has a percentage of how likely it is to happen. There are 3 horses in the race. So, Speedmap 1 (happens 40% of the time) Horse A wins 70% Horse B wins 25% Horse C wins 5% Speedmap 2 (happens 30% of the time) Horse A wins 30% Horse B wins 50% Horse C wins 20% Speedmap 3 (happens 20% of the time) Horse A wins 10% Horse B wins 20% Horse C wins 70% Speedmap 4 (happens 10% of the time) Horse A wins 80% Horse B wins 10% Horse C wins 10% I manually worked out that for this race, the probability of each horse winning is: A = 47%, B=30%, C=23% How do I build a model in excel where I can input any values in and the probability of the horses winning is spat out? Would appreciate any help/discussion!! Thanks!
Lyle Lazarus
Apr 03, 2019 Place Excel
1.3KViews
0likes
2Comments