best practices
179 TopicsAzure Container Registry Repository Permissions with Attribute-based Access Control (ABAC)
General Availability announcement Today marks the general availability of Azure Container Registry (ACR) repository permissions with Microsoft Entra attribute-based access control (ABAC). ABAC augments the familiar Azure RBAC model with namespace and repository-level conditions so platform teams can express least-privilege access at the granularity of specific repositories or entire logical namespaces. This capability is designed for modern multi-tenant platform engineering patterns where a central registry serves many business domains. With ABAC, CI/CD systems and runtime consumers like Azure Kubernetes Service (AKS) clusters have least-privilege access to ACR registries. Why this matters Enterprises are converging on a central container registry pattern that hosts artifacts and container images for multiple business units and application domains. In this model: CI/CD pipelines from different parts of the business push container images and artifacts only to approved namespaces and repositories within a central registry. AKS clusters, Azure Container Apps (ACA), Azure Container Instances (ACI), and consumers pull only from authorized repositories within a central registry. With ABAC, these repository and namespace permission boundaries become explicit and enforceable using standard Microsoft Entra identities and role assignments. This aligns with cloud-native zero trust, supply chain hardening, and least-privilege permissions. What ABAC in ACR means ACR registries now support a registry permissions mode called “RBAC Registry + ABAC Repository Permissions.” Configuring a registry to this mode makes it ABAC-enabled. When a registry is configured to be ABAC-enabled, registry administrators can optionally add ABAC conditions during standard Azure RBAC role assignments. This optional ABAC conditions scope the role assignment’s effect to specific repositories or namespace prefixes. ABAC can be enabled on all new and existing ACR registries across all SKUs, either during registry creation or configured on existing registries. ABAC-enabled built-in roles Once a registry is ABAC-enabled (configured to “RBAC Registry + ABAC Repository Permissions), registry admins can use these ABAC-enabled built-in roles to grant repository-scoped permissions: Container Registry Repository Reader: grants image pull and metadata read permissions, including tag resolution and referrer discoverability. Container Registry Repository Writer: grants Repository Reader permissions, as well as image and tag push permissions. Container Registry Repository Contributor: grants Repository Reader and Writer permissions, as well as image and tag delete permissions. Note that these roles do not grant repository list permissions. The separate Container Registry Repository Catalog Lister must be assigned to grant repository list permissions. The Container Registry Repository Catalog Lister role does not support ABAC conditions; assigning it grants permissions to list all repositories in a registry. Important role behavior changes in ABAC mode When a registry is ABAC-enabled by configuring its permissions mode to “RBAC Registry + ABAC Repository Permissions”: Legacy data-plane roles such as AcrPull, AcrPush, AcrDelete are not honored in ABAC-enabled registries. For ABAC-enabled registries, use the ABAC-enabled built-in roles listed above. Broad roles like Owner, Contributor, and Reader previously granted full control plane and data plane permissions, which is typically an overprivileged role assignment. In ABAC-enabled registries, these broad roles will only grant control plane permissions to the registry. They will no longer grant data plane permissions, such as image push, pull, delete or repository list permissions. ACR Tasks, Quick Tasks, Quick Builds, and Quick Runs no longer inherit default data-plane access to source registries; assign the ABAC-enabled roles above to the calling identity as needed. Identities you can assign ACR ABAC uses standard Microsoft Entra role assignments. Assign RBAC roles with optional ABAC conditions to users, groups, service principals, and managed identities, including AKS kubelet and workload identities, ACA and ACI identities, and more. Next steps Start using ABAC repository permissions in ACR to enforce least-privilege artifact push, pull, and delete boundaries across your CI/CD systems and container image workloads. This model is now the recommended approach for multi-tenant platform engineering patterns and central registry deployments. To get started, follow the step-by-step guides in the official ACR ABAC documentation: https://aka.ms/acr/auth/abac330Views1like0CommentsOn‑Device AI with Windows AI Foundry
From “waiting” to “instant”- without sending data away AI is everywhere, but speed, privacy, and reliability are critical. Users expect instant answers without compromise. On-device AI makes that possible: fast, private and available, even when the network isn’t - empowering apps to deliver seamless experiences. Imagine an intelligent assistant that works in seconds, without sending a text to the cloud. This approach brings speed and data control to the places that need it most; while still letting you tap into cloud power when it makes sense. Windows AI Foundry: A Local Home for Models Windows AI Foundry is a developer toolkit that makes it simple to run AI models directly on Windows devices. It uses ONNX Runtime under the hood and can leverage CPU, GPU (via DirectML), or NPU acceleration, without requiring you to manage those details. The principle is straightforward: Keep the model and the data on the same device. Inference becomes faster, and data stays local by default unless you explicitly choose to use the cloud. Foundry Local Foundry Local is the engine that powers this experience. Think of it as local AI runtime - fast, private, and easy to integrate into an app. Why Adopt On‑Device AI? Faster, more responsive apps: Local inference often reduces perceived latency and improves user experience. Privacy‑first by design: Keep sensitive data on the device; avoid cloud round trips unless the user opts in. Offline capability: An app can provide AI features even without a network connection. Cost control: Reduce cloud compute and data costs for common, high‑volume tasks. This approach is especially useful in regulated industries, field‑work tools, and any app where users expect quick, on‑device responses. Hybrid Pattern for Real Apps On-device AI doesn’t replace the cloud, it complements it. Here’s how: Standalone On‑Device: Quick, private actions like document summarization, local search, and offline assistants. Cloud‑Enhanced (Optional): Large-context models, up-to-date knowledge, or heavy multimodal workloads. Design an app to keep data local by default and surface cloud options transparently with user consent and clear disclosures. Windows AI Foundry supports hybrid workflows: Use Foundry Local for real-time inference. Sync with Azure AI services for model updates, telemetry, and advanced analytics. Implement fallback strategies for resource-intensive scenarios. Application Workflow Code Example 1. Only On-Device: Tries Foundry Local first, falls back to ONNX if foundry_runtime.check_foundry_available(): # Use on-device Foundry Local models try: answer = foundry_runtime.run_inference(question, context) return answer, source="Foundry Local (On-Device)" except Exception as e: logger.warning(f"Foundry failed: {e}, trying ONNX...") if onnx_model.is_loaded(): # Fallback to local BERT ONNX model try: answer = bert_model.get_answer(question, context) return answer, source="BERT ONNX (On-Device)" except Exception as e: logger.warning(f"ONNX failed: {e}") return "Error: No local AI available" 2. Hybrid approach: On-device first, cloud as last resort def get_answer(question, context): """ Priority order: 1. Foundry Local (best: advanced + private) 2. ONNX Runtime (good: fast + private) 3. Cloud API (fallback: requires internet, less private) # in case of Hybrid approach, based on real-time scenario """ if foundry_runtime.check_foundry_available(): # Use on-device Foundry Local models try: answer = foundry_runtime.run_inference(question, context) return answer, source="Foundry Local (On-Device)" except Exception as e: logger.warning(f"Foundry failed: {e}, trying ONNX...") if onnx_model.is_loaded(): # Fallback to local BERT ONNX model try: answer = bert_model.get_answer(question, context) return answer, source="BERT ONNX (On-Device)" except Exception as e: logger.warning(f"ONNX failed: {e}, trying cloud...") # Last resort: Cloud API (requires internet) if network_available(): try: import requests response = requests.post( '{BASE_URL_AI_CHAT_COMPLETION}', headers={'Authorization': f'Bearer {API_KEY}'}, json={ 'model': '{MODEL_NAME}', 'messages': [{ 'role': 'user', 'content': f'Context: {context}\n\nQuestion: {question}' }] }, timeout=10 ) answer = response.json()['choices'][0]['message']['content'] return answer, source="Cloud API (Online)" except Exception as e: return "Error: No AI runtime available", source="Failed" else: return "Error: No internet and no local AI available", source="Offline" Demo Project Output: Foundry Local answering context-based questions offline : The Foundry Local engine ran the Phi-4-mini model offline and retrieved context-based data. : The Foundry Local engine ran the Phi-4-mini model offline and mentioned that there is no answer. Practical Use Cases Privacy-First Reading Assistant: Summarize documents locally without sending text to the cloud. Healthcare Apps: Analyze medical data on-device for compliance. Financial Tools: Risk scoring without exposing sensitive financial data. IoT & Edge Devices: Real-time anomaly detection without network dependency. Conclusion On-device AI isn’t just a trend - it’s a shift toward smarter, faster, and more secure applications. With Windows AI Foundry and Foundry Local, developers can deliver experiences that respect user specific data, reduce latency, and work even when connectivity fails. By combining local inference with optional cloud enhancements, you get the best of both worlds: instant performance and scalable intelligence. Whether you’re creating document summarizers, offline assistants, or compliance-ready solutions, this approach ensures your apps stay responsive, reliable, and user-centric. References Get started with Foundry Local - Foundry Local | Microsoft Learn What is Windows AI Foundry? | Microsoft Learn https://devblogs.microsoft.com/foundry/unlock-instant-on-device-ai-with-foundry-local/Capture .NET Profiler Trace on the Azure App Service platform
Summary The article provides guidance on using the .NET Profiler Trace feature in Microsoft Azure App Service to diagnose performance issues in ASP.NET applications. It explains how to configure and collect the trace by accessing the Azure Portal, navigating to the Azure App Service, and selecting the "Collect .NET Profiler Trace" feature. Users can choose between "Collect and Analyze Data" or "Collect Data only" and must select the instance to perform the trace on. The trace stops after 60 seconds but can be extended up to 15 minutes. After analysis, users can view the report online or download the trace file for local analysis, which includes information like slow requests and CPU stacks. The article also details how to analyze the trace using Perf View, a tool available on GitHub, to identify performance issues. Additionally, it provides a table outlining scenarios for using .NET Profiler Trace or memory dumps based on various factors like issue type and symptom code. This tool is particularly useful for diagnosing slow or hung ASP.NET applications and is available only in Standard or higher SKUs with the Always On setting enabled. In this article How to configure and collect the .NET Profiler Trace How to download the .NET Profiler Trace How to analyze a .NET Profiler Trace When to use .NET Profilers tracing vs. a memory dump The tool is exceptionally suited for scenarios where an ASP.NET application is performing slower than expected or gets hung. As shown in Figure 1, this feature is available only in Standard or higher Stock Keeping Unit (SKU) and Always On is enabled. If you try to configure .NET Profiler Trace, without both configurations the following messages is rendered. Azure App Service Diagnose and solve problems blade in the Azure Portal error messages Error – This tool is supported only on Standard, Premium, and Isolated Stock Keeping Unit (SKU) only with AlwaysOn setting enabled to TRUE. Error – We determined that the web app is not "Always-On" enabled and diagnostic does not work reliably with Auto Heal. Turn on the Always-On setting by going to the Application Settings for the web app and then run these tools. How to configure and collect the .NET Profiler Trace To configure a .NET Profiler Trace access the Azure Portal and navigate to the Azure App Service which is experiencing a performance issue. Select Diagnose and solve problems and then the Diagnostic Tools tile. Azure App Service Diagnose and solve problems blade in the Azure Portal Select the "Collect .NET Profiler Trace" feature on the Diagnostic Tools blade and the following blade is rendered. Notice that you can only select Collect and Analyze Data or Collect Data only. Choose the one you prefer but do consider having the feature perform the analysis. You can download the trace for offline analysis if necessary. Also notice that you need to **select the instance** on which you want to perform the trace. In the scenario, there is only one, so the selection is simple. However, if your app runs on multiple instances, either select them all or if you identify a specific instance which is behaving slowly, select only that one. You realize the best results if you can isolate a single instance enough so that the request you sent is the only one received on that instance. However, in a scenario where the request or instance is not known, the trace adds value and insights. Adding a thread report provides list of all the threads in the process is also collected at the end of the profiler trace. The thread report is useful especially if you are troubleshooting hung processes, deadlocks, or requests taking more than 60 seconds. This pauses your process for a few seconds until the thread dump is generated. CAUTION: a thread report is NOT recommended if you are experiencing High CPU in your application, you may experience issues during trace analysis if CPU consumption is high. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace blade in the Azure Portal There are a few points called out in the previous image which are important to read and consider. Specifically the .NET Profiler Trace will stop after 60 seconds from the time that it is started. Therefore, if you can reproduce the issue, have the reproduction steps ready before you start the profiling. If you are not able to reproduce the issue, then you may need to run the trace a few times until the slowness or hang occurs. The collection time can be increased up to 15 minutes (900 seconds) by adding an application setting named IIS_PROFILING_TIMEOUT_IN_SECONDS with a value of up to 900. After selecting the instance to perform the trace on, press the Collect Profiler Trace button, wait for the profiler to start as seen here, then reproduce the issue or wait for it to occur. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace status starting window After the issue is reproduced the .NET Profiler Trace continues to the next step of stopping as seen here. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace status stopping window Once stopped, the process continues to the analysis phase if you selected the Collect and Analyze Data option, as seen in the following image, otherwise you are provided a link to download the file for analysis on your local machine. The analysis can take some time, so be patient. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace status analyzing window After the analysis is complete, you can either view the Analysis online or download the trace file for local development. How to download the .NET Profiler Trace Once the analysis is complete you can view the report by selecting the link in the Reports column, as seen here. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace status complete window Clicking on the report you see the following. There is some useful information in this report, like a list of slow requests, Failed Requests, Thread Call stacks, and CPU stacks. Also shown is a breakdown of where the time was spent during the response generation into categories like Application Code, Platform, and Network. In this case, all the time is spent in the Application code. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace review the Report To find out specifically where in the Application Code this request performs the analysis of the trace locally. How to analyze a .NET Profiler Trace After downloading the network trace by selecting the link in the Data column, you can use a tool named Perf View which is downloadable on GitHub here. Begin by opening Perf View and double-clicking on the ".DIAGSESSION" file, after some moments expand it to render the Event Trace Log (ETL) file, as shown here. Analyze Azure App Service .NET Profiler Trace with Perf View Double-click on the Thread Time (with startStop Activities) Stacks which open up a new window similar to shown next. If your App Service is configured as out-of-process select the dotnet process which is associated to your app code. If your App Service is in-process select the w3wp process. Analyze Azure App Service .NET Profiler Trace with Perf View, dotnet out-of-process Double-click on dotnet and another window is rendered, as shown here. From the previous image, .NET Profiler Trace reviews the Report, it is clear where slowness is coming from, find that in the Name column or search for it by entering the page name into the Find text box. Analyze Azure App Service .NET Profiler Trace with Perf View, dotnet out-of-process, method, and class discovery Once found right-click on the row and select Drill Into from the pop-up menu, shown here. Select the Call Tree tab and the reason for the issue renders showing which request was performing slow. Analyze Azure App Service .NET Profiler Trace with Perf View, dotnet out-of-process, root cause This example is relatively. As you analyze more performance issues using Perf View to analyze a .NET Profiler Trace your ability to find the root cause of more complicated performance issues can be realized. When to use .NET Profilers tracing vs. a memory dump That same issue is seen in a memory dump, however there are some scenarios where a .NET Profile trace would be best. Here is a table, Table 1, which describes scenarios for when to capture a .NET profile trace or to capture a memory dump. Issue Type Symptom Code Symptom Stack Startup Issue Intermittent Scenario Performance 200 Requests take 500 ms to 2.5 seconds, or takes <= 60 seconds ASP.NET/ASP.NET Core No No Profiler Performance 200 Requests take > 60 seconds & < 230 seconds ASP.NET/ASP.NET Core No No Dump Performance 502.3/500.121/503 Requests take >=120 to <= 230 seconds ASP.NET No No Dump, Profiler Performance 502.3/500.121/503 Requests timing out >=230 ASP.NET/ASP.NET Core Yes/No Yes/No Dump Performance 502.3/500.121/503 App hangs or deadlocks (ex: due to async anti-pattern) ASP.NET/ASP.NET Core Yes/No Yes/No Dump Performance 502.3/500.121/503 App hangs on startup (ex: caused by nonasync deadlock issue) ASP.NET/ASP.NET Core No Yes/No Dump Performance 502.3/500.121 Request timing out >=230 (time out) ASP.NET/ASP.NET Core No No Dump Availability 502.3/500.121/503 High CPU causing app downtime ASP.NET No No Profiler, Dump Availability 502.3/500.121/503 High Memory causing app downtime ASP.NET/ASP.NET Core No No Dump Availability 500.0[121]/503 SQLException or Some Exception causes app downtime ASP.NET No No Dump, Profiler Availability 500.0[121]/503 App crashing due to fatal exception at native layer ASP.NET/ASP.NET Core Yes/No Yes/No Dump Availability 500.0[121]/503 App crashing due to exit code (ex: 0xC0000374) ASP.NET/ASP.NET Core Yes/No Yes/No Dump Availability 500.0 App begin nonfatal exceptions (during a context of a request) ASP.NET No No Profiler, Dump Availability 500.0 App begin nonfatal exceptions (during a context of a request) ASP.NET/ASP.NET Core No Yes/No Dump Table 1, when to capture a .NET Profiler Trace or a Memory Dump on Azure App Service, Diagnose and solve problems Use this list as a guide to help decide how to approach the solving of performance and availability applications problems which are occurring in your application source code. Here are some descriptions regarding the column heading. - Issues Type – Performance means that a request to the app is responding or processing the response but not at a speed in which it is expected to. Availability means that the request is failing or consuming more resources than expected. - Symptom Code – the HTTP Status and/or sub status which is returned by the request. - Symptom – a description of the behavior experienced while engaging with the application. - Stack – this table targets .NET, specifically ASP.NET, and ASP.NET Core applications. - Startup Issue – if "No" then the Scenario can or should be used, "No" represents that the issue is not at startup. If "Yes/No" it means the Scenario is useful for troubleshooting startup issues. - Intermittent – if "No" then the Scenario can or should be used, "No" means the issue is not intermittent or that it can be reproduced. If "Yes/No" it means the Scenario is useful if the issue happens randomly or cannot be reproduced. Meaning that the tool can be set to trigger on a specific event or left running for a specific amount of time until the exception happens. - Scenario – "Profiler" means that the collection of a .NET Profiler Trace would be recommended. "Dump" means that a memory dump would be your best option. If both are provided, then both can be useful when the given symptoms and system codes are present. You might find the videos in Table 2 useful which instruct you how to collect and analyze a memory dump or .NET Profiler Trace. Product Stack Hosting Symptom Capture Analyze Scenario App Service Windows in High CPU link link Dump App Service Windows in High Memory link link Dump App Service Windows in Terminate link link Dump App Service Windows in Hang link link Dump App Service Windows out High CPU link link Dump App Service Windows out High Memory link link Dump App Service Windows out Terminate link link Dump App Service Windows out Hang link link Dump App Service Windows in High CPU link link Dump Function App Windows in High Memory link link Dump Function App Windows in Terminate link link Dump Function App Windows in Hang link link Dump Function App Windows out High CPU link link Dump Function App Windows out High Memory link link Dump Function App Windows out Terminate link link Dump Function App Windows out Hang link link Dump Azure WebJob Windows in High CPU link link Dump App Service Windows in High CPU link link .NET Profiler App Service Windows in Hang link link .NET Profiler App Service Windows in Exception link link .NET Profiler App Service Windows out High CPU link link .NET Profiler App Service Windows out Hang link link .NET Profiler App Service Windows out Exception link link .NET Profiler Table 2, short video instructions on capturing and analyzing dumps and profiler traces Here are a few other helpful videos for troubleshooting Azure App Service Availability and Performance issues: View Application EventLogs Azure App Service Add Application Insights To Azure App Service Prior to capturing and analyzing memory dumps, consider viewing this short video: Setting up WinDbg to analyze Managed code memory dumps and this blog post titled: Capture memory dumps on the Azure App Service platform. Question & Answers - Q: What are the prerequisites for using the .NET Profiler Trace feature in Azure App Service? A: To use the .NET Profiler Trace feature in Azure App Service, the application must be running on a Standard or higher Stock Keeping Unit (SKU) with the Always On setting enabled. If these conditions are not met, the tool will not function, and error messages will be displayed indicating the need for these configurations. - Q: How can you extend the default collection time for a .NET Profiler Trace beyond 60 seconds? A: The default collection time for a .NET Profiler Trace is 60 seconds, but it can be extended up to 15 minutes (900 seconds) by adding an application setting named IIS_PROFILING_TIMEOUT_IN_SECONDS with a value of up to 900. This allows for a longer duration to capture the necessary data for analysis. - Q: When should you use a .NET Profiler Trace instead of a memory dump for diagnosing performance issues in an ASP.NET application? A: A .NET Profiler Trace is recommended for diagnosing performance issues where requests take between 500 milliseconds to 2.5 seconds or less than 60 seconds. It is also useful for identifying high CPU usage causing app downtime. In contrast, a memory dump is more suitable for scenarios where requests take longer than 60 seconds, the application hangs or deadlocks, or there are issues related to high memory usage or app crashes due to fatal exceptions. Keywords Microsoft Azure, Azure App Service, .NET Profiler Trace, ASP.NET performance, Azure debugging tools, .NET performance issues, Azure diagnostic tools, Collect .NET Profiler Trace, Analyze .NET Profiler Trace, Azure portal, Performance troubleshooting, ASP.NET application, Slow ASP.NET app, Azure Standard SKU, Always On setting, Memory dump vs profiler trace, Perf View analysis, Azure performance diagnostics, .NET application profiling, Diagnose ASP.NET slowness, Azure app performance, High CPU usage ASP.NET, Azure app diagnostics, .NET Profiler configuration, Azure app service performance2KViews3likes1CommentLangChain v1 is now generally available!
Today LangChain v1 officially launches and marks a new era for the popular AI agent library. The new version ushers in a more streamlined, and extensible foundation for building agentic LLM applications. In this post we'll breakdown what’s new, what changed, and what “general availability” means in practice. Join Microsoft Developer Advocates, Marlene Mhangami and Yohan Lasorsa, to see live demos of the new API and find out more about what JavaScript and Python developers need to know about v1. Register for this event here. Why v1? The Motivation Behind the Redesign The number of abstractions in LangChain had grown over the years to include chains, agents, tools, wrappers, prompt helpers and more, which, while powerful, introduced complexity and fragmentation. As model APIs evolve (multimodal inputs, richer structured output, tool-calling semantics), LangChain needed a cleaner, more consistent core to ensure production ready stability. In v1: All existing chains and agent abstractions in the old LangChain are deprecated; they are replaced by a single high-level agent abstraction built on LangGraph internals. LangGraph becomes the foundational runtime for durable, stateful, orchestrated execution. LangChain now emphasizes being the “fast path to agents” that doesn’t hide but builds upon LangGraph. The internal message format has been upgraded to support standard content blocks (e.g. text, reasoning, citations, tool calls) across model providers, decoupling “content” from raw strings. Namespace cleanup: the langchain package now focuses tightly on core abstractions (agents, models, messages, tools), while legacy patterns are moved into langchain-classic (or equivalents). What’s New & Noteworthy for Developers Here are key changes developers should pay attention to: 1. create_agent becomes the default API The create_agent function is now the idiomatic way to spin up agents in v1. It replaces older constructs (e.g. create_react_agent) with a clearer, more modular API. You can also now compose middleware around model calls, tool calls, before/after hooks, error handling, etc. 2. Standard content blocks & normalized message model One of LangChain's greatest stregnth's is it's model agnosticism. Content blocks move to standardize all outputs, so developers know exactly what to expect regardless of the model they are using. Responses from models are no longer opaque strings. Instead, they carry structured `content_blocks` which classify parts of the output (e.g. “text”, “reasoning”, “citation”, “tool_call”). 3. Multimodal and richer model inputs / outputs LangChain continues to support more than just text-based interactions, but in a more comprehensive way in v1. Models can accept and return files, images, video, etc., and the message format reflects this flexibility. This upgrade prepares us well for the next generation of models with mixed modalities (vision, audio, etc.). 4. Middleware hooks Because create_agent is designed as a pluggable pipeline, developers can now inject logic before/after model calls, before tool calls and more. New middleware such as 'human in the loop' and 'summarization' middleware have been added. This is a feature of the new package that I am most excited about it! Even with the simplified agents API, this option provides more room to customize workflows! Developers can try pre-built middleware or make their own. 5. Simplified, leaner namespace Many formerly top-level modules or helper classes have been removed or relocated to langchain-classic (or similarly stamped “legacy”) to declutter the main API surface. A migration guide is available to help projects transition from v0 to v1. While v1 is now the main line, older v0 is still documented and maintained for compatibility. What “General Availability” Means (and Doesn’t) v1 is production-ready, after testing the alpha version. The stable v0 release line remains supported for those unwilling or unable to migrate immediately. Breaking changes in public APIs will be accompanied by version bumps (i.e. minor version increments) and deprecation notices. The roadmap anticipates minor versions every 2–3 months (with patch releases more frequently). Because the field of LLM applications is evolving rapidly, the team expects continued iterations in v1—even in GA mode—with users encouraged to surface feedback, file issues, and adopt the migration path. (This is in line with the philosophy stated in docs.) Developer Callouts & Suggested Steps Some things we recommend for developers to do to get started with v1: Try the new API Now! LangChain Azure AI and Azure OpenAI have migrated to LangChain v1 and are ready to test! Learn more about using LangChain and Azure AI: Python: https://docs.langchain.com/oss/python/integrations/providers/azure_ai JavaScript: https://docs.langchain.com/oss/javascript/integrations/providers/microsoft Join us for a Live Stream on Wednesday 22 October 2025 Join Microsoft Developer Advocates Marlene Mhangami and Yohan Lasorsa for a livestream this Wednesday to see live demos and find out more about what JavaScript and Python developers need to know about v1. Register for this event here.Using Keycloak with Azure AD to integrate AKS Cluster authentication process
Integrating Azure Kubernetes Service (AKS) with Keycloak through Azure Active Directory (Azure AD) as an intermediary leverages Azure AD’s support for OpenID Connect (OIDC) to handle authentication and authorization. This integration enhances security, streamlines user management, and simplifies the authentication process for users accessing the AKS cluster.AMA Spotlight: Build Smarter with Azure Developer CLI 'AZD'
Weekly AMA 'Ask Me Anything': Build Smarter with Azure Developer CLI Calling all AI engineers, developers, and builders of the future, this is your backstage pass to the tools shaping scalable, agentic AI deployments. Join Kristen Womack, Product Manager for the Azure Developer CLI (azd) Developer CLI (azd), and the engineering team behind azd for a live Ask Me Anything session every Thursday at 12:30pm PT in the Azure AI Foundry Discord. Whether you're: 🧠 Orchestrating multi-agent systems 📦 Deploying LLM-powered apps with Azure AI Foundry 🔐 Navigating least-privilege infrastructure setups 🛠️ Debugging and optimizing reproducible workflows …this AMA is your chance to connect directly with the team building the CLI that powers it all. 💡 Why Join? Real-time answers from the azd engineers and product team Deployment walkthroughs for Foundry templates, from chatbots to document processors Tips for CI/CD, debugging, and reproducibility in enterprise environments Community-first mindset: bring your feedback, challenges, and ideas Kristen Womack brings deep insight into developer experience and product strategy; this is a rare opportunity to learn from the source and shape the future of AI tooling. 🔧 Get Ready Before you join: Install azd 👉 Install Guide Explore Kristen’s work 👉 www.kristenwomack.io Join the Discord 👉 Azure AI Foundry Community 🗓️ Weekly Schedule 🕧 Thursdays at 12:30pm PT 📍 Azure AI Foundry Discord Bring your questions. Bring your curiosity. Build with the best. Additional resources: check out the AZD for Beginners course https://aka.ms/azd-for-beginnersBeyond the Desktop: The Future of Development with Microsoft Dev Box and GitHub Codespaces
The modern developer platform has already moved past the desktop. We’re no longer defined by what’s installed on our laptops, instead we look at what tooling we can use to move from idea to production. An organisations developer platform strategy is no longer a nice to have, it sets the ceiling for what’s possible, an organisation can’t iterate it's way to developer nirvana if the foundation itself is brittle. A great developer platform shrinks TTFC (time to first commit), accelerates release velocity, and maybe most importantly, helps alleviate everyday frictions that lead to developer burnout. Very few platforms deliver everything an organization needs from a developer platform in one product. Modern development spans multiple dimensions, local tooling, cloud infrastructure, compliance, security, cross-platform builds, collaboration, and rapid onboarding. The options organizations face are then to either compromise on one or more of these areas or force developers into rigid environments that slow productivity and innovation. This is where Microsoft Dev Box and GitHub Codespaces come into play. On their own, each addresses critical parts of the modern developer platform: Microsoft Dev Box provides a full, managed cloud workstation. Dev Box gives developers a consistent, high-performance environment while letting central IT apply strict governance and control. Internally at Microsoft, we estimate that usage of Dev Box by our development teams delivers savings of 156 hours per year per developer purely on local environment setup and upkeep. We have also seen significant gains in other key SPACE metrics reducing context-switching friction and improving build/test cycles. Although the benefits of Dev Box are clear in the results demonstrated by our customers it is not without its challenges. The biggest challenge often faced by Dev Box customers is its lack of native Linux support. At the time of writing and for the foreseeable future Dev Box does not support native Linux developer workstations. While WSL2 provides partial parity, I know from my own engineering projects it still does not deliver the full experience. This is where GitHub Codespaces comes into this story. GitHub Codespaces delivers instant, Linux-native environments spun up directly from your repository. It’s lightweight, reproducible, and ephemeral ideal for rapid iteration, PR testing, and cross-platform development where you need Linux parity or containerized workflows. Unlike Dev Box, Codespaces can run fully in Linux, giving developers access to native tools, scripts, and runtimes without workarounds. It also removes much of the friction around onboarding: a new developer can open a repository and be coding in minutes, with the exact environment defined by the project’s devcontainer.json. That said, Codespaces isn’t a complete replacement for a full workstation. While it’s perfect for isolated project work or ephemeral testing, it doesn’t provide the persistent, policy-controlled environment that enterprise teams often require for heavier workloads or complex toolchains. Used together, they fill the gaps that neither can cover alone: Dev Box gives the enterprise-grade foundation, while Codespaces provides the agile, cross-platform sandbox. For organizations, this pairing sets a higher ceiling for developer productivity, delivering a truly hybrid, agile and well governed developer platform. Better Together: Dev Box and GitHub Codespaces in action Together, Microsoft Dev Box and GitHub Codespaces deliver a hybrid developer platform that combines consistency, speed, and flexibility. Teams can spin up full, policy-compliant Dev Box workstations preloaded with enterprise tooling, IDEs, and local testing infrastructure, while Codespaces provides ephemeral, Linux-native environments tailored to each project. One of my favourite use cases is having local testing setups like a Docker Swarm cluster, ready to go in either Dev Box or Codespaces. New developers can jump in and start running services or testing microservices immediately, without spending hours on environment setup. Anecdotally, my time to first commit and time to delivering “impact” has been significantly faster on projects where one or both technologies provide local development services out of the box. Switching between Dev Boxes and Codespaces is seamless every environment keeps its own libraries, extensions, and settings intact, so developers can jump between projects without reconfiguring or breaking dependencies. The result is a turnkey, ready-to-code experience that maximizes productivity, reduces friction, and lets teams focus entirely on building, testing, and shipping software. To showcase this value, I thought I would walk through an example scenario. In this scenario I want to simulate a typical modern developer workflow. Let's look at a day in the life of a developer on this hybrid platform building an IOT project using Python and React. Spin up a ready-to-go workstation (Dev Box) for Windows development and heavy builds. Launch a Linux-native Codespace for cross-platform services, ephemeral testing, and PR work. Run "local" testing like a Docker Swarm cluster, database, and message queue ready to go out-of-the-box. Switch seamlessly between environments without losing project-specific configurations, libraries, or extensions. 9:00 AM – Morning Kickoff on Dev Box I start my day on my Microsoft Dev Box, which gives me a fully-configured Windows environment with VS Code, design tools, and Azure integrations. I select my teams project, and the environment is pre-configured for me through the Dev Box catalogue. Fortunately for me, its already provisioned. I could always self service another one using the "New Dev Box" button if I wanted too. I'll connect through the browser but I could use the desktop app too if I wanted to. My Tasks are: Prototype a new dashboard widget for monitoring IoT device temperature. Use GUI-based tools to tweak the UI and preview changes live. Review my Visio Architecture. Join my morning stand up. Write documentation notes and plan API interactions for the backend. In a flash, I have access to my modern work tooling like Teams, I have this projects files already preloaded and all my peripherals are working without additional setup. Only down side was that I did seem to be the only person on my stand up this morning? Why Dev Box first: GUI-heavy tasks are fast and responsive. Dev Box’s environment allows me to use a full desktop. Great for early-stage design, planning, and visual work. Enterprise Apps are ready for me to use out of the box (P.S. It also supports my multi-monitor setup). I use my Dev Box to make a very complicated change to my IoT dashboard. Changing the title from "IoT Dashboard" to "Owain's IoT Dashboard". I preview this change in a browser live. (Time for a coffee after this hardwork). The rest of the dashboard isnt loading as my backend isnt running... yet. 10:30 AM – Switching to Linux Codespaces Once the UI is ready, I push the code to GitHub and spin up a Linux-native GitHub Codespace for backend development. Tasks: Implement FastAPI endpoints to support the new IoT feature. Run the service on my Codespace and debug any errors. Why Codespaces now: Linux-native tools ensure compatibility with the production server. Docker and containerized testing run natively, avoiding WSL translation overhead. The environment is fully reproducible across any device I log in from. 12:30 PM – Midday Testing & Sync I toggle between Dev Box and Codespaces to test and validate the integration. I do this in my Dev Box Edge browser viewing my codespace (I use my Codespace in a browser through this demo to highlight the difference in environments. In reality I would leverage the VSCode "Remote Explorer" extension and its GitHub Codespace integration to use my Codespace from within my own desktop VSCode but that is personal preference) and I use the same browser to view my frontend preview. I update the environment variable for my frontend that is running locally in my Dev Box and point it at the port running my API locally on my Codespace. In this case it was a web socket connection and HTTPS calls to port 8000. I can make this public by changing the port visibility in my Codespace. https://fluffy-invention-5x5wp656g4xcp6x9-8000.app.github.dev/api/devices wss://fluffy-invention-5x5wp656g4xcp6x9-8000.app.github.dev/ws This allows me to: Preview the frontend widget on Dev Box, connecting to the backend running in Codespaces. Make small frontend adjustments in Dev Box while monitoring backend logs in Codespaces. Commit changes to GitHub, keeping both environments in sync and leveraging my CI/CD for deployment to the next environment. We can see the Dev Box running local frontend and the Codespace running the API connected to each other, making requests and displaying the data in the frontend! Hybrid advantage: Dev Box handles GUI previews comfortably and allows me to live test frontend changes. Codespaces handles production-aligned backend testing and Linux-native tools. Dev Box allows me to view all of my files in one screen with potentially multiple Codespaces running in browser of VS Code Desktop. Due to all of those platform efficiencies I have completed my days goals within an hour or two and now I can spend the rest of my day learning about how to enable my developers to inner source using GitHub CoPilot and MCP (Shameless plug). The bottom line There are some additional considerations when architecting a developer platform for an enterprise such as private networking and security not covered in this post but these are implementation details to deliver the described developer experience. Architecting such a platform is a valuable investment to deliver the developer platform foundations we discussed at the top of the article. While in this demo I have quickly built I was working in a mono repository in real engineering teams it is likely (I hope) that an application is built of many different repositories. The great thing about Dev Box and Codespaces is that this wouldn’t slow down the rapid development I can achieve when using both. My Dev Box would be specific for the project or development team, pre loaded with all the tools I need and potentially some repos too! When I need too I can quickly switch over to Codespaces and work in a clean isolated environment and push my changes. In both cases any changes I want to deliver locally are pushed into GitHub (Or ADO), merged and my CI/CD ensures that my next step, potentially a staging environment or who knows perhaps *Whispering* straight into production is taken care of. Once I’m finished I delete my Codespace and potentially my Dev Box if I am done with the project, knowing I can self service either one of these anytime and be up and running again! Now is there overlap in terms of what can be developed in a Codespace vs what can be developed in Azure Dev Box? Of course, but as organisations prioritise developer experience to ensure release velocity while maintaining organisational standards and governance then providing developers a windows native and Linux native service both of which are primarily charged on the consumption of the compute* is a no brainer. There are also gaps that neither fill at the moment for example Microsoft Dev Box only provides windows compute while GitHub Codespaces only supports VS Code as your chosen IDE. It's not a question of which service do I choose for my developers, these two services are better together! * Changes have been announced to Dev Box pricing. A W365 license is already required today and dev boxes will continue to be managed through Azure. For more information please see: Microsoft Dev Box capabilities are coming to Windows 365 - Microsoft Dev Box | Microsoft Learn757Views2likes0CommentsDeployment and Build from Azure Linux based Web App
TOC Introduction Deployment Sources From Laptop From CI/CD tools Build Source From Oryx Build From Runtime From Deployment Sources Walkthrough Laptop + Oryx Laptop + Runtime Laptop CI/CD concept Conclusion 1. Introduction Deployment on Azure Linux Web Apps can be done through several different methods. When a deployment issue occurs, the first step is usually to identify which method was used. The core of these methods revolves around the concept of Build, the process of preparing and loading the third-party dependencies required to run an application. For example, a Python app defines its build process as pip install packages, a Node.js app uses npm install modules, and PHP or Java apps rely on libraries. In this tutorial, I’ll use a simple Python app to demonstrate four different Deployment/Build approaches. Each method has its own use cases and limitations. You can even combine them, for example, using your laptop as the deployment tool while still using Oryx as the build engine. The same concepts apply to other runtimes such as Node.js, PHP, and beyond. 2. Deployment Sources From Laptop Scenarios: Setting up a proof of concept Developing in a local environment Advantages: Fast development cycle Minimal configuration required Limitations: Difficult for the local test environment to interact with cloud resources OS differences between local and cloud environments may cause integration issues From CI/CD tools Scenarios: Projects with established development and deployment workflows Codebases requiring version control and automation Advantages: Developers can focus purely on coding Automatic deployment upon branch commits Limitations: Build and runtime environments may still differ slightly at the OS level 3. Build Source From Oryx Build Scenarios: Offloading resource-intensive build tasks from your local or CI/CD environment directly to the Azure Web App platform, reducing local computing overhead. Advantages: Minimal extra configuration Multi-language support Limitations: Build performance is limited by the App Service SKU and may face performance bottlenecks The build environment may differ from the runtime environment, so apps sensitive to minor package versions should take caution From Runtime Scenarios: When you want the benefits and pricing of a PaaS solution but need control similar to an IaaS setup Advantages: Build occurs in the runtime environment itself Allows greater flexibility for low-level system operations Limitations: Certain system-level settings (e.g., NTP time sync) remain inaccessible From Deployment Sources Scenarios: Pre-package all dependencies and deploy them together, eliminating the need for a separate build step. Advantages: Supports proprietary or closed-source company packages Limitations: Incompatibility may arise if the development and runtime environments differ significantly in OS or package support Type Method Scenario Advantage Limitation Deployment From Laptop POC / Dev Fast setup Poor cloud link Deployment From CI/CD Auto pipeline Focus on code OS mismatch Build From Oryx Platform build Simple, multi-lang Performance cap Build From Runtime High control Flexible ops Limited access Build From Deployment Pre-built deploy Use private pkg Env mismatch 4. Walkthrough Laptop + Oryx Add Environment Variables SCM_DO_BUILD_DURING_DEPLOYMENT=false (Purpose: prevents the deployment environment from packaging during publish; this must also be set in the deployment environment itself.) WEBSITE_RUN_FROM_PACKAGE=false (Purpose: tells Azure Web App not to run the app from a prepackaged file.) ENABLE_ORYX_BUILD=true (Purpose: allows the Azure Web App platform to handle the build process automatically after a deployment event.) Add startup command bash /home/site/wwwroot/run.sh (The run.sh file corresponds to the script in your project code.) Check sample code requirements.txt — defines Python packages (similar to package.json in Node.js). Flask==3.0.3 gunicorn==23.0.0 app.py — main Python application code. from flask import Flask app = Flask(__name__) @app.route("/") def home(): return "Deploy from Laptop + Oryx" if __name__ == "__main__": import os app.run(host="0.0.0.0", port=8000) run.sh — script used to start the application. #!/bin/bash gunicorn --bind=0.0.0.0:8000 app:app .deployment — VS Code deployment configuration file. [config] SCM_DO_BUILD_DURING_DEPLOYMENT=false Deployment Once both the deployment and build processes complete successfully, you should see the expected result. Laptop + Runtime Add Environment Variables (Screenshots omitted since the process is similar to previous steps) SCM_DO_BUILD_DURING_DEPLOYMENT=false Purpose: Prevents the deployment environment from packaging during the publishing process. This setting must also be added in the deployment environment itself. WEBSITE_RUN_FROM_PACKAGE=false Purpose: Instructs Azure Web App not to run the application from a prepackaged file. ENABLE_ORYX_BUILD=false Purpose: Ensures that Azure Web App does not perform any build after deployment; all build tasks will instead be handled during the startup script execution. Add Startup Command (Screenshots omitted since the process is similar to previous steps) bash /home/site/wwwroot/run.sh (The run.sh file corresponds to the script of the same name in your project code.) Check Sample Code (Screenshots omitted since the process is similar to previous steps) requirements.txt: Defines Python packages (similar to package.json in Node.js). Flask==3.0.3 gunicorn==23.0.0 app.py: The main Python application code. from flask import Flask app = Flask(__name__) @app.route("/") def home(): return "Deploy from Laptop + Runtime" if __name__ == "__main__": import os app.run(host="0.0.0.0", port=8000) run.sh: Startup script. In addition to launching the app, it also creates a virtual environment and installs dependencies, all build-related tasks happen here. #!/bin/bash python -m venv venv source venv/bin/activate pip install -r requirements.txt gunicorn --bind=0.0.0.0:8000 app:app .deployment: VS Code deployment configuration file. [config] SCM_DO_BUILD_DURING_DEPLOYMENT=false Deployment (Screenshots omitted since the process is similar to previous steps) Once both deployment and build are completed, you should see the expected output. Laptop Add Environment Variables (Screenshots omitted as the process is similar to previous steps) SCM_DO_BUILD_DURING_DEPLOYMENT=false Purpose: Prevents the deployment environment from packaging during publish. This must also be set in the deployment environment itself. WEBSITE_RUN_FROM_PACKAGE=false Purpose: Instructs Azure Web App not to run the app from a prepackaged file. ENABLE_ORYX_BUILD=false Purpose: Prevents Azure Web App from building after deployment. All build tasks will instead execute during the startup script. Add Startup Command (Screenshots omitted as the process is similar to previous steps) bash /home/site/wwwroot/run.sh (The run.sh corresponds to the same-named file in your project code.) Check Sample Code (Screenshots omitted as the process is similar to previous steps) requirements.txt: Defines Python packages (like package.json in Node.js). Flask==3.0.3 gunicorn==23.0.0 app.py: The main Python application. from flask import Flask app = Flask(__name__) @app.route("/") def home(): return "Deploy from Laptop" if __name__ == "__main__": import os app.run(host="0.0.0.0", port=8000) run.sh: The startup script. In addition to launching the app, it activates an existing virtual environment. The creation of that environment and installation of dependencies will occur in the next section. #!/bin/bash source venv/bin/activate gunicorn --bind=0.0.0.0:8000 app:app .deployment: VS Code deployment configuration file. [config] SCM_DO_BUILD_DURING_DEPLOYMENT=false Deployment Before deployment, you must perform a local build process. Run commands locally (depending on the language, usually for installing dependencies). python -m venv venv source venv/bin/activate pip install -r requirements.txt After completing the local build, deploy your app. Once deployment finishes, you should see the expected result. CI/CD concept For example, when using Azure DevOps (ADO) as your CI/CD tool, its behavior conceptually mirrors deploying directly from a laptop, but with enhanced automation, governance, and reproducibility. Essentially, ADO pipelines translate your manual local deployment steps into codified, repeatable workflows defined in a YAML pipeline file, executed by Microsoft-hosted or self-hosted agents. A typical azure-pipelines.yml defines the stages (e.g., build, deploy) and their corresponding jobs and steps. Each stage runs on a specified VM image (e.g., ubuntu-latest) and executes commands, the same npm install, pip install which you would normally run on your laptop. The ADO pipeline acts as your automated laptop, every build command, environment variable, and deployment step you’d normally execute locally is just formalized in YAML. Whether you build inline, use Oryx, or deploy pre-built artifacts, the underlying concept remains identical: compile, package, and deliver code to Azure. The distinction lies in who performs it. 5. Conclusion Different deployment and build methods lead to different debugging and troubleshooting approaches. Therefore, understanding the selected deployment method and its corresponding troubleshooting process is an essential skill for every developer and DevOps engineer.368Views0likes0CommentsAMA: Azure AI Foundry Voice Live API: Build Smarter, Faster Voice Agents
Join us LIVE in the Azure AI Foundry Discord on the 14th October, 2025, 10am PT to learn more about Voice Live API Voice is no longer a novelty, it's the next-gen interface between humans and machines. From automotive assistants to educational tutors, voice-driven agents are reshaping how we interact with technology. But building seamless, real-time voice experiences has often meant stitching together a patchwork of services: STT, GenAI, TTS, avatars, and more. Until now. Introducing Azure AI Foundry Voice Live API Launched into general availability on October 1, 2025, the Azure AI Foundry Voice Live API is a game-changer for developers building voice-enabled agents. It unifies the entire voice stack—speech-to-text, generative AI, text-to-speech, avatars, and conversational enhancements, into a single, streamlined interface. That means: ⚡ Lower latency 🧠 Smarter interactions 🛠️ Simplified development 📈 Scalable deployment Whether you're prototyping a voice bot for customer support or deploying a full-stack assistant in production, Voice Live API accelerates your journey from idea to impact. Ask Me Anything: Deep Dive with the CoreAI Speech Team Join us for a live AMA session where you can engage directly with the engineers behind the API: 🗓️ Date: 14th Oct 2025 🕒 Time: 10am PT 📍 Location: https://aka.ms/foundry/discord See the EVENTS 🎤 Speakers: Qinying Liao, Principal Program Manager, CoreAI Speech Jan Gorgen, Senior Program Manager, CoreAI Speech They’ll walk through real-world use cases, demo the API in action, and answer your toughest questions, from latency optimization to avatar integration. Who Should Attend? This AMA is designed for: AI engineers building multimodal agents Developers integrating voice into enterprise workflows Researchers exploring conversational UX Foundry users looking to scale voice prototypes Why It Matters Voice Live API isn’t just another endpoint, it’s a foundation for building natural, responsive, and production-ready voice agents. With Azure AI Foundry’s orchestration and deployment tools, you can: Skip the glue code Focus on experience design Deploy with confidence across platforms Bring Your Questions Curious about latency benchmarks? Want to know how avatars sync with TTS? Wondering how to integrate with your existing Foundry workflows? This is your chance to ask the team directly.From Cloud to Chip: Building Smarter AI at the Edge with Windows AI PCs
As AI engineers, we’ve spent years optimizing models for the cloud, scaling inference, wrangling latency, and chasing compute across clusters. But the frontier is shifting. With the rise of Windows AI PCs and powerful local accelerators, the edge is no longer a constraint it’s now a canvas. Whether you're deploying vision models to industrial cameras, optimizing speech interfaces for offline assistants, or building privacy-preserving apps for healthcare, Edge AI is where real-world intelligence meets real-time performance. Why Edge AI, Why Now? Edge AI isn’t just about running models locally, it’s about rethinking the entire lifecycle: - Latency: Decisions in milliseconds, not round-trips to the cloud. - Privacy: Sensitive data stays on-device, enabling HIPAA/GDPR compliance. - Resilience: Offline-first apps that don’t break when the network does. - Cost: Reduced cloud compute and bandwidth overhead. With Windows AI PCs powered by Intel and Qualcomm NPUs and tools like ONNX Runtime, DirectML, and Olive, developers can now optimize and deploy models with unprecedented efficiency. What You’ll Learn in Edge AI for Beginners The Edge AI for Beginners curriculum is a hands-on, open-source guide designed for engineers ready to move from theory to deployment. Multi-Language Support This content is available in over 48 languages, so you can read and study in your native language. What You'll Master This course takes you from fundamental concepts to production-ready implementations, covering: Small Language Models (SLMs) optimized for edge deployment Hardware-aware optimization across diverse platforms Real-time inference with privacy-preserving capabilities Production deployment strategies for enterprise applications Why EdgeAI Matters Edge AI represents a paradigm shift that addresses critical modern challenges: Privacy & Security: Process sensitive data locally without cloud exposure Real-time Performance: Eliminate network latency for time-critical applications Cost Efficiency: Reduce bandwidth and cloud computing expenses Resilient Operations: Maintain functionality during network outages Regulatory Compliance: Meet data sovereignty requirements Edge AI Edge AI refers to running AI algorithms and language models locally on hardware, close to where data is generated without relying on cloud resources for inference. It reduces latency, enhances privacy, and enables real-time decision-making. Core Principles: On-device inference: AI models run on edge devices (phones, routers, microcontrollers, industrial PCs) Offline capability: Functions without persistent internet connectivity Low latency: Immediate responses suited for real-time systems Data sovereignty: Keeps sensitive data local, improving security and compliance Small Language Models (SLMs) SLMs like Phi-4, Mistral-7B, Qwen and Gemma are optimized versions of larger LLMs, trained or distilled for: Reduced memory footprint: Efficient use of limited edge device memory Lower compute demand: Optimized for CPU and edge GPU performance Faster startup times: Quick initialization for responsive applications They unlock powerful NLP capabilities while meeting the constraints of: Embedded systems: IoT devices and industrial controllers Mobile devices: Smartphones and tablets with offline capabilities IoT Devices: Sensors and smart devices with limited resources Edge servers: Local processing units with limited GPU resources Personal Computers: Desktop and laptop deployment scenarios Course Modules & Navigation Course duration. 10 hours of content Module Topic Focus Area Key Content Level Duration 📖 00 Introduction to EdgeAI Foundation & Context EdgeAI Overview • Industry Applications • SLM Introduction • Learning Objectives Beginner 1-2 hrs 📚 01 EdgeAI Fundamentals Cloud vs Edge AI comparison EdgeAI Fundamentals • Real World Case Studies • Implementation Guide • Edge Deployment Beginner 3-4 hrs 🧠 02 SLM Model Foundations Model families & architecture Phi Family • Qwen Family • Gemma Family • BitNET • μModel • Phi-Silica Beginner 4-5 hrs 🚀 03 SLM Deployment Practice Local & cloud deployment Advanced Learning • Local Environment • Cloud Deployment Intermediate 4-5 hrs ⚙️ 04 Model Optimization Toolkit Cross-platform optimization Introduction • Llama.cpp • Microsoft Olive • OpenVINO • Apple MLX • Workflow Synthesis Intermediate 5-6 hrs 🔧 05 SLMOps Production Production operations SLMOps Introduction • Model Distillation • Fine-tuning • Production Deployment Advanced 5-6 hrs 🤖 06 AI Agents & Function Calling Agent frameworks & MCP Agent Introduction • Function Calling • Model Context Protocol Advanced 4-5 hrs 💻 07 Platform Implementation Cross-platform samples AI Toolkit • Foundry Local • Windows Development Advanced 3-4 hrs 🏭 08 Foundry Local Toolkit Production-ready samples Sample applications (see details below) Expert 8-10 hrs Each module includes Jupyter notebooks, code samples, and deployment walkthroughs, perfect for engineers who learn by doing. Developer Highlights - 🔧 Olive: Microsoft's optimization toolchain for quantization, pruning, and acceleration. - 🧩 ONNX Runtime: Cross-platform inference engine with support for CPU, GPU, and NPU. - 🎮 DirectML: GPU-accelerated ML API for Windows, ideal for gaming and real-time apps. - 🖥️ Windows AI PCs: Devices with built-in NPUs for low-power, high-performance inference. Local AI: Beyond the Edge Local AI isn’t just about inference, it’s about autonomy. Imagine agents that: - Learn from local context - Adapt to user behavior - Respect privacy by design With tools like Agent Framework, Azure AI Foundry and Windows Copilot Studio, and Foundry Local developers can orchestrate local agents that blend LLMs, sensors, and user preferences, all without cloud dependency. Try It Yourself Ready to get started? Clone the Edge AI for Beginners GitHub repo, run the notebooks, and deploy your first model to a Windows AI PC or IoT devices Whether you're building smart kiosks, offline assistants, or industrial monitors, this curriculum gives you the scaffolding to go from prototype to production.