adaptive cloud
77 TopicsAnsible + Azure Arc: Use Ansible modules to deploy and manage Azure Arc machine extensions at scale
We are making Azure Arc extensible and increasing the flexibility of the tooling you can use to operate your machines using Azure’s control plane. We are excited to announce new modules in Ansible Galaxy that make it easier to manage Azure Arc machine extensions at scale. With the latest updates to the azure.azcollection on Ansible Galaxy, you no longer need to switch between existing tools. You can now deploy and manage Azure Arc extensions using familiar, declarative Ansible workflows. These new modules include: Azure Arc machine extensions module Azure Arc extensions info module Together, they enable infrastructure and platform teams to automate extension lifecycle management across their hybrid estate—bringing consistency, security, and efficiency to Azure Arc-enabled servers. Why this matters Azure Arc machine extensions power critical scenarios such as security, monitoring, update management, configuration and compliance. Until now, managing these Azure Arc extensions across hybrid estates often required Azure CLI scripts, ARM templates, or manual operations. With these new Ansible modules, you can: Integrate Azure Arc extension management into existing Ansible playbooks Enforce consistent configuration across hybrid servers Reduce operational overhead through declarative automation Align extension deployment with broader configuration management workflows What’s included azure_rm_arcmachineextensions This module allows you to manage the full lifecycle of Azure Arc machine extensions, including: Creating and deploying extensions Updating extension settings Removing extensions when no longer needed You can define extension state declaratively, ensuring consistent enforcement across your Azure Arc-enabled servers. azure_rm_arcmachineextensions_info This module provides visibility into extension state by retrieving: Installed extensions on Azure Arc-enabled machines Provisioning status and configuration details Extension metadata for reporting and validation This is useful for compliance validation, auditing, and conditional automation in playbooks. Scenario: Enforcing identity-based SSH access across a hybrid fleet Consider a regulated enterprise that must ensure all Linux servers—whether on-premises or in a multicloud environment—use Microsoft Entra ID for SSH access. The organization wants to: Eliminate local SSH credentials Enforce centralized identity and access controls Audit access consistently across all environments By combining Azure Arc with Ansible, the organization can deploy the Microsoft Entra SSH for Linux extension across all Azure Arc-enabled servers as part of a standardized playbook, ensuring compliance and reducing operational overhead. Example: Deploy Microsoft Entra SSH for Linux extension Below is an example of using Ansible to deploy the Microsoft Entra SSH extension to an Azure Arc-enabled server: - name: Deploy Entra SSH extension to Arc server hosts: localhost connection: local tasks: - name: Install Entra SSH extension for Linux azure_rm_arcmachineextensions: resource_group: myResourceGroup machine_name: myArcServer name: AADSSHLoginForLinux publisher: Microsoft.Azure.ActiveDirectory type: AADSSHLoginForLinux type_handler_version: "1.0" settings: {} state: present Example: Retrieve extension information Below is an example of using Ansible to retrieve details about your Azure Arc extensions: - name: Get Arc machine extension details hosts: localhost connection: local tasks: - name: Fetch extensions azure_rm_arcmachineextensions_info: resource_group: myResourceGroup machine_name: myArcServer Integrating with existing Ansible workflows If you’re already using Ansible for: OS configuration Patch and update management Application deployment You can now extend those workflows to include Azure Arc extension management—without introducing new tools or processes. This allows you to manage on-premises servers, Edge infrastructure and multicloud environments through a unified automation approach powered by Azure Arc and Ansible. Read more at Enable VM Extensions Using Red Hat Ansible - Azure Arc | Microsoft Learn What’s next These modules are part of our continued investment in making Azure Arc a first-class platform for managing Windows and Linux machines in hybrid and multicloud infrastructure. By bringing extension lifecycle management into Ansible, we’re enabling teams to enforce security, compliance, and operational consistency at scale—using the tools they already trust. Stay connected Join the Azure Arc Monthly Forum here: aka.ms/ArcServerForumSignup Let us know what you’d like to see next in the comments!424Views0likes0CommentsEmbed intelligence into physical systems with smaller form factor infrastructure (preview)
Written by Cosmos Darwin, Azure Edge PM, and Michael MacKenzie, VP of Digital Operations AI is transforming how we work, but so far it's mostly lived on your screen: agents and models assisting with information work. How can that intelligence take on physical work, too? Jobs that happen out in the world, like transporting goods, inspecting equipment, manufacturing products, and serving retail customers. This is already possible today, but developing autonomous robots remains highly complex and specialized. The real breakthrough will come when using AI in physical work is as simple and ubiquitous as it is on a screen. To get there, we need to go beyond software agents and embed intelligence directly into physical systems. Today at Microsoft Build 2026, we're announcing several new capabilities to help organizations everywhere get started. We're extending AI-ready Azure-managed infrastructure to smaller form factor hardware, bringing Foundry Local to it for running local AI agents and models, and adding support for Azure Kubernetes Service and Azure IoT Operations. Demo: a simple robot that thinks for itself Applied in combination, these capabilities can be surprisingly powerful. For Microsoft Build this week, we wanted to show you just how easy this can be. We put together a basic agentic robot using nothing but open-source AI models, commercial off-the-shelf sensors and robot hardware, and the new Azure previews we're announcing today. It's a playful example, but it illustrates what’s possible – check it out: Lightweight deployments on smaller form factor hardware (preview) First, we're extending Azure-based provisioning and management to smaller hardware form factors, using a lightweight, performance-oriented architecture built for AI workloads. Unlike hyperconverged and disaggregated deployments, this doesn’t rely on virtualization, and instead runs Linux (initially Azure Linux) directly on bare metal to host containers. You can choose whichever runtime tools you prefer, like Docker, open source k3s, or fully managed Azure Kubernetes Service. Each deployment is provisioned and managed from the cloud using a new type of resource called Provisioned Machine that looks and behaves a lot like an Azure VM – for example, you can see it in the Azure portal and govern access with Microsoft Entra ID. Over the coming months, we’ll be rolling out more features like update management, metrics, security configuration, and natively configurable child resources for network interfaces and disks. Screenshot of the new Provisioned Machine resource type in Azure portal. Provisioned Machines support lifecycle operations centrally from the Azure portal and APIs. Effectively, you can treat physical machines like cloud resources, removing the need for separate on-site IT tools. This makes it much more practical to scale across many distributed locations. For an organization like Chevron, whose operations span field sites around the world, that’s significant: "Chevron has a growing fleet of industrial edge devices that collect data in the field and increasingly perform local AI processing. Technologies like Azure Local on smaller form factors can help us manage these systems centrally and in a more automated way – reducing complexity compared to the customized OS environments and tools we use today." — Ed Moore, OT Strategist and Distinguished Engineer, Chevron Run agents and models locally with Foundry Local (preview) To embed intelligence into physical systems, Foundry Local is now available as a lightweight container image for Linux infrastructure. Foundry Local provides a consistent way to deploy and run agents and models, including an inference server that runs alongside your app container and exposes an OpenAI-compatible REST endpoint. It also offers a trusted source for the latest open-source models with an extensive online catalog. Although it integrates closely with Microsoft Foundry, at run time everything stays local: there's no round-trip to the cloud. Data stays on the machine, responses start instantly with zero network latency, and inferences continue even without connectivity. There are no per-token costs, either. Optimized for edge and industrial form factors, the new Foundry Local preview automatically detects and uses available accelerators like GPUs (and soon NPUs), lining up the full stack for you, from kernel drivers to user-mode libraries. For example, in our demo above, Foundry Local taps an Nvidia RTX 2000E GPU to deliver snappy inferences in real time. Diagram of the lightweight Linux architecture with container-based Azure services. More popular Azure services In addition to Foundry Local, these popular Azure services are validated too: Azure Kubernetes Service (AKS), the fully-managed enterprise-grade Kubernetes service, now runs directly on bare metal with small form factor deployments – no virtualization layer required. It's the same AKS already available in the cloud and on servers. Once deployed, the cluster looks and works exactly like AKS anywhere else – with Azure-based RBAC, networking, upgrades, monitoring, and even integrations like AKS Fleet Manager – so the controls and tooling you rely on in the cloud extend all the way to the industrial edge. Learn more and join the AKS preview Azure IoT Operations provides a unified data and control plane for physical assets at the edge. It includes a variety of connectors and an industrial-grade MQTT broker where local agents and logic can run – even with intermittent connectivity – to shape operational data into AI-ready forms, act on it autonomously, and connect into broader cloud analytics and AI systems. It provides a no-code graphical interface to configure data flows and contextualize data before sending it to destinations like Microsoft Fabric for Real-Time Intelligence, and allows you to send messages back to the physical machines it’s connected to. It's already generally available, and as seen in our demo above, it now works on small form factor deployments too. Learn more about Azure IoT Operations Choose the hardware that fits your requirements We're delighted to partner with leading makers of edge and industrial computers so you can deploy Azure-managed infrastructure on smaller form factor hardware that’s available to buy today – straight from your preferred vendor or distributor, with no special customization required. We’re partnering with leading makers of AI-ready edge and industrial computers. The most compact and affordable options are the ASUS NUC 14 Pro and 15 Pro. At barely 4 inches square and under 2 pounds, they pack the latest Intel® Core™ Ultra processors into a remarkably trim package, well suited to space-constrained scenarios like retail. Learn more about NUC 15 Pro “With ASUS NUC 14 Pro and 15 Pro, organizations have a powerful yet compact platform for innovation at the edge. When paired with Azure Local, these devices make it easy to deploy, manage, and scale AI workloads at the edge – unlocking real-time intelligence for retail stores and manufacturing environments while maintaining seamless integration with the cloud.” – (ASUS) KuoWei Chao, General Manager of ASUS NUC Business Unit For more flexibility, the industrial-grade Lenovo ThinkEdge SE100 offers expandable storage and networking, plus an optional Nvidia RTX A1000 (8GB) or 2000E (16GB) GPU to accelerate demanding edge AI inferencing. Learn more about ThinkEdge SE100 For the toughest operational and regulatory constraints, the OnLogic Helix 521 offers a fan-less design with no moving parts. Designed, assembled, and supported entirely in the USA, it takes the uncertainty out of meeting stringent supply-chain requirements. Learn more about the Hx521 Get started today We're excited to bring AI-ready infrastructure to where physical work happens, and we genuinely had a lot of fun making the agentic robot demo above. Now it's your turn. Small form factor deployments are available in public preview today, starting in the East US region. There is no charge during the preview. Once your hardware is ready, the Azure-based provisioning experience gets most previewers up and running in about an hour. Instructions to get started are on Microsoft Learn, and if you’d like to engage directly with our team, get in touch here. (If you need to evaluate before committing to hardware, you can spin it up on a virtual machine, though it’s not quite the same as real hardware.) Whether you're bringing intelligence to a fleet of machines, standing up inference next to your data, or building something we haven't even imagined yet, we can't wait to see what you create! - Cosmos & Mike on behalf of our global team in Redmond, Mountain View, Pittsburgh, and Bengaluru821Views2likes0CommentsBuild, deploy, and govern sovereign AI with Foundry Local on Azure Local
Not every AI workload can run in the cloud. For many of our customers, data needs to stay within defined boundaries, connectivity may be limited or absent, and latency, governance, and auditability are non-negotiable. With Foundry Local on Azure Local, you can use the same model catalog, developer workflows, and governance capabilities you know from Azure, while running AI entirely within your own environment where your data resides. Foundry Local provides the model catalog and developer experience. Azure Local provides the customer-managed infrastructure. Azure Arc provides unified policy, governance, and lifecycle management across cloud and local environments. This gives developers a consistent way to build, deploy, and operate AI. The same az commands, the same model catalog, the same Arc policies, all running on hardware you control. Expansion of Foundry Local on Azure Local We're expanding the Foundry Local model offering on Azure Local, with support for multi-node deployments and new agents and tools that run locally, in preview. Deploy and run AI models locally. Run models with Foundry Local in customer-managed environments on Azure Local, across sovereign, private, and edge scenarios, including fully disconnected operation. Choose from a flexible, high-performance model catalog. Access proprietary and community models through Foundry Local, now expanded with vLLM-optimized models alongside ONNX-based offerings. You explore and deploy through the same catalog API experience, then operate locally on Azure Local. Build for production realities. Bring governance, identity, and auditability into your applications while keeping execution inside your controlled boundary. See what’s new in Foundry Local on Azure Local in the Tech Community blog. From intelligence to action: agents and tools inside the enterprise boundary Most production AI use cases need two things: grounded answers and the ability to act on them, without sending data outside the environment. Here's how we're enabling that locally. Preview: Agentic retrieval with Foundry Local: Ground agents in enterprise data using retrieval-augmented generation across local Microsoft 365 services, including Exchange and SharePoint. Read the Tech Community blog to learn more. Preview: Agents and tools with Foundry Local: Build AI systems that reason, retrieve information, and take action within customer-controlled environments. Learn more. Preview: Developer acceleration templates: Jump-start local AI application development with new Foundry solution templates, including local chat experiences and video agents, powered by Azure AI Video Indexer. Read the Tech Community to learn more. GitHub Enterprise Local: Now available in public preview Sovereign AI is also about how systems are built and secured, not just where they run. With GitHub Enterprise Local on Azure Local, you can bring your full software development lifecycle on-premises: Source control and repositories CI/CD pipelines Security and DevSecOps workflows GitHub Enterprise Local deploys entirely within customer-owned infrastructure, so teams get the developer tools they expect without compromising on data residency or operational control. This extends modern DevSecOps practice into sovereign environments and pairs naturally with the AI development workflows above: build, secure, and ship your AI applications within the same boundary where they run. Read the tech community blog to learn more about GitHub Enterprise Local and how to join the preview. Accelerating High-performance AI at the Edge with NVIDIA We are expanding our collaboration with NVIDIA to deliver high-performance AI capabilities directly at the edge. At Build, we are bringing: Azure Local and Foundry Local on NVIDIA-powered GPUs, including NVIDIA RTX PRO 6000 Blackwell Server Edition, with expanded GPU support coming soon Integration with Nemotron models, optimized for enterprise performance A scalable foundation for data-intensive, low-latency workloads This partnership ensures that organizations can run advanced AI workloads where data is generated - without dependency on centralized cloud infrastructure. Hardware options: AI factory configurations are available now in the catalog Alongside our hardware partners, we’re bringing integrated solutions to customers building AI within sovereign environments. The Azure Local hardware catalog now includes AI factory configurations from our OEM partners, including NVIDIA-certified 8xH100 systems, with options from DataON, Dell, HPE, and Lenovo. These configurations are sized for the performance that model serving and agentic workloads require on customer-managed infrastructure. Together with Microsoft, we are advancing sovereign AI by bringing the open NVIDIA Nemotron model family to Microsoft Foundry Local on Azure Local. This collaboration gives organizations a production-ready AI platform that enables them to deploy AI where their data resides while maintaining the governance, control, and performance needed to scale AI across the enterprise.” Kari Briski, VP Generative AI Software Products, NVIDIA ”Sovereign AI is becoming increasingly important for governments, regulated industries, and enterprises that want to use AI while maintaining control of their data, location, and operations. Lenovo’s ThinkAgile MX Series delivers trusted, enterprise-grade infrastructure with global deployment expertise to help customers run AI wherever their data resides. Co-engineered with Foundry Local and Azure Local, this solution provides an optimized platform to deploy, run, and scale AI locally with greater simplicity, consistency, and control, while helping meet strict data residency, security, and compliance requirements." Scott Patti - VP Infrastructure Solutions Group (ISG), Lenovo From AI models to trusted, mission-critical systems: what this unlocks for developers and operators AI is evolving from systems that answer questions to systems that plan, reason, and take action across workloads. These capabilities move AI from a cloud-only assumption to something you can deploy where sensitive work actually happens, with governance and operational controls intact. For our customers, this means you can now: Keep data, identities, and audit trails inside your sovereign boundary. Run AI inference and agentic workloads in connected, intermittently connected, or fully disconnected modes. Apply consistent policy and governance across cloud and local environments through Azure Arc. Use the same Foundry catalog and developer experience you already know, on infrastructure you own. Build, secure, and ship your AI applications with GitHub Enterprise Local, keeping source control, CI/CD, and DevSecOps workflows inside the same sovereign boundary. Resources Join us at Build OD837 Shipping physical AI to the edge with Azure Local and Foundry Local https://github.com/microsoft/build26-OD837 OD839 Foundry Local: AI solutions for industrial and sovereign needs https://github.com/microsoft/build26-OD839 LTG425 Expanding horizons: Foundry Local for devices and on-prem https://build.microsoft.com/en-US/sessions/LTG425 Request to join the Foundry Local on Azure Local preview Hands-on walkthrough: Your first model deployment on Foundry Local on Azure Local: from catalog to inference in 10 minutes | Microsoft Community Hub Read our Tech Community blogs: Foundry Local announcing multi-node and vLLM support Agentic Retrival with Foundry Local blog: https://aka.ms/AgentsAndToolsBuildBlog2026 Code sample / model catalog blog: https://aka.ms/foundry-local-model-catalog-blog For more details on the expanded capabilities of Foundry Local for highly secure environments, contact your Microsoft account team Discover Microsoft Sovereign Cloud Explore product documentation at: Foundry Local models on Azure Local: https://aka.ms/FoundryLocalonAzureLocal_documentation Local Agentic retrieval with Foundry Local: https://aka.ms/edge-agentic-retrieval-docs384Views0likes0CommentsUnlock On-Prem Productivity with Agentic Retrieval in Foundry Local
In today’s connected world, customers expect instant, context-rich interactions, even in environments where cloud connectivity isn’t guaranteed. That’s where Retrieval-Augmented Generation at the edge comes in. Since we launched into public preview, we’ve watched teams across regulated, disconnected, and mission-critical environments push this technology into places cloud GenAI simply couldn’t reach. What we heard back shaped everything in this release: customers don’t just want retrieval. They want reasoning, they want agency, and they want an end-user experience that feels as natural as the one they already use in the cloud. Today at Build 2026, we're excited to introduce Agentic Retrieval, the next evolution of our on-prem RAG platform, enabled by Azure Arc and powered by Foundry language models. Agentic Retrieval is part of Microsoft's Adaptive Cloud approach, which extends Azure capabilities to wherever customer data and workloads actually live, with Edge AI focused on bringing reasoning and grounding to on-prem, distributed, and disconnected environments. Together with Foundry Local, Agentic Retrieval continues to shape Microsoft's Foundry Anywhere commitment: flexibility, resilience, and intelligence wherever customers operate. What’s new at Build 2026 This release introduces three major pillars that work independently or together: Agentic Retrieval engine: a first-party orchestration runtime for planning, reasoning, conversation state, and tool calls over your local data Knowledge: a dedicated layer for organizing, curating, and governing your grounding data, exposed via MCP and connectable to any agentic retrieval layer Chat UI: a production-ready, polished conversational experience that ships as the default UX for Agentic Retrieval and can also be deployed standalone Alongside, we’re delivering the platform upgrades customers asked for: flexible deployment modes (Agentic-only, Knowledge-only, or Combined), BYOM with pluggable backends, Foundry Local model catalog integration, Entra ID support, disconnected-ready, and hybrid search combined with agentic retrieval. Agentic Retrieval: From Answering to Reasoning Classic RAG retrieves, then generates. Agentic Retrieval plans, reasons, and acts, running multi-step retrieval and tool invocation under a first-party orchestration runtime, entirely on your infrastructure. Under the hood it manages query planning, iterative multi-hop retrieval, tool calls via MCP, conversation state, and mandatory grounding with citations and audit logging built in. What customers can achieve: Compliance, policy, and permit workflows for public sector, regulators, and defense operations, with data never leaving sovereign infrastructure Multi-document synthesis across standards, technical manuals, contracts, and field procedures for industrial operators An agentic chat experience for regulated and operational teams (engineers, inspectors, analysts) that reasons like a subject-matter expert Auditable AI for sovereign and mission-critical environments, with every answer traceable to its source Knowledge: A First-Class, Governed Data Layer Great answers start with great knowledge. Knowledge is now a standalone component customers can deploy on its own or alongside Agentic Retrieval, exposed through an MCP wrapper so it can connect to any agentic retrieval layer, ours or yours. This release brings Collections (segmented groups of indexed knowledge with granular access permissions), multi-source ingestion across documents, tables, images, and SharePoint (indexed source moving to public preview), high-fidelity parsing for complex enterprise content, Bring Your Own MCP to connect customer-owned data sources directly into Agentic Retrieval and the chat experience, and governance enforced at the data layer itself. ent view - collections, sources, and permission scopes What customers can achieve: Scope knowledge access to different slices of the same corpus, by plant, site, classification, or jurisdiction Enforce data sovereignty, residency, and regulatory compliance at the knowledge layer itself Ground both first-party Agentic Retrieval and BYO orchestration through a single governed source of truth across distributed sites Keep classified, proprietary, and operational data fully on-prem while delivering premium chat experiences Chat UI: Production-Ready Conversational Experience Agentic Retrieval now ships with a polished, production-ready Chat UI as its default experience, and the same component can be deployed standalone for customers building their own stack on Foundry Local. Highlights include Entra ID authentication (MSAL login, Bearer tokens, user identity display), pluggable backends across AI Foundry, BYOM, or mock mode with zero code changes, Chain-of-Thought visibility and inline citations that make grounding transparent to end users, standalone frontend deployment via Helm chart and container image, and disconnected-ready operation for air-gapped environments. What customers can achieve: Deliver a polished end-user experience to operators, inspectors, and analysts without building UI from scratch Build trust in regulated and industrial workflows through transparent, inspectable reasoning and grounding Run the same UI across air-gapped facilities, sovereign clouds, and connected industrial sites Accelerate rollout across public sector, defense, manufacturing, and other mission-critical environments Why This Release Matters Every update to our on-prem RAG platform has moved us toward a simple conviction: GenAI should be useful wherever customers operate, whether regulated or open, connected or disconnected, centralized or distributed. With Agentic Retrieval, Knowledge, and Chat UI coming together, backed by Foundry on Arc, BYOM, and fully disconnected support, this is no longer “cloud RAG, but local.” It’s an agentic knowledge platform purpose-built for the realities of enterprise data: on-prem, governed, and increasingly autonomous. Learn More Explore Agentic retrieval documentation Read Foundry Local on Azure Local model inferencing blog post For more information reach out to the team at FoundryLocalOnAzure@microsoft.com299Views0likes0CommentsScale On-Prem AI with Foundry Local on Azure Local: Multi-Node Inference and vLLM Support
Since announcing the public preview of Foundry Local on Azure Local for single-node, we’ve seen strong adoption in regulated industries and consistent customer demand to expand the platform for scalable deployments. Today, we’re expanding Foundry Local model offering on Azure Local (preview) with three additions that broaden where and how you can use it: Multi-node scheduling - distribute inference workloads across the GPU capacity in your Azure Local cluster, not just a single node vLLM runtime support - a high-throughput serving engine purpose-built for large language models and concurrent workloads An expanded model catalog - new models available in vLLM optimized format alongside the existing ONNX offerings Together, these additions let you scale to higher concurrency, serve more users from a single endpoint, and run larger models on-premises. They round out Foundry Local on Azure Local into a more complete, production-grade on-premises inference platform - covering a wider range of model sizes, concurrency profiles, and hardware footprints, while preserving the same Kubernetes-native, OpenAI-compatible patterns you're already using. Runs disconnected - no cloud round-trip required Foundry Local on Azure Local is designed to run fully on-premises, including in disconnected and intermittently-connected environments. Model weights, prompts, and inference traffic stay entirely inside your Arc-enabled cluster - there is no per-request call to Azure, no data exfiltration to the cloud, and no dependency on a live WAN to serve inference. Models are cached locally on Persistent Volumes after the first pull. Once cached, the inference endpoint keeps serving even when the WAN is down - across reboots, network outages, and extended disconnected operation. API-key authentication continues working uninterrupted during disconnected periods. Microsoft Entra ID auth resumes seamlessly when connectivity returns. The control plane is local to the cluster. The Foundry Local operator, the model catalog, and the inference runtimes all live inside Azure Local - Arc is used for fleet management and updates, not for the inference data path. For factory floors, offshore platforms, sovereign data centers, classified sites, and remote branch offices where cloud connectivity is unreliable, restricted, or prohibited, this is what makes on-premises AI inference actually viable in production. Multi-node scheduling: more scenarios, more capacity Foundry Local on Azure Local now expands to support multiple nodes in your cluster. The inference operator schedules and manages deployments across the GPU capacity available cluster-wide, so you can: GPU capacity from any node in the cluster, not just a single node’s resources Place inference workloads where the hardware lives, with the operator managing deployments across nodes The same Model Deployment custom resource you already use defines the workload, and it is served through the standard OpenAI-compatible endpoint (POST /v1/chat/completions). The API used to interact with conversational AI models by sending structured messages and receiving model-generated responses. Existing applications work against multi-node deployments with zero code changes. vLLM runtime: high-throughput serving for production workloads Alongside ONNX-GenAI, Foundry Local now offers vLLM as a first-class inference runtime. vLLM is an open-source, high-throughput serving engine that has become the standard for production LLM inference in the cloud. Bringing it to Foundry Local on Azure Local means the same performance characteristics are available on your factory floor, in your sovereign data center, or at your remote site. Why vLLM matters for edge and on-premises inference Capability ONNX-GenAI vLLM Hardware CPU and GPU GPU only Throughput Optimized for single-user, low-latency Optimized for high-throughput, multi-user concurrency Memory management Standard allocation PagedAttention - efficient KV-cache management reduces VRAM waste Continuous batching Not supported Supported - incoming requests are batched dynamically for higher GPU utilization FP8 KV cache Not supported Supported on compatible models and GPUs - roughly doubles token capacity Best for Compact models, CPU-only nodes, single-client scenarios Larger models, multi-user workloads, GPU-equipped clusters Automatic GPU inference tuning with the vLLM planner One of the operational challenges with vLLM is configuration tuning - setting GPU memory utilization, context length, batch sizes, and other parameters for a given model on a given hardware profile. Get it wrong and the pod either OOMs (runs out of memory) on startup or wastes GPU capacity. Foundry Local addresses this with the vLLM planner, an automatic tuning component that inspects the available GPU resources, analyzes the target model's footprint, and generates a memory-safe, high-performance configuration before the model server starts. You declare what model you want to run; the planner figures out how to run it optimally on your hardware. Full configuration reference is in the vLLM planner docs. Identity-based access for multi-user workloads Serving more concurrent users isn't only a throughput problem - it's also an access-control problem. Foundry Local supports two authentication modes side by side on the same endpoint: API keys - primary and secondary keys per deployment, with zero-downtime rotation. Ideal for service-to-service traffic and automated pipelines. Microsoft Entra ID with Azure RBAC - per-identity access using the Cognitive Services OpenAI User role (or any role granting the equivalent data-plane action). JWT validation runs inside the inference pod; authorization is enforced through the cluster's Arc-managed identity. Enable both, and clients can present either credential type in the same Authorization: Bearer header - the platform detects which one was sent and routes to the right validation path. API-key callers also keep working uninterrupted if external connectivity is briefly lost, giving you a natural degradation story for edge and disconnected sites. For a multi-user AI assistant on the factory floor or in a sovereign data center, this is the difference between a shared service account and a per-user audit trail. Expanded model catalog: ONNX and vLLM side by side The Foundry Local model catalog now includes models in both ONNX and vLLM formats. The same model can appear multiple times in the catalog - once per runtime/compute target - so you can pick the build that matches your hardware without leaving the platform. The operator selects the right container image automatically based on the entry you reference. Broader open-model support Beyond the Phi and GPTOSS families, the catalog now includes additional models across multiple open-source lineups that customers have requested for on-prem and sovereign deployments, including Mistral and NVIDIA Nemotron. Both are available as catalog entries, served by the vLLM runtime on GPU, and accessible through the same OpenAI-compatible endpoint you already use. In collaboration with NVIDIA, Foundry Local now supports the latest Nemotron models, optimized for enterprise performance on NVIDIA powered Azure Local hardware including NVIDIA RTX Pro 6000. Nemotron models are tuned for reasoning, instruction-following, and agentic workflows, and run on the vLLM runtime with PagedAttention, continuous batching, and FP8 KV cache on compatible GPUs. The vLLM planner handles GPU memory utilization and context-length sizing automatically. you declare the catalog entry, the platform sizes the deployment to your hardware. Models available in vLLM format (see the model catalog docs for the full, regularly updated list) Model ONNX vLLM Notes Phi-4 ✓ ✓ Microsoft's flagship SLM Phi-4-mini ✓ ✓ Compact, fast inference Phi-4-mini-reasoning ✓ ✓ Chain-of-thought reasoning Phi-4-reasoning — ✓ vLLM-only, reasoning-focused gpt-oss-20b ✓ ✓ Mid-range generative gpt-oss-120b — ✓ Large generative, vLLM-only Mistral-7B-v0.2 ✓ ✓ Popular open-source LLM DeepSeek-R1 (7b/14b) ✓ — Reasoning-focused Qwen2.5 (0.5b–14b) ✓ — Multilingual, coder variants Qwen3 (0.6b–14b) ✓ — Latest generation Whisper (multiple sizes) ✓ — Speech-to-text Nemotron ✓ (CPU) ✓ The catalog now includes a growing list of models across both runtimes. Models in vLLM format are served using the vLLM engine with all its performance benefits - PagedAttention, continuous batching, FP8 KV cache - while ONNX models continue to serve on CPU or GPU through the ONNX-GenAI runtime. Bring-your-own model (BYOM) When you need a model that isn’t in the catalog, bring-your-own model still works the same way: package your model as an OCI artifact in any ORAS-compatible registry (Azure Container Registry, GitHub Container Registry, Docker Hub) and reference it from your ModelDeployment. The operator caches it locally and reuses the cached copy on subsequent deployments. Choosing the right runtime ONNX-GenAI when you're running on CPU-only hardware, serving a single application with a compact model, or need the broadest model compatibility including speech and predictive workloads. vLLM when you have GPU hardware, need to serve concurrent users, want to run larger models, or need production-grade throughput from your inference endpoint. Both runtimes expose the same OpenAI-compatible REST API - the choice is transparent to application code. vLLM ModelDeployment is as simple as this: Everything else - memory utilization, context length, batch sizing - is handled by the vLLM planner. See the model catalog docs for the BYO pattern and full configuration options. What hasn't changed Everything from the public preview remains fully supported: Two installation paths - Azure Arc extension (recommended for fleet management) and Helm chart (for platform engineers who need full control) OpenAI-compatible REST endpoints - POST /v1/chat/completions and standard patterns API key and Microsoft Entra ID authentication - secured with bearer tokens, with the per-identity RBAC model described above TLS-enabled ingress - encrypted traffic in transit Disconnected operation - models cached on local PersistentVolumes continue serving when WAN connectivity drops Bring-your-own predictive models - deploy custom ONNX models from OCI registries Multi-model orchestration - agent-style patterns coordinating multiple local models Your existing ModelDeployment manifests continue to work. Applications targeting the ONNX-GenAI runtime don't need any changes. The new capabilities are additive. Real-world scenarios, now at scale Over the past few months, we’ve partnered with customers in early preview to build and validate real-world scenarios. A consistent theme across these engagements is the need to run AI where data resides—on-premises—while maintaining the governance and consistency enabled by Azure Arc. "In energy operations, AI needs to run where the work happens – at remote facilities, offshore platforms, and field locations where connectivity is often limited, and safety is paramount. Foundry Local gives us a path to bring AI-driven decision-making closer to our operational data, with the governance our industry demands. The ability to deploy and run AI workloads consistently across edge and field environments, even when disconnected, is critical as we advance Chevron's vision for autonomous and intelligent operations." (Chevron) Ed Moore - OT Strategist and Distinguished Engineer With multi-node and vLLM, the scenarios from our initial preview scale to meet production demands: Manufacturing: multi-user quality inspection A quality-control system on a production line previously ran Phi-4-mini for single-station anomaly explanation. With vLLM's continuous batching, the same Foundry Local endpoint now serves 10+ inspection stations concurrently - each sending defect images and sensor telemetry for real-time root-cause analysis - without response-time degradation. Sovereign: identity-scoped document processing A government agency processing sensitive casework needs production-grade throughput and a strict audit trail. Foundry Local serves the workload on-premises across multiple GPU nodes, with per-analyst access enforced through Entra ID and Azure RBAC, so every inference call is tied to a real identity - and no data leaves the cluster. Energy: disconnected multi-user operations An offshore platform runs Foundry Local on a multi-node Azure Local cluster. When WAN connectivity drops, the vLLM-powered endpoint continues serving safety procedure lookups, maintenance guidance, and operational queries to multiple crew members simultaneously - each accessing the inference endpoint from their local application. API-key auth keeps working through the outage; Entra ID resumes seamlessly when the WAN comes back. Getting started If you're already running Foundry Local on Azure Local in the public preview: Once installed the Foundry Local extension is automatically kept up to date, with multi-node and vLLM support included. Browse the updated catalog to discover models available in vLLM format Deploy a vLLM model by setting runtime: vllm in your ModelDeployment manifest Let the vLLM planner optimize - override only the preferences you care about and let the planner handle the rest If you're new to Foundry Local on Azure Local: Follow the get-started code-sample blog to see the end-to-end flow Request preview deployment access to get started Read the documentation for architecture overview and deployment guide What's next Multi-node and vLLM are just the beginning. We're continuing to invest in: Distributed LLM serving with LLM-D - KV-cache-aware routing and disaggregated serving for large models that span multiple nodes Autoscaling for inference workloads - dynamic capacity that follows demand Broader model catalog expansion - more model families, more sizes, more task types Enhanced monitoring and observability for inference workloads Performance optimization for specific Azure Local hardware profiles Expanded GPU hardware validation across the Azure Local catalog We're building Foundry Local to be the production AI inference platform for edge and sovereign environments. Your feedback is shaping every release - keep it coming. Learn more: Foundry Local Model and inferencing on multi node demo Foundry Local for devices (GA) For more information reach out to the team at FoundryLocalOnAzure@microsoft.com241Views0likes0CommentsIntroducing GitHub Enterprise Local (Preview): DevOps for Sovereign and Private Cloud Environments
Across the world, many organizations, particularly in government, defense, financial services, and critical infrastructure, must operate within strict sovereign boundaries, often due to regulatory, security, or disconnected environment requirements. Microsoft’s Sovereign Private Cloud is a customer operated cloud model designed for scenarios where sovereignty, operational control, and resiliency are non negotiable. It enables organizations to operate securely and at scale, even in restricted or disconnected environments, while maintaining governance aligned with regulatory and national obligations. Azure Local is the foundation that makes this possible. With Azure Local, organizations can run critical workloads—including virtual machines, Kubernetes, virtual desktop infrastructure, and AI workloads—on infrastructure they own and control, while still benefiting from Azure consistent management, governance, and lifecycle operations. We’re continuing to expand the set of workloads and capabilities supported on Azure Local to meet the needs of organizations operating in sovereign and highly regulated environments. With Microsoft 365 Local, Azure Local now extends beyond infrastructure to support communication and collaboration workloads, enabling productivity and resiliency even in disconnected or restricted conditions. And with Foundry Local, we are supporting modern AI workloads on Azure Local, bringing advanced AI capabilities to infrastructure customers own and operate. We are excited to announce the public preview of GitHub Enterprise Local, which brings GitHub’s enterprise developer platform into sovereign and private cloud environments. GitHub Enterprise Local is fully hosted on customer owned infrastructure, enabling organizations to modernize application development while keeping source code, build pipelines, and development artifacts entirely within their own operational boundaries. What Is GitHub Enterprise Local? GitHub Enterprise Local enables organizations to deploy GitHub Enterprise Server (GHES) entirely within customer‑owned infrastructure using Azure Local as the underlying private cloud platform. The solution is delivered as a prebuilt virtual machine image that runs on Azure Local and operates fully within the customer’s security and network perimeter. All repositories, metadata, CI/CD workflows, and artifacts remain on‑premises. GitHub Enterprise Local is designed to run without internet connectivity by default, making it suitable for both connected and fully disconnected or air‑gapped environments. At the same time, it preserves a GitHub‑consistent experience for developers, allowing teams to continue using familiar workflows for source control, collaboration, and automation. Developer and Platform Capabilities GitHub Enterprise Local provides a comprehensive set of enterprise developer platform capabilities. Teams can host private repositories, manage organizations, and collaborate through pull requests, branch protection rules, and structured code reviews. Issues, wikis, and project collaboration features are also available, enabling end‑to‑end development workflows within the same platform. GitHub Enterprise Local can run on either a single-node or multi-node Azure Local instance depending on customer needs. Single‑node Azure Local runs GHES as a standalone VM, ideal for preview, PoC, and low‑risk scenarios focused on simplicity and cost efficiency. For production-oriented deployments, the same single GHES VM can run on a multi‑node Azure Local cluster, where Azure Local provides VM‑level high availability and failover. For automation and delivery, GitHub Enterprise Local supports GitHub Actions using self‑hosted runners. This allows organizations to build and run CI/CD pipelines entirely within their own environments, with full control over execution context, dependencies, and network access. GitHub Packages can be used for artifact management, supporting common ecosystems such as npm, NuGet, Maven, and container images. GitHub Enterprise Local extends modern development workflows with AI assisted experiences while keeping sensitive data within customer-controlled environments. Developers can use GitHub Copilot in several ways, including as a standalone experience, through Copilot CLI, and in VS Code. They can choose GitHub-managed models by connecting to GitHub.com, or connecting directly to model providers from Copilot CLI, allowing source code to avoid passing through GitHub Cloud. Foundry Local provides an on-premises inference layer that keeps prompts, code context, and model execution inside organizational boundaries. Together, these capabilities create a clear integration path across code automation and AI application development, enabling organizations to modernize the developer experience while preserving operational control, compliance, and auditability. Developer AI Workflow Architecture This architecture demonstrates how GitHub Enterprise Local serves as the secure, customer-managed foundation for source control, collaboration, and workflow orchestration, enabling developers to layer AI-assisted capabilities through GitHub Copilot, GitHub CLI, and Foundry Local—while ensuring that code, data, and AI execution remain fully within organizational boundaries. Architecture Overview GitHub Enterprise Local follows a layered architecture model. Infrastructure Layer Azure Local forms the foundation, deployed on Azure Local–certified hardware. It provides: The virtualization platform for running GitHub Enterprise Local Infrastructure availability and update management Customer‑controlled networking, identity, and security policies Azure Arc‑enabled management for infrastructure lifecycle operations GitHub Enterprise Local Appliance Layer GitHub Enterprise Server (GHES) is deployed as a prebuilt virtual machine image on Azure Local. This VM includes: The GHES application stack Persistent data disks for repositories and metadata Support for replica‑based failover configurations, depending on customer requirements All application data remains within customer infrastructure boundaries. Operations Layer Operational responsibilities are clearly separated: Azure Local administrators manage the Azure Local infrastructure through Azure GitHub administrators manage GHES configuration, upgrades, user access, and ongoing maintenance through the GitHub Management control and site admin dashboard This separation aligns with common enterprise operational models. Connectivity Modes and Deployment Scenarios GHES is designed to operate fully offline, making it suitable for air‑gapped and restricted environments. Azure Local complements this capability by supporting both connected and fully disconnected operational modes. In connected environments, customers can take advantage of centralized management and monitoring of GHES appliance. In disconnected environments, the entire solution can operate in complete isolation, ensuring compliance with strict sovereignty or security mandates. This flexibility allows organizations to adopt a deployment model that aligns with their regulatory, operational, and security requirements. Hardware and Capacity Planning GitHub Enterprise Local virtual machine sizing depends on customer use cases, including: Number of developers Repository size and growth CI/CD pipeline frequency Artifact storage requirements Azure Local supports running GitHub Enterprise Local on both Integrated and Premier hardware solutions, provided sufficient capacity is available. Customers should plan compute, memory, storage, and network resources accordingly. Minimum recommended requirements Billing Overview GitHub Enterprise Local combines user-based application licensing, Azure Local infrastructure-based billing, and separate pricing for AI services such as Copilot and Foundry. GitHub Enterprise Local is billed per user seat. (GitHub Enterprise license) Azure Local is billed per physical CPU core. (Azure Local Billing) Copilot and Foundry have separate service-based pricing. (GitHub Copilot Plans & pricing) Public Preview Access GitHub Enterprise Local on Azure Local is available today in public preview. Customers can request access by completing the public preview registration form. Submissions are reviewed as part of the preview onboarding process. Participate in public preview: GitHub Enterprise Local Preview Sign-Up Learn More GitHub Enterprise Local documentation986Views0likes0CommentsSimplified access to Hotpatching enabled by Azure Arc for Windows Server 2025
With Windows Server 2025, we introduced hotpatch enabled by Azure Arc, delivering security updates to Windows Server across hybrid and multicloud environments – minimizing downtime (no reboot), accelerating protection, and unifying patch management. We know that keeping your servers updated with the latest patches is one of the critical tasks that IT teams perform day-to-day. We want to make it simpler to install the latest operating system (OS) updates without rebooting machines after every installation. The resounding feedback we have received from you underscored the criticality of this feature in the lifecycle management and security of your infrastructure. We are now taking it one step further to reduce the friction to deploying these critical updates: hotpatch enabled by Azure Arc is now available at no additional cost for Windows Server 2025. Which machines are eligible for this offer? To use hotpatch for Windows Servers running on-premises or in multicloud environments, you must be using Windows Server 2025 Standard or Datacenter, and your server must be connected to Azure Arc. With this announcement, enabling and usage of the hotpatching service is available at no additional charge. Please take note that there are no charges for customers running on Azure IaaS, or Azure Local, wherein hotpatching is available as part of the functionality of Windows Server Datacenter: Azure Edition. This feature is already included both with Windows Server 2022 Datacenter: Azure Edition and Windows Server 2025 Datacenter: Azure Edition. How do I manage hotpatches enabled by Azure Arc for Windows Server 2025? If your Windows Server 2025 machines aren't already connected to Azure Arc, install the Azure Connected Machine agent — it takes just a few minutes per server and supports at-scale rollout via Group Policy, service principal, or Terraform. Once connected, enable Hotpatch from the Azure portal, Azure PowerShell, Azure CLI, or the REST API — just confirm Virtualization-based security (VBS is enabled) first. From there, use Azure Update Manager to schedule and monitor rollouts at scale. For instructions on how to enable hotpatch for Azure Arc-enabled machines using group policy or scripts, learn more here: https://aka.ms/ws-hotpatch For patch orchestration at scale, you can use Azure Update Manager to deliver hotpatches enabled by Azure Arc for Windows server 2025 machines. This enables greater uptime with fewer reboots and faster deployment of updates with easy patch orchestration. Alternatively, you can use APIs or other management tools to manage hotpatches. Centralized management of hotpatch updates across hybrid and multicloud environments enabled by Azure Arc Once your machines are connected to Azure Arc, you can also use the cloud-native services from Azure to manage your windows machines running on-prem. Azure Arc enables you to standardize security and governance across a wide range of resources so you can easily organize, govern and secure Windows, Linux, SQL servers, and Kubernetes clusters running across data centers, edge, and multi-cloud environments – using Azure services such as Azure Policy, Azure Monitor, Microsoft Defender and more. At no additional cost for machines attached to Azure Arc Basic inventory across on-prem and multi-cloud Tag your resources, organize them into resource groups, subscriptions, and management groups, and query at scale with Azure Resource Graph to unify your environments. Infra as Code (Bicep, Terraform) Infra as code for provisioning and management of resources. VM Self Service Perform lifecycle management such as (create, resize, update and delete) and power cycle operations such as (start, stop, and restart on VMware vCenter and System Center Virtual Machine Manager Virtual Machines. Hotpatch for Windows Server 2025 NEW Windows Server hot patching enables you to apply security updates without rebooting, keeping systems secure while maintaining continuous uptime. VM Management Administrate your servers anywhere using SSH for Azure Arc, Run Command, and Custom Script Extension. Mgmt. Services included for no additional costs with Windows Server Software Assurance or Extended Security Updates Azure Update Manager Provides a unified, centralized service to monitor, orchestrate, and automate patching across Azure, on‑prem, and multi‑cloud environments ensuring security, compliance, and minimal downtime at scale. Azure Machine Configuration (Policy) Policy‑driven auditing and enforcement of OS and application settings as code across Azure and hybrid machines—ensuring consistent, compliant state at scale. Including compliance policies like CIS Benchmark and WinRE Change Tracking & Inventory Real‑time visibility into configuration changes and system state across your fleet enabling faster troubleshooting, improved security, and continuous compliance at scale. VM insights from Azure Monitor Delivers a unified, pre‑built observability experience that provides real‑time performance, health, and dependency visibility across VMs—enabling faster troubleshooting, optimization, and capacity planning at scale. Windows Admin Center Unified, browser‑based management plane to securely manage Windows servers, VMs, and hybrid infrastructure from anywhere—simplifying operations and improving efficiency at scale. Best Practices Assessment Continuously evaluation your server configurations against Microsoft-recommended standards to proactively identify risks and provide actionable remediation guidance—improving security, performance, and operational health at scale. Frequently Asked Questions What are hotpatch updates? Hotpatch updates are monthly security updates that take effect without requiring you to restart the device. They contain a full set of security updates equivalent to the standard updates released the same day. What is the hotpatch update cycle? All eligible Windows Server 2025 machines enrolled in hotpatch are offered up to 8 monthly hotpatch updates in a calendar year in a quarterly cycle: Baseline month: In January, April, July, and October, devices install the monthly cumulative security update and must restart for the update to take effect. This update includes the latest security fixes, cumulative new features, and enhancements since the last baseline. Subsequent two months: Devices receive hotpatch updates, which only include security updates and don't require a restart for the update to take effect. These devices will catch up on features and enhancements with the next cumulative baseline month (quarterly). Will billing be stopped for existing enrolled machines? Yes, as of 15 th May 2026 all billing for hotpatch has been stopped for all existing machines enrolled in hotpatch. What action do we need to take if we have machines enrolled in hotpatch already? There is no additional action needed for machines that are currently enrolled in hotpatch. These machines will remain enrolled in hotpatch and receive hotpatch updates when available. I want all my Windows Server 2025 machines to get hotpatches. How do I do it? If you have Windows Server 2025 machines on-premises or on cloud (other than Azure) then you can enable hotpatch on them. To do so, ensure these machines have Virtualization Based Security enabled and are connected to Azure Arc and then you can use Azure Arc portal, Azure Update manager or APIs to enable hotpatch. Learn more: https://aka.ms/ws-hotpatch Is anything changing for Hotpatching on Azure? Hotpatch continues to be available on Azure for your Windows Server 2022 and Windows Server 2025 VMs when using Azure Edition. There is no fee associated with Hotpatching on Azure. Learn more here. Is there a community forum for Arc? Yes, you can join the Azure Arc Monthly Forum here: aka.ms/ArcServerForumSignup3.4KViews10likes5CommentsWhat’s new in Azure Local: Cloud infrastructure for distributed locations enabled by Azure Arc
Today’s enterprises are navigating competing challenges: delivering AI-enabled digital experiences at the edge while also meeting growing demands for data sovereignty and regulatory compliance. Whether it’s a hospital needing local compute for patient care, or a government agency requiring full control over its infrastructure, the need for flexible, secure, and cloud scale solutions has never been greater. That’s why we introduced Azure Local—Microsoft’s solution for running Azure services and workloads at distributed locations, all managed through Azure Arc. With Azure Local, customers can deploy cloud-native and traditional applications on their own infrastructure while maintaining centralized visibility and control through the Azure portal. This approach is resonating: Microsoft has been named a Leader in the Gartner® Magic Quadrant™ for Distributed Hybrid Infrastructure every year since its inception. Azure Local is the foundation of Microsoft’s Sovereign Private Cloud, delivering Azure consistent services in customer controlled environments which meet strict data residency and compliance requirements. Read more about our recent Sovereign announcements here. See the Sovereign Private Cloud come to life here: Today, we’re so excited to tell you about the incredible new capabilities on Azure Local including support for external SAN storage, rack aware clustering, larger scale deployments, and more. Operate and scale with the power of the cloud Azure Local empowers organizations to operate and scale infrastructure with the power of the cloud, no matter where it’s deployed. From the Azure portal, customers can define and deploy infrastructure across distributed locations, apply one-click updates to entire clusters, and centrally monitor performance, health, and security. This cloud-based control plane ensures consistency and agility across environments—whether in datacenters, branch offices, or sovereign sites. NEW: Local Identity with Azure Key Vault (Preview) Azure Local now supports deployments without Active Directory using local identity with Azure Key Vault, currently in preview. This new option simplifies setup by removing the need for domain controllers, while still providing secure access and centralized secret management through Azure. Read the announcement here. Ready for all your apps, VMs and containers alike Azure Local is built to run all your applications—whether they’re virtual machines, containers, or Azure services. It offers full-featured, general-purpose VMs with cloud-consistent management, and includes Azure Kubernetes Service (AKS) built-in for modern containerized workloads. Customers can also deploy some of Azure’s most popular PaaS services like Azure Virtual Desktop, SQL Managed Instance, and Azure IoT Operations directly on Azure Local. With support for GPU-enabled nodes and Arc VM extensions, Azure Local is ready for everything from legacy line-of-business apps to AI-powered workloads. Migrate from VMware to Azure Local (Generally Available) Azure Migrate from VMware to Azure Local is now generally available, enabling customers to seamlessly move VMware virtual machines into their Azure Local infrastructure. This agentless migration path keeps data flows local, minimizes downtime, and simplifies onboarding with a cloud-consistent experience. Customers can discover, replicate, and migrate workloads using the Azure portal, with support for validated hardware and reference architectures. Azure Migrate unlocks a fast path to modernization for organizations consolidating legacy infrastructure. Read the announcement here. Customer Spotlight: How Publix Employees Federal Credit Union strengthened its disaster recovery strategy with Azure Loc... NEW: Microsoft 365 Local to meet your Private Sovereign Cloud needs (Generally Available) Microsoft 365 Local brings trusted productivity services like Exchange Server, SharePoint Server, and Skype for Business Server into customer-controlled environments, running directly on Azure Local infrastructure. Designed for those who need productivity tools in a private cloud environment, it leverages Azure Arc to provide a unified control plane for easy infrastructure management, simplified deployment, and streamlined updates. The solution features a validated reference architecture with certified hardware to ensure optimal performance and reliability, along with a hardened security baseline and robust controls to safeguard your infrastructure. It’s a key part of Microsoft’s Sovereign Private Cloud strategy, now generally available. Read the announcement here. Flexibility to meet your requirements Azure Local gives customers the flexibility to deploy infrastructure that fits their exact needs—whether that’s choosing from over 100 validated hardware platforms in the Azure Local catalog or operating in fully connected or disconnected environments. You can run Azure Local in public Azure regions or in Azure Government cloud, supporting both commercial and regulated workloads. Azure Local adapts to everything from retail edge sites to sovereign datacenters, disconnected oil rigs to connected manufacturing plants, all while maintaining a consistent Azure management experience. NEW: SAN Support (Preview) Azure Local now delivers greater infrastructure flexibility with expanded support for leading external SAN storage solutions, a capability that customers have long sought. Customers can now integrate their existing Fiber Channel-based SAN storage from leading vendors such as Pure Storage, NetApp, Dell, Lenovo, HPE, and Hitachi directly with Azure Local clusters. External storage support allows organizations to achieve high performance, scalability, and resilience while continuing to use their trusted storage infrastructure. It also enables consistent management across virtual machines, AKS clusters, and Arc-enabled services through the familiar Azure experience. Customers now have the freedom to modernize their environments while maximizing the value of their existing investments. Our customers are already exploring the impact this brings to enterprise customers. “We’re excited to partner with Microsoft and their trusted storage vendors to test external storage support for Azure Local,” said David McKenney, VP of Public Cloud Products at TierPoint. “This milestone gives customers greater flexibility to address performance, scalability, resilience, and investment protection needs. It reflects Microsoft’s ongoing dedication to making Azure Local the leading distributed cloud solution by listening to the needs of their customers and partners.” Support for more Storage protocols and other storage capabilities coming soon. Reach out to Microsoft or our storage partners to be part of this limited preview. NEW: Rack Aware Clusters (Preview) Rack aware clustering is now available in preview for Azure Local, enabling intelligent placement and resiliency across multi-rack deployments using one storage pool. This feature allows Azure Local to detect physical rack boundaries and distribute workloads accordingly, improving fault tolerance and minimizing impact from localized hardware failures. It’s especially valuable for larger deployments where high availability and service continuity are critical. Rack awareness integrates seamlessly with Azure Local’s update orchestration and VM placement logic, helping ensure infrastructure stays resilient at scale. Read the announcement here. NEW: Support for NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs (Generally Available) Azure Local now supports the NVIDIA RTX PRO 6000 Blackwell Server Edition GPU, generally available for high-performance workloads including AI inferencing, simulation, and visualization. This enterprise-grade GPU delivers exceptional compute density and energy efficiency, making it ideal for deployments that require advanced acceleration. Customers can deploy this powerful GPU in new Azure Local solutions—including Dell AX-770, Lenovo ThinkAgile MX650a V4, and HPE ProLiant DL380 Gen 12. Read the announcement here. NEW: Azure Local for larger deployments (Preview) Azure Local now scales further, with instances of up to 10,000+ cores across 100+ nodes delivered as multiple integrated racks with disaggregated storage. This enables customers to run the same familiar Azure Arc-enabled infrastructure and services at significantly larger scale, supporting a greater variety of workloads and scenarios. This new capability is available now in preview. Contact your Azure account representatives to learn more. Secure by default Azure Local is built with security at its core, offering a hardened infrastructure stack aligned with Microsoft’s secure-by-default principles, built-in Microsoft Defender for Cloud integration, and trusted launch VMs. Every VM is Azure Arc-enabled, allowing customers to apply security baselines, monitor threats, and enforce policies using familiar Azure tools. These protections are automatically enabled, so customers can operate confidently from day one. Network segmentation (Generally Available) To protect and isolate your network traffic between VMs or logical networks, Azure Local now supports network security groups (NSGs), generally available as of the 2510 release. NSGs enable precise filtering of network traffic using policy-driven access controls by applying inbound and outbound allow/deny rules. Rules support the full five-tuple of source IP, source port, destination IP, destination port, and protocol, and are enforced within the virtual switch at the virtual port level. NSGs can be applied to both logical networks and individual network interfaces and can be managed using the Azure Portal for centralized policy management of your edge workloads. Read the announcement here. Get Started Today For new production deployments Azure Local is generally available for production use. Explore the solutions catalog to find hardware from your preferred vendor and read the deployment overview to get started today. For evaluation (virtual) Want to try out Azure Local but don’t have hardware? Get a dedicated Azure Local sandbox in one click with Azure Arc Jumpstart. All you need is an Azure subscription to get started. Thank you! As we mark the second year since announcing Azure Local, we want to extend a heartfelt thank you to our customers, partners, and community. It’s incredibly rewarding to see Azure Local continue to be the infrastructure of choice for enterprises seeking flexibility, security, and innovation at the edge. We’re excited to continue delivering the solutions you need to thrive in a rapidly evolving world. Thank you for trusting Azure Local to power your most important workloads—here’s to another year of partnership and progress! If you’re at Ignite this week, please come say hello at: Our session dedicated to Azure Local What’s new in Azure Local Our booth “Azure Arc and Azure Local” in the Cloud and AI Platforms neighborhood See everything going on with Adaptive Cloud on our Ignite website Adaptive Cloud @Ignite 2025 FAQ What is Azure Local? Azure Local is Microsoft’s full-stack infrastructure software that runs on validated hardware in your own facilities. It brings Azure capabilities to distributed or sovereign locations, so you can run virtual machines, containers, and select Azure services locally while maintaining a consistent management experience through Azure Arc. How are Azure Local and Private Sovereign Cloud related? Azure Local is the foundation and core product fueling Microsoft’s Private Sovereign Cloud offering. It enables customers to meet strict data residency and regulatory requirements by hosting workloads on-premises, disconnected or semi-connected, while still benefiting from Azure innovation and security. When should I use Azure Local? Use Azure Local when you need modern cloud capabilities in locations where connectivity is limited, data sovereignty is critical, or latency-sensitive applications must run close to where data is generated. It’s ideal for industries like manufacturing, retail, and government that require local control with Azure consistency.11KViews4likes3CommentsAzure Local expands to sovereign-scale infrastructure with disaggregated deployments
As organizations accelerate digital transformation across datacenters, sovereign environments, and edge locations, infrastructure architectures must evolve to meet new operational and regulatory demands. The first feature update of Azure Local in CY 2026 (version 2604) marks a significant step forward—expanding Azure Local as a platform for sovereign private cloud infrastructure, introducing larger scale, disaggregated deployment architectures, expanded storage ecosystem partnerships, and simplified identity capabilities that unlock entirely new infrastructure scenarios from edge locations to enterprise-scale environments. This release is focused on enabling: Sovereign private cloud deployments at scale from single node up to multi-rack infrastructure Infrastructure modernization through SAN reuse and disaggregated architectures Simplified edge deployment without Microsoft Active Directory dependencies Faster lifecycle operations across deployment and update workflows Introducing disaggregated larger scale deployments using SAN storage Azure Local now supports a disaggregated infrastructure architecture, allowing customers to deploy compute and storage resources independently—while continuing to benefit from an Azure-consistent management and operational experience. This enables organizations to scale infrastructure more flexibly separating compute and storage to align with workload demands and long-term growth. This architecture enables: Independent scaling of compute nodes and storage infrastructure SAN‑only and hybrid storage architectures for Azure Local infrastructure and workloads Fibre Channel (FC) connectivity support beginning with 2604 (iSCSI coming soon) With disaggregated deployments and SAN storage, Azure Local clusters can now scale from a single node at the edge to multi-rack environments spanning beyond 16 nodes and up to thousands of nodes, addressing growing demand for large-scale deployments across sovereign, government, defense, and regulated environments. This unlocks new class of Azure -consistent infrastructure deployments at sovereign scale. This unlocks a new class of Azure-consistent infrastructure deployments at sovereign scale. This new capability is generally available with the release of Azure Local 2604. General Availability of SAN Support for Azure Local Support for attaching SAN storage to Azure Local was introduced as public preview back in November 2025. Today this brownfield expansion capability is generally available and allows external SAN devices to be introduced into already deployed Azure Local instances via Fibre Channel (FC)—supporting virtual machines, Kubernetes environments, and Azure Virtual Desktop workloads without requiring disruptive infrastructure changes or full system refresh. Azure Local instances now support the coexistence of Storage Spaces Direct volumes and external SAN volumes. Support for SAN-attached deployments allows organizations to: Reuse existing enterprise SAN investments Modernize infrastructure without replacing existing storage estates Manage rising disk costs associated with hyperconverged architectures Enable workload scenarios that depend on massive storage requirements These innovative capabilities supporting disaggregated deployments and SAN storage are supported by a strong ecosystem of hardware partners. DataON, Dell Technologies, Everpure, HPE, Hitachi Vantara, Lenovo and NetApp are working with Microsoft to deliver configurations, giving customers more flexibility in how they design and scale their infrastructure. General Availability of Local Identity with Azure Key Vault While disaggregated architectures primarily target sovereign and centralized datacenter deployments, Azure Local 2604 also introduces a major advancement for distributed and edge scenarios. With the General Availability of Local Identity with Key Vault, Azure Local can now be provisioned without infrastructure dependencies on Microsoft Active Directory, enabling simplified deployment in disconnected, air-gapped, and regulated environments. This simplifies deployment and adoption, by removing the need for extra hardware running domain controllers and removing the complexity of firewall configurations when installing in isolated network environments. Azure Local 2604 adds support for deploying rack-aware clusters using Local Identity with Azure Key Vault. This combines reduced requirements with the high availability that customers demand across manufacturing, energy, and other industries. This capability removes one of the key barriers to deploying Azure-consistent infrastructure in sovereign and edge environments. Pricing Changes Pricing for multi-rack and sovereign-scale deployments is being introduced as part of this release. Customers should connect with their Microsoft account team to learn more about pricing, configuration options, and early access programs as these offerings continue to actively evolve. Getting started Release 2604 is available for both existing and new Azure Local instances. Review the release note for Azure Local 2604 release here Learn more about disaggregated deployments here Learn more about SAN attach here Learn more about Local Identity with Azure Key Vault here. Learn more about hardware configurations that support disaggregated deployments using the solutions catalog or learn directly from our partners: o DataON: “DataON Premier Solutions for Azure Local provide a premium Azure Local experience that includes deployment, integration, training, and white glove service & support. Our goal is to not only get you up and running quickly but also to help your team to be confident in managing Azure Local.” o Dell Technologies: “Coming Soon, Dell Private Cloud–Microsoft enables a modern disaggregated architecture, simplifying operations across Dell PowerEdge compute, Dell PowerStore storage, and Azure Local.” “Available now, Dell PowerStore delivers high-performance, scalable, and resilient storage for Azure Local, with support for Dell Private Cloud coming soon to make it easier to streamline operations for storage, compute, and your Azure Local license.” o Everpure: “Azure Local now supports external storage with Everpure FlashArray, offering Azure Local customers unprecedented levels of scale, performance and efficiency with the added benefit of seamless hybrid cloud integration with Everpure Cloud in Azure.” o Hitachi Vantara: “Hitachi Vantara VSP and VSP One Block, fully validated to meet Microsoft's Azure Local storage requirements, deliver enterprise SAN reliability for Azure Local.” o HPE: “HPE ProLiant Compute Premier Solutions for Azure Local enable customers to gain full control over data residency, and accelerate innovation with industry-leading performance, security, and management automation.” “HPE Alletra Storage MP B10000 integrated with Azure Local delivers a unified, Azure managed experience with the simplicity of Azure Local plus the advanced data services of a modern enterprise storage platform.” o Lenovo: “Lenovo is expanding its Azure Local portfolio to support disaggregated infrastructure designs that deliver greater choice across compute and storage. The ThinkAgile Disaggregated Solution for Microsoft Azure Local with new compute-only configurations on ThinkAgile MX Series enables customers to integrate ThinkSystem DM, DS, and DG Series storage arrays or bring their own Azure Local validated third party SAN arrays into new or existing Azure Local environments, allowing fully disaggregated, independent scaling using enterprise class Lenovo solutions for sovereign private cloud deployments and emerging AI workloads.” o NetApp: “With Azure Local, NetApp delivers support across NetApp® AFF, ASA, and FAS systems.” Thank you! This first feature release of 2026 is packed with innovation for Azure Local, and we can’t wait for you to try it and share feedback. We are committed to listening to your feedback and delivering the next wave of capabilities in a continuously evolving world. Thank you to all our customers who trust Azure Local to run their business—and to our engineering partners for the incredible collaboration in building solutions together.4.5KViews7likes0CommentsFrom fragmented sites to consistent governance: Azure Arc patterns for adaptive cloud strategy.
In Manufacturing companies, hybrid architectures aren’t transitional—they’re persistent. Most large manufacturers operate across remote plants, branch sites, private datacenters, and Azure. The main challenge manufacturers face isn’t adopting cloud services, it is preventing long‑term operational fragmentation: multiple teams, multiple tools, inconsistent security controls, and uneven governance as the estate grows. When manufacturing IT grows organically, systems end up scattered across factories, edge, and cloud—creating fragmentation instead of flow. Azure Arc addresses this as an architectural control‑plane pattern: it extends Azure management to infrastructure and Kubernetes outside Azure by projecting them into Azure Resource Manager (ARM) so they can be governed using Azure-native primitives such as policy, RBAC, and monitoring. This article describes three architecture patterns that consistently emerge in manufacturing and edge scenarios. Each pattern addresses a distinct set of constraints—ranging from centralized governance across hybrid estates, to plant‑adjacent platforms, to fully disconnected environments—and illustrates how Azure services can be composed to support these realities in a scalable, well‑governed way. Typical manufacturing environments must contend with some or many of the following components: Latency & determinism: plant-floor systems often require local execution Distributed footprint: dozens/hundreds of sites with varying maturity Connectivity variability: some sites are intermittently connected Regulatory & data constraints: some workloads must remain on premises Cloud: Native cloud applications including the AI based research applications, SAP systems, etc. As a result, the estate becomes a mix of Azure + non‑Azure infrastructure. The failure mode isn’t performance—it’s inconsistent operations: different patching methods, different monitoring stacks, and uneven security baselines. Azure Arc is positioned specifically to create unity across that operational model by bringing hybrid resources into the Azure control plane. A helpful way to think about Arc in manufacturing scenario is to separate the control plane and the data plane: Arc enables a centralized control plane by projecting resources, like the ones below, into ARM: Azure Resource Manager (resource inventory, tags, RBAC, Policy) Security posture & compliance (Defender for Cloud, policy initiatives) Observability and operations workflows (Azure Monitor, Update Manager, etc.) Whereas the data plane remains at distributed locations meaning: Workload execution remains at plants, private DCs, or edge sites Kubernetes API endpoints, runtime traffic, OT systems remain local This separation is an architectural lever allowing organizations to standardize governance without forcing workload relocation. A high-level design decision matrix Constraint Recommended starting pattern Why Many sites + inconsistent tooling Arc as distributed control plane Standardizes governance and inventory via ARM projection Plant workloads require local platform Azure Local + Arc Uses Azure Local baseline + Arc integration for operations Connectivity cannot be assumed Disconnected/intermittent design Forces control-plane boundary design + local autonomy Pattern 1 — Azure Arc as the distributed control plane (for VM, SQL severs+ Kubernetes) When this pattern fits Use this pattern when: You need consistent governance across plants, datacenters, and multicloud You can maintain at least periodic connectivity for control-plane sync You want Azure policy/security/monitoring to apply uniformly Architecture intent Azure Arc projects existing bare metal, VM, and Kubernetes infrastructure resources into Azure to handle operations with Azure management and security tools. Azure Arc simplifies governance and management by delivering a consistent multicloud and on-premises management platform experience for Azure services. Once projected, you can operate hybrid resources using Azure-native constructs (inventory, compliance reporting, policy scope) and apply standardized guardrails. From an architectural standpoint, Azure Arc establishes a centralized control plane in Azure (ARM, RBAC, Policy, Resource Graph) and decentralized data plane remaining at plants, datacenters, or edge sites. This separation enables organizations to apply management‑group–scoped policies, standardized tagging, and Defender for Cloud controls consistently across environments, while preserving local execution and latency characteristics required by manufacturing workloads. Why this pattern matters: It moves organizations from managing individual sites to governing the entire estate as one. It minimizes operational drift as environments expand across plants and edge locations. Centralized control simplifies enforcement of standards without slowing local operations. The pattern creates predictability at scale in highly distributed environments. It establishes a stable foundation for future modernization initiatives. Pattern 2 — Azure Local + Azure Arc (plant-adjacent platform pattern) When this pattern fits Use this pattern when: Workloads must run on premises for latency, sovereignty, or operational control You want cloud-consistent operations without creating a separate tooling island You need a standardized platform for virtualized + containerized workloads at sites You need the local AI inferencing where data needs to be processed at the source/plant site Architecture intent Azure Local Microsoft’s distributed infrastructure solution that extends Azure capabilities to customer-owned environments. It facilitates the local deployment of both modern and legacy applications across distributed or sovereign locations. Azure Local accelerates cloud and AI innovation by seamlessly delivering new applications, workloads, and services from cloud to edge, using Azure Arc as the unifying control plane. From an architectural perspective, Azure Local serves as the local data plane for applications—supporting general‑purpose virtual machines, managed Kubernetes (AKS), and selected Azure services—while Azure Arc extends the Azure control plane to that environment for inventory, policy, monitoring, and security integration. This separation allows workloads to run close to manufacturing systems without creating a parallel or disconnected operational model. Azure Local supports a broad spectrum of workload types on the same platform foundation, including: Traditional line‑of‑business applications on virtual machines Modern containerized workloads using AKS on Azure Local Azure‑consistent platform services that can be deployed locally, such as Azure Virtual Desktop and SQL Managed Instance GPU‑accelerated workloads for AI inferencing and computer vision scenarios Why this pattern matters: Without a platform like Azure Local integrated through Azure Arc, on‑premises manufacturing workloads tend to evolve into bespoke environments with inconsistent security, monitoring, and lifecycle management—making long‑term scale and governance increasingly difficult. Pattern 3 — Disconnected edge workloads (connectivity-constrained design) When this pattern fits Use this pattern when: Sites cannot assume continuous connectivity Local autonomy is required for safety or production continuity You still want centralized governance when connected Architecture intent In manufacturing and edge scenarios, some environments must operate without continuous internet connectivity due to regulatory constraints, physical isolation, or operational risk tolerance. In these cases, architectures must assume that cloud control‑plane access is intermittent or unavailable, while local execution must continue without disruption. Disconnected architectures shift the primary design concern from availability of services to autonomy of execution. This pattern applies to environments that are fully offline, intermittently connected, or explicitly restricted from sending data to public cloud endpoints. Azure supports this model through Disconnected-containers, where containerized services are deployed and operated fully offline. Once provisioned, these containers run entirely on local infrastructure with no runtime dependency on Azure endpoints, enabling uninterrupted execution even during extended disconnection periods. Disconnected containers are offered through commitment tier pricing, each offering a discounted rate compared to the Standard pricing model. Learn more about pricing here: Plan and Manage Costs - Microsoft Foundry | Microsoft Learn Before attempting to run a Docker container in an offline environment, make sure you know the steps to successfully download and use the container. For example: Host computer requirements and recommendations. The Docker pull command you use to download the container. How to validate that a container is running. How to send queries to the container's endpoint once it's running. Why this pattern matters: This pattern matters because not all environments can rely on continuous connectivity. It enables critical workloads to operate independently at the edge while remaining aligned to central governance when connectivity is available. The pattern prioritizes local autonomy without sacrificing architectural discipline. It reduces operational risk in constrained or disconnected sites. This approach ensures resilience and continuity in environments where connectivity cannot be assumed. Manufacturing IT will remain distributed by design. The risk is not hybrid complexity, but fragmented operations. By centralizing the control plane while keeping execution local, Arc enables consistent security, compliance, and operations across cloud, datacenter, and edge.758Views0likes0Comments