adaptive cloud
47 TopicsIntroducing Azure Local: cloud infrastructure for distributed locations enabled by Azure Arc
Today at Microsoft Ignite 2024 we're introducing Azure Local, cloud-connected infrastructure that can be deployed at your physical locations and under your operational control. With Azure Local, you can run the foundational Azure compute, networking, storage, and application services locally on hardware from your preferred vendor, providing flexibility to meet your requirements and budget.86KViews24likes26CommentsAnnouncing General Availability: Windows Server Management enabled by Azure Arc
Windows Server Management enabled by Azure Arc offers customers with Windows Server licenses that have active Software Assurances or Windows Server licenses that are active subscription licenses the following key benefits: Azure Update Manager Azure Change Tracking and Inventory Azure Machine Configuration Windows Admin Center in Azure for Arc Remote Support Network HUD Best Practices Assessment Azure Site Recovery (Configuration Only) Upon attestation, customers receive access to the following at no additional cost beyond associated networking, compute, storage, and log ingestion charges. These same capabilities are also available for customers enrolled in Windows Server 2025 Pay as you Go licensing enabled by Azure Arc. Learn more at Windows Server Management enabled by Azure Arc - Azure Arc | Microsoft Learn or watch Video: Free Azure Services for Non-Azure Windows Servers Covered by SA Powered by Azure Arc! To get started, connect your servers to Azure Arc, attest for these benefits, and deploy management services as you modernize to Azure's AI-enabled set of server management capabilities across your hybrid, multi-cloud, and edge infrastructure!18KViews10likes10CommentsEvolving Stretch Clustering for Azure Local
Stretched clusters in Azure Local, version 22H2 (formerly Azure Stack HCI, version 22H2) entail a specific technical implementation of storage replication that spans a cluster across two sites. Azure Local, version 23H2 has evolved from a cloud-connected operating system to an Arc-enabled solution with Arc Resource Bridge, Arc VM, and AKS enabled by Azure Arc. Azure Local, version 23H2 expands the requirements for multi-site scenarios beyond the OS layer, while Stretched clusters do not encompass the entire solution stack. Based on customer feedback, the new Azure Local release will replace the Stretched clusters defined in version 22H2 with new high availability and disaster recovery options. For Short Distance Rack Aware Cluster is a new cluster option which spans two separate racks or rooms within the same Layer-2 network at a single location, such as a manufacturing plant or a campus. Each rack functions as a local availability zone across layers from OS to Arc management including Arc VMs and AKS enabled by Azure Arc, providing fault isolation and workload placement within the cluster. The solution is configured with one storage pool to reduce additional storage replication and enhance storage efficiency. This solution delivers the same Azure deployment and management experience as a standard cluster. This setup is suitable for edge locations and can scale up to 8 nodes, with 4 nodes in each rack. Rack Aware Cluster is currently in private preview and is slated to public preview and general release in 2025. For Long Distance Azure Site Recovery can be used to replicate on-premises Azure Local virtual machines into Azure and protect business-critical workloads. This allows Azure cloud to serve as a disaster recovery site, enabling critical VMs to be failed over to Azure in case of a local cluster disaster, and then failed back to the on-premises cluster when it becomes operational again. If you cannot fail over certain workloads to cloud and require long distance of disaster recovery, like in two different cities, you can leverage Hyper-V Replica to replicate Arc VMs to the secondary site. Those VMs will become Hyper-V VMs on the secondary site, they will become Arc VMs once they fail back to the original cluster on the first site. Additional Options beyond Azure Local If the above solutions in Azure Local do not cover your needs, you can fully customize your solution with Windows Server 2025 which introduces several advanced hybrid cloud capabilities designed to enhance operational flexibility and connectivity across various environments. Additionally, it offers various replication technologies like Hyper-V Replica, Storage Replica and external SAN replication that enable the development of tailored datacenter disaster recovery solutions. Learn more from the Windows Server 2025 now generally available, with advanced security, improved performance, and cloud agility - Microsoft Windows Server Blog What to do with existing Stretched clusters on version 22H2 Stretched clusters and Storage Replica are not supported in Azure Local, version 23H2 and beyond. However, version 22H2 stretched clusters can stay in supported state in version 23H2 by performing the first step of operating system upgrade as shown in the following diagram to 23H2 OS. The second step of the solution upgrade to Azure Local is not applicable to stretched clusters. This provides extra time to assess the most suitable future solution for your needs. Please refer to the About Azure Local upgrade to version 23H2 - Azure Local | Microsoft Learn for more information on the 23H2 upgrade. Refer the blog on Upgrade from Azure Stack HCI, version 22H2 to Azure Local | Microsoft Community Hub. Conclusion We are excited to be bringing Rack Aware Clusters and Azure Site Recovery to Azure Local. These high availability and disaster recovery options allow customers to address various scenarios with a modern cloud experience and simplified management.14KViews16likes0CommentsExtending Azure's AI Platform with an adaptive cloud approach
Authored by Derek Bogardus and Sanjana Mohan, Azure Edge AI Product Management Ignite 2024 is here, and nothing is more top of mind for customers than the potential to transform their businesses with AI wherever they operate. Today, we are excited to announce the preview of two new Arc-enabled services that extend the power of Azure’s AI platform to on-premises and edge environments. Sign up to join the previews here! An adaptive cloud approach to AI The goal of Azure’s adaptive cloud approach is to extend just enough Azure to customers’ distributed environments. For many of these customers, valuable data is generated and stored locally, outside of the hyperscale cloud, whether due to regulation, latency, business continuity, or simply the large volume of data being generated in real time. AI inferencing can only occur where the data exists. So, while the cloud has become the environment of choice for training models, we see a tremendous need to extend inferencing services beyond the cloud to enable complete cloud-to-edge AI scenarios. Search on-premises data with generative AI Over the past couple of years, generative AI has come to the forefront of AI innovation. Language models give any user the ability to interact with large, complex data sets in natural language. Public tools like ChatGPT are great for queries about general knowledge, but they can’t answer questions about private enterprise data on which they were not trained. Retrieval Augmented Generation, or "RAG", helps address this need by augmenting language models with private data. Cloud services like Azure AI Search and Azure AI Foundry simplify how customers can use RAG to ground language models in their enterprise data. Today, we are announcing the preview of a new service that brings generative AI and RAG to your data at the edge. Within minutes, customers can deploy an Arc extension that contains everything needed to start asking questions about their on-premises data, including: Popular small and large language models running locally with support for both CPU and GPU hardware A turnkey data ingestion and RAG pipeline that keeps all data completely local, with RBAC controls to prevent unauthorized access An out-of-the-box prompt engineering and evaluation tool to find the best settings for a particular dataset Azure-consistent APIs to integrate into business applications, as well as a pre-packaged UI to get started quickly This service is available now in gated private preview for customers running Azure Local infrastructure, and we plan to make it available on other Arc-enabled infrastructure platforms in the near future. Sign up here! Deploy curated open-source AI models via Azure Arc Another great thing about Azure’s AI platform is that it provides a catalog of curated AI models that are ready to deploy and provide consistent inferencing endpoints that can be integrated directly into customer applications. This not only makes deployment easy, but customers can also be confident that the models are secure and validated These same needs exist on the edge as well, which is why we are now making a set of curated models deployable directly from the Azure Portal. These models have been selected, packaged, and tested specifically for edge deployments, and are currently available on Azure Local infrastructure. Phi-3.5 Mini (3.8 billion parameter language model) Mistral 7B (7.3 billion parameter language model) MMDetection YOLO (object detection) OpenAI Whisper Large (speech to text) Google T5 Base (translation) Models can be deployed from a familiar Azure Portal wizard to an Arc AKS cluster running on premises. All available models today can be run on just a CPU. Phi-3.5 and Mistral 7B also have GPU versions available for better performance. Once complete, the deployment can be managed directly in Azure ML Studio, and an inferencing endpoint is available on your local network. Wrap up Sign up now to join either of the previews at the link below or stop by and visit us in person in the Azure Arc and Azure Local Expert Meet Up station in the Azure Infrastructure neighborhood at Ignite. We’re excited to get these new capabilities into our customers’ hands and hear from you how it’s going. Sign up to join the previews here5.2KViews7likes2CommentsIgnite 2024: AKS enabled by Azure Arc - New Capabilities and Expanded Workload Support
Microsoft Ignite 2024 has been a showcase of innovation across the Azure ecosystem, bringing forward major advancements in AI, cloud-native applications, and hybrid cloud solutions. This year’s event featured key updates, including enhancements to AKS enabled by Azure Arc, which introduced new capabilities and expanded workload support. These updates reinforce the value and versatility that AKS enabled by Azure Arc brings to organizations looking to scale and optimize their operations. With these advancements, AKS Arc continues to support seamless management, increased scalability, and enhanced workload performance across diverse infrastructures. AKS Enabled by Azure Arc AKS enabled by Azure Arc brings the power of Azure’s managed Kubernetes service to any environment, providing consistent management and security across on-premises, edge, and multi-cloud deployments. It encompasses: AKS on Azure Local: A full-featured Kubernetes platform integrated with Azure Local for comprehensive container orchestration in hybrid setups. Notably, AKS on Azure Local has earned recognition as a leader in the 2024 Gartner Magic Quadrant for Distributed Hybrid Infrastructure, underscoring Microsoft's dedication to delivering comprehensive, enterprise-ready solutions for hybrid cloud deployments. AKS Edge Essentials: A lightweight version designed for edge computing, ensuring operational consistency on constrained hardware. AKS on Azure Local Disconnected Operations: It is now available on Azure Local Disconnected Operations. This latest addition to AKS enabled by Azure Arc portfolio is the support for fully disconnected scenario. It allows AKS enabled by Azure Arc to operate in air-gapped, isolated environments without the need for continuous Azure connectivity. It is crucial for organizations that require secure, self-sufficient Kubernetes operations in highly controlled or remote locations. With this support, businesses can maintain robust Kubernetes functionality while meeting stringent compliance and security standards. Key Features and Expanded Workload Support This year's Ignite announcements unveiled a series of public preview and GA features that enhance the capabilities of AKS enabled by Azure Arc. These advancements reflect our commitment to delivering robust, scalable solutions that meet the evolving needs of our customers. Below are the key highlights that showcase the enhanced capabilities of AKS enabled by Azure Arc: Edge Workload Azure IoT Operations - enabled by Azure Arc: Available on AKS Edge Essentials (AKS-EE) and AKS on Azure Local with public preview support. Azure IoT Operations in the management and scaling of IoT solutions. It provides robust support for deploying and overseeing IoT applications within Kubernetes environments, enhancing operational control and scalability. Organizations can leverage this tool to maintain seamless management of distributed IoT workloads, ensuring consistent performance and simplified scaling across diverse deployment scenarios. Azure Container Storage - enabled by Azure Arc: Available on both AKS Edge Essentials (AKS-EE) and AKS on Azure Local, this support enables seamless integration for persistent storage needs in Kubernetes environments. It provides scalable, reliable, and high-performance storage solutions that enhance data management and support stateful applications running in hybrid and edge deployments. This addition ensures that organizations can efficiently manage their containerized workloads with robust storage capabilities. Azure Key Vault Secret Store extension for Kubernetes: Now available as public preview on AKS Edge Essentials and AKS on Azure Local, this extension automatically synchronizes secrets from an Azure Key Vault to an AKS enabled by Azure Arc cluster for offline access, providing essential tools for proactive monitoring and policy enforcement. It offers advanced security and compliance capabilities tailored for robust governance and regulatory adherence, ensuring that organizations can maintain compliance with industry standards and best practices while safeguarding their infrastructure. Azure Monitor Pipeline: The Azure Monitor pipeline is a data ingestion solution designed to provide consistent, centralized data collection for Azure Monitor. Once deployed for AIO on AKS cluster enabled by Azure Arc, it enables at-scale telemetry data collection and routing at the edge. The pipeline can cache data locally, syncing with the cloud when connectivity is restored, and supports segmented networks where direct data transfer to the cloud isn’t possible. Built on OpenTelemetry Collector, the pipeline’s configuration includes data flows, cache properties, and destination rules defined in the DCR to ensure seamless data processing and transmission to the cloud. Arc Workload Identity Federation: Now available as public preview on AKS Edge Essentials and AKS on Azure Local, providing secure federated identity management to enhance security for customer workloads Arc Gateway: Now available as public preview for AKS Edge Essentials and AKS on Azure Local. Arc Gateway support on AKS enabled by Azure Arc enhances secure connectivity across hybrid environments, reducing required firewall rules and improving security for customer deployments. Azure AI Video Indexer - enabled by Azure Arc: Supported on AKS Edge Essentials and AKS on Azure Local. Arc-enabled Video Indexer enables comprehensive AI-powered video analysis, including transcription, facial recognition, and object detection. It allows organizations to deploy sophisticated video processing solutions within hybrid and edge environments, ensuring efficient local data processing with improved security and minimal latency. MetalLB - Azure Arc Extension: Now supported on AKS Edge Essentials and AKS on Azure Local, MetalLB ensures efficient load balancing capabilities. This addition enhances network resilience and optimizes traffic distribution within Kubernetes environments. Comprehensive AI and Machine Learning Capabilities GPUs for AI Workloads: Now AKS enabled by Azure Arc supports a range of GPUs tailored for AI and machine learning workloads with GPU Partitioning) and GPU Passthrough Virtualization support. These options enable robust performance for resource-intensive AI and machine learning workloads, allowing for efficient use of GPU resources to run complex models and data processing tasks. Arc-enabled Azure Machine Learning: Support on AKS on Azure Local, AML capabilities for running sophisticated AI models. Businesses can leverage Azure’s powerful machine learning tools seamlessly across different environments, enabling them to develop, deploy, and manage machine learning models effectively on-premises and at the edge. Arc-enabled Video Indexer: It extends Azure's advanced video analytics capabilities to AKS enabled by Azure Arc. Organizations can now process and analyze video content in real-time, harnessing Azure's robust video AI tools to enhance video-based insights and operations. This support provides businesses with greater flexibility to conduct video analysis seamlessly in remote or hybrid environments Kubernetes AI Toolchain Orchestrator (Kaito + LoRA + QLoRA): Fully validated and support for fine-tuning and optimizing AI models, Kaito, LoRA and QLoRA are designed for edge deployments such as AKS on Azure Local. This combination enhances the ability to run and refine AI applications effectively in edge environments, ensuring performance and flexibility. Flyte Integration: Now supported on AKS on Azure Local, Flyte offers a scalable orchestration platform for managing machine learning workflows. This capability enables teams to build, execute, and manage complex AI pipelines efficiently, enhancing productivity and simplifying the workflow management process. Enhanced Infrastructure and Operations Management Infrastructure as Code (IaC) with Terraform: Now supported on AKS on Azure Local for both Connected and Air-gapped scenario, providing streamlined deployment capabilities through code. This support enables teams to automate and manage their Kubernetes infrastructure at scale more efficiently with Terraform. Anti-affinity, Pod CIDR, Taints/Labels: Available on AKS on Azure Local, these features provide enhanced infrastructure capabilities by allowing refined workload placement and advanced network configuration. Anti-affinity rules help distribute pods across different nodes to avoid single points of failure, while Pod CIDR simplifies network management by allocating IP ranges to pods. Taints and labels offer greater control over node selection, ensuring that specific workloads run on designated nodes and enhancing the overall efficiency and reliability of Kubernetes operations. Optimized Windows Node Pool Management: AKS enabled by Azure Arc now includes the capability to enable and disable Windows node pools for clusters. This enhancement helps prevent unnecessary binary downloads, benefiting customers with low-speed or limited internet connection. It optimizes resource usage, reduces bandwidth consumption, and enhances overall deployment efficiency, making it ideal for environments with network constraints. Kubernetes Development AKS-WSL: With AKS-WSL, developers can set up a local environment that mimics the experience of working with AKS. This makes it easier for developers to write, debug, and test Kubernetes applications locally before deploying them to a full AKS cluster. AKS-WSL VSCode Extension: The Visual Studio Code extension for AKS-WSL allows developers to write, debug, and deploy Kubernetes applications locally, streamlining development workflows. This setup improves productivity by providing efficient tools and capabilities, making it easier to develop, test, and refine Kubernetes workloads directly from a local machine. Arc Jumpstart: Supported AKS Edge Essentials and AKS on Azure Local. Arc Jumpstart simplifies deployment initiation, providing developers with a streamlined way to set up and start working with Kubernetes environments quickly. It makes it easier for teams to evaluate and experiment with AKS enabled by Azure Arc, offering pre-configured scenarios and comprehensive guidance. By reducing complexity and setup time, Arc Jumpstart enhances the developer experience, facilitating faster prototyping and smoother onboarding for new projects in hybrid and edge settings. Conclusion Microsoft Ignite 2024 has underscored the continued evolution of AKS enabled by Azure Arc, bringing more comprehensive, scalable, and secure solutions to diverse environments. These advancements support organizations in running cloud-native applications anywhere, enhancing operational efficiency and innovation. We welcome your feedback (aksarcfeedback@microsoft.com) and look forward to ongoing collaboration as we continue to evolve AKS enabled by Azure Arc.4KViews5likes0CommentsUpgrade Azure Local operating system to new version
Today, we’re sharing more details about the end of support for Azure Local, with OS version 25398.xxxx (23H2) on October 31, 2025. After this date, monthly security and quality updates stop, and Microsoft Support remains available only for upgrade assistance. Your billing continues, and your systems keep working, including registration and repair. There are several options to upgrade to Azure Local, with OS version 26100.xxxx (24H2) depending on which scenario applies to you. Scenario #1: You are on Azure Local solution, with OS version 25398.xxxx If you're already running the Azure Local solution, with OS version 25398.xxxx, there is no action required. You will automatically receive the upgrade to OS version 26100.xxxx via a solution update to 2509. Azure Local, version 23H2 and 24H2 release information - Azure Local | Microsoft Learn for the latest version of the diagram. If you are interested in upgrading to OS version 26100.xxxx before the 2509 release, there will be an opt-in process available in the future with production support. Scenario #2: You are on Azure Stack HCI and haven’t performed the solution upgrade yet Scenario #2a: You are still on Azure Stack HCI, version 22H2 With the 2505 release, a direct upgrade path from version 22H2 OS (20349.xxxx) to 24H2 OS (26100.xxxx) has been made available. To ensure a validated, consistent experience, we have reduced the process to using the downloadable media and PowerShell to install the upgrade. If you’re running Azure Stack HCI, version 22H2 OS, we recommend taking this direct upgrade path to the version 24H2 OS. Skipping the upgrade to the version 23H2 OS will be one less upgrade hop and will help reduce reboots and maintenance planning prior to the solution upgrade. After then, perform post-OS upgrade tasks and validate the solution upgrade readiness. Consult with your hardware vendor to determine if version 24H2 OS is supported before performing the direct upgrade path. The solution upgrade for systems on the 24H2 OS is not yet supported but will be available soon. Scenario #2b: You are on Azure Stack HCI, version 23H2 OS If you performed the upgrade from Azure Stack HCI, version 22H2 OS to version 23H2 OS (25398.xxxx), but haven’t applied the solution upgrade, then we recommend that you perform post-OS upgrade tasks, validate the solution upgrade readiness, and apply the solution upgrade. Diagram of Upgrade Paths Conclusion We invite you to identify which scenarios apply to you and take action to upgrade your systems. On behalf of the Azure Local team, we thank you for your continuous trust and feedback! Learn more To learn more, refer to the upgrade documentation. For known issues and remediation guidance, see the Azure Local Supportability GitHub repository.3KViews4likes9CommentsCloud infrastructure for disconnected environments enabled by Azure Arc
Organizations in highly regulated industries such as government, defense, financial services, healthcare, and energy often operate under strict security and compliance requirements and across distributed locations, some with limited or no connectivity to public cloud. Leveraging advanced capabilities, including AI, in the face of this complexity can be time-consuming and resource intensive. Azure Local, enabled by Azure Arc, offers simplicity. Azure Local’s distributed infrastructure extends cloud services and security across distributed locations, including customer-owned on-premises environments. Through Azure Arc, customers benefit from a single management experience and full operational control that is consistent from cloud to edge. Available in preview to pre-qualified customers, Azure Local with disconnected operations extends these capabilities even further – enabling organizations to deploy, manage, and operate cloud-native infrastructure and services in completely disconnected or air-gapped networks. What is disconnected operations? Disconnected operations is an add-on capability of Azure Local, delivered as a virtual appliance, that enables the deployment and lifecycle management of your Azure Local infrastructure and Arc-enabled services, without any dependency on a continuous cloud connection. Key Benefits Consistent Azure Experience: You can operate your disconnected environment using the same tools you already know - Azure Portal, Azure CLI and ARM Templates extended through a local control plane. Built-in Azure Services: Through Azure Arc, you can deploy, update, and manage Azure services such as Azure Local VMs, Azure Kubernetes Service (AKS), etc. Data Residency and Control: You can govern and keep data within your organization's physical and legal jurisdiction to meet data residency, operational autonomy, and technological isolation requirements. Key Use Cases Azure Local with disconnected operations unlocks a range of impactful use cases for regulated industries: Government and Defense: Running sensitive government workloads and classified data more securely in air-gapped and tactical environments with familiar Azure management and operations. Manufacturing: Deploying and managing mission-critical applications like industrial process automation and control systems for real-time optimizations in more highly secure environments with zero connectivity. Financial Services: Enhanced protection of sensitive financial data with real time data analytics and decision making, while ensuring compliance with strict regulations in isolated networks. Healthcare: Running critical workloads with a need for real-time processing, storing and managing sensitive patient data with the increased levels of privacy and security in disconnected environments Energy: Operating critical infrastructure in isolated environments, such as electrical production and distribution facilities, oil rigs, or remote pipelines. Here is an example of how disconnected operations for Azure Local can provide mission critical emergency response and recovery efforts by providing essential services when critical infrastructure and networks are unavailable. Core Features and capabilities Simplified Deployment and Management Download and deploy the disconnected operations virtual appliance on Azure Local Premier Solutions through a streamlined user interface. Create and manage Azure Local instances using the local control plane, with the same tooling experience as Azure. Offline Updates The monthly update package includes all the essential components: the appliance, Azure Local software, AKS, and Arc-enabled service agents. You can update and manage the entire Azure Local instance using the local control plane without an internet connection. Monitoring Integration You can monitor your Azure Local instances and VMs using external monitoring solutions like SCOM by installing custom management packs and monitor AKS Clusters through 3 rd party open-source solutions like Prometheus and Grafana. Run Mission-Critical Workloads – Anytime, Anywhere Azure Local VMs You can run VMs with flexible sizing, support for custom VM images, and high availability through storage replication and automatic failover – all managed through the local Azure interface. AI & Containers with AKS You can use disconnected AI containers with Azure Kubernetes Service (AKS) on Azure Local to deploy and manage AI applications in disconnected scenarios where data residency and operational autonomy is required. AKS enables the deployment and management of containerized applications such as AI agents and models, deep learning frameworks, and related tools, which can be leveraged for inferencing, fine-tuning, and training in isolated networks. AKS also automates resource scaling, allowing for the dynamic addition and removal of container instances to more efficiently utilize hardware resources, including GPUs, which are critical for AI workloads. This provides consistent Azure experience in managing Kubernetes clusters and AI workloads with the same tooling and processes in connected environments. Get Started: Resources and Next Steps Microsoft is excited to announce the upcoming preview of Disconnected Operations for Azure Local in Q3 ‘CY25 for both Commercial and Government Cloud customers. To Learn more, please visit Disconnected operations for Azure Local overview (preview) - Azure Local Ready to participate? Get Qualified! or contact your Microsoft account team. Please also check out this session at Microsoft Build https://build.microsoft.com/en-US/sessions/BRK195 by Mark Russinovich, one of the most influential minds in cloud computing. His insights into the latest Azure innovations, the future of cloud architecture and computing, is a must-watch event!2.3KViews7likes3CommentsAKS Arc - Optimized for AI Workloads
Overview Azure is the world’s AI supercomputer providing the most comprehensive AI capabilities ranging from infrastructure, platform services to frontier models. We’ve seen emerging needs among Azure customers to use the same Azure-based solution for AI/ML on the edge with minimized latencies while staying compliant with industry regulation or government requirement. Azure Kubernetes Service enabled by Azure Arc (AKS Arc) is a managed Kubernetes service that empowers customers to deploy and manage containerized workload whether they are in data centers or at edge locations. We want to ensure AKS Arc provides optimal experience for AI/ML workload on the edge, throughout the whole development lifecycle from AI infrastructure, Model deployment, Inference, Fine-tuning, and Application. AI infrastructure AKS Arc supports Nvidia A2, A16, and T4 for compute-intensive workload such as machine learning, deep learning, model training. When GPUs are enabled in Azure Local; AKS Arc customers can provision GPU node pools from Azure and host AI/ML workload in the Kubernetes cluster on the edge. For more details, please visit instructions from GPU Nodepool in AKS Arc. Model deployment and fine tuning Use KAITO for language model deployment, inference and fine tuning Kubernetes AI Toolchain Operator (KAITO) is an open-source operator that automates and simplifies the management of model deployments on a Kubernetes cluster. With KAITO, you can deploy popular open-source language models such as Phi-3 and Falcon, and host them in the cloud or on the edge. Along with the currently supported models from KAITO, you can also onboard and deploy custom language models following this guidance in just a few steps. AKS Arc has been validated with the latest KAITO operator via helm-based installation, and customers can now use KAITO in the edge to: Deploy language models such as Falcon, Phi-3, or their custom models Automate and optimize AI/ML model inferencing for cost-effective deployments, Fine-tune a model directly in a Kubernetes cluster, Perform parameter efficient fine tuning using low-rank adaptation (LoRA) Perform parameter efficient fine tuning using quantized adaptation (QLoRA) You can get started by installing KAITO and deploying a model for inference on your edge GPU nodes with KAITO Quickstart Guidance. You may also refer to KAITO experience in AKS in cloud: Deploy an AI model with the AI toolchain operator (Preview) Use Arc-enabled Machine Learning to train and deploy models in the edge For customers who are already familiar with Azure Machine Learning (AML), Azure Arc-enabled ML extends AML in Azure and enables customers to target any Arc enabled Kubernetes cluster for model training, evaluation and inferencing. With Arc ML extension running in AKS Arc, customers can meet data-residency requirements by storing data on premises during model training and deploy models in the cloud for global service access. To get started with Arc ML extension, please view instructions from Azure Machine Learning document . In addition, AML extension can now be used for a fully automated deployment of a curated list of pre-validated language and traditional AI models to AKS clusters, perform CPU and GPU-based inferencing, and subsequently manage them via Azure ML Studio. This experience is currently in gated preview, please view another Ignite blog for more details. Use Azure AI Services with disconnected container in the edge Azure AI services enable customers to rapidly create cutting-edge AI applications with out-of-the-box and customizable APIs and models. It simplified the developer experience to use APIs and embed the ability to see, hear, speak, search, understand and accelerate decision-making into the application. With disconnected Azure AI service containers, customers can now download the container to an offline environment such as AKS Arc and use the same APIs available from Azure. Containers enable you to run Azure AI services APIs in your own environment and are great for your specific security and data governance requirements. Disconnected containers enable you to use several of these APIs disconnected from the internet. Currently, the following containers can be run in this manner: Speech to text Custom Speech to text Neural Text to speech Text Translation (Standard) Azure AI Vision - Read Document Intelligence Azure AI Language Sentiment Analysis Key Phrase Extraction Language Detection Summarization Named Entity Recognition Personally Identifiable Information (PII) detection To get started with disconnected container, please view instructions at Use Docker containers in disconnected environments . Build and deploy data and machine learning pipelines with Flyte Flyte is an open-source orchestrator that facilitates building production-grade data and ML pipelines. It is a Kubernetes native workflow automation tool. Customers can focus on experimentation and providing business value without being an expert in infrastructure and resource management. Data scientists and ML engineers can use Flyte to create data pipelines for processing petabyte-scale data, building analytics workflow for business or finance, or leveraging it as ML pipeline for industry applications. AKS Arc has been validated with the latest Flyte operator via helm-based installation, customers are welcome to use Flyte for building data or ML pipelines. For more information, please view instructions from Introduction to Flyte - Flyte and Build and deploy data and machine learning pipelines with Flyte on Azure Kubernetes Service (AKS). AI-powered edge applications with cloud-connected control plane Azure AI Video Indexer, enabled by Azure Arc Azure AI Video Indexer enabled by Arc enables video and audio analysis, generative AI on edge devices. It runs as Azure Arc extension on AKS Arc and supports many video formats including MP4 and other common formats. It also supports several languages in all basic audio-related models. The Phi 3 language model is included and automatically connected with your Video Indexer extension. With Arc enabled VI, you can bring AI to the content for cases when indexed content can’t move to the cloud due to regulation or data store being too large. Other use cases include using on-premises workflow to lower the indexing duration latency or pre-indexing before uploading to the cloud. You can find more details from What is Azure AI Video Indexer enabled by Arc (Preview) Search on-premises data with a language model via Arc extension Retrieval Augmented Generation (RAG) is emerging to augment language models with private data, and this is especially important for enterprise use cases. Cloud services like Azure AI Search and Azure AI Studio simplify how customers can use RAG to ground language models in their enterprise data in cloud. The same experience is coming to the edge and now customers can deploy an Arc extension and ask questions about on-premises data within a few clicks. Please note this experience is currently in gated preview and please see another Ignite blog for more details. Conclusion Developing and running AI workload at distributed edges brings clear benefits such as using cloud as universal control plane, data residency, reduced network bandwidth, and low latency. We hope the products and features we developed above can benefit and enable new scenarios in Retail, Manufacturing, Logistics, Energy, and more. As Microsoft-managed Kubernetes on the edge, AKS Arc not only can host critical edge applications but also optimized for AI workload from hardware, runtime to application. Please share your valuable feedback with us (aksarcfeedback@microsoft.com) and we would love to hear from you regarding your scenarios and business impact.2.3KViews3likes1CommentOperate everywhere with AI-enhanced management and security
Farzana Rahman and Dushyant Gill from Microsoft discuss new AI-enhanced features in Azure that make it simpler to acquire, connect, and operate with Azure's management offerings across multiple clouds, on-premises, and at the edge. Key updates include enhanced management for Windows servers and virtual machines with Windows Software Assurance, Windows Server 2025 hotpatching support in Azure Update Manager, simplified hybrid environment connectivity with Azure Arc gateway, a multicloud connector for AWS, and Log Analytics Simple Mode. Additionally, Azure Migrate Business Case helps compare the total cost of ownership, and new Copilot in Azure capabilities that simplify cloud management and provide intelligent recommendations.2.1KViews1like0Comments