Blog Post

Azure Arc Blog
3 MIN READ

Extending Azure's AI Platform with an adaptive cloud approach

Derek_Bogardus's avatar
Nov 19, 2024

Ignite 2024 is here, and nothing is more top of mind for customers than the potential to transform their businesses with AI wherever they operate. Today, we are excited to announce the preview of two new Arc-enabled services that extend the power of Azure’s AI platform to on-premises and edge environments. 

Sign up to join the previews here! 

An adaptive cloud approach to AI 

The goal of Azure’s adaptive cloud approach is to extend just enough Azure to customers’ distributed environments. For many of these customers, valuable data is generated and stored locally, outside of the hyperscale cloud, whether due to regulation, latency, business continuity, or simply the large volume of data being generated in real time. 

AI inferencing can only occur where the data exists. So, while the cloud has become the environment of choice for training models, we see a tremendous need to extend inferencing services beyond the cloud to enable complete cloud-to-edge AI scenarios. 

Search on-premises data with generative AI  

Over the past couple of years, generative AI has come to the forefront of AI innovation. Language models give any user the ability to interact with large, complex data sets in natural language. 

Public tools like ChatGPT are great for queries about general knowledge, but they can’t answer questions about private enterprise data on which they were not trained. 

Retrieval Augmented Generation, or "RAG", helps address this need by augmenting language models with private data. Cloud services like Azure AI Search and Azure AI Foundry simplify how customers can use RAG to ground language models in their enterprise data.  

Today, we are announcing the preview of a new service that brings generative AI and RAG to your data at the edge.  

 

Within minutes, customers can deploy an Arc extension that contains everything needed to start asking questions about their on-premises data, including:   

  • Popular small and large language models running locally with support for both CPU and GPU hardware 
  • A turnkey data ingestion and RAG pipeline that keeps all data completely local, with RBAC controls to prevent unauthorized access  
  • An out-of-the-box prompt engineering and evaluation tool to find the best settings for a particular dataset  
  • Azure-consistent APIs to integrate into business applications, as well as a pre-packaged UI to get started quickly 

This service is available now in gated private preview for customers running Azure Local infrastructure, and we plan to make it available on other Arc-enabled infrastructure platforms in the near future.  

Sign up here 

Deploy curated open-source AI models via Azure Arc 

Another great thing about Azure’s AI platform is that it provides a catalog of curated AI models that are ready to deploy and provide consistent inferencing endpoints that can be integrated directly into customer applications. This not only makes deployment easy, but customers can also be confident that the models are secure and validated 

These same needs exist on the edge as well, which is why we are now making a set of curated models deployable directly from the Azure Portal. These models have been selected, packaged, and tested specifically for edge deployments, and are currently available on Azure Local infrastructure. 

  • Mistral 7B (7.3 billion parameter language model) 

Models can be deployed from a familiar Azure Portal wizard to an Arc AKS cluster running on premises. All available models today can be run on just a CPU. Phi-3.5 and Mistral 7B also have GPU versions available for better performance. 

 

 

Once complete, the deployment can be managed directly in Azure ML Studio, and an inferencing endpoint is available on your local network. 

 

 Wrap up 

Sign up now to join either of the previews at the link below or stop by and visit us in person in the Azure Arc and Azure Local Expert Meet Up station in the Azure Infrastructure neighborhood at Ignite. We’re excited to get these new capabilities into our customers’ hands and hear from you how it’s going. 

Sign up to join the previews here 

Updated Nov 19, 2024
Version 3.0