Azure CycleCloud – the simplest way to execute HPC on Azure

Published May 24 2021 01:55 AM 1,818 Views
Microsoft

Accelerated computing.  Microsoft HPC continues to invest to deliver the broadest range of Accelerated and high-performance computing (HPC) capabilities in the public cloud. From InfiniBand-enabled Virtual Machine families for artificial intelligence and HPC, to Hyperscale services like Cray supercomputing, Azure enables customers to deliver the full spectrum of AI and machine learning applications.

 

Azure CycleCloud – the simplest way to execute HPC on Azure

 Azure CycleCloud, a tool for creating, managing, operating, and optimizing HPC clusters of any scale in Azure. With Azure CycleCloud, we are making it even easier for everyone to deploy, use, and optimize HPC burst, hybrid, or cloud-only clusters. For users running traditional HPC clusters, using schedulers including SLURM, PBS Pro, Grid Engine, LSF, HPC Pack, or HTCondor, this will be the easiest way to get clusters up and running in the cloud, and manage the compute/data workflows, user access, and costs for their HPC workloads over time.

Cyclecloud.png

 

With a few clicks, HPC IT administrators can deploy high-performance clusters of compute, storage, filesystem, and application capability in Azure. Azure CycleCloud’s role-based policies and governance features make it easy for their organizations to deliver the hybrid compute power where needed while avoiding runaway costs. Users can rely on Azure CycleCloud to orchestrate their job and data workflows across these clusters.

 

 

 

NVIDIA GPU Cloud with Azure

As GPUs provide outstanding performance for AI and HPC, Microsoft Azure provides a variety of virtual machines enabled with NVIDIA GPUs. Starting today, Azure users and cloud developers have a new way to accelerate their AI and HPC workflows with powerful GPU-optimized software that takes full advantage of supported NVIDIA GPUs on Azure.


Containers from the NVIDIA GPU Cloud (NGC) container registry are now supported. The NGC container registry includes NVIDIA tuned, tested, and certified containers for deep learning software such as Microsoft Cognitive Toolkit, TensorFlow, PyTorch, and NVIDIA TensorRT. Through extensive integration and testing, NVIDIA creates an optimal software stack for each framework – including required operating system patches, NVIDIA deep learning libraries, and the NVIDIA CUDA Toolkit – to allow the containers to take full advantage of NVIDIA GPUs. The deep learning containers from NGC are refreshed monthly with the latest software and component updates.

NGC also provides fully tested, GPU-accelerated applications and visualization tools for HPC, such as NAMD, GROMACS, LAMMPS, ParaView, and VMD. These containers simplify deployment and get you up and running quickly with the latest features.


To make it easy to use NGC containers with Azure, a new image called NVIDIA GPU Cloud Image for Deep Learning and HPC is available on Azure Marketplace. This image provides a pre-configured environment for using containers from NGC on Azure. Containers from NGC on Azure NCv2, NCv3, and ND virtual machines can also be run with Azure Batch AI by following these GitHub instructions.To access NGC containers from this image, simply signup for a free account and then pull the containers into your Azure instance. 

 


Resources on Microsoft Learn
 

Co-Authors
Version history
Last update:
‎May 24 2021 01:57 AM
Updated by: