Mar 27 2023 07:21 AM
Written by Matt Vegas, Principal Product Manager, Azure HPC+AI
Delivering on the promise of advanced AI for our customers requires supercomputing infrastructure, services, and expertise to address the exponentially increasing size and complexity of the latest models. At Microsoft, we are meeting this challenge by applying a decade of experience in supercomputing and supporting the largest AI training workloads to create AI infrastructure capable of massive performance at scale. The Microsoft Azure cloud, and specifically our graphics processing unit (GPU) accelerated virtual machines (VMs), provide the foundation for many generative AI advancements from both Microsoft and our customers.
“Co-designing supercomputers with Azure has been crucial for scaling our demanding AI training needs, making our research and alignment work on systems like ChatGPT possible.”—Greg Brockman, President and Co-Founder of OpenAI.
Today, Microsoft is introducing the ND H100 v5 VM which enables on-demand in sizes ranging from eight to thousands of NVIDIA H100 GPUs interconnected by NVIDIA Quantum-2 InfiniBand networking. Customers will see significantly faster performance for AI models over our last generation ND A100 v4 VMs with innovative technologies like: