Recent Blog ArticlesNewest TopicsMost LikesTagged:TagBreaking the Million-Token Barrier: The Technical Achievement of Azure ND GB300 v6 Azure ND GB300 v6 Virtual Machines with NVIDIA GB300 NVL72 rack-scale systems achieve unprecedented performance of 1,100,000 tokens/s on Llama2 70B Inference, beating the previous Azure ND GB200 v6 r...Optimizing Large-Scale AI Performance with Pretraining Validation on a Single Azure ND GB200 v6 Small performance gaps on a single virtual machine lead to large and costly performance losses at scale. Running small-scale pretraining jobs enables single-VM validation and allows for fine-grained ...Performance at Scale: The Role of Interconnects in Azure HPC & AI Infrastructure Microsoft Azure’s high-performance computing (HPC) & AI infrastructure is designed from the ground up to support the world’s most demanding workloads. High-performance AI workloads are bandwidth-hung...Azure’s ND GB200 v6 Delivers Record Performance for Inference Workloads Achieving peak AI performance requires both cutting-edge hardware and a finely optimized infrastructure. Azure’s ND GB200 v6 Virtual Machines, accelerated by the NVIDIA GB200 Blackwell GPUs, have alr...Unpacking the Performance of Microsoft Azure ND GB200 v6 Virtual Machines For a comprehensive understanding of our benchmarking methodologies and detailed performance results, please refer to our benchmarking guide available on the official Azure GitHub repository: Azure A...Optimizing Language Model Inference on Azure Inefficient inference optimization can lead to skyrocketing costs for customers, making it crucial to establish clear performance benchmarking numbers. This blog sets the standard for expected perfor...A quick start guide to benchmarking AI models in Azure: Llama 2 from MLPerf Inference v4.0 Microsoft Azure has delivered industry-leading results for AI inference workloads amongst cloud service providers in the most recent MLPerf Inference results published publicly by MLCommons. The Azur...A Quick Guide to Benchmarking AI Models on Azure: ResNet with MLPerf Training v3.0 Azure is pleased to showcase results from our MLPerf Training v3.0 submission. For this submission, we benchmarked our ND H100 v5 virtual machine (preview). In this blog post, you will learn how to r...Accelerating AI applications using the JAX framework on Azure’s NDm A100 v4 Virtual Machines The results highlight good scaling from 1 to 16 nodes on both the Large and XLarge T5 models running with JAX on Azure. The Large T5 model has a scaling efficiency of 84% at 16 nodes (128 GPUs...Tackling AI Inference workloads on Azure’s NC A100 v4 virtual machines with time to spare The NC A100 v4-series offers great flexibility through MIG technology to handle different sizes of workload, from small to medium. While we compared the performance of a single MIG instance (1/7th of...