Azure NV V710 v5: Empowering Real-Time AI/ML Inferencing and Advanced Visualization in the Cloud

Microsoft

Oct 03, 2024

As industries increasingly rely on high-performance computing and AI for real-time inferencing, remote work, and advanced visualization, Azure’s Virtual Machine (VM) portfolio continues to evolve to meet these demands. Today, we’re excited to introduce the Azure NV V710 v5, the latest VM tailored for small-to-medium AI/ML inferencing workloads, Virtual Desktop Infrastructure (VDI), visualization, and cloud gaming workloads.

Powered by AMD’s latest Radeon™ PRO V710 GPUs and 4^th Generation AMD EPYC™ (formerly “Genoa”) high-frequency CPUs delivers high compute performance and flexible GPU partitioning to address a wide range of industry needs.

Why Choose Azure NV V710 v5?

The NV V710 v5 brings a new level of flexibility and performance to the cloud, specifically designed for small-to-medium real-time AI/ML inferencing workloads and graphics-intensive applications.

Key Features of NV V710 v5

Real-Time Inferencing (RTI) and AI Inferencing:

The NV V710 v5 is optimized for small-to-medium AI model inferencing and real-time machine learning processing, offering the computational power and speed necessary for industries that rely on immediate data processing. With support for vLLM, users can perform AI/ML inferencing more efficiently, providing near-instant results for workloads such as edge AI applications and intelligent decision-making systems, all at a lower total cost of SKU ownership.

GPU Partitioning for Flexibility:

A standout feature of the NV V710 v5 is its GPU partitioning capability, allowing customers to allocate fractions of the GPU according to their workload requirements. This flexibility is ideal for multi-tenant environments, enabling organizations to support a variety of inferencing and graphical workloads efficiently without needing a full GPU for each application.

High-Performance AMD EPYC CPUs:

Equipped with AMD 4th Gen EPYC CPUs that boast a 3.9 GHz base frequency and 4.3 GHz max frequency, the NV V710 v5 is optimized for demanding compute tasks requiring both high CPU and GPU performance. This makes it suitable for complex simulations, graphics rendering, and real-time inferencing.

Massive GPU Memory:

With 28 GB of GDDR6 GPU memory, the NV V710 v5 can handle large-sized model inferencing, high-resolution rendering, and intricate visual content. The high memory capacity ensures smooth processing and loading of substantial datasets in real time.

Azure Integration and High-Speed Networking:

Integrated with Azure Accelerated Networking, the NV V710 v5 provides up to 80 Gbps bandwidth, ensuring high performance and low latency for AI inferencing, VDI applications, and cloud gaming workloads. This high-speed networking capability facilitates seamless data transfer, supporting intensive graphical and inferencing operations.

Real-World Applications

One of the key applications of the NV V710 v5 is in the automotive industry, where AI-based sensor simulation and inferencing play a vital role in developing intelligent edge devices for autonomous vehicles. Platforms like the Automated Driving Perception Hub (ADPH) offer automotive customers a virtual environment to evaluate a range of automotive sensors, such as cameras, lidars, and radars.

Accurate Inferencing: The NV V710 v5 supports batch-processed inferencing, providing a trusted environment for evaluating AI model accuracy in various simulations.
Cross-Platform Support: Its compatibility with ROCm/HIP enables cross-platform inferencing, which is crucial for intelligent edge devices.
Broader Applications: Beyond the automotive industry, the NV V710 v5 can support a variety of edge AI devices, such as security cameras, industrial equipment, and drones.

NV V710 v5 Technical Specifications

Specification	Details
vCPUs	Configurations from 4 to 28 vCPUs (3.95 GHz base, 4.3 GHz max)
Memory	16 GB to 160 GB
GPU	AMD Radeon PRO V710 GPU with 28 GB GDDR6 memory, partitioned from 1/6 to full GPU, supporting the latest ROCm releases for vLLM to enhance real-time AI inferencing
Storage	Up to 1 TB temporary disk
Networking	Up to 80 Gbps Azure Accelerated Networking

For more detailed technical information, visit our Azure documentation here.

AI Inferencing Opportunities with NV V710 v5

The NV V710 v5 provides a versatile platform for real-time AI/ML inferencing and visualization tasks. With support for vLLM, it enables enterprises to execute complex AI models in real time efficiently, making it an essential asset for industries focused on AI-driven insights. By leveraging GPU partitioning, companies can optimize their resources across various workloads, ensuring a cost-effective approach to cloud-based inferencing and graphics rendering.

Additional Use Cases

VDI and Remote Workstations: For enterprises deploying virtual desktops, the NV V710 v5 provides high-performance computing resources that can be dynamically adjusted based on user requirements. This flexibility is valuable for media production, design, and financial services, where high-end graphics capabilities are crucial.
Cloud Gaming: The NV V710 v5 is built to handle cloud gaming with low-latency performance, offering gamers a seamless, high-quality experience comparable to traditional gaming consoles. Its robust architecture supports real-time rendering, delivering a premium gaming experience in the cloud.

Conclusion: The Future of AI Inferencing and Graphics Workloads with Azure NV V710 v5

The Azure NV V710 v5 VM is set to transform the landscape of AI inferencing, real-time visualization, and cloud gaming. By combining high-performance AMD Genoa CPUs, 24 GB GPU memory, ROCm 6 support, and vLLM, it provides an all-in-one solution for a wide range of applications.

The NV V710 v5 opens up new opportunities for businesses to run real-time AI/ML model inferencing in the cloud, scale graphical workloads efficiently, and deliver high-quality user experiences. With its advanced partitioning and high-speed networking capabilities, it’s tailored to meet the demands of modern, graphics-intensive, and AI-driven industries.

Ready to experience the power of the NV V710 v5? Sign up for the public preview here.

Updated Nov 07, 2024

Version 8.0

Microsoft

Joined April 30, 2024

View Profile

Azure High Performance Computing (HPC) Blog