Running SAP Applications on the Microsoft Platform

4 MIN READ

Optimizing Network Throughput on Azure M-series VMs

Microsoft

Jul 28, 2022

Update - 22 July 2024

“Due to improvements in Suse and Redhat, new network Device Drivers and improved Network Cards the procedure that was in this blog is no longer required.

It is still recommended to test the network throughput using the procedures in this blog. If a VM fails HCMT testing then ensure the irqbalance daemon is running and the Guest OS release is the latest version. Contact Microsoft support if a VM does not pass HCMT”

1. HCMT, niping & iperf3 Throughput Testing on Azure M-series VMs

The SAP HCMT Network Topology Test, niping and/or iperf3 results are below the expected values in a VM hosted on Azure M-series under some circumstances.

The SAP HCMT output could show errors such as those in the screenshot below.

The error below shows a 5-node scale-out SAP HANA landscape. The problem is most often seen on SAP HANA scale-out systems, but may occur on large Oracle or DB2 systems.

The network optimization described in this blog may benefit Azure NetApp Files (ANF) scenarios as well, such as a large Oracle server running dNFS to ANF.

It is recommended readers be fully familiar with the terminology such as NUMA Node, Processor, Core, Logical Processor and Hyperthreading before continuing to read this blog. These terms are explained here

2. How to Analyze Network Performance

Network processing consumes a significant amount of CPU. Modern Network Cards and Network Drivers leverage technologies to distribute and balance CPU consumption across multiple CPU cores.

Mellanox provides a set of tools that can “pin” network processing operations to a specific set of CPU cores or NUMA Nodes. The links at the end of this blog describe the process in more detail.

Microsoft internal testing has shown that limiting network processing queues to 16 and pinning network processing to NUMA node 0 as seen from the Linux guest OS in the VM is a configuration that delivers stable high throughput.

Excluding the first CPU core, core 0 (logical processor 0 and 1) is recommended as many other operations are affinitized to this core.

The procedure detailed in this blog has been implemented many times successfully, but careful testing should be done prior to implementing (to establish a baseline) and after implementation.

The procedure below should not be implemented on a production system without careful testing and validation in non-production systems. The procedure in this blog only applies to Azure M-series (M, Mv2 and similiar).

Test scenarios have shown network throughput improvements of approximately 20%.

Most Linux tools such as ‘vmstat’, show averages and are not useful for characterizing network interrupt utilization. It is recommended to install ‘nmon’ and sysstat(SAR).

Unfortunately NMON is not available via zypper, apt, yum and must be downloaded ( http://nmon.sourceforge.net/pmwiki.php)
sysstat or SAR may or may not be installed and activated by default. Typically, most SUSE gallery images have SAR running by default. Check the directory /var/log/sa. If the directory does not exist or does not contain recent sarXX files then follow the steps below
KSAR is a graphical tool that presents system performance information in a simple and easy to interpret way. This tool requires a runtime JVM https://github.com/vlsi/ksar

If sysstat needs to be installed follow the steps below

# sudo yum install sysstat

# sudo service sysstat restart

Redirecting to /bin/systemctl restart sysstat.service

The /var/log/sa/sarXX files can be copied onto a Windows PC with sftp

sftp -i <keyfilename>.pem azureuser@<xx.xx.xx.xx>

get /var/log/sa/sar<XX>

Run "Java -jar C:\sap_media\ksar.jar" and capture graphs showing overall CPU consumption, then consumption on the first NUMA node (node 0) and Kernel/System CPU consumption on each Logical Processor.

An optimal pattern in KSAR will show approximately the same CPU consumption on all NUMA Nodes and Logical Processors. NUMA Node 0 and Logical Processors 0 and 1 may have slightly higher CPU time, especially System CPU time (kernel time).

CPU consumption over 80% on some NUMA Nodes and/or Logical Processors and simultaneously near 0-10% consumption on other NUMA Nodes and/or Logical Processors indicates a problem.

Nmon can be used to watch cpu consumption real-time and can help isolate high CPU processes.

3. How to Optimize Network Interrupts on Azure M-series VMs

Update - 22 July 2024

Procedure deleted. If HCMT does not pass on certain older VM types such as m208 please open a support case with Microsoft. In all cases it is recommended to use the latest Suse, Redhat or Oracle Linux version and patch.

4. After Setting IRQ Validate Performance

Repeat SAP HCMT and/or niping or iperf3.

In addition, it is recommended to repeat analysis with KSAR and Nmon.

In most cases SAP HCMT and other tools will show approximately 20% improvement. If there are still performance concerns open a Microsoft Support case and quote this blog.

Thanks to Hermann Daeubler for contributing to this blog.

Interesting SAP OSS Notes and Links

Windows 2008 R2 - Groups, Processors, Sockets, Cores Threads, NUMA nodes what is all this? - Microsoft Tech Community

GitHub - Mellanox/mlnx-tools: Mellanox userland tools and scripts

SAP HANA Hardware and Cloud Measurement Tools (HCMT) – Replacement of HWCCT Tool | SAP Blogs

How to Use the SAP HANA Hardware and Cloud Measurement Tools

2973899 - Running HCMT - HANA Hardware and Cloud Measurement Tools - SAP ONE Support Launchpad

SAP HANA Hardware and Cloud Measurement Tools | SAP Help Portal

SAP HANA Hardware and Cloud Measurement Analysis (ondemand.com)

Queues, RSS, interrupts and cores (mellanox.com)

Performance Configuration and Troubleshooting (mellanox.com)

What is IRQ Affinity? (mellanox.com)

500235 - Network Diagnosis with NIPING - SAP ONE Support Launchpad

How to test network throughput using iperf3 tool - Tutorials and How To - CloudCone

Short throughput/stability test

Commands

Server: niping -s -I 0 (the last character is zero, not the letter O)

Client: niping -c -H <nipingsvr> -B 1000000 -L 100

Measuring throughput

Commands

Server: niping -s -I 0 (the last character is zero, not the letter O)

Client: niping -c -H <nipingsvr> -B 100000

3rd party content in this blog is used under “fair use” copyright exception for the purpose of promoting scholarship, discussion, research, learning and education

Updated Jul 22, 2024

Version 3.0

Microsoft

Joined February 09, 2019

View Profile

Running SAP Applications on the Microsoft Platform