Used in many industries, including engineering, mathematics, and finance, MatLab is a proprietary programming language and multi-paradigm numerical computing environment. With the increasing complexity of data analysis, simulation, and modeling tasks, the performance of MatLab plays a crucial role in the speed and accuracy of these operations. Microsoft Azure offers a cloud-based platform that provides virtual machines (VMs) to run MatLab. However, selecting the right VM SKU can be difficult, and choosing an incorrect one can lead to suboptimal performance and potentially higher costs. In this blog article, we'll discuss how processor selection and other factors may affect MatLab's performance and how to choose the right Azure VM SKU to achieve the best performance for your MatLab workloads. We'll also explore some best practices to optimize MatLab performance on Azure VMs.
For background, MatLab, short for Matrix Laboratory, is a numerical computing environment developed by MathWorks. MatLab provides a wide range of tools for performing calculations, data analysis, visualization, and simulation tasks. It offers a high-level language that allows users to express complex mathematical computations easily and efficiently. With a vast library of built-in functions and toolboxes, MatLab provides a platform for solving complex engineering, scientific, and financial problems. MatLab's user-friendly interface, combined with its powerful computing capabilities, has made it a popular choice for researchers, engineers, and scientists across various industries.
The default answer to many organizations is to run MatLab calculations or simulations directly on the end-user workstations. However, for many reasons, this can be suboptimal as it leads to over-provisioning the capability of the desktop environment, especially if using Terminal Services or Virtual Desktop Infrastructure. I worked last year with a very large Non-Government Organization (NGO) nonprofit, which had such an environment. Their VDI environment was a difficult to manage RDS environment with users sharing access to large compute nodes with capabilities sufficient to run their data scientists' jobs.
Offloading MatLab Workload to Dedicated Compute Nodes
Utilizing MatLab Parallel Compute Services and its native HPC Pack integration allows the end user to optimize or right-size the front end to run the desktop client and optimize the back end to be large enough to handle simulations. Offloading MatLab calculations to HPC Pack can significantly improve its performance and scalability. The HPC Pack provides a powerful platform for running parallel and distributed MatLab applications across a cluster of machines. Additionally, HPC Pack offers features such as job scheduling, data management, and native cloud orchestration to "AutoGrowShrink" to minimize compute costs when no jobs are in the queue. By utilizing the HPC Pack, users can take advantage of the full power of their cluster environment, enabling faster and more efficient data processing and analysis. The above NGO fixed their front end by implementing Azure Virtual Desktop (AVD) and implemented their compute infrastructure on HC44rs SKUs.
For the Compute Nodes running the calculations, there are several recommendations for performance. Primarily, MatLab is a compute-intensive program that requires enough memory to handle the size of the models. Being a multi-threaded application, MatLab benefits from having several physical cores available. In general, hyperthreading does not benefit the calculations once a sufficient number of cores are present. For an optimal memory-to-core ratio, it is important to know the size of companies model as any paging activity will seriously degrade performance. Local disk performance can also affect simulation performance as MatLab writes the results back out to disk. The general Azure recommendation is to utilize the local ephemeral disk for this transient data and ensure the Server Message Block (SMB) Share location is performant.
Benchmarking MatLab Workload
MATLAB provides a built-in benchmarking utility called bench
, which measures the execution time of specific MatLab functions and compares them against standard reference values. The bench function evaluates different types of computation and tests various combinations of data sizes and algorithms to provide a comprehensive performance profile. The benchmarking process helps identify performance bottlenecks and guide optimization efforts, such as parallelizing computations or optimizing code. Use the MATLAB function timeit
to help produce reliable and repeatable performance benchmarks. Use gputimeit
to benchmark GPU code. Utilizing this bench
, you can evaluate potential Azure SKUs in comparison to other Virtual Machine (VM) SKUs.
From a methodology standpoint, I ran the same Windows 2019 OS with the latest patches and MatLab version across all likely HPC VM SKUs. I disabled hyperthreading for any General Purpose SKU VM families utilizing metatags. I ran the benchmark command 3 times on each VM family and averaged the result. If a result was dramatically out of range in comparison to the other two, I threw out the bad result and ran the result one additional time. In each case, we used the local ephemeral drive to run the MatLab bench
command.
Azure VMs being Benchmarked:
VM Name |
HC44rs |
HB120rs_v3 |
HB120rs_v2 |
D64ds_v5 |
D64ads_v5 |
Number of pCPUs |
44 (Constrained Core 16, 32 options available) |
120 (Constrained Core 16, 32, 64, 96 options available) |
120 (Constrained Core 16, 32, 64, 96 options available) |
32 |
32 |
Processor |
Intel Xeon Platinum 8168 |
AMD EPYC 7V73X CPU cores (“Milan-X”) |
AMD EPYC 7742 CPU cores |
Intel® Xeon® Platinum 8370C (Ice Lake) |
AMD's EPYC 7763v CPU Cores |
Peak CPU Frequency |
3.70 GHz |
3.5 GHz |
3.4 GHz |
3.5 GHz |
3.5 GHz |
RAM per VM |
352 GB |
448 GB |
456 GB |
256 GB |
256 GB |
RAM per core |
8 GB (22, 11GB) |
3.75 GB (28, 14, 7, 4.6 GB) |
3.8 GB (28, 14, 7, 4.6 GB) |
8 GB |
8 GB |
Memory B/W per core |
4.3 GB/s |
5.25 GB/s |
2.9 GB/s |
4.26 GB/s |
4.26 GB/s |
L3 Cache per VM |
33MB |
768MB |
256MB |
48MB |
256MB |
Attached Disk |
1 x 700MB NVMe |
2 x 0.9 TB NVMe |
1 x 0.9 TB NVMe |
2400 SSD |
2400 SSD |
Disk per Core |
15.9GB (43.8, 21.8) |
15GB (113, 56, 28, 19) |
7.5 GB (56, 28, 14, 9) |
75GB |
75GB |
Accelerated Networking |
Yes |
Yes |
Yes |
Yes |
Yes |
MatLab Benchmark Results:
VM SKU | MatLAB: LU | MatLAB: FFT | MatLAB: ODE | MatLAB: Sparse |
HC44rs | 0.2121 | 0.6646 | 0.2604 | 0.5576 |
HB120rs_v3 | 0.2236 | 0.401 | 0.2082 | 1.3275 |
HB120rs_v2 | 0.2309 | 0.3290 | 0.2482 | 1.5880 |
D64ds_v5 | 0.1697 | 0.23 | 0.1879 | 0.4406 |
D64ads_v5 |
0.2106 | 0.2809 | 0.1948 | 1.1102 |
For an explanation of what the columns are, I refer to the MatLab Benchmark page:
LU (Lower-Upper Decomposition) Benchmark: The LU benchmark tests the performance of MATLAB for the lower-upper decomposition of large matrices. This benchmark involves factoring a matrix into lower and upper triangular matrices using different algorithms. Performance Factors: Floating-point, regular memory access
FFT (Fast Fourier Transform) Benchmark: The FFT benchmark tests the performance of MATLAB for computing the fast Fourier transform of large data sets. This benchmark involves transforming a time-domain signal into its frequency-domain representation. The results of the FFT benchmark are influenced by the size of the input data set and the complexity of the signal being transformed. Performance Factors: Floating-point, irregular memory access
ODE (Ordinary Differential Equation) Benchmark: The ODE benchmark tests the performance of MATLAB for solving systems of ordinary differential equations. This benchmark involves simulating the behavior of a physical system over time using differential equations. The results of the ODE benchmark are influenced by the complexity of the system being modeled and the accuracy of the numerical methods used to solve the equations.
Performance Factors: Data structures and MATLAB function files, Disk Performance
Sparse Benchmark: The Sparse benchmark tests the performance of MATLAB for manipulating sparse matrices. This benchmark involves performing operations on matrices that have a large number of zero elements. The results of the Sparse benchmark are influenced by the size and sparsity of the input matrix, as well as the specific operation being performed.
Performance Factors: Mixed integer and floating-point
Performance Comparison:
Utilizing HC44rs as a performance baseline, a result of 1.50 would be 150% of the performance of HC44rs Result.
You may notice a third column for HB120rs_v3 for AVX2. There is some belief within MatLab circles that MatLab is "crippled" on AMD processors. That was not my experience. I tested the supposition by forcing MatLab into MKL Debug mode. I created an MS-DOS batch file to launch MatLab in AVX2 Mode
@echo off
set MKL_DEBUG_CPU_TYPE=5
matlab.exe
While performance was slightly higher (roughly 1-5% faster), it was within the margin of error for the result and was largely proven unnecessary.
Conclusion:
MatLab is a powerful computational tool used widely within Financial Services Industry (FSI) specifically. However, to achieve optimal performance and efficiency, it's crucial to understand the factors that affect MatLab's performance and how to optimize the workload for the hardware environment. Choosing the right Azure VM SKU, offloading computations to HPC Pack, benchmarking workloads, and optimizing MatLab code are all effective ways to improve MatLab's performance and scalability. Understanding your technical requirements and requirements for the computational environment will lead you to a specific SKU and whether or not to purchase a cloud savings plan or reserved instance for a portion of them. By following these best practices, MatLab users can reduce processing time, enhance data analysis and simulations, and ultimately improve their productivity and decision-making. Whether running MatLab on-premises or in the cloud, optimizing its performance is critical for data scientists' satisfaction and delivering results faster.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.