If your team runs high-performance computing (HPC) workloads on Azure, you can use a package manager to save time building and installing software. Spack is a popular open-source package management tool written in Python that builds software packages from source. Spack allows you to rapidly and consistently install many HPC packages using scripts called recipes. These recipes make it easy to deploy all the Azure resources needed for a particular piece of software, and more than 2,500 recipes for HPC packages already exist.
In this post, we explore using Spack to build and install packages on Azure Virtual Machines for HPC workloads (HBv2, HB, and HC). For testing purposes, this deployment takes advantage of the MPI libraries installed on the CentOS-HPC 7.7 image available from Azure Marketplace. It also integrates the Spack buildcache with Azure Blob storage to create a repository for the pre-build binaries.
To show how this process works, we will use Spack to build OSU micro-benchmarks that measure the performance of the CentOS MPI libraries. The steps below explain how to upload the build cache to Blob storage, install OSU micro-benchmarks using the Blob storage build cache, and use Portable Batch System (PBS) scripts to run the resulting executables.
Install Spack on Azure
The azurehpc GitHub repository contains scripts (see apps/spack directory) to automatically install Spack on Azure, set up suitable configuration files, and integrate the CentOS-HPC 7.7 MPI libraries. The repository also provides example PBS scripts that you can use to run OSU micro-benchmark with different CentOS-HPC 7.7 libraries (Open MPI, MVAPICH2, HPC-X, and Intel MPI).
To clone the GitHub repository:
git clone firstname.lastname@example.org:Azure/azurehpc.git
You can use the azurehpc scripts as is for testing, but you’ll probably want to customize your Spack installation. To do that, edit the configuration files (config.yaml, modules.yaml, compiler.yaml, and packages.yaml). The azurehpc Spack installation script (apps/spack/build_spack.sh) creates these files automatically and sets them as spack system defaults.
The example scripts do the following:
Install all software packages in /apps/spack/$sku_type (config.yaml) (where sku_type is hbv2, hb, or hc to indicate the target processor architecture).
Generate the Tcl module files at /apps/modulefiles/spack/tcl/$sku_type and the Lmod module files in /apps/modules/spack/lmod/$sku_type (config.yaml).
Use 16 processes by default for a parallel build (for example, make -j) (config.yaml).
Use the CentOS-HPC 7.7 MPI libraries (they are not rebuilt). See the packages.yaml configuration file for details (packages.yaml, compiler.yaml).
Note: Make sure you build on one of the virtual machine types designed for HPC workloads—HBv2, HB, or HC.
Build and install OSU micro-benchmarks
To build OSU micro-benchmarks with MVAPICH2 (from the CentOS-HPC 7.7 image), run the following command in Azure CLI:
Where %email@example.com^mvapich2 is the spec, the syntax you use in Spack to specify versions and configuration options. Here, the spec tells Spack to use the firstname.lastname@example.org compiler and the the latest mvapich2 MPI library.
To see the detailed installation options and software dependencies for a given package, use:
spack info <PACKAGE_NAME>
Similarly, you can build OSU micro-benchmarks using Open MPI:
Creating the buildbuild on Azure is very straight forward because AzureHPC contains a patch to integrate Azure blob storage with spack buildcache. You just need to specify the location of the buildcache on blog storage and upload the software. After your software is on Blob storage in a buildcache, you can install it anytime from the buildcache without the need to recompile the package.
1. Set up a mirror to tell spack the location of the buildcache on Azure blob storage :
Note: The Storage account must be created in advance with a container called buildcache
Where SPEC corresponds to the installed software (email@example.com^openmpi), and the -k option specifies the GPG key to use to sign the software. (A GPG key was generated as part of the azurehpc Spack installation. To see all available GPG keys, use spack gpg list). -m option specifies what buildcache to use.
If you need to retrieve/install software from the buildcache later, remember to save your GPG keys using spack gpg export. If you need to add/upload additional software to your buildcache you will need to keep your private gpg key.
NOTE: You will need to set the environmental variable AZURE_STORAGE_CONNECTION_STRING to
your storage connection string (portal--> storage account --> Access keys) to access your buildcache.
4. Check that you can see all the software (identified by <SPEC> or <HASH>) available in the buildcache:
spack buildcache list
5. To install software from the buildcache:
spack buildcache install <SPEC> or <HASH>
Test the installed software
We will use PBS to test OSU micro-benchmarks built with Spack. The PBS run scripts are available in the azurehpc repository. The following test uses the MVAPICH2 osu_bw /osu_latency PBS run script (osu_bw_latency_mvapich2.pbs). Similar scripts are available for the other MPI libraries.
With Spack’s spec syntax, you can specify the software to load using just enough of the name to uniquely identify it—for example, spack load osu-micro-benchmarks^mvapich2. You can also load the software with regular module load syntax if you prefer.
Assuming you have two HBv2 nodes, each running a single MPI process, you would submit the test script as follows:
Spack can really save you time when managing HPC clusters on Azure, because you don’t have to build code and libraries by hand. It’s also very flexible. You can easily customize it to suit your requirements and get started quickly using the recipes that already exist for more than 2,500 HPC packages. It’s pretty easy to write your own recipes, too. For more information, see the Spack website.