Run Spack on Azure and integrate the build cache with Azure Blob Storage
Published Jan 17 2020 05:46 PM 5,126 Views
Microsoft

If your team runs high-performance computing (HPC) workloads on Azure, you can use a package manager to save time building and installing software. Spack is a popular open-source package management tool written in Python that builds software packages from source. Spack allows you to rapidly and consistently install many HPC packages using scripts called recipes. These recipes make it easy to deploy all the Azure resources needed for a particular piece of software, and more than 2,500 recipes for HPC packages already exist.


In this post, we explore using Spack to build and install packages on Azure Virtual Machines for HPC workloads (HBv2, HB, and HC). For testing purposes, this deployment takes advantage of the MPI libraries installed on the CentOS-HPC 7.7 image available from Azure Marketplace. It also integrates the Spack buildcache with Azure Blob storage to create a repository for the pre-build binaries.


To show how this process works, we will use Spack to build OSU micro-benchmarks that measure the performance of the CentOS MPI libraries. The steps below explain how to upload the build cache to Blob storage, install OSU micro-benchmarks using the Blob storage build cache, and use Portable Batch System (PBS) scripts to run the resulting executables.


Install Spack on Azure

The azurehpc GitHub repository contains scripts (see apps/spack directory) to automatically install Spack on Azure, set up suitable configuration files, and integrate the CentOS-HPC 7.7 MPI libraries. The repository also provides example PBS scripts that you can use to run OSU micro-benchmark with different CentOS-HPC 7.7 libraries (Open MPI, MVAPICH2, HPC-X, and Intel MPI).


To clone the GitHub repository:

git clone git@github.com:Azure/azurehpc.git

 

You can use the azurehpc scripts as is for testing, but you’ll probably want to customize your Spack installation. To do that, edit the configuration files (config.yaml, modules.yaml, compiler.yaml, and packages.yaml). The azurehpc Spack installation script (apps/spack/build_spack.sh) creates these files automatically and sets them as spack system defaults.


The example scripts do the following:

 

  • Install all software packages in /apps/spack/$sku_type (config.yaml) (where sku_type is hbv2, hb, or hc to indicate the target processor architecture).
  • Generate the Tcl module files at /apps/modulefiles/spack/tcl/$sku_type and the Lmod module files in /apps/modules/spack/lmod/$sku_type (config.yaml).
  • Use 16 processes by default for a parallel build (for example, make -j) (config.yaml).
  • Use the CentOS-HPC 7.7 MPI libraries (they are not rebuilt). See the packages.yaml configuration file for details (packages.yaml, compiler.yaml).


Note: Make sure you build on one of the virtual machine types designed for HPC workloads—HBv2, HB, or HC.


Build and install OSU micro-benchmarks

To build OSU micro-benchmarks with MVAPICH2 (from the CentOS-HPC 7.7 image), run the following command in Azure CLI:

 

spack install osu-micro-benchmarks%gcc@9.2.0^mvapich2

Where %gcc@9.2.0^mvapich2 is the spec, the syntax you use in Spack to specify versions and configuration options. Here, the spec tells Spack to use the gcc@9.2.0 compiler and the the latest mvapich2 MPI library.


To see the detailed installation options and software dependencies for a given package, use:

 

spack info <PACKAGE_NAME>

 

Similarly, you can build OSU micro-benchmarks using Open MPI:

 

spack install osu-micro-benchmarks%gcc@9.2.0^openmpi 


Or HPC-X:

 

spack install osu-micro-benchmarks%gcc@9.2.0^hpcx 


Or Intel MPI:

 

spack install osu-micro-benchmarks%gcc@9.2.0^intel-mpi


Set-up the build cache on Blob storage

Creating the buildbuild on Azure is very straight forward because AzureHPC contains a patch to integrate Azure blob storage with spack buildcache. You just need to specify the location of the buildcache on blog storage and upload the software. After your software is on Blob storage in a buildcache, you can install it anytime from the buildcache without the need to recompile the package.

 

1. Set up a mirror to tell spack the location of the buildcache on Azure blob storage :

    Note: The Storage account must be created in advance with a container called buildcache

 

spack mirror add <SKU_TYPE>_buildcache “azure://<STORAGE_NAME>.blob.core.windows/buildcache/<SKU_TYPE>"”

2. Create public/private gpg keys for signing software in the buildcache

 

spack gpg init
spack gpg create <SKU_TYPE>_gpg <YOUR_EMAIL_ADDRESS>

3. To upload your built software to your buildcache.

 

spack buildcache create --rebuild-index -k <SKU_TYPE>_gpg  -m <SKU_TYPE>_buildcache <SPEC>

    Where SPEC corresponds to the installed software (osu-micro-benchmarks%gcc@9.2.0^openmpi), and the -k option specifies the GPG key to use to sign the software. (A GPG key was generated as part of the azurehpc Spack installation. To see all available GPG keys, use spack gpg list). -m option specifies what buildcache to use.


  If you need to retrieve/install software from the buildcache later, remember to save your GPG keys using spack gpg export. If you need to add/upload additional software to your buildcache you will need to keep your private gpg key.

NOTE: You will need to set the environmental variable AZURE_STORAGE_CONNECTION_STRING to

your storage connection string (portal--> storage account --> Access keys) to access your buildcache.

 

4. Check that you can see all the software (identified by <SPEC> or <HASH>) available in the buildcache:

 

spack buildcache list

5. To install software from the buildcache:

 

spack buildcache install <SPEC> or <HASH>

 

Test the installed software

We will use PBS to test OSU micro-benchmarks built with Spack. The PBS run scripts are available in the azurehpc repository. The following test uses the MVAPICH2 osu_bw /osu_latency PBS run script (osu_bw_latency_mvapich2.pbs). Similar scripts are available for the other MPI libraries.

 


#!/bin/bash 
SHARED_APPS=/apps
export OMP_NUM_THREADS=1

module load gcc-9.2.0
module load mpi/mvapich2
spack load osu-micro-benchmarks^mvapich2
cat $PBS_NODEFILE

mpirun osu_bw
sleep 2
mpirun osu_latency

 

With Spack’s spec syntax, you can specify the software to load using just enough of the name to uniquely identify it—for example, spack load osu-micro-benchmarks^mvapich2. You can also load the software with regular module load syntax if you prefer.


Assuming you have two HBv2 nodes, each running a single MPI process, you would submit the test script as follows:

 

qsub -l select=2:ncpus=120:mpiprocs=1 osu_bw_latency_mvapich2.pbs 


Summary

Spack can really save you time when managing HPC clusters on Azure, because you don’t have to build code and libraries by hand. It’s also very flexible. You can easily customize it to suit your requirements and get started quickly using the recipes that already exist for more than 2,500 HPC packages. It’s pretty easy to write your own recipes, too. For more information, see the Spack website.

Co-Authors
Version history
Last update:
‎Oct 25 2022 12:41 PM
Updated by: