High Performance Computing (HPC) clusters in Azure are almost exclusively deployed per Azure Region (ie. East US, South Central US, West Europe, etc). Data gravity usually drives this as your data should be as close to the compute as possible to reduce latency. If a need arises to use a different Region the default answer is to create a new/separate cluster in the Region and manage multiple clusters. This additional management overhead isn't always desired and many customers ask how to create a single cluster that can span multiple Azure regions. This blog will provide an example of how to create a Multi-Region Slurm cluster using Azure CycleCloud (CC).
#FOLLOWING EXAMPLE ASSUMES BOTH VNETS IN SAME RESOURCE GROUP
#CREATE VNET PEERING FROM VNET-1 TO VNET-2
az network vnet peering create -g MyResourceGroup -n VNET1ToVNET2 \
--vnet-name VNET-1 --remote-vnet VNET-2 --allow-vnet-access
#CREATE VNET PEERING FROM VNET-2 TO VNET-1
az network vnet peering create -g MyResourceGroup -n VNET2ToVNET1 \
--vnet-name VNET-2 --remote-vnet VNET-1 --allow-vnet-access
# CREATE PRIVATE DNS ZONE
az network private-dns zone create -g MyResourceGroup \
-n private.ccmr.net
#LINK VNETS TO PRIVATE DNS ZONE
az network private-dns link vnet create -g MyResourceGroup -n CCMRClusterLink1 \
-z private.ccmr.net -v VNET-1 -e true
az network private-dns link vnet create -g MyResourceGroup -n CCMRClusterLink2 \
-z private.ccmr.net -v VNET-2 -e true
Credentials
is the common name of the CC credential in your environment. This can be found in your CC GUI or CC CLI command: cyclecloud account listPrimary*
represents the scheduler and HTC partition, whereas Secondary*
represents the HPC partitionPrimarySubnet
,PrimaryRegion
, SecondarySubnet
& SecondaryRegion
*Subnet
is of the format resource-group-name/vnet-name/subnet-name
(the template has a placeholder name)*Region
Name can be found with the azure-cli command az account list-locations -o table
HPCMachineType
, MaxHPCExecuteCoreCount
, HTCMachineType
&MaxHTCExecuteCoreCount
as necessaryprivate.ccmr.net
with your specific Private DNS Zone namecyclecloud import_cluster slurm-multigregion-cluster -c Slurm -f slurm-multiregion-git.txt -p slurm-multiregion-params-min.json
slurm-multigregion-cluster
= a name for the cluster chosen by you (no spaces)-c Slurm
= name of the cluster defined in the template file (ie. line #6)-f slurm-multiregion-git.txt
= file name of the template to upload-p slurm-multiregion-params-min.json
= file name of the parameters file to uploadsbatch mpi.sh
)#!/bin/bash
#SBATCH --job-name=mpiMultiRegion
#SBATCH --partition=hpc
#SBATCH -N 2
#SBATCH -n 120 # 60 MPI processes per node
#SBATCH --chdir /tmp
#SBATCH --exclusive
set -x
source /etc/profile.d/modules.sh
module load mpi/hpcx
echo "SLURM_JOB_NODELIST = " $SLURM_JOB_NODELIST
# Assign the number of processors
NPROCS=$SLURM_NTASKS
#Run the job
mpirun -n $NPROCS --report-bindings echo "hello world!"
mv slurm-${SLURM_JOB_ID}.out $HOME
NOTE: the default Slurm working directory is the path from which the job was submitted, typically the user home directory. As the home directory will likely be in Region1 its important to explicitly set a working dir to something local to Region2. In the above example I set it to the VM local /tmp (#SBATCH --chdir /tmp
) and added a line at the end to move the Slurm output file to the user home directory.
With careful planning and implementation it is possible to create a Slurm Multi-Region cluster with Azure CycleCloud. This blog is not all inclusive and there is likely additional customization required for a customer specific environment, such as adding mounts (ie. datasets) specific to the workflow in Region2.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.