On Azure there are performance and cost advantages by deploying a BeeGFS parallel filesystem using NVMe SSDs for storage and metadata (e.g. L8s_v2 SKU). NVMe SSD’s provide the superior throughput and IOPS advantages and can significantly reduce cost and complexity by utilizing the NVMe SSDs that come with the SKU (e.g. Lsv2 SKUs) and not have to configure, install and pay for additional data disks.
The primary disadvantage of using NVMe SSDs in your BeeGFS deployment is that these disks are not persistent, and you will lose your data once you stop/deallocate your BeeGFS filesystem. One approach to overcome this limitation is to use the storage pools feature of BeeGFS, which allows you to have mixed disks as part of your BeeGFS filesystem. The idea is to add cheap HDD managed disks to an NVMe SSD based BeeGFS parallel filesystem to provide data persistence. The details of how to deploy BeeGFS with NVMe and HDD using storage pools will be discussed below.
The AzureCAT HPC azurehpc repository will be used to deploy the BeeGFS storage pools architecture.
git clone git@github.com:Azure/azurehpc.git
We will be following closely the azurehpc beegfs_pools example (in azurehpc/examples/beegfs_pools).
$ azhpc-init -c $azhpc_dir/examples/beegfs_pools -d beegfs_pools -s
Fri Jun 28 08:50:25 UTC 2019 : variables to set: "-v location=,resource_group="
azhpc-init -c $azhpc_dir/examples/beegfs_pools -d beegfs_pools -v location=westus2,
resource_group=azhpc-cluster
azhpc-build
azhpc-connect -u hpcuser beegfsm
$ beegfs-ctl --liststoragepools
Pool ID Pool Description Targets Buddy Groups
======= ================== =====
1 Default 2,4
2 hdd_pool 1,3
After I/O processing is complete, the data can be migrated to permanent storage (HDD) using the following procedure.
cp -R /beegfs/data /beegfs/hdd_poolsor
beegfs-ctl –migrate –storagepoolid=1 –destinationpoolid=2 /beegfs/data
WCOLL=beegfssm "cd /mnt/beegfs;sudo tar czvf /home/${user}/beegfs_meta.tar.gz meta/
--xattrs"
After the BeeGFS parallel filesystem is restarted the BeeGFS data can be migrated back to the BeeGFS NVMe SSDs with the following procedure.
WCOLL= beegfssm "cd /mnt/beegfs;sudo tar xvf /home/${user}/beegfs_meta.tar.gz
--xattrs"
cp -R /beegfs/hdd_pools/data /beegfs/dataor
beegfs-ctl –migrate –storagepoolid=2 –destinationpoolid=1 /beegfs/data
The azurehpc repository contains a number of scripts to measure storage throughput and IOPTS using IOR and FIO benchmark codes, see azurehpc/apps/ior and azurehpc/apps/fio.
BeeGFS has a built-in feature called storage pools which allows a parallel filesystem to be deployed with different types of disks, creating pools of storage resources with different performance characteristics for flexibility. This feature can be utilized to provide an ephemeral based BeeGFS parallel filesystem with data persisted by adding low cost HDD data disks.
The deployment of BeeGFS storage pools has been automated in the AzureCAT HPC azurehpc repository. The checkpoint/restart of a BeeGFS parallel filesystem has been discussed and how the data can be migrated to/from NVMe SSD’d and HDD’s.
This solution provides the best of both worlds, fast performance of NVMe SSD’s with the cheap cost of persistent HDD’s.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.