This blog post walks through how to setup an Azure Managed Lustre Filesystem (AMLFS) that will automatically synchronise to an Azure BLOB Storage container. The synchronisation is achieved using the Lustre HSM (Hierarchical Storage Management) interface combined with the Robinhood policy engine and a tool that reads the Lustre changelog and synchronises metadata with the archived storage. The lfsazsync repository on GitHub contains a Bicep template to deploy and setup a virtual machine for this purpose.
Disclaimer: The
lfsazsync
deployment is not a supported Microsoft product you are responsible for the deployment and operation of the solution. There are updates that need applying to AMLFS that will require a Support Request to be raised through the Azure Portal. These updates could effect the stabaility of AMLFS and customer requiring the same level of SLA should speak to their Microsoft representative.
The following is required before running the lfsazsync
Bicep template:
The lfsazsync repository contains a test/infra.bicep
example to create the required resources:
To deploy, first create a resource group, e.g.
TODO: set the variables below
resource_group=
location=
az group create --name $resource_group --location $location
Then deploy into this resource group:
az deployment group create --resource-group $resource_group --template-file test/infra.bicep
Note: The bicep file has parameters for names, ip ranges etc. that should be set if you do not want the default values.
Once deployment is complete, navigate to the Azure Portal, locate the AMLFS resource and click on "New Support Request". The following shows the suggested request to get AMLFS updated:
The lctl
commands needed are listed here.
The lfsazsync
deployment sets up a single virtual machine for all tasks. The HSM copytools could be run on multiple virtual machines to increase transfer peformance. The bandwidth for archiving and retrieval is constrained to approximately half the network bandwidth available to the virtual machine. It is important to note that the same network will be utilized for both accessing the Lustre filesystem and accessing Azure Storage. This should be considered when deciding the virtual machine size. The virtual machine sizes and expected network performance is available here.
The Bicep template has the following parameters:
Parameter | Description |
---|---|
subnet_id | The ID of the subnet to deploy the virtual machine to |
vm_sku | The SKU of the virtual machine to deploy |
admin_user | The username of the administrator account |
ssh_key | The public key for the administrator account |
lustre_mgs | The IP address/hostname of the Lustre MGS |
storage_account_name | The name of the Azure storage account |
storage_container_name | The container to use for synchonising the data |
storage_account_key | A SAS key for the storage account |
ssh_port | The port used by sshd on the virtual machine |
github_release | Release tag where the robinhood and lemur will be downloaded from |
os | The OS to use for the VM (options: ubuntu2004 or almalinux87) |
The SAS key can be generated using the following Azure CLI command:
# TODO: set the account name and container name below
account_name=
container_name=
start_date=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
expiry_date=$(date -u +"%Y-%m-%dT%H:%M:%SZ" --date "next month")
az storage container generate-sas \
--account-name $account_name \
--name $container_name \
--permissions rwld \
--start $start_date \
--expiry $expiry_date \
-o tsv
The following Azure CLI command can be used to get the subnet ID:
# TODO: set the variable below
resource_group=
vnet_name=
subnet_name=
az network vnet subnet show --resource-group $resource_group --vnet-name $vnet_name --name $subnet_name --query id --output tsv
The following Azure CLI command can be used to deploy the Bicep template (as an alterative to setting environment variables, the parameters could be set in a parameters.json
file):
# TODO: set the variables below
resource_group=
subnet_id=
vmsku="Standard_D32ds_v4"
admin_user=
ssh_key=
lustre_mgs=
storage_account_name=
storage_container_name=
storage_sas_key=
ssh_port=
github_release="v1.0.1"
os="almalinux87"
az deployment group create \
--resource-group $resource_group \
--template-file lfsazsync.bicep \
--parameters \
subnet_id="$subnet_id" \
vmsku=$vmsku \
admin_user="$admin_user" \
ssh_key="$ssh_key" \
lustre_mgs=$lustre_mgs \
storage_account_name=$storage_account_name \
storage_container_name=$storage_container_name \
storage_sas_key="$storage_sas_key" \
ssh_port=$ssh_port \
github_release=$github_release \
os=$os
After this call completes the virtual machine will be deployed although it will take more time to install and import the metadata from Azure BLOB storage into the Lustre filesystem. The progress can be monitored by looking at the /var/log/cloud-init-output.log
file on the virtual machine.
The install will set up three systemd services for lhsmd, robinhood and lustremetasync. The log files are located here:
The synchronisation parameters can be controlled through the Robinhood config file, /opt/robinhood/etc/robinhood.d/lustre.conf
. Below are some of the default settings and their locations in the config file:
Name | Default | Location |
---|---|---|
Archive interval | 5 minutes | lhsm_archive_parameters.lhsm_archive_trigger |
Rate limit | 1000 files | lhsm_archive_parameters.rate_limit.max_count |
Rate limit interval | 10 seconds | lhsm_archive_parameters.rate_limit.period_ms |
Archive threshold | last modified time > 30 minutes | lhsm_archive_parameters.lhsm_archive_rules |
Release trigger | 85% of OST usage | lhsm_archive_parameters.lhsm_release_trigger |
Small file release | last access > 1 year | lhsm_archive_parameters.lhsm_release_rules |
Default file release | last access > 1 day | lhsm_archive_parameters.lhsm_release_rules |
File remove | removal time > 5 minutes | lhsmd.lhsmd_remove_rules |
To update the config file, edit the file and then restart the robinhood service, systemctl restart robinhood
.
The lustremetasync service is processing the Lustre ChangeLog continuously. Therefore, actions will happen immediately unless there is a lot of IO all at once where it may take a few minutes to catch up. The following operations will be handled:
Create/delete directories
Directories are created in BLOB storage as an empty object with the name of the directory. There is metadata on this file to indicate that it is a directory. The same object is deleted when removed on the filesystem.
Create/delete symbolic links
Symbolic links are create in BLOB storage as an empty object with the name of the symbolic link. There is metadata on this file to indicate that it is a symbolic link and this contains the path that it is linking to. The same object is deleted when removed on the filesystem.
Moving files or directories
Moving files or directories requires everything being moved to be restored to the Lustre filesystem. The files are then marked as dirty in their new location and the existing files are deleted from BLOB storage. Robinhood will handle archiving the files again in their new location.
Updating metadata (e.g. ownership and permissions)
The metadata will only be updated for archived files that isn't modified. Modified files will have the metadata set when Robinhood updated the archived file.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.