Azure Managed Lustre
Lustre is an open-source parallel filesystem born for high performance computing as a research project back in 1999. Its name is the fusion of Linux and cluster, suggesting its strong vocation to deliver extreme parallel I/O performance for Linux-based clusters.
During standard Lustre operations, data is stored through striping on object storage servers (OSS), while metadata (filenames, directories, permissions) is stored on separate metadata servers (MDS). This is the key to the superior parallel I/O performance the file system can deliver and to the ability to scale performance and capacity increasing the number of MDS and OSS.
We announced back in February the Public Preview of Azure Managed Lustre on Azure, a fully managed Lustre service in Azure cloud environment to provide scalable and high-performance storage for HPC/AI workloads on Azure.
Today we are proudly announcing the General Availability of Azure Managed Lustre.
Azure Managed Lustre delivers all the performance and scalability benefits of Lustre, without the burden of managing the underlying infrastructure. Moreover, it features an integration through Lustre HSM with Azure Blob Storage for data retrieval and archival. This allows HPC/AI workloads to have access on the hot tier to the working datasets, keeping the remaining data in Azure Blob to minimize operational costs.
Azure Managed Lustre delivers a nominal bandwidth throughput per provisioned TB depending on the tier and it has been already tested how the aggregated bandwidth in parallel I/O benchmarking reaches the nominal bandwidth target.
Considering all the details above, it is clear how Azure Managed Lustre File System (AMLFS) is a service strongly oriented toward Linux HPC/AI infrastructure and accessible by installing the specific kernel modules on a Linux client.
This article is focused on providing a recipe to expose Azure Managed Lustre File Systems to Windows clients through SMB/CIFS protocol.
!!Disclaimer: This recipe for deployment is not a supported Microsoft product you are responsible for the deployment and operation of this SAMBA solution.
Lustre is conceived by nature to be primarily accessible from Linux clients through installation of Lustre kernel modules. However, in several scenarios, users of HPC/AI infrastructure will require access to input/output simulation files for pre-processing or post-processing, but also simply to make data available to the cluster. In these scenarios, a Windows operating system client may take advantage of direct access to the Lustre file system directly from Windows Explorer without the need of SCP or other file transfer methodologies.
In the following sections, after a brief introduction to Samba and the architecture that will be deployed on Azure, we will describe how to set up a Samba server on different Linux operating systems with local user authentication (Linux managed) or with Active Directory Domain integration.
SAMBA is a free and open-source software suite that provides seamless file and print services to SMB/CIFS (Server Message Block/Common Internet File System) clients. Samba allows for interoperability between Linux servers and Windows-based clients.
It was originally developed by Andrew Tridgell in 1992, and since then, it has become a standard tool for virtually all Linux distributions.
SAMBA allows to export from a Linux server specific folder toward SMB/CIFS clients. This includes Windows clients.
SAMBA on Linux
SAMBA can be configured to fine tune several aspects of the SMB/CIFS shares including authentication, authorization, user mapping and advanced features like ACLs and extended attributes.
In general, when configuring a SAMBA server, three aspects are critical to plan:
- Server operating mode
- Server security mode
- User ID Mapping
A full description of SAMBA configuration is out of scope of the present article, however very good references are:
- For Server operating modes: Chapter 3. Using Samba as a server Red Hat Enterprise Linux 8 | Red Hat Customer Portal
- For Server security modes: Chapter 3. Using Samba as a server Red Hat Enterprise Linux 8 | Red Hat Customer Portal
- For User ID Mapping: Chapter 3. Using Samba as a server Red Hat Enterprise Linux 8 | Red Hat Customer Portal
In the following sections we will showcase two scenarios:
- The configuration of a standalone (no domain joined) SAMBA server, operating in user security mode
- The configuration of a domain joined SAMBA server, operating in Active Directory security mode
If interested in the scripts and the procedure used in the current documentation, we would like to point you to the related GitHub repository, containing some automated configuration scripts for the different operating systems.
Deploying a SAMBA server exporting AMLFS with local user authentication
In this section we will be realizing the architecture described in the diagram below where a Linux VM will be operating as a standalone SAMBA server with local user authentication.
To configure a SAMBA server exporting an AMLFS volume with local Linux authentication, it is necessary to deploy an Azure Virtual Machine keeping in mind the following:
- It is suggested to use the latest version of RedHat-base or Debian-base OS for performance and out of the box access to the latest SAMBA versions. The procedure in this article has been tested on Alma Linux 8.5, CentOS 7.9, RedHat 7.9 and 8.8, Ubuntu 20.04 and 22.04 .
- The VM should be located in the same Availability Zone of the Azure Managed Lustre File System for best performance.
- The VM should have accelerated networking enabled.
- The VM should have a line of sight with AMLFS from a network perspective, ideally without any Firewall or Network device in the middle. This means that the preferred configuration is the same Virtual Network of a Lustre mount. This would allow for maximum performance.
- VM size should consider the number of clients that will connect to the servers for CPU and RAM sizing.
- VM size should also consider network bandwidth limits
For most of the scenarios, we suggest to use VMs of the Dasv5-series, Dv5-series, Easv5-series or Ev5-series. The SAMBA server may benefit from E-series VMs for increased caching capabilities. At the same time, it is not easy to provide a formula for the number of CPUs/RAM per user, since it greatly depends on the usage profile of the SAMBA server.
The suggested approach is to start with a guessed size and to perform monitoring of RAM/CPU usage. Afterward, it will be possible to adjust size accordingly thanks to Azure VM resizing options.
Installing Lustre Kernel modules
After VM deployment, the first step to carry out is installing the Lustre kernel modules and client in order to be able to mount the designated Lustre filesystem.
After the installation of the kernel module is completed, a quick check of the installation being successful can be done with the following command (to be executed as root or with sudo )
sudo modprobe -v lustre
On Alma Linux 8 for example the output should look like the following:
insmod /lib/modules/4.18.0-348.20.1.el8_5.x86_64/extra/lustre-client-4.18.0.348.20.1.el8.5-2.15.1_29_gbae0abe/net/libcfs.ko
insmod /lib/modules/4.18.0-348.20.1.el8_5.x86_64/extra/lustre-client-4.18.0.348.20.1.el8.5-2.15.1_29_gbae0abe/net/lnet.ko
insmod /lib/modules/4.18.0-348.20.1.el8_5.x86_64/extra/lustre-client-4.18.0.348.20.1.el8.5-2.15.1_29_gbae0abe/fs/obdclass.ko
insmod /lib/modules/4.18.0-348.20.1.el8_5.x86_64/extra/lustre-client-4.18.0.348.20.1.el8.5-2.15.1_29_gbae0abe/fs/ptlrpc.ko
insmod /lib/modules/4.18.0-348.20.1.el8_5.x86_64/extra/lustre-client-4.18.0.348.20.1.el8.5-2.15.1_29_gbae0abe/fs/osc.ko
insmod /lib/modules/4.18.0-348.20.1.el8_5.x86_64/extra/lustre-client-4.18.0.348.20.1.el8.5-2.15.1_29_gbae0abe/fs/fld.ko
insmod /lib/modules/4.18.0-348.20.1.el8_5.x86_64/extra/lustre-client-4.18.0.348.20.1.el8.5-2.15.1_29_gbae0abe/fs/lov.ko
insmod /lib/modules/4.18.0-348.20.1.el8_5.x86_64/extra/lustre-client-4.18.0.348.20.1.el8.5-2.15.1_29_gbae0abe/fs/fid.ko
insmod /lib/modules/4.18.0-348.20.1.el8_5.x86_64/extra/lustre-client-4.18.0.348.20.1.el8.5-2.15.1_29_gbae0abe/fs/mdc.ko
insmod /lib/modules/4.18.0-348.20.1.el8_5.x86_64/extra/lustre-client-4.18.0.348.20.1.el8.5-2.15.1_29_gbae0abe/fs/lmv.ko
insmod /lib/modules/4.18.0-348.20.1.el8_5.x86_64/extra/lustre-client-4.18.0.348.20.1.el8.5-2.15.1_29_gbae0abe/fs/lustre.ko
After this step is completed, you can now mount the Lustre file system as usual.
sudo mount -t lustre -o noatime,flock <MGS_IP_ADDRESS>@tcp:/lustrefs /lustre-fs
We will assume to mount the file system on /lustre-fs on the server.
Installing SAMBA service
Next step will be to install SAMBA server packages.
This can be achieved:
- On Alma Linux 8.5, RedHat 7.9, RedHat 8.8 and CentOS 7.9 with the command
sudo yum install -y samba
- On Ubuntu 20.04 and Ubuntu 22.04 with the command
sudo apt-get install -y samba
As a next step, let’s create the smb.conf configuration file in /etc/samba/smb.conf with the following content:
[global]
workgroup = SAMBA
security = user
passdb backend = tdbsam
; Required only for SMB Version <4.17
ea support = off
[lustre-fs]
comment = Lustre FS
browseable = no
create mask = 0700
directory mask = 0700
valid users=azureuser
read only = No
path = /lustre-fs
A full documentation of the options used in the file can be found in the man pages of smb.conf that can be accessed with the command:
man smb.conf
The configuration above is the extremely basic configuration which sets SAMBA server to:
- Operate in standalone security mode, at user-level, so that the client needs to provide a valid username and password
- Uses TDB (Trivial Database) for storing users password locally
- Creates a SAMBA share which will be visible as lustre-fs exporting /lustre-fs path. Moreover, we are specifying that:
- The share won’t be visible in Network explorer, but accessible only through direct path
- New files and directories are created as a default with mask 0700
- Only azureuser is authorized to connect. However, please be aware that this is an authorization at share connection level. In the directory tree, the files/folders will still follow the standard Linux assigned permissions.
After setting the configuration in the /etc/samba/smb.conf, let’s define the azureuser SAMBA password with the command
sudo smbpasswd -a azureuser
After having completed these steps, two additional steps are required on the server:
- Checking the status of SELinux
- Checking the status of the firewall
SELinux
SELinux by default will prevent correct export operations for the SAMBA server. If your IT Security policy allows for that, you can switch SELinux to Permissive mode, where monitoring and logging is still active by a certain degree, but restrictions are not enforced. This can be achieved with the command:
setenforce 0
If your IT Policy requires SELinux to remain in Enforcing mode, it is necessary to enable SAMBA operations. Quite good documentation can be found in the man pages of samba_selinux. To access them from a terminal, just type man samba_selinux. This requires, if not already installed, the package selinux-policy-doc:
sudo yum install -y selinux-policy-doc #(AlmaLinux, CentOS, RedHat)
sudo apt-get install -y selinux-policy-doc #(Ubuntu)
For example, on AlmaLinux, CentOS, RedHat, in order to allow SAMBA to export any file and folder in read/write mode, just use the command:
sudo setsebool -P samba_export_all_rw 1
Firewall
Depending again on the specific requirements of your IT Security policies, you may need to keep the firewall service enabled on the SAMBA server VM. In this scenario, it is required that you whitelist SAMBA server in the firewall:
- On AlmaLinux, CentOS, RedHat:
sudo firewall-cmd –permanent –add-service=samba
sudo firewall-cmd –reload
- On Ubuntu:
sudo ufw allow samba
Starting the service
After having completed the configuration above, it is necessary to start the service. Use the following command to enable SAMBA to start at boot and to be started contextually:
- On AlmaLinux, CentOS, RedHat:
sudo systemctl enable smb --now
- On Ubuntu:
sudo systemctl enable smbd --now
To check the status of the service, let’s run:
- On AlmaLinux, CentOS, RedHat:
sudo systemctl status smb
- On Ubuntu:
sudo systemctl status smbd
The output on AlmaLinux is the following, as an example:
● smb.service - Samba SMB Daemon
Loaded: loaded (/usr/lib/systemd/system/smb.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2023-06-25 18:49:13 UTC; 5s ago
Docs: man:smbd(8)
man:samba(7)
man:smb.conf(5)
Main PID: 65148 (smbd)
Status: "smbd: ready to serve connections..."
Tasks: 3 (limit: 50473)
Memory: 9.0M
CGroup: /system.slice/smb.service
├─65148 /usr/sbin/smbd --foreground --no-process-group
├─65150 /usr/sbin/smbd --foreground --no-process-group
└─65151 /usr/sbin/smbd --foreground --no-process-group
Jun 25 18:49:13 alma-linux-8-samba-server systemd[1]: Starting Samba SMB Daemon...
Jun 25 18:49:13 alma-linux-8-samba-server smbd[65148]: [2023/06/25 18:49:13.178721, 0] ../../source3/smbd/server.c:1741(main)
Jun 25 18:49:13 alma-linux-8-samba-server smbd[65148]: smbd version 4.17.5 started.
Jun 25 18:49:13 alma-linux-8-samba-server smbd[65148]: Copyright Andrew Tridgell and the Samba Team 1992-2022
Jun 25 18:49:13 alma-linux-8-samba-server systemd[1]: Started Samba SMB Daemon.
Testing the service
After all the operations above have been completed, we are ready to test the first connection to the SMB server.
This will require a Windows client, but the SAMBA server exports can potentially be mounted also by standard Linux clients with the proper CIFS clients installed.
To connect to the SAMBA server, it is necessary to take a Windows client with a line of sight with the SAMBA server on port 445.
In our case, it will be a VM located on Azure inside the same Virtual Network. In order to connect to the SAMBA share, considering also that we have configured it as not-browsable, we will need to reach out to it directly inserting in the Windows Explorer Navigation bar the full path:
\\<IP_ADDRESS_OF_THE_SAMBA_SERVER>\lustre-fs
Hitting “Enter”, we will be prompted for a password. Since our SAMBA server is not Active Directory joined, we will need to enter the local credentials configured above.
Let’s use the username and password previously configured with smbpasswd:
If all the configuration has been successful, you should be able to access the file system. Let’s create a new text document, as a preliminary test.
As we can see, the file will be created and it will be visible in the Linux world with the correct user permissions and with the specified mask.
[root@alma-8-standalone ~]# ls -ltr /lustre-fs/
total 0
-rwx------. 1 azureuser azureuser 0 Jul 4 10:07 ThisIsATextDocument.txt
[root@alma-8-standalone ~]#
It is important to stress the fact that this will also work for those files that reside on the Azure Managed Lustre File System that have been moved to Azure Blob.
Let’s perform this operation creating on the Linux side a file in /lustre-fs of 10 GB full of zeros:
dd if=/dev/zero of=/lustre-fs/test-file bs=512k count=2048
The output will be the following:
[root@alma-8-standalone lustre-fs]# dd if=/dev/zero of=/lustre-fs/test-file bs=512k count=2048
2048+0 records in
2048+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.25165 s, 477 MB/s
[root@alma-8-standalone lustre-fs]# ls -ltrh
total 1.1G
-rwx------. 1 azureuser azureuser 0 Jul 4 10:07 ThisIsATextDocument.txt
-rw-r--r--. 1 root root 1.0G Jul 4 10:13 test-file
[root@alma-8-standalone lustre-fs]#
After the process is complete, let’s move it to the Azure Blob storage using Lustre HSM:
lfs hsm_archive /lustre-fs/test-file
lfs hsm_release /lustre-fs/test-file
After these commands, the disk space will be released, but the file metadata will still be visible with the correct file size, both on Linux and Windows side:
[root@alma-8-standalone lustre-fs]# du -sh .
4.5K .
[root@alma-8-standalone lustre-fs]# ls -ltrh
total 512
-rwx------. 1 azureuser azureuser 0 Jul 4 10:07 ThisIsATextDocument.txt
-rw-r--r--. 1 root root 1.0G Jul 4 10:13 test-file
[root@alma-8-standalone lustre-fs]#
Now, let’s try to copy the file through the Windows client to Desktop. After a very brief I/O delay for file recovery, the data will be available again:
We can also see that now the space is effectively occupied by the file:
[root@alma-8-standalone lustre-fs]# du -sh .
1.1G .
[root@alma-8-standalone lustre-fs]# ls -ltrh
total 1.1G
-rwx------. 1 azureuser azureuser 0 Jul 4 10:07 ThisIsATextDocument.txt
-rw-r--r--. 1 root root 1.0G Jul 4 10:13 test-file
[root@alma-8-standalone lustre-fs]#
This is because Lustre HSM transparently brought back the data from Azure Blob to the Lustre filesystem OSSs.
This server can be used standalone by adding additional users, adding additional configuration options in smb.conf, or adding additional shares.
- To enable access for an existing Linux user, the user must be configured with:
smbpasswd -a <USER_NAME>
- Multiple shares can be defined in the smb.conf. Each share path must not necessarily be the mount point, paths can point to sub-folders.
- Both Linux permissions and SAMBA configuration parameter "valid users" can be used to tune access control.
Extended Attributes Support
Depending on the operating system version, you may get a different SMB version from official operating system repositories. At the time of publication of this article, the following version matrix is automatically installed from repositories:
|
SAMBA version |
Alma Linux 8.5 |
4.17.5 |
CentOS 7.9 |
4.10.16 |
RedHat 7.9 |
4.10.16 |
RedHat 8.8 |
4.17.5 |
Ubuntu 20.04 LTS |
4.15.13-Ubuntu |
Ubuntu 22.04 LTS |
4.15.13-Ubuntu |
It is important to note that there is a bug in versions older than 4.17 affecting the behavior of extended attributes for Lustre file systems. If using SMB versions older than 4.17, it is necessary to disable EA support for proper Azure Managed Lustre operations, with the subsequent implications. More specifically, during a file copy run from Windows, the extended attributes of the files will not be available in the new file.
Creating a file on Linux, setting extended attributes, and then duplicating it on the SMB file will create a new file without those extended attributes replicated.
root@ubuntu-22-standalone:/lustre-fs# touch xattr_test.file
root@ubuntu-22-standalone:/lustre-fs# setfattr -n user.attribute -v test xattr_test.file
root@ubuntu-22-standalone:/lustre-fs# ls -ltr
total 0
-rw-r--r-- 1 root root 0 Jul 4 13:36 xattr_test.file
root@ubuntu-22-standalone:/lustre-fs# getfattr xattr_test.file
# file: xattr_test.file
user.attribute
root@ubuntu-22-standalone:/lustre-fs# # Now we copy from SMB Windows client the file to xattr_test_copy.file
root@ubuntu-22-standalone:/lustre-fs# getfattr xattr_test_copy.file
# file: xattr_test_copy.file
user.DOSATTRIB
root@ubuntu-22-standalone:/lustre-fs#
Deploying a SAMBA server as Domain Member exporting AMLFS with Active Directory Authentication
In this section, we will be realizing the architecture described in the following diagram where a SAMBA server will act as a Domain Member, authorizing users through Kerberos against Active Directory.
The prerequisite for this architecture is that the SMB server has a line of sight to an Active Directory Domain Controller with the possibility to join the domain through an account with adequate privileges.
In order to go through the steps of the guide, it is mandatory to finalize the steps for a standalone server described in the previous section “Deploying a Samba server exporting AMLFS with local user authentication”. An important remark should be made on the way in which the SAMBA server should be joined as a Domain Member into the Domain Controller.
Active Directory join
As of today, to join a server to AD, there are two main options in the Linux world: SSSD and Winbind. The selection between the two methods should be done on the basis of specific requirements.
Very good articles from Dmitri Pal comparing the two methods (Overview of Direct Integration Options (redhat.com) and SSSD vs Winbind (redhat.com)).
The choice really depends on the specific infrastructure scenario. For example, if the identity management system used on other servers in your Linux world is already SSSD, then this could also be the best choice for AD integration.
When using orchestrators like Azure CycleCloud for example, it is extremely useful for compute nodes to avoid join/removal from the domain at every iteration. In these situations, it is possible to use SSSD through LDAPS.
Another important aspect to take into consideration is that this choice will also impact the handling of the UID/GID mapping from the AD domain to the Linux world.
In general, both Winbind and SSSD have the capability to use specific Linux Active Directory attributes like uidNumber, gidNumber, unixHomeDirectory as UID/GID for Linux users. At the same time, they provide several logics to perform automatic mapping (in case Active Directory doesn’t contain ad-hoc Linux attributes) from the Active Directory SIDs.
After a domain join has been completed on the VM, the Active Directory users become visible and usable for authentication in the Linux environment. Moreover, the Linux VM will become visible inside Active Directory in the target Organizational Unit.
However, depending on the specific SAMBA version that comes from the specific Linux distribution repository, you may face the bug fixes related to CVE-2020-25717. As you can read from the official SAMBA project website, several patches on top of the fix have caused in some situations the need, even in the case where AD join is managed by SSSD, to have Winbind service running for SAMBA authentication.
The table below represents the possible different combinations available for AD Join mode and for user mapping that will be explored in this guide.
AD Join software |
SAMBA Security mode |
SAMBA User Mapping |
SMB Client authentication |
ID Mapping |
SSSD |
ads |
sss |
Kerberos |
ID Mapping handled by SSSD configuration, which includes automatic mapping or AD attributes use |
Winbind |
ads |
rid |
ID Mapping through RID algorithm |
|
ad |
No mapping, attributes from AD |
|||
sss (with SSSD in LDAPs) |
ID Mapping handled by SSSD configuration in case SSSD with LDAPS is used on other Linux environments |
Let’s now explore the two available join methodologies in the following sections. In both cases, to join a Linux VM in Active Directory, the following prerequisites must be satisfied:
- A line of sight with a Domain Controller
- Root permissions on the Linux server
- An account with sufficient privileges to allow the VM to join the domain
- Domain Controller in the form of an Azure Active Directory Domain Services or customer managed
Joining the domain with SSSD
In this section, the AD join of the server will be handled by SSSD:
The following procedure has been created for Alma Linux 8.5, RedHat Linux 8.8 and CentOS 7.9 and for Ubuntu 20.04 and Ubuntu 22.04 using the relative guides on Azure Learn.
RedHat 8.8, AlmaLinux 8.5 and CentOS 7.9
On RedHat based distributions, the following packages will require installation:
sudo yum install -y adcli realmd sssd krb5-workstation krb5-libs oddjob oddjob-mkhomedir samba-common-tools samba-winbind samba-winbind-clients
After installation is complete, join will be done through the command:
sudo realm join --verbose <DOMAIN_NAME> -U <USER>@<DOMAIN_NAME> --membership-software=samba --client-software=sssd
Ubuntu 22.04 and Ubuntu 20.04
On Ubuntu based distributions, the following packages will require installation:
sudo apt-get -y install krb5-user samba sssd sssd-tools libnss-sss libpam-sss ntp ntpdate realmd adcli
As mentioned in Azure Learn, it may be necessary to disable rdns in /etc/krb5.conf, adding in the [libdefaults] section:
rdns=false
After installation is complete, join will be done through the command:
sudo realm join --verbose <DOMAIN_NAME> -U <USER>@<DOMAIN_NAME> --membership-software=samba --client-software=sssd
Joining a specific OU in AD
Using --computer-ou, it is possible to place the SAMBA server in a specific organizational unit inside Active Directory. For example:
sudo realm join LUSTRE.LAB -U user@LUSTRE.LAB -v --computer-ou=OU=Ubuntu,OU=SambaServers,DC=lustre,DC=lab --membership-software=samba --client-software=sssd
Testing users resolution
At the end of the procedure, the VM on Linux should be able to resolve users inside the Active Directory domain and at the same time it should be visible in the Domain Controller list.
If the --computer-ou option is used, this will force the computer be in the correct OU in Active Directory.
After the procedure is completed, it should be possible to resolve AD users inside the Linux system. For example, for a domain user for which the sAMAccountName is demo.user1 in AD, we can see how it is resolved in Linux world.
[azureuser@alma-linux-8-samba-server-ssd ~]$ id demo.user1@lustre.lab
uid=1589201103(demo.user1@lustre.lab) gid=1589200513(domain users@lustre.lab) groups=1589200513(domain users@lustre.lab)
[azureuser@alma-linux-8-samba-server-ssd ~]$
It is important to notice that the mapping above in terms of UID and GID has been performed automatically by SSSD from SID using a proper algorithm thanks to the parameter ldap_id_mapping = True in /etc/sssd/sssd.conf.
In case your Active Directory users already contain Linux attributes, you can disable the automatic mapping with ldap_id_mapping = False
Fine tuning of the configuration of SSSD can be performed according to what's documented in man sssd.conf.
SAMBA Configuration
After this change, you will be ready to change the SMB configuration above adding the parameters required for AD authentication:
[global]
; DOMAIN_NAME as returned by net getdomainsid
workgroup = <DOMAIN_NAME>
security = ads
passdb backend = tdbsam
ea support = off
; NETBIOS name as in ldap_sasl_authid parameter in /etc/sssd/sssd.conf or from net getlocalsid, truncated to 15 characters
netbios name = <COMPUTER_NETBIOS_NAME>
kerberos method = secrets and keytab
; REALM name as contained in realm list command
realm = <REALM_NAME>
; Keep this range large enough to include system local accounts
idmap config * : range = 1000-9000
idmap config * : backend = tdb
; keep this range to match what SSSD mapping or Active Directory parameters require
idmap config <DOMAIN_NAME> : range = 10000-29999999999
idmap config <DOMAIN_NAME> : backend = sss
winbind use default domain = no
[lustre-fs]
comment = Lustre FS
browseable = no
create mask = 0700
directory mask = 0700
valid users=LUSTRELAB\azureuser
read only = No
path = /lustre-fs
It is worth noting how in the smb.conf even in the case of a domain member it is necessary to define a * default domain. As described in RedHat documentation, this default domain will be still used for local SAMBA groups and users
As mentioned above, if we now try to start SMB, we will get the following error on the first connection:
==> /var/log/samba/log.smbd <==
[2023/07/01 13:50:03.610988, 0] ../../source3/auth/auth_winbind.c:120(check_winbind_security)
check_winbind_security: winbindd not running - but required as domain member: NT_STATUS_NO_LOGON_SERVERS
Because of this, it is important to install winbind and enable the service:
- On Alma Linux 8.5, RedHat 7.9, RedHat 8.8 and CentOS 7.9 with the command:
sudo yum install -y samba-winbind
sudo systemctl enable winbind --now
sudo systemctl restart smb
- On Ubuntu 20.04 and Ubuntu 22.04 with the command:
sudo apt-get install -y winbind
sudo systemctl enable winbind --now
sudo systemctl restart smbd
Testing SAMBA share
After these steps have been completed, it should be possible to access the SMB server using the standard UNC path from a Windows client:
\\<IP_ADDRESS_OF_THE_SAMBA_SERVER>\lustre-fs
If the connection is done from a machine which is Active Directory joined, the authentication should happen without requiring a password provided the logged-in user is allowed to have access to the share:
When creating any file on the disk, it will be attributed to the correct user by SSSD mapping.
root@ubuntu-22-sssd:/lustre-fs# ls -ltr
total 0
-rwx------ 1 azureuser@lustre.lab domain users@lustre.lab 0 Jul 4 20:43 ThisIsATestDocument.txt
root@ubuntu-22-sssd:/lustre-fs#
At the same time, trying to connect as another user, will cause an error since that user is not in the list of the valid users in the smb.conf. Restrictions can of course be done at share access level using Active Directory groups in the valid users parameter.
Switching off LDAP user mapping
The last remark about user mapping: here it is totally governed by the SSSD configuration file.
Let’s go back to our case. In the LUSTRE.LAB Active Directory, there are three users:
- demo.user1 -> No Linux attribute in AD
- demo.user2 -> Linux attributes in AD with UID 20000 and GID 20000
In /etc/sssd/sssd.conf, if we keep ldap_id_mapping = True we will get:
root@ubuntu-22-sssd:/lustre-fs# id LUSTRELAB\\demo.user1
uid=1589201103(demo.user1@lustre.lab) gid=1589200513(domain users@lustre.lab) groups=1589200513(domain users@lustre.lab)
root@ubuntu-22-sssd:/lustre-fs# id LUSTRELAB\\demo.user2
uid=1589201606(demo.user2@lustre.lab) gid=1589200513(domain users@lustre.lab) groups=1589200513(domain users@lustre.lab)
root@ubuntu-22-sssd:/lustre-fs#
This UID and GID are generated by the SSSD algorithm using objects SIDs. If we switch to ldap_id_mapping = False, restart SSSD, and clear the credential cache:
root@ubuntu-22-sssd:/lustre-fs# systemctl restart sssd
root@ubuntu-22-sssd:/lustre-fs# sss_cache -EUG
root@ubuntu-22-sssd:/lustre-fs# id LUSTRELAB\\demo.user1
id: ‘LUSTRELAB\\demo.user1’: no such user
root@ubuntu-22-sssd:/lustre-fs# id LUSTRELAB\\demo.user2
uid=20000(demo.user2@lustre.lab) gid=20000(domain users@lustre.lab) groups=20000(domain users@lustre.lab)
root@ubuntu-22-sssd:/lustre-fs#
Then all users without Linux attributes set in AD will not be visible by the system and will not be able to access the SMB share. The users with the correct Linux attributes will be resolved with those attributes.
This is propagated in the SMB layer which will deny access to users that do not have a mapping:
On the other hand, connecting as demo.user2 the SMB drive:
We will be able to access the disk and the files will be created with the correct UID/GID mapping from Active Directory:
Creating a file with the new user, we can see how the AD set UID/GID will be enforced:
root@ubuntu-22-sssd:/lustre-fs# ls -ln
total 0
-rwx------ 1 20000 20000 0 Jul 4 20:50 Test_demo.user2.txt
-rwx------ 1 1589200500 1589200513 0 Jul 4 20:43 ThisIsATestDocument.txt
root@ubuntu-22-sssd:/lustre-fs# ls -l
total 0
-rwx------ 1 demo.user2@lustre.lab domain users@lustre.lab 0 Jul 4 20:50 Test_demo.user2.txt
-rwx------ 1 1589200500 1589200513 0 Jul 4 20:43 ThisIsATestDocument.txt
root@ubuntu-22-sssd:/lustre-fs#
Joining the domain with Winbind
In this section, the configuration of the server will be done with Winbind:
The following procedure has been created for Alma Linux 8.5, RedHat Linux 8.8 and CentOS 7.9 and for Ubuntu 20.04 and Ubuntu 22.04 using the relative guides on RedHat documentation and on Ubuntu Wiki.
RedHat 8.8, AlmaLinux 8.5 and CentOS 7.9
On RedHat based distributions, the following packages will require installation:
sudo yum install -y adcli realmd sssd krb5-workstation krb5-libs oddjob oddjob-mkhomedir samba-common-tools samba-winbind samba-winbind-clients
After installation is complete, join will be done through the command:
sudo realm join --verbose <DOMAIN_NAME> -U <USER>@<DOMAIN_NAME> --membership-software=samba --client-software=winbind
For AlmaLinux, if SELinux is in Enforcing mode, you may need to explicitly whitelist some SAMBA components.
If you get errors when accessing the share, you may see in /var/log/secure an output like the following:
Jul 7 22:20:34 alma8-winbind setroubleshoot[64877]: SELinux is preventing /usr/libexec/samba/rpcd_lsad from using the setgid capability. For complete SELinux messages run: sealert -l a11c80ed-fdbd-4823-9855-fffcd21eb92d
In this case it is necessary to allow the operation of samba-dcerpcd and rpcd_lsad:
ausearch -c 'samba-dcerpcd' --raw | audit2allow -M allow-samba-dcerpcd
semodule -X 300 -i allow-samba-dcerpcd.pp
ausearch -c 'rpcd_lsad' --raw | audit2allow -M allow-samba-rpcd_lsad
semodule -X 300 -i allow-samba-rpcd_lsad.pp
Ubuntu 22.04 and Ubuntu 20.04
On Ubuntu based distributions, the following packages will require installation:
sudo apt-get -y install samba winbind libnss-winbind libpam-winbind krb5-user realmd
As mentioned in Azure Learn, it may be necessary to disable rdns in /etc/krb5.conf adding in the [libdefaults] section:
rdns=false
After installation is complete, join will be done through the command:
sudo realm join --verbose <DOMAIN_NAME> -U <USER>@<DOMAIN_NAME> --membership-software=samba --client-software=winbind
After this step is complete, it is important to add winbind in the passwd / group / shadow module in /etc/nsswitch.conf.
# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.
passwd: files systemd winbind
group: files systemd winbind
shadow: files winbind
Joining a specific OU in AD
As in the case of SSSD, using --computer-ou, it is possible to place the SAMBA server in a specific organizational unit inside Active Directory. For example:
sudo realm join LUSTRE.LAB -U user@LUSTRE.LAB -v --computer-ou=OU=Ubuntu,OU=SambaServers,DC=lustre,DC=lab --membership-software=samba --client-software=sssd
SAMBA and WINBIND configuration
After joining the domain, you will be ready to change the SMB configuration above adding the parameters required for AD authentication. Please note that performing a join with realm and using winbind will already create part of this structure. You may decide to add the missing parts or to create a brand-new file. Let’s start with the first configuration with RID mapping:
[global]
; DOMAIN_NAME as returned by net getdomainsid
workgroup = <DOMAIN_NAME>
security = ads
passdb backend = tdbsam
; Following parameter is necessary for SMB versions <4.17
ea support = off
; NETBIOS name from net getlocalsid, truncated to 15 characters
netbios name = <COMPUTER_NETBIOS_NAME>
kerberos method = secrets and keytab
; REALM name as contained in realm list command
realm = <REALM_NAME>
; Keep this range large enough to include system local acccounts
idmap config * : range = 10000-999999
idmap config * : backend = tdb
; keep this range to match what RID mapping or Active Directory parameters require
idmap config <DOMAIN_NAME> : range = 2000000-2999999
idmap config <DOMAIN_NAME> : backend = rid
winbind use default domain = no
winbind refresh tickets = yes
winbind offline logon = yes
winbind enum groups = no
winbind enum users = no
[lustre-fs]
comment = Lustre FS
browseable = no
create mask = 0700
directory mask = 0700
valid users=LUSTRELAB\azureuser
read only = No
path = /lustre-fs
It is worth noting how in the smb.conf even in the case of a domain member it is necessary to define a * default domain. As described in RedHat documentation, this default domain will be still used for local SAMBA groups and users
After having created the file, let’s restart both SAMBA and Winbind:
- For AlmaLinux 8, RedHat 8.8 and CentOS 7.9:
systemctl restart winbind smb
- For Ubuntu 20.04 and 22.04:
sudo systemctl restart winbind smbd
Testing users resolution
After starting Winbind and SMB, users should be successfully resolved in the Linux domain:
[root@almalinux-8-samba-winbind azureuser]# id LUSTRELAB\\demo.user1
uid=11103(LUSTRELAB\demo.user1) gid=10513(LUSTRELAB\domain users) groups=10513(LUSTRELAB\domain users),11103(LUSTRELAB\demo.user1),10001(BUILTIN\users)
[root@almalinux-8-samba-winbind azureuser]# id LUSTRELAB\\demo.user2
uid=11606(LUSTRELAB\demo.user2) gid=10513(LUSTRELAB\domain users) groups=10513(LUSTRELAB\domain users),11606(LUSTRELAB\demo.user2),10001(BUILTIN\users)
Testing SAMBA share
This configuration will allow a Windows client that is AD joined with an AD account that has proper authorization on the shared Lustre folder to access the shared folder. In the configuration above we are allowing only LUSTRELAB\azureuser to access the share:
Let’s try now to write a TXT file:
This will be done using the right permissions, UID and GID:
[root@almalinux-8-samba-winbind lustre-fs]# ls -ltr
total 0
-rwx------. 1 LUSTRELAB\azureuser LUSTRELAB\domain users 0 Jul 3 18:22 TestWriteOnSAMBA.txt
[root@almalinux-8-samba-winbind lustre-fs]#
It is worth noting that accessing with any other users will be denied because of the valid users directory in /etc/samba/smb.conf.
Switching ID Mapping logic
In the case above, RID mapping has been used for automatic UID/GID calculation from Winbind starting from AD objects SID. This, even if convenient to avoid large AD attributes addition, is consistent only on the same domain and on VM with the same range configuration for Winbind.
If mapping is switched to “AD” in the configuration file...
idmap config <DOMAIN_NAME> : range = 10000-29999999999
idmap config <DOMAIN_NAME> : backend = ad
... and the services are restarted, Winbind will exclusively enforce mapping using Linux attributes already present in Active Directory, ignoring all the users without those attributes.
In our case, only demo.user1 has assigned a uidNumber and a gidNumber in AD:
[root@almalinux-8-samba-winbind lustre-fs]# systemctl stop smb winbind
[root@almalinux-8-samba-winbind lustre-fs]# net cache flush
[root@almalinux-8-samba-winbind lustre-fs]# systemctl start smb winbind
[root@almalinux-8-samba-winbind lustre-fs]# id LUSTRELAB\\azureuser
id: ‘LUSTRELAB\\azureuser’: no such user
[root@almalinux-8-samba-winbind lustre-fs]# id LUSTRELAB\\demo.user1
id: ‘LUSTRELAB\\demo.user1’: no such user
[root@almalinux-8-samba-winbind lustre-fs]# id LUSTRELAB\\demo.user2
uid=20000(LUSTRELAB\demo.user2) gid=20000(LUSTRELAB\domain users) groups=20000(LUSTRELAB\domain users),10001(BUILTIN\users)
[root@almalinux-8-samba-winbind lustre-fs]#
The last interesting option is the possibility to configure the id mapping backend to sss. This may be useful for those situations where on Linux environments the common enterprise practice is the use of LDAPS instead of Active Directory join is used to manage AD users on Linux. The use of SSSD with LDAPS is extremely powerful for those scenarios where Active Directory join is too expensive or poses security/monitoring concerns.
For example, in the case of the compute nodes of an Azure CycleCloud cluster, the continuous dynamic creation/destruction of nodes in VM ScaleSets will require:
- Each VM to have access during provisioning to credentials with join rights to an AD domain
- Continuous join/leave of nodes inside Active Directory
Using SSSD through LDAPS allows us to avoid AD join and also to use read-only service accounts for LDAP bind.
However, where it comes to SAMBA, the server requires us to be AD joined to properly handle Kerberos authentication.
It is important to highlight that in order to make SAMBA work appropriately with an AD join and an ID mapping handled by SSSD, it is necessary to add the following entry to the [sssd] section of /etc/sssd/sssd.conf.
re_expression = (?P<domain>[^\\]*?)\\?(?P<name>[^\\]+$)
This parameter allows us to make the Winbind name format the default format recognized by SSSD. After changing the configuration, restart the service with:
sudo systemctl restart sssd
After this change, resolution should become available using Winbind pattern:
[azureuser@alma-linux-8-samba-server-ssd ~]$ id LUSTRELAB\\azureuser
uid=1589200500(azureuser@lustre.lab) gid=1589200513(domain users@lustre.lab) groups=1589200513(domain users@lustre.lab)
[azureuser@alma-linux-8-samba-server-ssd ~]$
The Winbind configuration above, however, can be configured to leverage ID mapping to SSSD by setting the backend to sss and configuring SSSD on the same nodes with LDAPS using exactly the same configuration adopted in other VMs:
idmap config <DOMAIN_NAME> : range = 10000-29999999999
idmap config <DOMAIN_NAME> : backend = sss
This allows us to have the same configuration for ID mapping on all the nodes, both the SAMBA server and other enterprise systems, without the need to handle it differently on the SAMBA server because of Winbind.
Debugging Active Directory join
In case any issue arises during the use of SSSD, Winbind, or SAMBA AD join configuration, it is possible to increase the verbosity of the logs and to live monitor them to identify any error.
- For SSSD, it is possible to add the following parameter to the [sssd] and [domain/<DOMAIN_NAME>] blocks in /etc/sssd/sssd.conf:
debug_level = 5
- For Winbind, it is possible to add the following parameter to the [global] section in /etc/samba/smb.conf:
log level = 5
After this, a services restart is required. Increasing the log level is possible to monitor log files while performing authentication tests:
- For SSSD:
tail -f /var/log/sssd/*
- For SAMBA:
tail -f /var/log/samba/*
Next Steps
Learn more about how to use Azure Managed Lustre and its various supported features from our documentation.
Learn more about HPC and AI solutions
- Learn more about Azure Managed Lustre
- Read our Azure Managed Lustre GA Launch blog
- Visit our Azure HPC hub for more technical content developed for HPC
- Learn about our Azure HPC
- Learn more about Azure AI Infrastructure
#AzureHPC #AzureHPCAI