How to troubleshoot blobfuse2 issues
Published Apr 10 2024 10:58 AM 989 Views
Microsoft

BlobFuse2 is a virtual file system driver available for Azure Blob Storage which helps accessing the containers/blobs on the Linux file system as a virtual file system. At present, we have Blobfuse2, which is great improvement over blobfuse1 and is generally available for all major Linux distributions.  

 

In this article, we will talk about various common troubleshooting scenarios that are seen while using blobfuse2.  

 

We assume that you have mounted blobfuse2 on your Linux VM. For blobfuse2 installation and mount process, you can refer to the below links: ` 

 

  1. azure-storage-fuse/MIGRATION.md at main · Azure/azure-storage-fuse (github.com) 
  2. How to mount an Azure Blob Storage container on Linux with BlobFuse2 - Azure Storage | Microsoft Lea... 
  3. How to configure settings for BlobFuse2 - Azure Storage | Microsoft Learn 

 
There is blobfuse2 troubleshooting guide for some of the common issues being face prior/post mounting blobfuse2. Reference link: azure-storage-fuse/TSG.md at main · Azure/azure-storage-fuse · GitHub 
 

In this blog, we will be talking about some other scenarios that could be helpful in resolving blobfuse2 issues.  

 

Symptom scenario 1:  
You are using blobfuse2 to transfer data from your Linux system to storage account. You have successfully mounted blobfuse2 and made changes to the blob. Additionally, there are other channels (like Portal, or Rest API or Storage Explorer or code) from where the blob is getting updated. You have observed that, even though the blob is getting updated from different sources, you are still seeing the stale blob copy when accessed from Linux system.  
 

Cause: 
This could be because of file level cache/kernel level caching because of which you were seeing stale copy of blob.   

 
Steps for mitigation:  
File caching plays an important role in the data integrity. The blobfuse2 supports read and write operations but the continuous synchronization of data written to storage by using other APIs or other mounts of blobFuse2 isn't always guaranteed. For data integrity, we do not recommend that multiple sources modify the same blob. Please note that this is how blobfuse2 is designed, and we have it documented in our public link as well. Reference link: What is BlobFuse? - BlobFuse2 - Azure Storage | Microsoft Learn 
 
For this issue, we can explore below options: 

  1. This could happen because of the file-cache timeout parameters configured.  
  2. In config file under "libfuse" section add "disable-writeback-cache: true". 
  3. Setting the timeout= 0 in the libfuse section and removing the attr_cache from the config file. 
  4. We can remove the kernel level caching using the command: sysctl -w vm.drop_caches=3 

Word of Caution: Please note that this option removes all the kernel level caching. 

Symptoms scenario 2:  
You are unable to persist the blobfuse2 mount point on your Linux machine.   

 
Cause:  
Usually, after restarting the Linux machine, the blobfuse2 mount point is getting unmounted and you want to know how to persist the mount point in such scenarios.  

 
Resolution: 

You need to ensure that the below steps are performed for persisting the mount point.

  1. Make sure the fuse package is installed (e.g., yum install fuse3 / apt-get install fuse3) 
  2. Update config.yaml file with your preferred configuration. 
  3. Edit /etc/fstab with the blobfuse script. 

Add the following line to use mount.sh: 

 

 

/<path_to_blobfuse2_mount.sh_file>/mount.sh </path/to/desired/mountpoint> fuse defaults, netdev 0 0 

 

 

/<path_to_blobfuse2_mount.sh_file>/mount.sh </path/to/desired/mountpoint> fuse defaults, netdev 0 0 
 

OR Add the following line to run without mount.sh  

 

 

blobfuse2 /home/azureuser/mntblobfuse fuse defaults,_netdev,--config-file=/home/azureuser/config.yaml,allow_other 0 0 

 

 

blobfuse2 /home/azureuser/mntblobfuse fuse defaults,_netdev,--config-file=/home/azureuser/config.yaml,allow_other 0 0 

 

For more information regarding the blobfuse2 persistent mount point, you can refer to the link: https://github.com/Azure/azure-storage-fuse/wiki/Blobfuse2-Installation#persisting-mount 

 

Symptom scenario 3:  
You are uploading a large data (nearly 1TB) to the storage account using blobfuse2. However, you are seeing that the cache is getting filled up fast as you are running out of disk error while using file cache mode.  

 

Cause:  

The design of blobfuse2 is such that the data will be cached on the temporary path that you have set and then, the data will be parallelly uploaded to the storage account (provided you are making use of file cache in blobfuse2).  

Thus, we need to make sure that we have enough space to contain all open files. For example, since you want to upload 1TB of data, you need to ensure that your cache space is of 1TB or more, before the data gets committed to storage account. 

Resolution:  

For this issue, we can mitigate it with below possible solution: 

 

  1. Please also make use of --file-cache-timeout=0 while mounting. Please refer to article GitHub - Azure/azure-storage-fuse: A virtual file system adapter for Azure Blob storage for more information.  
     
  2. You can explore the option of stream mode while using Blobfuse2. Additionally, we have a new feature called as block-caching. The block caching is optimized for In-Memory caching. This feature is recommended for In-Memory caching scenarios and is better than streaming. The Block-cache is the evolution of streaming to next level with prefetching and much higher performance. Therefore it’s recommended to use block-cache instead of streaming now. Along with memory caching, Block Caching can use also disk to cache few blocks if we provide block-cache-path under the block_cache component. 
     
    Word of Caution: Each Caching type can’t co-exist with other, so do draft your config file accordingly. 
     

Symptom scenario 4: 
You are observing performance degradation while transferring data to the Storage Account using stream mode compared to file-cache mode.  

 
Cause:  

Blobfuse2 write operation using streaming will be slower, because data is uploaded in small block and goes serially. When file-cache is enabled data is uploaded in parallel and hence takes less time. On an average, the blobfuse2 stream performance will be comparatively low as compared to file-cache mode. 
 

Resolution:  

For this issue, we can explore the below options:   
 

  1. We can tweak the block size in the config file and monitor it ahead.  
     
  2. Use of SSD: We have documented that the use of SSD will provide low-latency buffer for blobfuse2. You can refer to the article here:https://learn.microsoft.com/en-us/azure/storage/blobs/blobfuse2-how-to-deploy#use-an-ssd . It is left to the users whether to use standard or premium SSD. You can refer to various SSD available in Azure here:https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types  

  3. You can make use of Block caching as it resolves lot of perf issues in streaming. You can refer config  details here: https://github.com/Azure/azure-storage-fuse/blob/main/setup/baseConfig.yaml#L82  

 

Symptom scenario 5:  
You are trying to mount the storage account on a Linux VM using MSI/SPN for authentication. However, you are getting the following error message while mounting the storage account: 'failed to initialize new pipeline [failed to get credential]'.   
 

Cause: 
When using MSI/SPN, we need to ensure that the permissions provided are correct along with the Application ID/ Client ID/ Object ID are correctly specified. 
 

Resolution:  
For this issue, we can explore the below option:  

  1. Please ensure that the MSI/ SPN has the storage access roles. The MSI/ SPN should have  Storage Blob Contributor role.  

  

Symptom scenario 6: 

You have mounted the blobfuse2 on your Linux machine, but the mount point is not being accessible to other users, apart from root user.    

 

Cause 

The mount point is not correctly configured to be accessed by other users.  

 

Resolution: 
Since you want the mount point to be accessible to the other the user/ or Microsoft Entra ID group, apart from the root user, you will need to pass the user’s ID in the mount command with the below flags, 

 

 

-o uid=USERID -o gid=USERID 

 

 

-o uid=USERID -o gid=USERID 

 
Additionally, you need to alter the config file to include the parameter “allow-other = true” for it to work.   
 
If you are interested in the mount point accessible for the user who has logged in, you need to ensure that there is no UID or GID flags passed in the mount command. Additionally, you can remove the sudo from the mount command. For example, the mount command can be as follows:  

linuxuser@blofuse2demo:/> blobfuse2 mount /mnt/testcon --config-file=/mnt/config.yaml 

 

linuxuser@blofuse2demo:/> blobfuse2 mount /mnt/testcon --config-file=/mnt/config.yaml 

 
Hope the article helps in troubleshooting the blobfuse2 issues! 

Co-Authors
Version history
Last update:
‎Apr 10 2024 10:58 AM
Updated by: