FastTrack for Azure

7 MIN READ

Using Trident to Automate Azure NetApp Files from OpenShift

cloudtrooper

Microsoft

May 21, 2021

Dynamic Storage provisioning with NetApp Trident for ARO with ANF

This post has been written with the collaboration of Rizul Khanna.

Azure RedHat OpenShift (ARO) is a managed OpenShift platform that runs a Kubernetes-powered container engine to run and manage applications in an agile way. OpenShift leverages Kubernetes technologies to provide the required infrastructure to run workloads, including the necessary storage for stateful applications.

There are multiple options that can be used with Azure RedHat OpenShift to provide storage, being Azure NetApp Files (ANF) one of them. ANF-backed volumes provide multiple advantages, for example:

Kubernetes “ReadWriteMany” access mode, meaning that the same volume can be mounted simultaneously by multiple application instances
Dynamic allocation of volumes to different performance classes, without having to unmount the volumes
No impact on the maximum number of disks or the maximum IOPS limits of Azure Virtual Machines (Azure Disks do count against these limits)

With the Trident software component, some additional functionality are available for all Kubernetes and OpenShift clusters using Azure NetApp Files: volumes can be dynamically created in Azure as they are demanded by the application. Instead of having to go to the Azure portal and manage the volumes in ANF accounts, Trident will take care of the lifecycle of the ANF volumes: it will create them when they are required, and delete them when they are not. Operational complexity is reduced, and application deployments are simplified.

The following diagram summarizes the overall architecture of Trident:

In essence, an OpenShift pod will mount a persistent volume (PV), that is based on a persistent volume claim (PVC). This PVC is the Kubernetes representation of an ANF volume: when a PVC is created, it is based on a storage class (SC) that will tell Trident how to create the volume in Azure. When the PVC is deleted, Trident will delete the volume from the ANF account.

In this blog we will cover some of the new features of Trident version 21.7.1, the latest at the time of this writing, such as service level selectors and specifying a list of capacity pools where the NetApp volumes will be provisioned.

ANF and Trident backend

As a first step, the Azure NetApp Files accounts and the capacity pools need to be created. In this example we will work with one single account, and three capacity pools, which appear like this in the Azure portal:

The first thing after installing Trident in the OpenShift cluster (with the command “tridentctl install”) is creating the backend, that will instruct Trident which account and capacity pools to use when:

{
  "backendName": "$anf_name",
  "version": 1,
  "storageDriverName": "azure-netapp-files",
  "subscriptionID": "$subscription_id",
  "tenantID": "$tenant_id",
  "clientID": "$sp_app_id",
  "clientSecret": "$sp_app_secret",
  "location": "$anf_location",
  "serviceLevel": "$anf_sku",
  "virtualNetwork": "$vnet_name",
  "subnet": "$anf_subnet_name",
  "nfsMountOptions": "vers=3,proto=tcp,timeo=600",
  "limitVolumeSize": "4Ti",
  "capacityPools": [ "${anf_name}-Standard", "${anf_name}-Premium", "${anf_name}-Ultra" ],
  "defaults": {
    "exportRule": "0.0.0.0/0",
    "size": "200Gi"
  },
 "storage": [{
      "labels": {
        "performance": "gold"
      },
      "serviceLevel": "Ultra"
    },
    {
      "labels": {
        "performance": "silver"
      },
      "serviceLevel": "Premium"
    },
    {
      "labels": {
        "performance": "bronze"
      },
      "serviceLevel": "Standard"
    }
  ]
}

There are a few things to notice:

The different service levels at the end allow for specifying which ANF capacity pool to use, depending on the performance of the persistent volume being requested in Kubernetes.
Another new feature is the possibility of specifying a list of valid ANF pools where Trident will choose from when creating new volumes. Otherwise ANF would just pick any ANF capacity pool that matches the required ANF service level from your Azure subscription. In this case we only have 3 pools, but if you have more it is a good practice to define from which pools you want your OpenShift volumes to be provisioned from.
The Service Principal ID and secret should have permissions in Azure to modify the ANF pools, so that Trident can manage the volumes

Storage Classes and Persistent Volume Claims

After the Trident backend has been created, storage classes can be created. Storage classes are the Kubernetes resource that instructs the control plane how to create persistent volume claims as they are demanded by the applications:

# Storage classes
perf_tiers=(bronze silver gold)           # To create a SC per tier
for perf_tier in "${perf_tiers[@]}"
do
  cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: anf-$perf_tier
provisioner: csi.trident.netapp.io
parameters:
  backendType: "azure-netapp-files"
  fsType: "nfs"
  selector: "performance=$perf_tier"  # Matching labels in the backends...
allowVolumeExpansion: true            # To allow volume resizing. This parameter is optional
mountOptions:
  - nconnect=16
EOF
done

There are some things to note in the previous code:

It is essentially a “for” loop in bash, that creates three different storage classes (bronze, silver and gold)
For each storage class, the “csi.trident.netapp.io” Trident provisioner is specified as backend. Note that this version is based on CSI, instead of on the old in-tree drivers
In the parameters, a “selector” tag matches the labels in the provisioner backends “performance=bronze”, “performance=silver” and “performance=gold”. So when a PVC is created using a certain storage class, the class knows which provisioner backend to use
The option “allowVolumeExpansion” enables increasing volume sizes without having to unmount the NFS shares
In the mount options, nconnect=16 is specified for best performance, to increase the number of parallel NFS threads

After creating the storage classes, they can be inspected with the kubectl command:

❯ kubectl get storageclass
NAME                        PROVISIONER                RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
anf-bronze                  csi.trident.netapp.io      Delete          Immediate              true                   88s
anf-gold                    csi.trident.netapp.io      Delete          Immediate              true                   80s
anf-silver                  csi.trident.netapp.io      Delete          Immediate              true                   84s
managed-premium (default)   kubernetes.io/azure-disk   Delete          WaitForFirstConsumer   true                   60m

In ARO, the storage classes are visible from the console as well:

After having the storage classes, Persistent Volume Claims that reference them can be deployed. For example, this bash loop would create three PVCs, one per storage tier (gold/silver/bronze). Each PVCs refers to the specific storage class (“anf-gold”, “anf-silver” or “anf-bronze”) created previously:

# PVCs
perf_tiers=(gold silver bronze)           # To create a SC per tier
for perf_tier in "${perf_tiers[@]}"
do
  cat <<EOF | kubectl apply -f -
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: anf-$perf_tier
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 4000Gi
  storageClassName: anf-$perf_tier
EOF
done

Again, some things to notice:

The PVCs are created with 4TB to get good performance, since further down we will have a look at how the volumes perform.
The PVCs are created with the “ReadWriteMany” access mode (ANF and Trident support ReadWriteOnce too)

At this point, the volumes will be created in the Azure NetApp Files accounts. You can inspect them via the CLI (“az netappfiles volume list”) or the Azure portal:

And of course, the PVCs are visible in Kubernetes and the OpenShift console:

❯ k get pvc
NAME         STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
anf-bronze   Pending                                      anf-bronze     2m36s
anf-gold     Pending                                      anf-gold       2m38s
anf-silver   Pending                                      anf-silver     2m37s

Using the PVCs

Once the PVCs are available, Kubernetes pods can mount Persistent Volumes based on them. This code deploys a sample application (a Python-based API that I often use for testing) that leverages one of the PVCs created earlier on, in this particular example in the gold class:

# Deployment & Service
perf_tier=gold
name=api-$perf_tier
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: $name
  labels:
    app: $name
    deploymethod: trident
spec:
  replicas: 2
  selector:
    matchLabels:
      app: $name
  template:
    metadata:
      labels:
        app: $name
        deploymethod: trident
    spec:
      containers:
      - name: $name
        image: erjosito/sqlapi:1.0
        ports:
        - containerPort: 8080
        volumeMounts:
        - name: disk01
          mountPath: /mnt/disk
      volumes:
      - name: disk01
        persistentVolumeClaim:
          claimName: anf-$perf_tier
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: $name
  name: $name
spec:
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    app: $name
  type: LoadBalancer
EOF

Performance

As the previous Azure screenshot of the volumes shows, the volume in the gold class should support 512 MiB/s. If we install fio in the pod and do a quick test against the mounted volume, we will see the performance offered by Azure NetApp Files:

pod_name=$(kubectl get pod -o json -l app=api-${perf_tier} | jq -r '.items[0].metadata.name')
kubectl exec $pod_name -- /bin/bash -c "apt update && apt install -y fio ioping --fix-missing"
kubectl exec $pod_name -- /bin/bash -c "cd /mnt/disk && fio --name=8kseqreads --rw=read --direct=1 --ioengine=libaio --bs=64k --numjobs=16 --iodepth=1 --size=1G --runtime=60 --group_reporting --fallocate=none"
[...]
   READ: bw=516MiB/s (542MB/s), 516MiB/s-516MiB/s (542MB/s-542MB/s), io=16.0GiB (17.2GB), run=31723-31723msec

We can measure the storage latency as well, which is really great (sub-millisecond) for an NFS-mounted volume (consider that Azure Disks is targeting at “single-digit millisecond latency for most I/O operations”):

❯ kubectl exec $pod_name -- /bin/bash -c "cd /mnt/disk && ioping -c 20 ."
[...]
--- . (nfs 192.168.0.196:/anf-1788fdae0a034ae28e9c85ef126bc1db) ioping statistics ---
19 requests completed in 21.2 ms, 76 KiB read, 897 iops, 3.50 MiB/s
generated 20 requests in 19.0 s, 80 KiB, 1 iops, 4.21 KiB/s
min/avg/max/mdev = 779.7 us / 1.11 ms / 4.25 ms / 746.0 us

Conclusion

Azure NetApp Files is a great way to provide persistent storage to Kubernetes and OpenShift workloads. Trident includes important functionality such as managing the whole lifecycle of ANF volumes using the Kubernetes APIs, dynamically creating and deleting ANF volumes as demanded by the applications running on the OpenShift cluster.

Updated Sep 27, 2021

Version 3.0

Cloud Native Apps

cloudtrooper

Microsoft

Joined October 02, 2018

View Profile

FastTrack for Azure

Follow this blog board to get notified when there's new activity