Blog Post

Startups at Microsoft
12 MIN READ

From zero to hero: Mastering storage in Azure Kubernetes Service (AKS)

rmmartins's avatar
rmmartins
Icon for Microsoft rankMicrosoft
Mar 26, 2025

In today’s cloud-native era, managing persistent storage is a critical component of any production-grade Kubernetes deployment. Azure Kubernetes Service (AKS) provides a robust and scalable platform for orchestrating containerized applications. However, the challenge lies in selecting and configuring the right storage solution for stateful applications such as databases and content management systems. 

This guide is designed to take you from a storage novice to an expert in AKS storage management. You’ll learn essential storage concepts, evaluate various storage classes, and implement best practices tailored for AKS. To make these concepts tangible, we’ll walk you through hands-on labs where we deploy a sample WordPress and MySQL application in a dedicated namespace. In the process, you’ll see how to provision dynamic persistent volumes, benchmark performance using fio, and set up Velero for comprehensive backup and disaster recovery. Whether you’re looking to optimize performance or ensure data resilience, this guide has you covered. 

1. Understanding storage in AKS

1.1 Why storage matters in Kubernetes 

Kubernetes pods are ephemeral and may be recreated at any time. For stateful applications—such as databases or content management systems—data persistence is essential. AKS supports several storage backends, letting you choose the best option for your workload. Whether you need high-performance storage for databases or shared file systems for applications, understanding these options ensures your data remains durable and accessible. 

1.2 Storage options overview 

Storage Type 

Persistence 

Performance 

Use Case 

Ephemeral OS Disk 

No 

High 

Stateless apps, caching 

Azure Managed Disks 

Yes 

High 

Databases, stateful applications 

Azure Files 

Yes 

Medium 

Shared file systems, SMB/NFS storage 

Azure NetApp Files 

Yes 

Very High 

Enterprise-grade workloads 

Azure Blob Storage 

Yes 

Variable 

Data lakes, backup, archiving 

Azure Container Storage 

Yes 

Variable; optimized for container use 

Container-native block storage for databases/streaming/caching/messaging/other generic stateful applications 


Consider both performance and cost. Premium SSD-backed disks (managed-csi-premium) offer high IOPS and low latency for databases, whereas Azure Files can be more cost-effective for shared access needs.

2. Choosing the right storage class for AKS 

What is a StorageClass and its relation to storage options? 

A StorageClass is a Kubernetes resource that defines a "class" or policy for dynamically provisioning persistent storage. It specifies parameters—such as the type of disk (e.g., SSD, HDD), performance characteristics, replication policies, and other configuration options—used by the underlying storage provider. 

How it relates to storage options: 

  • Storage options refer to the various types of storage available in your environment, such as Azure Managed Disks, Azure Files, Azure Blob Storage, etc. Each storage option has distinct characteristics in terms of performance, cost, and use case. 
  • A StorageClass acts as a bridge between your Kubernetes applications and these underlying storage options. When you create a PersistentVolumeClaim (PVC), you reference a StorageClass, which instructs Kubernetes on how to dynamically provision a PersistentVolume (PV) based on the specified parameters and the chosen storage option. 

In essence, while storage options represent the physical or managed storage resources available, a StorageClass defines the rules for how that storage is allocated and managed within Kubernetes. This abstraction allows developers to simply request storage with a particular performance or cost profile, without needing to know the details of the underlying storage infrastructure. 

AKS offers several predefined storage classes tailored for different needs: 

  • managed-csi: Standard managed disk (SSD) for general-purpose workloads. 
  • managed-csi-premium: Premium SSD-backed disks for high-performance requirements. 
  • azurefile-csi: Azure Files with SMB/NFS support for shared file systems. 
  • azurefile-csi-premium: Premium Azure Files (via CSI) for workloads that require higher throughput and performance. 
  • azureblob-nfs-premium: Premium Azure Blob Storage using NFS for scenarios that require file-system access. 
  • azureblob-fuse-premium: Premium Azure Blob Storage mounted using Blobfuse for specialized workloads. 

When selecting a storage class, consider your application’s IOPS, latency, and capacity needs. For instance, a database may require the low latency and high IOPS offered by premium SSDs (managed-csi-premium), while a file server or shared resource might be better served by Azure Files. Understanding these trade-offs is key to optimizing both performance and cost. 

3. Hands-on labs 

In this section, we deploy MySQL and WordPress within a dedicated namespace called wordpress. This exercise demonstrates the practical usage of one AKS storage option (Azure Managed Disks) while teaching key storage concepts such as dynamic PV binding, PVC provisioning via StatefulSet, and automated database initialization. 

Overview diagram 

Below is a diagram illustrating how MySQL and WordPress interact in AKS: 


Lab 1: Setting ip the AKS cluster and deploying MySQL

Step 1: Create an AKS cluster 

First, create your resource group and AKS cluster: 

az group create --name MyResourceGroup --location eastus 
az aks create                       \ 
  --resource-group MyResourceGroup  \ 
  --name MyAKSCluster               \ 
  --node-count 3                    \ 
  --enable-addons monitoring        \ 
  --generate-ssh-keys 

After the cluster is created, retrieve its credentials: 

az aks get-credentials --resource-group MyResourceGroup --name MyAKSCluster 

Step 2: Create the "wordpress" namespace 

Now that your cluster is ready, create a dedicated namespace for your WordPress and MySQL deployments: 

kubectl create namespace wordpress 

Verify the namespace: 

kubectl get namespaces 

(Optional step: Manual PVC creation for MySQL) 

Note: When using a StatefulSet with a volumeClaimTemplate (as shown in Step 5), Kubernetes automatically creates a PVC for each pod (e.g., mysql-storage-mysql-0). This is the recommended approach for stateful applications. Manual PVC creation for MySQL is unnecessary and may lead to redundant resources. 

This step is shown here only for demonstration and can be skipped. 

cat <<EOF | kubectl apply -f - 
apiVersion: v1 
kind: PersistentVolumeClaim 
metadata: 
  name: mysql-pvc 
  namespace: wordpress 
spec: 
  accessModes: 
    - ReadWriteOnce 
  resources: 
    requests: 
      storage: 10Gi 
  storageClassName: managed-csi-premium 
EOF  

Verify the PVC (if created): 

kubectl get pvc mysql-pvc -n wordpress 

Step 3: Create a headless service for MySQL 

A headless service ensures proper DNS resolution for MySQL pods: 

cat <<EOF | kubectl apply -f - 
apiVersion: v1 
kind: Service 
metadata: 
  name: mysql 
  namespace: wordpress 
spec: 
  clusterIP: None 
  selector: 
    app: mysql 
  ports: 
    - port: 3306 
      targetPort: 3306 
EOF 

 Verify the service: 

kubectl get svc -n wordpress mysql 

Step 4: Deploy MySQL with persistent storage (Including subPath & auto DB initialization) 

Dynamically provisioned PVs (e.g., using ext4) include a default lost+found directory, which could cause MySQL to consider the data directory non-empty. We use the subPath option to mount a specific subdirectory (e.g., mysql-data) and avoid this issue. Additionally, we automate database creation so you don't have to log in manually. 

Automate database initialization: 

cat <<EOF | kubectl apply -f - 
apiVersion: v1 
kind: ConfigMap 
metadata: 
  name: mysql-initdb 
  namespace: wordpress 
data: 
  init-wordpress.sql: | 
    CREATE DATABASE IF NOT EXISTS wordpress; 
EOF 

Verify the ConfigMap:

kubectl get configmaps -n wordpress 

Step 5: Deploy MySQL using a StatefulSet with a volumeClaimTemplate 

This approach automatically creates a unique PVC for the MySQL pod (e.g., mysql-storage-mysql-0), ensuring consistency and eliminating the need for a separate manual PVC. 

cat <<EOF | kubectl apply -f - 
apiVersion: apps/v1 
kind: StatefulSet 
metadata: 
  name: mysql 
  namespace: wordpress 
spec: 
  serviceName: mysql 
  replicas: 1 
  selector: 
    matchLabels: 
      app: mysql 
  template: 
    metadata: 
      labels: 
        app: mysql 
    spec: 
      containers: 
      - name: mysql 
        image: mysql:5.7 
        env: 
        - name: MYSQL_ROOT_PASSWORD 
          value: "password123" 
        volumeMounts: 
        - name: mysql-storage 
          mountPath: /var/lib/mysql 
          subPath: mysql-data  # Mount only this subdirectory to avoid 'lost+found' 
        - name: initdb 
          mountPath: /docker-entrypoint-initdb.d 
      volumes: 
      - name: initdb 
        configMap: 
          name: mysql-initdb 
  volumeClaimTemplates: 
  - metadata: 
      name: mysql-storage 
    spec: 
      accessModes: [ "ReadWriteOnce" ] 
      resources: 
        requests: 
          storage: 10Gi 
      storageClassName: managed-csi-premium 
EOF

 Verify the StatefulSet: 

kubectl get statefulset mysql -n wordpress 

Verify the created pods: 

kubectl get pods -n wordpress -l app=mysql 

Verify the PVCs created by the volumeClaimTemplate: 

kubectl get pvc -n wordpress 

And check its logs to confirm successful initialization (including automatic database creation): 

kubectl logs -n wordpress mysql-0 


Lab 2: Deploying WordPress and connecting to MySQL
 
Step 1: Create a separate PVC for WordPress files 
We use a separate PVC for WordPress files to keep application data distinct from database storage: 

cat <<EOF | kubectl apply -f - 
apiVersion: v1 
kind: PersistentVolumeClaim 
metadata: 
  name: wordpress-pvc 
  namespace: wordpress 
spec: 
  accessModes: 
    - ReadWriteOnce 
  resources: 
    requests: 
      storage: 10Gi 
  storageClassName: azurefile-csi 
EOF

Verify the PVC:

kubectl get pvc wordpress-pvc -n wordpress

 

Step 2: Create secrets for WordPress credentials 

Store the required credentials securely: 

cat <<EOF | kubectl apply -f - 
apiVersion: v1 
kind: Secret 
metadata: 
  name: wordpress-secrets 
  namespace: wordpress 
type: Opaque 
data: 
  WORDPRESS_DB_USER: cm9vdA==         	# "root" in base64 
  WORDPRESS_DB_PASSWORD: cGFzc3dvcmQxMjM=	# "password123" in base64 
EOF

Verify the secrets: 

Step 3: Deploy WordPress 

Deploy WordPress using a Deployment that connects to MySQL: 

cat <<EOF | kubectl apply -f - 
apiVersion: apps/v1 
kind: Deployment 
metadata: 
  name: wordpress 
  namespace: wordpress 
spec: 
  replicas: 1 
  selector: 
    matchLabels: 
      app: wordpress 
  template: 
    metadata: 
      labels: 
        app: wordpress 
    spec: 
      containers: 
      - name: wordpress 
        image: wordpress:latest 
        env: 
        - name: WORDPRESS_DB_HOST 
          value: mysql 
        - name: WORDPRESS_DB_USER 
          valueFrom: 
            secretKeyRef: 
              name: wordpress-secrets 
              key: WORDPRESS_DB_USER 
        - name: WORDPRESS_DB_PASSWORD 
          valueFrom: 
            secretKeyRef: 
              name: wordpress-secrets 
              key: WORDPRESS_DB_PASSWORD 
        volumeMounts: 
        - name: wp-storage 
          mountPath: /var/www/html 
      volumes: 
      - name: wp-storage 
        persistentVolumeClaim: 
          claimName: wordpress-pvc 
EOF 

Verify the WordPress deployment: 

kubectl get deployments wordpress -n wordpress 

Verify the WordPress pods:  

kubectl get pods -n wordpress 

Step 4: Expose WordPress externally 

Create a LoadBalancer service to expose WordPress externally: 

cat <<EOF | kubectl apply -f - 
apiVersion: v1 
kind: Service 
metadata: 
  name: wordpress-service 
  namespace: wordpress 
spec: 
  selector: 
    app: wordpress 
  ports: 
    - protocol: TCP 
      port: 80 
      targetPort: 80 
  type: LoadBalancer 
EOF 

Verify the service: 

kubectl get svc wordpress-service -n wordpress 

To extract just the external IP, run: 

kubectl get svc wordpress-service -n wordpress \ 
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}{"\n"}' 

Open the external IP in your browser to complete the WordPress setup. With our automated initialization, the wordpress database is created automatically in MySQL, so no manual intervention is required. 

4. Performance benchmarking for storage classes 

To measure IOPS and throughput, deploy a temporary pod with fio: 

kubectl run storage-test -n wordpress --rm -it --image=debian – bash 

Inside the pod, run: 

apt-get update && apt-get install -y fio 
fio --name=randwrite --ioengine=libaio --rw=randwrite --bs=4k --size=1G --numjobs=4 --runtime=60 --group_reporting 

Example fio output explanation 

Interpretation: 

1. IOPS (Input/Output Operations Per Second) 

  • IOPS = 8913 
    On average, this setup handled about 8,900 random 4K write operations per second. This is a solid indicator of how many small, random writes the storage can handle in parallel. 

2. Bandwidth (Throughput) 

  • BW = 34.8 MiB/s (≈36.5 MB/s) 
    Over the 60-second test, the system wrote roughly 34.8 MiB of data per second. This corresponds well to the IOPS figure when considering each write is 4K in size. 

3. Latency 

  • slat (submission latency): ~446 µs on average. 
    This is the time it takes fio to submit the I/O request to the kernel or I/O subsystem. 
  • clat (completion latency): ~1.59 µs on average (but with some large outliers). 
    This is the time from when the request is submitted to when it’s completed by the storage. 
  • lat (total latency): ~448 µs on average. 
    The overall latency (submission + completion) is under half a millisecond on average, which is decent for a cloud-based block storage scenario. However, there are a few high spikes, as seen by the maximum lat of ~123 ms. 

4. Percentiles 

  • The 99th percentile latencies are important for understanding worst-case performance: 
    • 99.00th = 9024 ns (~9 µs) 
    • 99.90th = 26496 ns (~26 µs) 
    • 99.95th = 64256 ns (~64 µs) 
    • 99.99th = 692224 ns (~692 µs) 
  • Most I/Os complete quickly, but we do see occasional outliers (in the hundreds of microseconds to over a millisecond). This can happen in bursty or cloud storage environments. 

5. CPU Usage 

  • usr=0.48%, sys=1.43% 
    The CPU overhead is relatively low, indicating that the storage performance (rather than CPU resources) is the primary bottleneck. 

6. IO Depth 

  • iodepth=1 for each job (4 jobs total). 
    This means fio is issuing only one I/O request at a time per job. The results might differ if you increase the iodepth to allow more in-flight requests. 

7. Summary 

  • Overall IOPS: ~8,900 
  • Throughput: ~34.8 MiB/s 
  • Average Latency: ~448 µs, with occasional spikes 
  • CPU usage is low, suggesting storage is the limiting factor rather than compute. 

These metrics suggest that the AKS storage (backed by Azure Managed Disks in this scenario) can handle ~8,900 random 4K writes per second at an average latency of under half a millisecond—a respectable performance for many stateful applications. These values provide a benchmark to compare against your workload's expected performance.  

For instance, if you plan to run a high-transaction database, you might expect higher IOPS and lower latencies; in that case, you might consider using an even higher-performance storage class (e.g., Azure Ultra Disk), increase replication, or tune application-level caching and concurrency. Conversely, if your workload is less I/O intensive, these results confirm that your current storage configuration is sufficient. 

5. Backup & Disaster Recovery 

Reliable backup and disaster recovery are critical for production systems. Here, we cover two methods: using Velero and backing up MySQL with database dumps. 

5.1 Installing Velero for cluster backups 

Step 1: Install the Velero CLI 

Download the Velero CLI from the Velero releases page. For Linux: 

wget https://github.com/vmware-tanzu/velero/releases/download/v1.9.3/velero-v1.9.3-linux-amd64.tar.gz 
tar -xzvf velero-v1.9.3-linux-amd64.tar.gz 
sudo mv velero-v1.9.3-linux-amd64/velero /usr/local/bin/velero 
velero version

Step 2: Retrieve your Subscription ID 

Run: 

az account show --query id --output tsv 

This command returns your subscription ID

Step 3: Create a service principal for Velero 

Replace <your-subscription-id> with your subscription ID: 

az ad sp create-for-rbac --name VeleroSP --role Contributor --scopes /subscriptions/<your-subscription-id> 

 Sample output: 

{ 
  "appId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", 
  "displayName": "VeleroSP", 
  "password": "your-generated-password", 
  "tenant": "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy" 
} 

 Verify the service principal:  

az ad sp list --display-name VeleroSP 

Step 4: Create the Velero credentials file 

Create a file named credentials-velero with the following content (replace placeholders with actual values):

AZURE_SUBSCRIPTION_ID=<your-subscription-id> 
AZURE_TENANT_ID=<your-tenant-id> 
AZURE_CLIENT_ID=<your-appId> 
AZURE_CLIENT_SECRET=<your-password>

 Step 5: Install Velero 

Run the following command (adjust the plugin version if necessary): 

velero install                                              \ 
  --provider azure                                          \ 
  --plugins velero/velero-plugin-for-microsoft-azure:v1.9.1 \ 
  --bucket my-backup-bucket                                 \ 
  --secret-file ./credentials-velero                        \ 
  --use-volume-snapshots=true                               \ 
  --backup-location-config resourceGroup=myResourceGroup,storageAccount=myStorageAccount 

The output will show the creation of CRDs, namespace, service account, and deployment. When finished, it will state: 

Velero is installed! ⛵ Use 'kubectl logs deployment/velero -n velero' to view the status. 

5.2 Backing up MySQL using database dumps 

Step 1: Get the MySQL pod name 

Retrieve the MySQL pod name with: 

kubectl get pods -n wordpress -l app=mysql -o jsonpath='{.items[0].metadata.name}' 

Step 2: Create a MySQL dump 

Replace <mysql-pod-name> with the actual name (e.g., mysql-0): 

kubectl exec -it <mysql-pod-name> -n wordpress -- mysqldump -u root -p wordpress > wordpress-backup.sql 

Step 3: Restore MySQL from the dump 

kubectl cp wordpress-backup.sql <mysql-pod-name>:/tmp/wordpress-backup.sql -n wordpress 
kubectl exec -it <mysql-pod-name> -n wordpress -- mysql -u root -p wordpress < /tmp/wordpress-backup.sql 

 6. Troubleshooting & Best Practices

  • Stuck PVCs: 
    If a PVC remains stuck during deletion, inspect for finalizers and remove them: 
kubectl patch pvc mysql-pvc -n wordpress -p '{"metadata":{"finalizers":[]}}' 
  •  MySQL initialization errors: 
    The error regarding a non-empty data directory is usually due to the default lost+found folder on ext4 file systems. The subPath fix ensures MySQL only sees an empty subdirectory. 
  •  PVC pending state: 
    When a PVC is created, it may initially show as Pending while the dynamic storage provisioner creates and binds an underlying PV. This typically takes only a few moments. 
  •  DNS resolution & network connectivity: 
    Verify that WordPress resolves the MySQL service correctly by running: 
kubectl run debug --rm -it --image=busybox -n wordpress -- nslookup mysql 
  • Credentials & Environment Variables: 
    Ensure that the environment variables in your WordPress deployment match those required by MySQL.

7. Final Thoughts and Next Steps

This blog post focused on mastering storage in AKS by exploring various storage options and best practices. The deployment of WordPress and MySQL served as an exercise to demonstrate how to implement one AKS storage option—Azure Managed Disks—while teaching key storage concepts such as PVC provisioning via a StatefulSet (using volumeClaimTemplate), dynamic PV binding, and automated database initialization using subPath. We also covered performance benchmarking using fio and provided a complete step-by-step guide to installing Velero for backup and disaster recovery. 

References about AKS Storage: 

 Next Steps: 

  • Experiment with advanced scaling and monitoring for both MySQL and WordPress. 
  • Implement additional security measures for production deployments. 
  • Explore other AKS storage options (like Azure Files or Blob Storage) and benchmark their performance based on your specific workload requirements. 

Happy deploying, and enjoy mastering storage in AKS! 

Updated Apr 16, 2025
Version 2.0
No CommentsBe the first to comment