In today’s cloud-native era, managing persistent storage is a critical component of any production-grade Kubernetes deployment. Azure Kubernetes Service (AKS) provides a robust and scalable platform for orchestrating containerized applications. However, the challenge lies in selecting and configuring the right storage solution for stateful applications such as databases and content management systems.
This guide is designed to take you from a storage novice to an expert in AKS storage management. You’ll learn essential storage concepts, evaluate various storage classes, and implement best practices tailored for AKS. To make these concepts tangible, we’ll walk you through hands-on labs where we deploy a sample WordPress and MySQL application in a dedicated namespace. In the process, you’ll see how to provision dynamic persistent volumes, benchmark performance using fio, and set up Velero for comprehensive backup and disaster recovery. Whether you’re looking to optimize performance or ensure data resilience, this guide has you covered.
1. Understanding storage in AKS
1.1 Why storage matters in Kubernetes
Kubernetes pods are ephemeral and may be recreated at any time. For stateful applications—such as databases or content management systems—data persistence is essential. AKS supports several storage backends, letting you choose the best option for your workload. Whether you need high-performance storage for databases or shared file systems for applications, understanding these options ensures your data remains durable and accessible.
1.2 Storage options overview
Storage Type |
Persistence |
Performance |
Use Case |
Ephemeral OS Disk |
No |
High |
Stateless apps, caching |
Azure Managed Disks |
Yes |
High |
Databases, stateful applications |
Azure Files |
Yes |
Medium |
Shared file systems, SMB/NFS storage |
Azure NetApp Files |
Yes |
Very High |
Enterprise-grade workloads |
Azure Blob Storage |
Yes |
Variable |
Data lakes, backup, archiving |
Azure Container Storage |
Yes |
Variable; optimized for container use |
Container-native block storage for databases/streaming/caching/messaging/other generic stateful applications |
Consider both performance and cost. Premium SSD-backed disks (managed-csi-premium) offer high IOPS and low latency for databases, whereas Azure Files can be more cost-effective for shared access needs.
2. Choosing the right storage class for AKS
What is a StorageClass and its relation to storage options?
A StorageClass is a Kubernetes resource that defines a "class" or policy for dynamically provisioning persistent storage. It specifies parameters—such as the type of disk (e.g., SSD, HDD), performance characteristics, replication policies, and other configuration options—used by the underlying storage provider.
How it relates to storage options:
- Storage options refer to the various types of storage available in your environment, such as Azure Managed Disks, Azure Files, Azure Blob Storage, etc. Each storage option has distinct characteristics in terms of performance, cost, and use case.
- A StorageClass acts as a bridge between your Kubernetes applications and these underlying storage options. When you create a PersistentVolumeClaim (PVC), you reference a StorageClass, which instructs Kubernetes on how to dynamically provision a PersistentVolume (PV) based on the specified parameters and the chosen storage option.
In essence, while storage options represent the physical or managed storage resources available, a StorageClass defines the rules for how that storage is allocated and managed within Kubernetes. This abstraction allows developers to simply request storage with a particular performance or cost profile, without needing to know the details of the underlying storage infrastructure.
AKS offers several predefined storage classes tailored for different needs:
- managed-csi: Standard managed disk (SSD) for general-purpose workloads.
- managed-csi-premium: Premium SSD-backed disks for high-performance requirements.
- azurefile-csi: Azure Files with SMB/NFS support for shared file systems.
- azurefile-csi-premium: Premium Azure Files (via CSI) for workloads that require higher throughput and performance.
- azureblob-nfs-premium: Premium Azure Blob Storage using NFS for scenarios that require file-system access.
- azureblob-fuse-premium: Premium Azure Blob Storage mounted using Blobfuse for specialized workloads.
When selecting a storage class, consider your application’s IOPS, latency, and capacity needs. For instance, a database may require the low latency and high IOPS offered by premium SSDs (managed-csi-premium), while a file server or shared resource might be better served by Azure Files. Understanding these trade-offs is key to optimizing both performance and cost.
3. Hands-on labs
In this section, we deploy MySQL and WordPress within a dedicated namespace called wordpress. This exercise demonstrates the practical usage of one AKS storage option (Azure Managed Disks) while teaching key storage concepts such as dynamic PV binding, PVC provisioning via StatefulSet, and automated database initialization.
Overview diagram
Below is a diagram illustrating how MySQL and WordPress interact in AKS:
Lab 1: Setting ip the AKS cluster and deploying MySQL
Step 1: Create an AKS cluster
First, create your resource group and AKS cluster:
az group create --name MyResourceGroup --location eastus
az aks create \
--resource-group MyResourceGroup \
--name MyAKSCluster \
--node-count 3 \
--enable-addons monitoring \
--generate-ssh-keys
After the cluster is created, retrieve its credentials:
az aks get-credentials --resource-group MyResourceGroup --name MyAKSCluster
Step 2: Create the "wordpress" namespace
Now that your cluster is ready, create a dedicated namespace for your WordPress and MySQL deployments:
kubectl create namespace wordpress
Verify the namespace:
kubectl get namespaces
(Optional step: Manual PVC creation for MySQL)
Note: When using a StatefulSet with a volumeClaimTemplate (as shown in Step 5), Kubernetes automatically creates a PVC for each pod (e.g., mysql-storage-mysql-0). This is the recommended approach for stateful applications. Manual PVC creation for MySQL is unnecessary and may lead to redundant resources.
This step is shown here only for demonstration and can be skipped.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
namespace: wordpress
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: managed-csi-premium
EOF
Verify the PVC (if created):
kubectl get pvc mysql-pvc -n wordpress
Step 3: Create a headless service for MySQL
A headless service ensures proper DNS resolution for MySQL pods:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
name: mysql
namespace: wordpress
spec:
clusterIP: None
selector:
app: mysql
ports:
- port: 3306
targetPort: 3306
EOF
Verify the service:
kubectl get svc -n wordpress mysql
Step 4: Deploy MySQL with persistent storage (Including subPath & auto DB initialization)
Dynamically provisioned PVs (e.g., using ext4) include a default lost+found directory, which could cause MySQL to consider the data directory non-empty. We use the subPath option to mount a specific subdirectory (e.g., mysql-data) and avoid this issue. Additionally, we automate database creation so you don't have to log in manually.
Automate database initialization:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: mysql-initdb
namespace: wordpress
data:
init-wordpress.sql: |
CREATE DATABASE IF NOT EXISTS wordpress;
EOF
Verify the ConfigMap:
kubectl get configmaps -n wordpress
Step 5: Deploy MySQL using a StatefulSet with a volumeClaimTemplate
This approach automatically creates a unique PVC for the MySQL pod (e.g., mysql-storage-mysql-0), ensuring consistency and eliminating the need for a separate manual PVC.
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
namespace: wordpress
spec:
serviceName: mysql
replicas: 1
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:5.7
env:
- name: MYSQL_ROOT_PASSWORD
value: "password123"
volumeMounts:
- name: mysql-storage
mountPath: /var/lib/mysql
subPath: mysql-data # Mount only this subdirectory to avoid 'lost+found'
- name: initdb
mountPath: /docker-entrypoint-initdb.d
volumes:
- name: initdb
configMap:
name: mysql-initdb
volumeClaimTemplates:
- metadata:
name: mysql-storage
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
storageClassName: managed-csi-premium
EOF
Verify the StatefulSet:
kubectl get statefulset mysql -n wordpress
Verify the created pods:
kubectl get pods -n wordpress -l app=mysql
Verify the PVCs created by the volumeClaimTemplate:
kubectl get pvc -n wordpress
And check its logs to confirm successful initialization (including automatic database creation):
kubectl logs -n wordpress mysql-0
Lab 2: Deploying WordPress and connecting to MySQL
Step 1: Create a separate PVC for WordPress files
We use a separate PVC for WordPress files to keep application data distinct from database storage:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: wordpress-pvc
namespace: wordpress
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: azurefile-csi
EOF
Verify the PVC:
kubectl get pvc wordpress-pvc -n wordpress
Step 2: Create secrets for WordPress credentials
Store the required credentials securely:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
name: wordpress-secrets
namespace: wordpress
type: Opaque
data:
WORDPRESS_DB_USER: cm9vdA== # "root" in base64
WORDPRESS_DB_PASSWORD: cGFzc3dvcmQxMjM= # "password123" in base64
EOF
Verify the secrets:
Step 3: Deploy WordPress
Deploy WordPress using a Deployment that connects to MySQL:
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: wordpress
namespace: wordpress
spec:
replicas: 1
selector:
matchLabels:
app: wordpress
template:
metadata:
labels:
app: wordpress
spec:
containers:
- name: wordpress
image: wordpress:latest
env:
- name: WORDPRESS_DB_HOST
value: mysql
- name: WORDPRESS_DB_USER
valueFrom:
secretKeyRef:
name: wordpress-secrets
key: WORDPRESS_DB_USER
- name: WORDPRESS_DB_PASSWORD
valueFrom:
secretKeyRef:
name: wordpress-secrets
key: WORDPRESS_DB_PASSWORD
volumeMounts:
- name: wp-storage
mountPath: /var/www/html
volumes:
- name: wp-storage
persistentVolumeClaim:
claimName: wordpress-pvc
EOF
Verify the WordPress deployment:
kubectl get deployments wordpress -n wordpress
Verify the WordPress pods:
kubectl get pods -n wordpress
Step 4: Expose WordPress externally
Create a LoadBalancer service to expose WordPress externally:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
name: wordpress-service
namespace: wordpress
spec:
selector:
app: wordpress
ports:
- protocol: TCP
port: 80
targetPort: 80
type: LoadBalancer
EOF
Verify the service:
kubectl get svc wordpress-service -n wordpress
To extract just the external IP, run:
kubectl get svc wordpress-service -n wordpress \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}{"\n"}'
Open the external IP in your browser to complete the WordPress setup. With our automated initialization, the wordpress database is created automatically in MySQL, so no manual intervention is required.
4. Performance benchmarking for storage classes
To measure IOPS and throughput, deploy a temporary pod with fio:
kubectl run storage-test -n wordpress --rm -it --image=debian – bash
Inside the pod, run:
apt-get update && apt-get install -y fio
fio --name=randwrite --ioengine=libaio --rw=randwrite --bs=4k --size=1G --numjobs=4 --runtime=60 --group_reporting
Example fio output explanation
Interpretation:
1. IOPS (Input/Output Operations Per Second)
- IOPS = 8913
On average, this setup handled about 8,900 random 4K write operations per second. This is a solid indicator of how many small, random writes the storage can handle in parallel.
2. Bandwidth (Throughput)
- BW = 34.8 MiB/s (≈36.5 MB/s)
Over the 60-second test, the system wrote roughly 34.8 MiB of data per second. This corresponds well to the IOPS figure when considering each write is 4K in size.
3. Latency
- slat (submission latency): ~446 µs on average.
This is the time it takes fio to submit the I/O request to the kernel or I/O subsystem. - clat (completion latency): ~1.59 µs on average (but with some large outliers).
This is the time from when the request is submitted to when it’s completed by the storage. - lat (total latency): ~448 µs on average.
The overall latency (submission + completion) is under half a millisecond on average, which is decent for a cloud-based block storage scenario. However, there are a few high spikes, as seen by the maximum lat of ~123 ms.
4. Percentiles
- The 99th percentile latencies are important for understanding worst-case performance:
- 99.00th = 9024 ns (~9 µs)
- 99.90th = 26496 ns (~26 µs)
- 99.95th = 64256 ns (~64 µs)
- 99.99th = 692224 ns (~692 µs)
- Most I/Os complete quickly, but we do see occasional outliers (in the hundreds of microseconds to over a millisecond). This can happen in bursty or cloud storage environments.
5. CPU Usage
- usr=0.48%, sys=1.43%
The CPU overhead is relatively low, indicating that the storage performance (rather than CPU resources) is the primary bottleneck.
6. IO Depth
- iodepth=1 for each job (4 jobs total).
This means fio is issuing only one I/O request at a time per job. The results might differ if you increase the iodepth to allow more in-flight requests.
7. Summary
- Overall IOPS: ~8,900
- Throughput: ~34.8 MiB/s
- Average Latency: ~448 µs, with occasional spikes
- CPU usage is low, suggesting storage is the limiting factor rather than compute.
These metrics suggest that the AKS storage (backed by Azure Managed Disks in this scenario) can handle ~8,900 random 4K writes per second at an average latency of under half a millisecond—a respectable performance for many stateful applications. These values provide a benchmark to compare against your workload's expected performance.
For instance, if you plan to run a high-transaction database, you might expect higher IOPS and lower latencies; in that case, you might consider using an even higher-performance storage class (e.g., Azure Ultra Disk), increase replication, or tune application-level caching and concurrency. Conversely, if your workload is less I/O intensive, these results confirm that your current storage configuration is sufficient.
5. Backup & Disaster Recovery
Reliable backup and disaster recovery are critical for production systems. Here, we cover two methods: using Velero and backing up MySQL with database dumps.
5.1 Installing Velero for cluster backups
Step 1: Install the Velero CLI
Download the Velero CLI from the Velero releases page. For Linux:
wget https://github.com/vmware-tanzu/velero/releases/download/v1.9.3/velero-v1.9.3-linux-amd64.tar.gz
tar -xzvf velero-v1.9.3-linux-amd64.tar.gz
sudo mv velero-v1.9.3-linux-amd64/velero /usr/local/bin/velero
velero version
Step 2: Retrieve your Subscription ID
Run:
az account show --query id --output tsv
This command returns your subscription ID
Step 3: Create a service principal for Velero
Replace <your-subscription-id> with your subscription ID:
az ad sp create-for-rbac --name VeleroSP --role Contributor --scopes /subscriptions/<your-subscription-id>
Sample output:
{
"appId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"displayName": "VeleroSP",
"password": "your-generated-password",
"tenant": "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy"
}
Verify the service principal:
az ad sp list --display-name VeleroSP
Step 4: Create the Velero credentials file
Create a file named credentials-velero with the following content (replace placeholders with actual values):
AZURE_SUBSCRIPTION_ID=<your-subscription-id>
AZURE_TENANT_ID=<your-tenant-id>
AZURE_CLIENT_ID=<your-appId>
AZURE_CLIENT_SECRET=<your-password>
Step 5: Install Velero
Run the following command (adjust the plugin version if necessary):
velero install \
--provider azure \
--plugins velero/velero-plugin-for-microsoft-azure:v1.9.1 \
--bucket my-backup-bucket \
--secret-file ./credentials-velero \
--use-volume-snapshots=true \
--backup-location-config resourceGroup=myResourceGroup,storageAccount=myStorageAccount
The output will show the creation of CRDs, namespace, service account, and deployment. When finished, it will state:
Velero is installed! ⛵ Use 'kubectl logs deployment/velero -n velero' to view the status.
5.2 Backing up MySQL using database dumps
Step 1: Get the MySQL pod name
Retrieve the MySQL pod name with:
kubectl get pods -n wordpress -l app=mysql -o jsonpath='{.items[0].metadata.name}'
Step 2: Create a MySQL dump
Replace <mysql-pod-name> with the actual name (e.g., mysql-0):
kubectl exec -it <mysql-pod-name> -n wordpress -- mysqldump -u root -p wordpress > wordpress-backup.sql
Step 3: Restore MySQL from the dump
kubectl cp wordpress-backup.sql <mysql-pod-name>:/tmp/wordpress-backup.sql -n wordpress
kubectl exec -it <mysql-pod-name> -n wordpress -- mysql -u root -p wordpress < /tmp/wordpress-backup.sql
6. Troubleshooting & Best Practices
- Stuck PVCs:
If a PVC remains stuck during deletion, inspect for finalizers and remove them:
kubectl patch pvc mysql-pvc -n wordpress -p '{"metadata":{"finalizers":[]}}'
- MySQL initialization errors:
The error regarding a non-empty data directory is usually due to the default lost+found folder on ext4 file systems. The subPath fix ensures MySQL only sees an empty subdirectory.
- PVC pending state:
When a PVC is created, it may initially show as Pending while the dynamic storage provisioner creates and binds an underlying PV. This typically takes only a few moments.
- DNS resolution & network connectivity:
Verify that WordPress resolves the MySQL service correctly by running:
kubectl run debug --rm -it --image=busybox -n wordpress -- nslookup mysql
- Credentials & Environment Variables:
Ensure that the environment variables in your WordPress deployment match those required by MySQL.
7. Final Thoughts and Next Steps
This blog post focused on mastering storage in AKS by exploring various storage options and best practices. The deployment of WordPress and MySQL served as an exercise to demonstrate how to implement one AKS storage option—Azure Managed Disks—while teaching key storage concepts such as PVC provisioning via a StatefulSet (using volumeClaimTemplate), dynamic PV binding, and automated database initialization using subPath. We also covered performance benchmarking using fio and provided a complete step-by-step guide to installing Velero for backup and disaster recovery.
References about AKS Storage:
- Storage options for applications in AKS
- AKS storage options
- Field tips for AKS storage provisioning
- Everything you want to know about ephemeral OS disks and AKS
Next Steps:
- Experiment with advanced scaling and monitoring for both MySQL and WordPress.
- Implement additional security measures for production deployments.
- Explore other AKS storage options (like Azure Files or Blob Storage) and benchmark their performance based on your specific workload requirements.
Happy deploying, and enjoy mastering storage in AKS!
Updated Apr 16, 2025
Version 2.0rmmartins
Microsoft
Joined June 01, 2017
Startups at Microsoft
Follow this blog board to get notified when there's new activity