Protecting Artifactory with Astra Control Service
Execution hooks for PostgreSQL database
Simulating disaster and recover the application to another cluster in a different region
We describe how to protect a multi-tier application with multiple components (like JFrog Artifactory) on Azure Kubernetes Service against disasters like the complete loss of a region with NetApp® Astra™ Control Service and Azure NetApp Files. We demonstrate how the use of pre- and post-snapshot execution hooks in Astra Control Service enables us to create application-consistent snapshots and backups across all application tiers and recover the application to a different region in case of a disaster.
Co-authors: Patric Uebele, Sayan Saha
NetApp® Astra™ Control is a solution that makes it easy to manage, protect, and move data-rich Kubernetes workloads within and across public clouds and on-premises. Astra Control provides persistent container storage that leverages NetApp’s proven and expansive storage portfolio in the public cloud and on premises, supporting Azure managed disks as storage backend options as well.
Astra Control also offers a rich set of application-aware data management functionality (like snapshot and restore, backup and restore, activity logs, and active cloning) for local data protection, disaster recovery, data audit, and mobility use cases for your modern apps. Astra Control provides complete protection of stateful Kubernetes applications by saving both data and metadata, like deployments, config maps, services, secrets, that constitute an application in Kubernetes. Astra Control can be managed via its user interface, accessed by any web browser, or via its powerful REST API.
Astra Control’s capability of adding execution hooks that can be executed before and/or after snapshots, backups, and before restores enables application consistent snapshots and backups by quiescing the applications before snapshot creation, as well as customized restores. The Verda open-source project hosts a variety of execution hooks for Astra Control.
Astra Control comes in two variants:
To showcase Astra Control’s backup and recovery capabilities in AKS, we use JFrog Artifactory, a universal binary and artifact manager that is used in the continuous integration (CI) / continuous delivery (CD) workflow in the DevOps process. JFrog Artifactory expedites application delivery and enables faster software releases.
JFrog Artifactory has two components in its setup. One component is a database (PostgreSQL in our example) that stores the metadata information about artifacts, builds, and binary packages, and the other component is a repository that stores the files as checksums. Both components use persistent volumes to store data in persistent volumes backed by Azure NetApp Files.
In the following, we will demonstrate how Astra Control can protect an Artifactory installation by taking application consistent snapshots and backups and test the protection scheme in a disaster recovery simulation across two AKS clusters in separate regions.
We deploy Artifactory on AKS cluster pu-aks-1 in the Azure region westeurope. The cluster is managed by our ACS account already, with Azure NetApp Files in service level standard (storage class netapp-anf-perf-standard) chosen as the default storage class. ACS also automatically installed Astra Trident as storage provisioner for persistent volumes backed by Azure NetApp Files:
To deploy the Artifactory application, we follow the instructions from JFrog, using the appropriate repository and helm chart:
~# helm repo add jfrog https://charts.jfrog.io
~# helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "netapp-trident" chart repository
...Successfully got an update from the "jfrog" chart repository
...Successfully got an update from the "bitnami" chart repository
...Successfully got an update from the "azure-marketplace" chart repository
Update Complete. ⎈Happy Helming!⎈
~# helm upgrade --install artifactory --namespace artifactory jfrog/artifactory --create-namespace
Release "artifactory" does not exist. Installing it now.
NAME: artifactory
LAST DEPLOYED: Tue Dec 6 10:21:38 2022
NAMESPACE: artifactory
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Congratulations. You have just deployed JFrog Artifactory!
1. Get the Artifactory URL by running these commands:
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of the service by running 'kubectl get svc --namespace artifactory -w artifactory-artifactory-nginx'
export SERVICE_IP=$(kubectl get svc --namespace artifactory artifactory-artifactory-nginx -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo http://$SERVICE_IP/
2. Open Artifactory in your browser
Default credential for Artifactory:
user: admin
password: password
After some minutes, all pods and services are up:
~# kubectl get all,pvc -n artifactory
NAME READY STATUS RESTARTS AGE
pod/artifactory-0 1/1 Running 0 11m
pod/artifactory-artifactory-nginx-5cb99466fd-wvh8j 1/1 Running 1 (3m13s ago) 11m
pod/artifactory-postgresql-0 1/1 Running 0 11m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/artifactory ClusterIP 10.0.245.232 <none> 8082/TCP,8081/TCP 11m
service/artifactory-artifactory-nginx LoadBalancer 10.0.46.129 20.103.196.178 80:30656/TCP,443:30734/TCP 11m
service/artifactory-postgresql ClusterIP 10.0.226.193 <none> 5432/TCP 11m
service/artifactory-postgresql-headless ClusterIP None <none> 5432/TCP 11m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/artifactory-artifactory-nginx 1/1 1 1 11m
NAME DESIRED CURRENT READY AGE
replicaset.apps/artifactory-artifactory-nginx-5cb99466fd 1 1 1 11m
NAME READY AGE
statefulset.apps/artifactory 1/1 11m
statefulset.apps/artifactory-postgresql 1/1 11m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/artifactory-volume-artifactory-0 Bound pvc-48307448-6f2f-4a3e-ace1-b7b1e9cfeed5 100Gi RWO netapp-anf-perf-standard 11m
persistentvolumeclaim/data-artifactory-postgresql-0 Bound pvc-d153c0d3-cf57-4207-8c7a-63f482c1dd89 200Gi RWO netapp-anf-perf-standard 11m
and we can set the SERVICE_IP:
~# export SERVICE_IP=$(kubectl get svc --namespace artifactory artifactory-artifactory-nginx -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
~# echo http://$SERVICE_IP/
http://20.103.196.178/
Then we add the SERVICE_IP to the FQDN arti1.astrarocks.pu-store.de in our test domain astrarocks.pu-store.de for easier access to the Artifactory service:
~# nslookup arti1.astrarocks.pu-store.de
Server: 192.168.178.73
Address: 192.168.178.73#53
Non-authoritative answer:
Name: arti1.astrarocks.pu-store.de
Address: 20.103.196.178
Now we can connect to the Artifactory instance with the initial admin credentials created during the installation, using the FQDN arti1.astrarocks.pu-store.de and start its configuration:
After entering the Artifactory (trial) license key, we first change the admin password and then create a second user:
In the next step, let’s add a Docker and a Helm repository:
After this initial configuration of Artifactory, we can start to manage and protect it with ACS. Switching to the ACS UI and checking for discovered namespaces in Applications -> Namespaces, we see that ACS already discovered the artifactory namespace in which we deployed the Artifactory instance. We define the complete namespace as application artifactory directly from the Actions menu:
Navigating to the application details, we see that the artifactory application is not protected yet:
To ensure that snapshots and backups are created in an application-consistent way, we utilize pre- and post-snapshot hooks for the PostgreSQL database. The Verda open-source project hosts a variety of execution hooks for Astra Control, including hooks for PostgreSQL, quiescing the database before taking any snapshots and backups. For use with Astra Control, we can upload the needed hook scripts from our workstation into the Astra Control account in Accounts -> Scripts:
With the hook script for PostgreSQL uploaded to our ACS account, we can now add it to the artifactory application in the app’s details -> Execution hooks:
We start with the pre-snapshot hook:
The containers in which the hooks will be run are selected based on container image names – regular expressions can be used. To find the container image names used for PostgreSQL, use kubectl describe:
~# kubectl describe pod/artifactory-postgresql-0 -n artifactory | grep Image
Image: releases-docker.jfrog.io/bitnami/postgresql:13.4.0-debian-10-r39
Image ID: releases-docker.jfrog.io/bitnami/postgresql@sha256:abfb7efd31afc36a8b16aa077bb9dd165c4f635412affef37...
And then add the post-snapshot hook for PostgreSQL:
Finally, we check the container image matches and confirm that the hooks will be executed in the PostgreSQL containers:
Check the ACS documentation, the Verda documentation, and this blog post to learn more about execution hooks in Astra Control.
With execution hooks for PostgreSQL configured, we can no begin to protect the artifactory application with snapshots and backups. Let’s first take an on-demand snapshot to test the proper execution of the hooks from the Data protection tab in the app’s details:
We accept the default snapshot name and start the snapshot creation:
As Azure NetApp Files' snapshots are based on the proven ONTAP snapshot technology, the snapshot creation is fast and efficient. In Astra Control’s Activity view, we can follow the steps of the snapshot creation and confirm that the pre- and post-snapshot hooks were executed correctly:
To protect the application on a regular basis, we now configure a protection policy in the application’s Data protection tab, where we can also see the just created on-demand snapshot listed:
We configure a protection schedule with an hourly snapshot on the 30th minute, keeping the last four snapshots, and a daily backup at 12:00 UTC, keeping one backup:
Once the first scheduled backup and snapshot are created, the application’s protection status in Astra Control changes to Fully protected, as it’s now protected with snapshots and backups regularly:
In the next step, we want to test the recovery of the Artifactory platform after a simulated disaster. To simulate the complete loss of the cluster pu-aks-test-1 hosting the artifactory application, we delete the cluster and its resources using a little script:
~#./AKS_compute.sh delete pu-aks-test-1 westeurope rg-patricu-westeu
./AKS_compute.sh: Checking Azure login
./AKS_compute.sh: Getting AKS credentials
Merged "pu-aks-test-1" as current context in /root/.kube/config
NAME STATUS ROLES AGE VERSION
aks-nodepool1-33509899-vmss000000 Ready agent 23h v1.23.12
aks-nodepool1-33509899-vmss000001 Ready agent 23h v1.23.12
aks-nodepool1-33509899-vmss000002 Ready agent 23h v1.23.12
./AKS_compute.sh: Getting node resource group ...MC_rg-patricu-westeu_pu-aks-test-1_westeurope
./AKS_compute.sh: Getting vnet name, please be patient
./AKS_compute.sh: vnet = aks-vnet-18022328
./AKS_compute.sh: Deleting non-system namespaces
./AKS_compute.sh: Deleting namespace artifactory
namespace "artifactory" deleted
./AKS_compute.sh: Wait until all PVs are deleted, please be patient
./AKS_compute.sh: Waiting for 2 PVs to be deleted for 1 min....
No resources found
./AKS_compute.sh: Waiting for 0 PVs to be deleted for 2 min....
Setting ANF subnet name ANF_SUBNET=anf-sso-subnet-pu-pu-aks-test-1
./AKS_compute.sh: Deleting subnet anf-sso-subnet-pu-pu-aks-test-1 from ANF
./AKS_compute.sh: Deleting AKS cluster pu-aks-test-1 in resource group rg-patricu-westeu
./AKS_compute.sh: Cleaning up
./AKS_compute.sh: Deleting context pu-aks-test-1
warning: this removed your active context, use "kubectl config use-context" to select a different one
deleted context pu-aks-test-1 from /root/.kube/config
./AKS_compute.sh: Deleting cluster pu-aks-test-1
deleted cluster pu-aks-test-1 from /root/.kube/config
ACS will detect that both the application and the cluster are not reachable anymore after a short while:
And ACS puts both the cluster and the application in state Unavailable:
As the snapshots are stored locally and hence are not available anymore after deleting all the cluster resources, the application protection status is now Partially protected:
The backups are stored in object storage and we can add buckets with a very high level of redundancy to Astra Control (see the ACS documentation and this blog post for instructions on how to add additional buckets to Astra Control for storing your backups), the backups will be available even after the loss of a region and we can recover the application in such a scenario from an existing backup, as we’ll show further down.
To recover the application from our simulated loss of a complete Azure region, we bring up a new AKS cluster pu-aks-test-2 in the Azure region northeurope and add it to ACS. As in our example we’re working with the same Azure subscription, we can use the existing Service Principal in Clusters -> Add to discover and manage the newly deployed AKS cluster:
We selected the same default storage class (netapp-anf-perf-standard) as in the original cluster:
Astra Control now manages the new cluster in the northeurope region:
Now we can initiate the restore of the artifactory application from backup to pu-aks-test-2 in region northeurope from the scheduled backup. From the Data protection tab in the application’s details, we can initiate the restore directly from the Actions menu next to the backup:
To be able to restore to a different cluster, select Restore to new namespace, select the destination cluster pu-aks-test-2 from the dropdown menu and enter the namespace name for the restore – we’re simply using the same namespace artifactory:
Next, we confirm the restore source:
After reviewing the restore information, we can start the restore process:
With kubectl, we can follow the creation of the artifactory namespace on the destination cluster:
~# kubectl config use-context pu-aks-test-2
Switched to context "pu-aks-test-2".
~# kubectl get ns
NAME STATUS AGE
artifactory Active 2m15s
default Active 3h44m
kube-node-lease Active 3h44m
kube-public Active 3h44m
kube-system Active 3h44m
trident Active 5m48s
and can see Astra Control’s restore processes:
~# kubectl get all,pvc -n artifactory
NAME READY STATUS RESTARTS AGE
pod/r-artifactory-volume-artifactory-0-fgswh 0/1 Pending 0 2m35s
pod/r-data-artifactory-postgresql-0-rk5c5 0/1 Pending 0 2m34s
NAME COMPLETIONS DURATION AGE
job.batch/r-artifactory-volume-artifactory-0 0/1 2m35s 2m35s
job.batch/r-data-artifactory-postgresql-0 0/1 2m34s 2m34s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/artifactory-volume-artifactory-0 Pending netapp-anf-perf-standard 2m36s
persistentvolumeclaim/data-artifactory-postgresql-0 Pending netapp-anf-perf-standard 2m36s
Once the data transfer from the backup finishes, Astra Control recreates the rest of the application resources, and the pods and services will come up.
~# kubectl get all,pvc -n artifactory
NAME READY STATUS RESTARTS AGE
pod/artifactory-0 1/1 Running 0 3m40s
pod/artifactory-artifactory-nginx-5cb99466fd-h6zsl 1/1 Running 0 3m38s
pod/artifactory-postgresql-0 1/1 Running 0 3m40s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/artifactory ClusterIP 10.0.3.17 <none> 8082/TCP,8081/TCP 3m37s
service/artifactory-artifactory-nginx LoadBalancer 10.0.98.142 20.166.200.14 80:30947/TCP,443:32154/TCP 3m35s
service/artifactory-postgresql ClusterIP 10.0.27.131 <none> 5432/TCP 3m38s
service/artifactory-postgresql-headless ClusterIP None <none> 5432/TCP 3m38s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/artifactory-artifactory-nginx 1/1 1 1 3m38s
NAME DESIRED CURRENT READY AGE
replicaset.apps/artifactory-artifactory-nginx-5cb99466fd 1 1 1 3m38s
NAME READY AGE
statefulset.apps/artifactory 1/1 3m41s
statefulset.apps/artifactory-postgresql 1/1 3m40s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/artifactory-volume-artifactory-0 Bound pvc-d01ef1a7-dd30-4d46-85ec-49ec3eeb49b7 100Gi RWO netapp-anf-perf-standard 9m25s
persistentvolumeclaim/data-artifactory-postgresql-0 Bound pvc-fdf18583-fc43-4daf-963b-63a8153319fa 200Gi RWO netapp-anf-perf-standard 9m25s
As the LoadBalancer service will come up with a new external IP, we reconfigure the FQDN arti1.astrarocks.pu-store.de with the new public IP address in our domain service:
~ % nslookup arti1.astrarocks.pu-store.de
Server: 192.168.178.73
Address: 192.168.178.73#53
Non-authoritative answer:
Name: arti1.astrarocks.pu-store.de
Address: 20.166.200.14
So now we can login again to the restored Artifactory service, now running on the AKS cluster pu-aks-test-2 in the region northeurope:
Login with the user created during the Artifactory configuration works and the Docker and Helm repositories are available:
In this article we described how we can make JFrog Artifactory running on AKS using Azure Disk Storage and Azure NetApp Files resilient to disasters, enabling us to provide business continuity for the platform. NetApp® Astra™ Control makes it easy to protect business-critical AKS workloads (stateful and stateless) with just a few clicks. Get started with Astra Control Service today with a free plan.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.