Disaster protection for JFrog Artifactory in AKS with Astra Control Service and Azure NetApp Files
Published Feb 06 2023 11:55 AM 4,073 Views

Table of Contents

Abstract

Introduction

Scenario

Installing Artifactory

Protecting Artifactory with Astra Control Service

Execution hooks for PostgreSQL database

Protecting the application

Simulating disaster and recover the application to another cluster in a different region

Summary

Additional information 

 

Abstract

We describe how to protect a multi-tier application with multiple components (like JFrog Artifactory) on Azure Kubernetes Service against disasters like the complete loss of a region with NetApp® Astra™ Control Service and Azure NetApp Files. We demonstrate how the use of pre- and post-snapshot execution hooks in Astra Control Service enables us to create application-consistent snapshots and backups across all application tiers and recover the application to a different region in case of a disaster.

 

Co-authors: Patric Uebele, Sayan Saha

 

Introduction

NetApp® Astra™ Control is a solution that makes it easy to manage, protect, and move data-rich Kubernetes workloads within and across public clouds and on-premises. Astra Control provides persistent container storage that leverages NetApp’s proven and expansive storage portfolio in the public cloud and on premises, supporting Azure managed disks as storage backend options as well.

 

Astra Control also offers a rich set of application-aware data management functionality (like snapshot and restore, backup and restore, activity logs, and active cloning) for local data protection, disaster recovery, data audit, and mobility use cases for your modern apps. Astra Control provides complete protection of stateful Kubernetes applications by saving both data and metadata, like deployments, config maps, services, secrets, that constitute an application in Kubernetes. Astra Control can be managed via its user interface, accessed by any web browser, or via its powerful REST API.

 

Astra Control’s capability of adding execution hooks that can be executed before and/or after snapshots, backups, and before restores enables application consistent snapshots and backups by quiescing the applications before snapshot creation, as well as customized restores. The Verda open-source project hosts a variety of execution hooks for Astra Control.

 

Astra Control comes in two variants:

 

  1. Astra Control Service (ACS) – A fully managed application-aware data management service that supports Azure Kubernetes Service (AKS), Azure Disk Storage, and Azure NetApp Files (ANF).
  2. Astra Control Center (ACC) – application-aware data management for on-premises Kubernetes clusters, delivered as a customer-managed Kubernetes application from NetApp.

 

To showcase Astra Control’s backup and recovery capabilities in AKS, we use JFrog Artifactory, a universal binary and artifact manager that is used in the continuous integration (CI) / continuous delivery (CD) workflow in the DevOps process. JFrog Artifactory expedites application delivery and enables faster software releases.

JFrog Artifactory has two components in its setup. One component is a database (PostgreSQL in our example) that stores the metadata information about artifacts, builds, and binary packages, and the other component is a repository that stores the files as checksums. Both components use persistent volumes to store data in persistent volumes backed by Azure NetApp Files.

 

Scenario

In the following, we will demonstrate how Astra Control can protect an Artifactory installation by taking application consistent snapshots and backups and test the protection scheme in a disaster recovery simulation across two AKS clusters in separate regions.

 

Installing Artifactory

We deploy Artifactory on AKS cluster pu-aks-1 in the Azure region westeurope. The cluster is managed by our ACS account already, with Azure NetApp Files in service level standard (storage class netapp-anf-perf-standard) chosen as the default storage class. ACS also automatically installed Astra Trident as storage provisioner for persistent volumes backed by Azure NetApp Files:

 

GeertVanTeylingen_0-1671567778269.png

GeertVanTeylingen_0-1671568485818.png

To deploy the Artifactory application, we follow the instructions from JFrog, using the appropriate repository and helm chart:

~# helm repo add jfrog https://charts.jfrog.io
~# helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "netapp-trident" chart repository
...Successfully got an update from the "jfrog" chart repository
...Successfully got an update from the "bitnami" chart repository
...Successfully got an update from the "azure-marketplace" chart repository
Update Complete. ⎈Happy Helming!⎈


~# helm upgrade --install artifactory --namespace artifactory jfrog/artifactory --create-namespace
Release "artifactory" does not exist. Installing it now.
NAME: artifactory
LAST DEPLOYED: Tue Dec  6 10:21:38 2022
NAMESPACE: artifactory
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Congratulations. You have just deployed JFrog Artifactory!

1. Get the Artifactory URL by running these commands:
   NOTE: It may take a few minutes for the LoadBalancer IP to be available.
         You can watch the status of the service by running 'kubectl get svc --namespace artifactory -w artifactory-artifactory-nginx'
   export SERVICE_IP=$(kubectl get svc --namespace artifactory artifactory-artifactory-nginx -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
   echo http://$SERVICE_IP/

2. Open Artifactory in your browser
   Default credential for Artifactory:
   user: admin
   password: password

 

After some minutes, all pods and services are up:

~# kubectl get all,pvc -n artifactory
NAME                                                 READY   STATUS    RESTARTS        AGE
pod/artifactory-0                                    1/1     Running   0               11m
pod/artifactory-artifactory-nginx-5cb99466fd-wvh8j   1/1     Running   1 (3m13s ago)   11m
pod/artifactory-postgresql-0                         1/1     Running   0               11m

NAME                                      TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)                      AGE
service/artifactory                       ClusterIP      10.0.245.232   <none>           8082/TCP,8081/TCP            11m
service/artifactory-artifactory-nginx     LoadBalancer   10.0.46.129    20.103.196.178   80:30656/TCP,443:30734/TCP   11m
service/artifactory-postgresql            ClusterIP      10.0.226.193   <none>           5432/TCP                     11m
service/artifactory-postgresql-headless   ClusterIP      None           <none>           5432/TCP                     11m

NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/artifactory-artifactory-nginx   1/1     1            1           11m

NAME                                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/artifactory-artifactory-nginx-5cb99466fd   1         1         1       11m

NAME                                      READY   AGE
statefulset.apps/artifactory              1/1     11m
statefulset.apps/artifactory-postgresql   1/1     11m

NAME                                                     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS               AGE
persistentvolumeclaim/artifactory-volume-artifactory-0   Bound    pvc-48307448-6f2f-4a3e-ace1-b7b1e9cfeed5   100Gi      RWO            netapp-anf-perf-standard   11m
persistentvolumeclaim/data-artifactory-postgresql-0      Bound    pvc-d153c0d3-cf57-4207-8c7a-63f482c1dd89   200Gi      RWO            netapp-anf-perf-standard   11m

 

and we can set the SERVICE_IP:

~# export SERVICE_IP=$(kubectl get svc --namespace artifactory artifactory-artifactory-nginx -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
~# echo http://$SERVICE_IP/
http://20.103.196.178/

 

Then we add the SERVICE_IP to the FQDN arti1.astrarocks.pu-store.de in our test domain astrarocks.pu-store.de for easier access to the Artifactory service:

~# nslookup arti1.astrarocks.pu-store.de
Server:           192.168.178.73
Address:    192.168.178.73#53
 
Non-authoritative answer:
Name: arti1.astrarocks.pu-store.de
Address: 20.103.196.178

                                                           

Now we can connect to the Artifactory instance with the initial admin credentials created during the installation, using the FQDN arti1.astrarocks.pu-store.de and start its configuration:

GeertVanTeylingen_1-1671568508505.png

After entering the Artifactory (trial) license key, we first change the admin password and then create a second user:

GeertVanTeylingen_2-1671568543218.png

GeertVanTeylingen_3-1671568576724.png

In the next step, let’s add a Docker and a Helm repository:

GeertVanTeylingen_4-1671568617017.png

GeertVanTeylingen_5-1671568659373.png

GeertVanTeylingen_6-1671568675894.png

Protecting Artifactory with Astra Control Service

After this initial configuration of Artifactory, we can start to manage and protect it with ACS. Switching to the ACS UI and checking for discovered namespaces in Applications -> Namespaces, we see that ACS already discovered the artifactory namespace in which we deployed the Artifactory instance. We define the complete namespace as application artifactory directly from the Actions menu:

GeertVanTeylingen_7-1671568701287.png

GeertVanTeylingen_8-1671568730915.png

Navigating to the application details, we see that the artifactory application is not protected yet:

 

GeertVanTeylingen_9-1671568751208.png

 

Execution hooks for PostgreSQL database

To ensure that snapshots and backups are created in an application-consistent way, we utilize pre- and post-snapshot hooks for the PostgreSQL database. The Verda open-source project hosts a variety of execution hooks for Astra Control, including hooks for PostgreSQL, quiescing the database before taking any snapshots and backups. For use with Astra Control, we can upload the needed hook scripts from our workstation into the Astra Control account in Accounts -> Scripts:

GeertVanTeylingen_10-1671568797899.png

 

GeertVanTeylingen_11-1671568815588.png

 

With the hook script for PostgreSQL uploaded to our ACS account, we can now add it to the artifactory application in the app’s details -> Execution hooks:

 

GeertVanTeylingen_12-1671568865080.png

 

We start with the pre-snapshot hook:

 

GeertVanTeylingen_13-1671568896847.png

 

The containers in which the hooks will be run are selected based on container image names – regular expressions can be used. To find the container image names used for PostgreSQL, use kubectl describe:

~# kubectl describe pod/artifactory-postgresql-0 -n artifactory | grep Image
    Image:          releases-docker.jfrog.io/bitnami/postgresql:13.4.0-debian-10-r39
Image ID:       releases-docker.jfrog.io/bitnami/postgresql@sha256:abfb7efd31afc36a8b16aa077bb9dd165c4f635412affef37...

 And then add the post-snapshot hook for PostgreSQL:

 

GeertVanTeylingen_14-1671568979355.png

 

Finally, we check the container image matches and confirm that the hooks will be executed in the PostgreSQL containers:

 

GeertVanTeylingen_15-1671569006543.png

 

Check the ACS documentation, the Verda documentation, and this blog post to learn more about execution hooks in Astra Control.

 

Protecting the application

With execution hooks for PostgreSQL configured, we can no begin to protect the artifactory application with snapshots and backups. Let’s first take an on-demand snapshot to test the proper execution of the hooks from the Data protection tab in the app’s details:

 

GeertVanTeylingen_16-1671569045502.png

 

We accept the default snapshot name and start the snapshot creation:

 

GeertVanTeylingen_17-1671569076557.png

 

As Azure NetApp Files' snapshots are based on the proven ONTAP snapshot technology, the snapshot creation is fast and efficient. In Astra Control’s Activity view, we can follow the steps of the snapshot creation and confirm that the pre- and post-snapshot hooks were executed correctly:

 

GeertVanTeylingen_18-1671569127231.png

 

To protect the application on a regular basis, we now configure a protection policy in the application’s Data protection tab, where we can also see the just created on-demand snapshot listed:

 

GeertVanTeylingen_19-1671569201290.png

 

We configure a protection schedule with an hourly snapshot on the 30th minute, keeping the last four snapshots, and a daily backup at 12:00 UTC, keeping one backup:

 

GeertVanTeylingen_20-1671569230836.png

 

Once the first scheduled backup and snapshot are created, the application’s protection status in Astra Control changes to Fully protected, as it’s now protected with snapshots and backups regularly:

 

GeertVanTeylingen_21-1671569260169.png

 

Simulating disaster and recover the application to another cluster in a different region

In the next step, we want to test the recovery of the Artifactory platform after a simulated disaster.  To simulate the complete loss of the cluster pu-aks-test-1 hosting the artifactory application, we delete the cluster and its resources using a little script:

~#./AKS_compute.sh delete pu-aks-test-1 westeurope rg-patricu-westeu
./AKS_compute.sh: Checking Azure login
./AKS_compute.sh: Getting AKS credentials
Merged "pu-aks-test-1" as current context in /root/.kube/config
NAME                                STATUS   ROLES   AGE   VERSION
aks-nodepool1-33509899-vmss000000   Ready    agent   23h   v1.23.12
aks-nodepool1-33509899-vmss000001   Ready    agent   23h   v1.23.12
aks-nodepool1-33509899-vmss000002   Ready    agent   23h   v1.23.12
./AKS_compute.sh: Getting node resource group ...MC_rg-patricu-westeu_pu-aks-test-1_westeurope
./AKS_compute.sh: Getting vnet name, please be patient
./AKS_compute.sh: vnet = aks-vnet-18022328
./AKS_compute.sh: Deleting non-system namespaces
./AKS_compute.sh: Deleting namespace artifactory
namespace "artifactory" deleted
./AKS_compute.sh: Wait until all PVs are deleted, please be patient
./AKS_compute.sh: Waiting for 2 PVs to be deleted for 1 min....
No resources found
./AKS_compute.sh: Waiting for 0 PVs to be deleted for 2 min....
Setting ANF subnet name ANF_SUBNET=anf-sso-subnet-pu-pu-aks-test-1
./AKS_compute.sh: Deleting subnet anf-sso-subnet-pu-pu-aks-test-1 from ANF
./AKS_compute.sh: Deleting AKS cluster pu-aks-test-1 in resource group rg-patricu-westeu
./AKS_compute.sh: Cleaning up
./AKS_compute.sh: Deleting context pu-aks-test-1
warning: this removed your active context, use "kubectl config use-context" to select a different one
deleted context pu-aks-test-1 from /root/.kube/config
./AKS_compute.sh: Deleting cluster pu-aks-test-1
deleted cluster pu-aks-test-1 from /root/.kube/config

 

ACS will detect that both the application and the cluster are not reachable anymore after a short while:

 

GeertVanTeylingen_22-1671569364556.png

 

And ACS puts both the cluster and the application in state Unavailable:

 

GeertVanTeylingen_23-1671569401396.png

 

As the snapshots are stored locally and hence are not available anymore after deleting all the cluster resources, the application protection status is now Partially protected:

 

GeertVanTeylingen_24-1671569430332.png

 

The backups are stored in object storage and we can add buckets with a very high level of redundancy to Astra Control (see the ACS documentation and this blog post for instructions on how to add additional buckets to Astra Control for storing your backups), the backups will be available even after the loss of a region and we can recover the application in such a scenario from an existing backup, as we’ll show further down.

 

To recover the application from our simulated loss of a complete Azure region, we bring up a new AKS cluster pu-aks-test-2 in the Azure region northeurope and add it to ACS. As in our example we’re working with the same Azure subscription, we can use the existing Service Principal in Clusters -> Add to discover and manage the newly deployed AKS cluster:

 

GeertVanTeylingen_25-1671569460521.png

 

GeertVanTeylingen_26-1671569502962.png

 

We selected the same default storage class (netapp-anf-perf-standard) as in the original cluster:

 

GeertVanTeylingen_27-1671569526966.png

 

Astra Control now manages the new cluster in the northeurope region:

 

GeertVanTeylingen_28-1671569572715.png

 

Now we can initiate the restore of the artifactory application from backup to pu-aks-test-2 in region northeurope from the scheduled backup. From the Data protection tab in the application’s details, we can initiate the restore directly from the Actions menu next to the backup:

 

GeertVanTeylingen_29-1671569603407.png

 

To be able to restore to a different cluster, select Restore to new namespace, select the destination cluster pu-aks-test-2 from the dropdown menu and enter the namespace name for the restore – we’re simply using the same namespace artifactory:

 

GeertVanTeylingen_30-1671569635314.png

 

Next, we confirm the restore source:

 

GeertVanTeylingen_31-1671569664843.png

 

After reviewing the restore information, we can start the restore process:

 

GeertVanTeylingen_32-1671569706768.png

 

With kubectl, we can follow the creation of the artifactory namespace on the destination cluster:

~# kubectl config use-context pu-aks-test-2
Switched to context "pu-aks-test-2".

 

~# kubectl get ns
NAME              STATUS   AGE
artifactory       Active   2m15s
default           Active   3h44m
kube-node-lease   Active   3h44m
kube-public       Active   3h44m
kube-system       Active   3h44m
trident           Active   5m48s

 

and can see Astra Control’s restore processes:

~# kubectl get all,pvc -n artifactory
NAME                                           READY   STATUS    RESTARTS   AGE
pod/r-artifactory-volume-artifactory-0-fgswh   0/1     Pending   0          2m35s
pod/r-data-artifactory-postgresql-0-rk5c5      0/1     Pending   0          2m34s

NAME                                           COMPLETIONS   DURATION   AGE
job.batch/r-artifactory-volume-artifactory-0   0/1           2m35s      2m35s
job.batch/r-data-artifactory-postgresql-0      0/1           2m34s      2m34s

NAME                                                     STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS               AGE
persistentvolumeclaim/artifactory-volume-artifactory-0   Pending                                      netapp-anf-perf-standard   2m36s
persistentvolumeclaim/data-artifactory-postgresql-0      Pending                                      netapp-anf-perf-standard   2m36s

 

Once the data transfer from the backup finishes, Astra Control recreates the rest of the application resources, and the pods and services will come up.

~# kubectl get all,pvc -n artifactory
NAME                                                 READY   STATUS    RESTARTS   AGE
pod/artifactory-0                                    1/1     Running   0          3m40s
pod/artifactory-artifactory-nginx-5cb99466fd-h6zsl   1/1     Running   0          3m38s
pod/artifactory-postgresql-0                         1/1     Running   0          3m40s

NAME                                      TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)                      AGE
service/artifactory                       ClusterIP      10.0.3.17     <none>          8082/TCP,8081/TCP            3m37s
service/artifactory-artifactory-nginx     LoadBalancer   10.0.98.142   20.166.200.14   80:30947/TCP,443:32154/TCP   3m35s
service/artifactory-postgresql            ClusterIP      10.0.27.131   <none>          5432/TCP                     3m38s
service/artifactory-postgresql-headless   ClusterIP      None          <none>          5432/TCP                     3m38s

NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/artifactory-artifactory-nginx   1/1     1            1           3m38s

NAME                                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/artifactory-artifactory-nginx-5cb99466fd   1         1         1       3m38s

NAME                                      READY   AGE
statefulset.apps/artifactory              1/1     3m41s
statefulset.apps/artifactory-postgresql   1/1     3m40s

NAME                                                     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS               AGE
persistentvolumeclaim/artifactory-volume-artifactory-0   Bound    pvc-d01ef1a7-dd30-4d46-85ec-49ec3eeb49b7   100Gi      RWO            netapp-anf-perf-standard   9m25s
persistentvolumeclaim/data-artifactory-postgresql-0      Bound    pvc-fdf18583-fc43-4daf-963b-63a8153319fa   200Gi      RWO            netapp-anf-perf-standard   9m25s

 

As the LoadBalancer service will come up with a new external IP, we reconfigure the FQDN arti1.astrarocks.pu-store.de with the new public IP address in our domain service:

~ % nslookup arti1.astrarocks.pu-store.de
Server:        192.168.178.73
Address:       192.168.178.73#53

Non-authoritative answer:
Name:   arti1.astrarocks.pu-store.de
Address: 20.166.200.14

 

So now we can login again to the restored Artifactory service, now running on the AKS cluster pu-aks-test-2 in the region northeurope:

GeertVanTeylingen_33-1671569919314.png

Login with the user created during the Artifactory configuration works and the Docker and Helm repositories are available:

GeertVanTeylingen_34-1671569944945.png

 

Summary

 

In this article we described how we can make JFrog Artifactory running on AKS using Azure Disk Storage and Azure NetApp Files resilient to disasters, enabling us to provide business continuity for the platform. NetApp® Astra™ Control makes it easy to protect business-critical AKS workloads (stateful and stateless) with just a few clicks. Get started with Astra Control Service today with a free plan.

 

Additional information

  1. https://docs.netapp.com/us-en/astra-control-service/index.html
  2. https://www.jfrog.com/confluence/display/JFROG/Installing+the+JFrog+Platform+Using+Helm+Chart
  3. https://docs.netapp.com/us-en/astra-control-service/use/manage-buckets.html
  4. https://cloud.netapp.com/blog/astra-blg-astra-uses-azure-buckets-to-protect-your-kubernetes-data
  5. https://docs.netapp.com/us-en/astra-control-service/use/manage-app-execution-hooks.html
  6. https://github.com/NetApp/Verda
  7. https://techcommunity.microsoft.com/t5/azure-architecture-blog/protecting-mongodb-on-aks-anf-with-as...
  8. https://cloud.netapp.com/astra-register
Version history
Last update:
‎Dec 20 2022 01:04 PM
Updated by: