Use Harbor in Azure and Multi-Cloud Environments
In today's rapidly evolving cloud landscape, managing container images in a single-cloud scenario across multiple regions or across multiple cloud providers can be a complex and challenging task. However, with the introduction of Harbor, an open-source, cloud-native registry, customers can now enjoy a unified and efficient approach to container image management across separate cloud environments such as Azure, AWS, and GCP.
Harbor was created to address the growing need for a single source of truth in container image management. Initially developed by VMware, it has gained popularity within the cloud-native community due to its powerful features and ease of use. As an open-source project, Harbor benefits from the collective efforts of a vibrant community, ensuring continuous improvements and innovation.
Harbor's significance to the cloud-native ecosystem is further amplified by its acceptance as a graduated member project of the Cloud Native Computing Foundation (CNCF). This prestigious recognition highlights the project's maturity and adherence to industry best practices.
One of the key advantages of adopting Harbor is its robust security capabilities. The registry incorporates features like vulnerability scanning and access controls, ensuring that container images are thoroughly examined for vulnerabilities using Trivy before deployment. Trivy is a popular open source security scanner, reliable, fast, and easy to use. Furthermore, Harbor allows customers to enforce consistent security policies and granular access controls, granting authorized users or teams exclusive rights to container images. This enhanced security posture allows customers to maintain control over their container images in a multi-cloud environment, mitigating potential security risks.
This article provides a comprehensive guide on deploying Harbor on Azure Kubernetes Service (AKS) in a reliable and secure manner, leveraging various configurations and network topologies. It also explains how to replicate container images across multiple registries and cloud environments, integrating Harbor with Azure Container Registry (ACR). Additionally, it discusses the integration of Harbor into a consistent GitOps and DevOps CI/CD strategy using Argo CD, Flux CD, and Azure DevOps. Please find the companion code and Visio diagram in this GitHub repository.
Harbor Services
When installing Harbor, you will encounter seven services that play different roles in the registry:
Core
: The Core service is the main component of Harbor. It manages the overall functionality and operation of the Harbor registry. This service runs in its own pod.Portal
: The Portal service provides the user interface for interacting with Harbor. It allows users to browse, search, and manage container images and projects. The Portal service runs in its own pod.Job Service
: The Job Service manages and executes background tasks in Harbor. It handles tasks such as garbage collection, replication, and scanning. This service ensures that these tasks run in the background without affecting the responsiveness of the registry. The Job Service runs in its own pod.Registry
: The Registry service is the core component of Harbor. It handles the storage and retrieval of container images. This service is compatible with the Docker Registry HTTP API V2 specifications. The Registry service runs in its own pod.Redis
: Redis is an in-memory data structure store that acts as a key-value cache for Harbor. It improves the performance of the registry by storing frequently accessed data in memory, reducing the need for disk I/O operations. The Redis service runs in its own pod.Database
: The Database service is responsible for storing and managing metadata and configuration of Harbor. It holds information such as users, projects, repositories, access control policies, and system settings. Harbor supports different databases like PostgreSQL and MySQL. The Database service ensures the persistence and integrity of the registry data.Trivy
: Trivy is a vulnerability scanner integrated with Harbor. It scans container images for security vulnerabilities during the image push process. Trivy runs in a separate pod and interacts with the Database and Registry services. It adds an additional layer of security by automatically scanning container images for vulnerabilities.
Each service operates within its own deployment, which can consist of one or multiple pods, depending on the specified deployment configuration. For more information on Harbor installation, see the following resources:
- Test Harbor with the Demo Server
- Harbor Compatibility List
- Harbor Installation Prerequisites
- Download the Harbor Installer
- Configure HTTPS Access to Harbor
- Configure Internal TLS communication between Harbor Component
- Configure the Harbor YML File
- Run the Installer Script
- Deploying Harbor with High Availability via Helm
- Deploy Harbor with the Quick Installation Script
- Troubleshooting Harbor Installation
- Reconfigure Harbor and Manage the Harbor Lifecycle
- Customize the Harbor Token Service
- Configure Harbor User Settings at the Command Line
Deploying Harbor on AKS with Helm
Deploying Harbor on Azure Kubernetes Service (AKS) can be done reliably using Helm. By leveraging Helm, you can easily set up Harbor on your AKS environment. One option is to use the official Harbor Helm chart, which is also available on Artifact Hub. For detailed instructions on customizing the deployment, refer to the Harbor Helm chart GitHub repository. Alternatively, you can explore alternative Helm charts supported by the technical community or third-party providers like Bitnami. These alternative charts offer additional features and customization options.
This article offers a collection of scripts that enable the deployment of Harbor on AKS through Helm, providing alternative configurations for seamless container image management in your AKS environment. Before executing any script, it is essential to customize the variables within the 00-variables.sh
file. his file is integrated into all the scripts and contains the following variables:
#!/bin/bash
# Azure Subscription and Tenant
subscriptionId=$(az account show --query id --output tsv)
subscriptionName=$(az account show --query name --output tsv)
tenantId=$(az account show --query tenantId --output tsv)
harborResourceGroupName="MikiRG"
location="eastus2"
# Harbor
harborNamespace="harbor"
harborRepoName="harbor"
harborChartName="harbor"
harborReleaseName="harbor"
harborRepoUrl="https://helm.goharbor.io"
harborTlsSecretName="harbor-tls"
harborIngressClassName="nginx"
harborExposeType="ingress"
harborRegistryStorageClass="managed-csi-premium"
harborAdminUsername="admin"
harborAdminPassword="admin"
# Store size
registrySize="256Gi"
jobserviceJobLogSize="8Gi"
databaseSize="8Gi"
redisSize="8Gi"
trivySize="8Gi"
# Replica Count
portalReplicaCount=1
coreReaplicaCount=1
jobserviceReplicaCount=1
registryReplicaCount=1
trivyReplicaCount=1
# Certificate Manager
certManagerNamespace="cert-manager"
certManagerRepoName="jetstack"
certManagerRepoUrl="https://charts.jetstack.io"
certManagerChartName="cert-manager"
certManagerReleaseName="cert-manager"
certManagerClusterIssuer="letsencrypt-nginx"
certManagerAcmeChallengeType="http01"
email="paolos@contoso.com"
clusterIssuer="letsencrypt-nginx"
template="cluster-issuer.yml"
# NGINX Ingress Controller
nginxNamespace="ingress-basic"
nginxRepoName="ingress-nginx"
nginxRepoUrl="https://kubernetes.github.io/ingress-nginx"
nginxChartName="ingress-nginx"
nginxReleaseName="nginx-ingress"
nginxIngressClassName="nginx"
nginxReplicaCount=3
# Azure DNS Zone
dnsZoneName="contoso.com"
dnsZoneResourceGroupName="dnsresourcegroup"
harborSubdomain="harbor"
harborHostname="${harborSubdomain,,}.${dnsZoneName,,}"
sampleSubdomain="flaskapp"
sampleHostname="${sampleSubdomain,,}.${dnsZoneName,,}"
# Virtual Network
virtualNetworkName="MikiAksVnet"
virtualNetworkAddressPrefix="10.0.0.0/8"
# Subnets
postgreSqlSubnetName="SqlSubnet"
postgreSqlSubnetPrefix="10.244.0.0/24"
defaultSubnetName="VmSubnet"
defaultPrefix="10.243.1.0/24"
# Private DNS Zones
postgreSqlPrivateDnsZoneName="harbor.postgres.database.azure.com"
postgreSqlPrivateDnsZoneVirtualNetworkLinkName="LinkTo${virtualNetworkName}"
redisPrivateDnsZoneName="privatelink.redis.cache.windows.net"
redisPrivateDnsZoneVirtualNetworkLinkName="LinkTo${virtualNetworkName}"
# Azure Database for PostgreSQL flexible server
postgreSqlServerName="MikiPostgreSQLServer"
postgreSqlActiveDirectoryAuth="Enabled"
postgreSqlAdminUsername="azadmin"
postgreSqlAdminPassword="P@ssw0rd1234"
postgreSqlSkuName="Standard_D2s_v3"
postgreSqlStorageSize=1024
postgreSqlVersion=15
postgreSqlDatabaseName="registry"
# Azure Cache for Redis
redisCacheName="MikiRedisCache"
redisCacheSku="Standard"
redisCacheVmSize="C1"
redisCachePrivateEndpointName="RedisCachePrivateEndpoint"
redisCachePrivateDnsZoneGroupName="RedisCache"
# Azure Container Registry
acrName="TanAcr"
acrResourceGroupName="TanRG"
acrSku="Standard"
acrUsername="harbor"
acrScopeMapName="harbor-scope-map"
repositories=("chat" "doc")
moreRepositories=("harbor/flaskapp")
importProject="private"
importRepositories=("ubuntu:latest" "python:latest")
# Image
imageName="flaskapp"
tag="1.0"
harborProject="private"
imagePullPolicy="IfNotPresent" # Always, Never, IfNotPresent
# Others
dockerEmail="paolos@contoso.com"
# Kubernetes
sampleNamespace="flaskapp"
managedCsiPremiumZrsStorageClassName="managed-csi-premium-zrs"
azureFilePremiumZrsStorageClassName="azurefile-csi-premium-zrs"
This article presents three different deployment options:
- Deploy Harbor with Managed Disks: This is a basic installation option where all Harbor services utilize an Azure Managed Disk as a data repository. Each service deployment consists of a single pod. This deployment is suitable for small setups.
- Deploy Harbor with Managed PostgreSQL and Redis: This deployment option creates and utilizes an Azure Database for PostgreSQL flexible server as the data repository for the
database
service. It also employs an Azure Cache for Redis, ideally configured with zone redundancy, as a cache for theredis
service. This deployment is more robust than the previous option as it leverages the resiliency features provided by Azure managed services. - Deploy Harbor across Availability Zones: This deployment option ensures intra-region resiliency by spreading multiple replicas of each service deployment across availability zones in a zone-redundant AKS cluster. Additionally, services utilize Zone Redundant Storage (ZRS) for persistent volumes.
Choose the deployment style that best fits your constraints and requirements in terms of reliability, configuration, performance, and cost. You can customize your own deployment configuration by selecting different values during the Helm chart deployment and by combining the approaches mentioned above.
Deploy Harbor with Managed Disks
You can use the 03-deploy-harbor-with-managed-disks.sh
script to deploy Harbor to your Azure Kubernetes Service (AKS) cluster with a configuration illustrated in the following picture.
As it can be noticed, the Redis
, Trivy
, Job Service
, Registry
, and Database
services utilize an Azure Managed Disk as a persistent volume for data storage. These managed disks are created by the AKS cluster identity within the node resource group, which houses the resources associated with the cluster. The Azure Disks CSI Driver is a CSI specification-compliant driver used by Azure Kubernetes Service (AKS) to manage the lifecycle of Azure managed Disks. The Container Storage Interface (CSI) is a standard for exposing arbitrary block and file storage systems to containerized workloads on Kubernetes. For more information, see Use the Azure Disk CSI Driver.
#!/bin/bash
# Variables
source ./00-variables.sh
# Check if the Harbor repository is not already added
result=$(helm repo list | grep $harborRepoName | awk '{print $1}')
if [[ -n $result ]]; then
echo "[$harborRepoName] Helm repo already exists"
else
# Add the ingress-Harbor repository
echo "Adding [$harborRepoName] Helm repo..."
helm repo add $harborRepoName $harborRepoUrl
fi
# Update your local Helm chart repository cache
echo 'Updating Helm repos...'
helm repo update
# Use Helm to deploy Harbor
result=$(helm list -n $harborNamespace | grep $harborReleaseName | awk '{print $1}')
if [[ -n $result ]]; then
echo "[$harborReleaseName] release already exists in the [$harborNamespace] namespace"
# Upgrade Harbor
echo "Upgrading [$harborReleaseName] release in the [$harborNamespace] namespace..."
helm upgrade $harborReleaseName $harborRepoName/$harborChartName \
--install \
--create-namespace \
--namespace $harborNamespace \
--set externalURL=https://$harborHostname \
--set harborAdminPassword=$harborAdminPassword \
--set expose.type=$harborExposeType \
--set expose.tls.certSource=secret \
--set expose.tls.secret.secretName=$harborTlsSecretName \
--set expose.ingress.hosts.core=$harborHostname \
--set expose.ingress.className=$nginxIngressClassName \
--set expose.ingress.annotations."cert-manager\.io/cluster-issuer"=$certManagerClusterIssuer \
--set expose.ingress.annotations."cert-manager\.io/acme-challenge-type"=$certManagerAcmeChallengeType \
--set persistence.enabled=true \
--set persistence.persistentVolumeClaim.registry.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.jobservice.jobLog.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.database.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.redis.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.trivy.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.registry.size=$registrySize \
--set persistence.persistentVolumeClaim.jobservice.jobLog.size=$jobserviceJobLogSize \
--set persistence.persistentVolumeClaim.database.size=$databaseSize \
--set persistence.persistentVolumeClaim.redis.size=$redisSize \
--set persistence.persistentVolumeClaim.trivy.size=$trivySize \
--set portal.replicas=$portalReplicaCount \
--set core.replicas=$coreReaplicaCount \
--set jobservice.replicas=$jobserviceReplicaCount \
--set registry.replicas=$registryReplicaCount \
--set trivy.replicas=$trivyReplicaCount
else
# Install Harbor
echo "Installing [$harborReleaseName] release in the [$harborNamespace] namespace..."
helm install $harborReleaseName $harborRepoName/$harborChartName \
--create-namespace \
--namespace $harborNamespace \
--set externalURL=https://$harborHostname \
--set harborAdminPassword=$harborAdminPassword \
--set expose.type=$harborExposeType \
--set expose.tls.certSource=secret \
--set expose.tls.secret.secretName=$harborTlsSecretName \
--set expose.ingress.hosts.core=$harborHostname \
--set expose.ingress.className=$nginxIngressClassName \
--set expose.ingress.annotations."cert-manager\.io/cluster-issuer"=$certManagerClusterIssuer \
--set expose.ingress.annotations."cert-manager\.io/acme-challenge-type"=$certManagerAcmeChallengeType \
--set persistence.enabled=true \
--set persistence.persistentVolumeClaim.registry.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.jobservice.jobLog.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.database.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.redis.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.trivy.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.registry.size=$registrySize \
--set persistence.persistentVolumeClaim.jobservice.jobLog.size=$jobserviceJobLogSize \
--set persistence.persistentVolumeClaim.database.size=$databaseSize \
--set persistence.persistentVolumeClaim.redis.size=$redisSize \
--set persistence.persistentVolumeClaim.trivy.size=$trivySize \
--set portal.replicas=$portalReplicaCount \
--set core.replicas=$coreReaplicaCount \
--set jobservice.replicas=$jobserviceReplicaCount \
--set registry.replicas=$registryReplicaCount \
--set trivy.replicas=$trivyReplicaCount
fi
# Get values
helm get values $harborReleaseName --namespace $harborNamespace
The official Harbor Helm Chart provides the flexibility to define the storage class for provisioning persistent volumes used by each service. You can use one of the following built-in storage classes:
managed-csi
: Uses Azure Standard SSD locally redundant storage (LRS) to create a managed disk.managed-csi-premium
: Uses Azure Premium LRS to create a managed disk.
The default storage classes are suitable for most common scenarios. For some cases, you might want to have your own storage class customized with your own parameters. You can see an example of custom storage classes in the third deployment options which makes use of both Azure Files and Azure Managed Disks as persistent volumes for the Harbor services.
Deploy Harbor with Managed PostgreSQL and Redis
The Helm chart offers the flexibility to deploy Harbor with a configuration where the Database
and Redis
services utilize external services.
External PostgreSQL
: Set thedatabase.type
toexternal
and provide the necessary details in thedatabase.external
section. Ensure that an empty database is pre-created, as Harbor will automatically create the required tables during startup.External Redis
: Set theredis.type
toexternal
and fill in the relevant information in theredis.external
section.
This feature can be leveraged to deploy Harbor with a resilient setup. For instance, the Database
service can be configured to use an Azure Database for PostgreSQL flexible server, deployed in a dedicated subnet within the same virtual network as the AKS cluster. Similarly, the Redis
service can be set up to utilize Azure Cache for Redis, ideally configured with zone redundancy for enhanced intra-region resiliency and passive geo-redundancy for cross-region business continuity.
The 04-deploy-harbor-with-managed-services.sh
script facilitates the creation of an Azure Database for PostgreSQL flexible server within the same virtual network as your AKS cluster via Azure CLI. It then provisions an Azure Cache for Redis and, finally, deploys Harbor using the Helm chart. The diagram below represents the deployment architecture.
In this setup, the Trivy
, Job Service
, and Registry
services utilize an Azure Managed Disk as a persistent volume for data storage. These managed disks are created by the AKS cluster identity within the node resource group, which contains the resources associated with the cluster. The Database
service is configured to use the Azure Database for PostgreSQL flexible server provisoned by the script. Likewise, the Redis
service is configured to utilize the newly created Azure Cache for Redis.
#!/bin/bash
# Variables
source ./00-variables.sh
# Check if the resource group already exists
echo "Checking if [$harborResourceGroupName] resource group actually exists in the [$subscriptionName] subscription..."
az group show --name $harborResourceGroupName &>/dev/null
if [[ $? != 0 ]]; then
echo "No [$harborResourceGroupName] resource group actually exists in the [$subscriptionName] subscription"
echo "Creating [$harborResourceGroupName] resource group in the [$subscriptionName] subscription..."
# Create the resource group
az group create --name $harborResourceGroupName --location $location 1>/dev/null
if [[ $? == 0 ]]; then
echo "[$harborResourceGroupName] resource group successfully created in the [$subscriptionName] subscription"
else
echo "Failed to create [$harborResourceGroupName] resource group in the [$subscriptionName] subscription"
exit
fi
else
echo "[$harborResourceGroupName] resource group already exists in the [$subscriptionName] subscription"
fi
# Check if the server virtual network already exists
echo "Checking if [$virtualNetworkName] virtual network actually exists in the [$harborResourceGroupName] resource group..."
az network vnet show \
--name $virtualNetworkName \
--only-show-errors \
--resource-group $harborResourceGroupName &>/dev/null
if [[ $? != 0 ]]; then
echo "No [$virtualNetworkName] virtual network actually exists in the [$harborResourceGroupName] resource group"
echo "Creating [$virtualNetworkName] virtual network in the [$harborResourceGroupName] resource group..."
# Create the server virtual network
az network vnet create \
--name $virtualNetworkName \
--resource-group $harborResourceGroupName \
--location $location \
--address-prefixes $virtualNetworkAddressPrefix \
--subnet-name $defaultSubnetName \
--subnet-prefix $defaultSubnetPrefix \
--only-show-errors 1>/dev/null
if [[ $? == 0 ]]; then
echo "[$virtualNetworkName] virtual network successfully created in the [$harborResourceGroupName] resource group"
else
echo "Failed to create [$virtualNetworkName] virtual network in the [$harborResourceGroupName] resource group"
exit
fi
else
echo "[$virtualNetworkName] virtual network already exists in the [$harborResourceGroupName] resource group"
fi
# Check if the subnet for Azure Database for PostgreSQL flexible server already exists
echo "Checking if [$postgreSqlSubnetName] backend subnet actually exists in the [$virtualNetworkName] virtual network..."
az network vnet subnet show \
--name $postgreSqlSubnetName \
--vnet-name $virtualNetworkName \
--resource-group $harborResourceGroupName \
--only-show-errors &>/dev/null
if [[ $? != 0 ]]; then
echo "No [$postgreSqlSubnetName] backend subnet actually exists in the [$virtualNetworkName] virtual network"
echo "Creating [$postgreSqlSubnetName] backend subnet in the [$virtualNetworkName] virtual network..."
# Create the backend subnet
az network vnet subnet create \
--name $postgreSqlSubnetName \
--vnet-name $virtualNetworkName \
--resource-group $harborResourceGroupName \
--address-prefix $postgreSqlSubnetPrefix \
--delegations "Microsoft.DBforPostgreSQL/flexibleServers" \
--only-show-errors 1>/dev/null
if [[ $? == 0 ]]; then
echo "[$postgreSqlSubnetName] backend subnet successfully created in the [$virtualNetworkName] virtual network"
else
echo "Failed to create [$postgreSqlSubnetName] backend subnet in the [$virtualNetworkName] virtual network"
exit
fi
else
echo "[$postgreSqlSubnetName] backend subnet already exists in the [$virtualNetworkName] virtual network"
fi
# Retrieve the resource id of the subnet
echo "Retrieving [$postgreSqlSubnetName] backend subnet resource id in the [$virtualNetworkName] virtual network..."
postgreSqlSubnetResourceId=$(az network vnet subnet show \
--name $postgreSqlSubnetName \
--vnet-name $virtualNetworkName \
--resource-group $harborResourceGroupName \
--query id \
--output tsv \
--only-show-errors)
if [[ -z $postgreSqlSubnetResourceId ]]; then
echo "Failed to retrieve [$postgreSqlSubnetName] backend subnet resource id in the [$virtualNetworkName] virtual network"
exit
fi
# Check if the private DNS Zone already exists
az network private-dns zone show \
--name $postgreSqlPrivateDnsZoneName \
--resource-group $harborResourceGroupName \
--only-show-errors &>/dev/null
if [[ $? != 0 ]]; then
echo "No [$postgreSqlPrivateDnsZoneName] private DNS zone actually exists in the [$harborResourceGroupName] resource group"
echo "Creating [$postgreSqlPrivateDnsZoneName] private DNS zone in the [$harborResourceGroupName] resource group..."
# Create the private DNS Zone
az network private-dns zone create \
--name $postgreSqlPrivateDnsZoneName \
--resource-group $harborResourceGroupName \
--only-show-errors 1>/dev/null
if [[ $? == 0 ]]; then
echo "[$postgreSqlPrivateDnsZoneName] private DNS zone successfully created in the [$harborResourceGroupName] resource group"
else
echo "Failed to create [$postgreSqlPrivateDnsZoneName] private DNS zone in the [$harborResourceGroupName] resource group"
exit
fi
else
echo "[$postgreSqlPrivateDnsZoneName] private DNS zone already exists in the [$harborResourceGroupName] resource group"
fi
# Check if the private DNS Zone virtual network link already exists
az network private-dns link vnet show \
--name $postgreSqlPrivateDnsZoneVirtualNetworkLinkName \
--resource-group $harborResourceGroupName \
--zone-name $postgreSqlPrivateDnsZoneName \
--only-show-errors &>/dev/null
if [[ $? != 0 ]]; then
echo "No [$postgreSqlPrivateDnsZoneVirtualNetworkLinkName] private DNS zone virtual network link actually exists in the [$harborResourceGroupName] resource group"
# Retrieve the client virtual network resource id
virtualNetworkResourceId=$(az network vnet show \
--name $virtualNetworkName \
--resource-group $harborResourceGroupName \
--only-show-errors \
--query id \
--output tsv)
if [[ -z $virtualNetworkResourceId ]]; then
echo "Failed to retrieve [$virtualNetworkName] client virtual network resource id in the [$harborResourceGroupName] resource group"
exit
fi
echo "Creating [$postgreSqlPrivateDnsZoneVirtualNetworkLinkName] private DNS zone virtual network link in the [$harborResourceGroupName] resource group..."
# Create the private DNS Zone virtual network link
az network private-dns link vnet create \
--name $postgreSqlPrivateDnsZoneVirtualNetworkLinkName \
--resource-group $harborResourceGroupName \
--zone-name $postgreSqlPrivateDnsZoneName \
--virtual-network $virtualNetworkResourceId \
--registration-enabled false \
--only-show-errors 1>/dev/null
if [[ $? == 0 ]]; then
echo "[$postgreSqlPrivateDnsZoneVirtualNetworkLinkName] private DNS zone virtual network link successfully created in the [$harborResourceGroupName] resource group"
else
echo "Failed to create [$postgreSqlPrivateDnsZoneVirtualNetworkLinkName] private DNS zone virtual network link in the [$harborResourceGroupName] resource group"
exit
fi
else
echo "[$postgreSqlPrivateDnsZoneVirtualNetworkLinkName] private DNS zone virtual network link already exists in the [$harborResourceGroupName] resource group"
fi
# Retrieve the resource id of the private DNS Zone
echo "Retrieving [$postgreSqlPrivateDnsZoneName] private DNS zone resource id in the [$harborResourceGroupName] resource group..."
postgreSqlPrivateDnsZoneResourceId=$(az network private-dns zone show \
--name $postgreSqlPrivateDnsZoneName \
--resource-group $harborResourceGroupName \
--query id \
--output tsv \
--only-show-errors)
# Check if Azure Database for PostgreSQL flexible server exists
echo "Checking if [$postgreSqlServerName] Azure Database for PostgreSQL flexible server actually exists in the [$harborResourceGroupName] resource group..."
az postgres flexible-server show \
--name ${postgreSqlServerName,,} \
--resource-group $harborResourceGroupName \
--only-show-errors &>/dev/null
if [[ $? != 0 ]]; then
echo "No [$postgreSqlServerName] Azure Database for PostgreSQL flexible server exists in the [$harborResourceGroupName] resource group"
echo "Creating [$postgreSqlServerName] Azure Database for PostgreSQL flexible server in the [$harborResourceGroupName] resource group..."
# Create Azure Database for PostgreSQL flexible server
az postgres flexible-server create \
--name $postgreSqlServerName \
--resource-group $harborResourceGroupName \
--location $location \
--active-directory-auth $postgreSqlActiveDirectoryAuth \
--sku-name $postgreSqlSkuName \
--storage-size $postgreSqlStorageSize \
--version $postgreSqlVersion \
--admin-user $postgreSqlAdminUsername \
--admin-password $postgreSqlAdminPassword \
--database-name $postgreSqlDatabaseName \
--subnet $postgreSqlSubnetResourceId \
--private-dns-zone $postgreSqlPrivateDnsZoneResourceId \
--tags "Environment=Test" "Project=Harbor" "Department=IT" \
--only-show-errors 1>/dev/null
if [[ $? == 0 ]]; then
echo "[$postgreSqlServerName] Azure Database for PostgreSQL flexible server successfully created in the [$harborResourceGroupName] resource group"
else
echo "Failed to create [$postgreSqlServerName] Azure Database for PostgreSQL flexible server in the [$harborResourceGroupName] resource group"
exit
fi
fi
# Retrieve the fullyQualifiedDomainName of the Azure Database for PostgreSQL flexible server
echo "Retrieving the fullyQualifiedDomainName of the [$postgreSqlServerName] Azure Database for PostgreSQL flexible server in the [$harborResourceGroupName] resource group..."
postgreSqlFullyQualifiedDomainName=$(az postgres flexible-server show \
--name ${postgreSqlServerName,,} \
--resource-group $harborResourceGroupName \
--query fullyQualifiedDomainName \
--output tsv \
--only-show-errors)
if [[ -n $postgreSqlFullyQualifiedDomainName ]]; then
echo "[$postgreSqlFullyQualifiedDomainName] fullyQualifiedDomainName for the [$postgreSqlServerName] Azure Database for PostgreSQL flexible server successfully retrieved"
else
echo "Failed to retrieve the fullyQualifiedDomainName of the [$postgreSqlServerName] Azure Database for PostgreSQL flexible server in the [$harborResourceGroupName] resource group"
exit
fi
# Check if the Azure Cache for Redis already exists
echo "Checking if [$redisCacheName] Azure Cache for Redis actually exists in the [$harborResourceGroupName] resource group..."
az redis show \
--name $redisCacheName \
--resource-group $harborResourceGroupName \
--only-show-errors &>/dev/null
if [[ $? != 0 ]]; then
echo "No [$redisCacheName] Azure Cache for Redis actually exists in the [$harborResourceGroupName] resource group"
echo "Creating [$redisCacheName] Azure Cache for Redis in the [$harborResourceGroupName] resource group..."
# Create the Azure Cache for Redis
az redis create \
--name $redisCacheName \
--resource-group $harborResourceGroupName \
--enable-non-ssl-port \
--location $location \
--sku $redisCacheSku \
--mi-system-assigned \
--vm-size $redisCacheVmSize \
--tags "Environment=Test" "Project=Harbor" "Department=IT" \
--only-show-errors 1>/dev/null
if [[ $? == 0 ]]; then
echo "[$redisCacheName] Azure Cache for Redis successfully created in the [$harborResourceGroupName] resource group"
else
echo "Failed to create [$redisCacheName] Azure Cache for Redis in the [$harborResourceGroupName] resource group"
exit
fi
else
echo "[$redisCacheName] Azure Cache for Redis already exists in the [$harborResourceGroupName] resource group"
fi
# Retrieve the resource id of the Azure Cache for Redis
echo "Retrieving the resource id of the [$redisCacheName] Azure Cache for Redis in the [$harborResourceGroupName] resource group..."
redisCacheResourceId=$(az redis show \
--name $redisCacheName \
--resource-group $harborResourceGroupName \
--query id \
--output tsv \
--only-show-errors)
if [[ -n $redisCacheResourceId ]]; then
echo "[$redisCacheResourceId] resource id for the [$redisCacheName] Azure Cache for Redis successfully retrieved"
else
echo "Failed to retrieve the resource id of the [$redisCacheName] Azure Cache for Redis in the [$harborResourceGroupName] resource group"
exit
fi
# Retrieve the hostname of the Azure Cache for Redis
echo "Retrieving the hostname of the [$redisCacheName] Azure Cache for Redis in the [$harborResourceGroupName] resource group..."
redisCacheHostName=$(az redis show \
--name $redisCacheName \
--resource-group $harborResourceGroupName \
--query hostName \
--output tsv \
--only-show-errors)
if [[ -n $redisCacheHostName ]]; then
echo "[$redisCacheHostName] hostname for the [$redisCacheName] Azure Cache for Redis successfully retrieved"
else
echo "Failed to retrieve the hostname of the [$redisCacheName] Azure Cache for Redis in the [$harborResourceGroupName] resource group"
exit
fi
# Retrieve the port of the Azure Cache for Redis
echo "Retrieving the port of the [$redisCacheName] Azure Cache for Redis in the [$harborResourceGroupName] resource group..."
redisCachePort=$(az redis show \
--name $redisCacheName \
--resource-group $harborResourceGroupName \
--query port \
--output tsv \
--only-show-errors)
if [[ -n $redisCachePort ]]; then
echo "[$redisCachePort] port for the [$redisCacheName] Azure Cache for Redis successfully retrieved"
else
echo "Failed to retrieve the port of the [$redisCacheName] Azure Cache for Redis in the [$harborResourceGroupName] resource group"
exit
fi
# Retrieve the primary key of the Azure Cache for Redis
echo "Retrieving the primary key of the [$redisCacheName] Azure Cache for Redis in the [$harborResourceGroupName] resource group..."
redisCachePrimaryKey=$(az redis list-keys \
--name $redisCacheName \
--resource-group $harborResourceGroupName \
--query primaryKey \
--output tsv \
--only-show-errors)
if [[ -n $redisCachePrimaryKey ]]; then
echo "[$redisCachePrimaryKey] primary key for the [$redisCacheName] Azure Cache for Redis successfully retrieved"
else
echo "Failed to retrieve the primary key of the [$redisCacheName] Azure Cache for Redis in the [$harborResourceGroupName] resource group"
exit
fi
# Check if the private DNS Zone already exists
az network private-dns zone show \
--name $redisPrivateDnsZoneName \
--resource-group $harborResourceGroupName \
--only-show-errors &>/dev/null
if [[ $? != 0 ]]; then
echo "No [$redisPrivateDnsZoneName] private DNS zone actually exists in the [$harborResourceGroupName] resource group"
echo "Creating [$redisPrivateDnsZoneName] private DNS zone in the [$harborResourceGroupName] resource group..."
# Create the private DNS Zone
az network private-dns zone create \
--name $redisPrivateDnsZoneName \
--resource-group $harborResourceGroupName \
--only-show-errors 1>/dev/null
if [[ $? == 0 ]]; then
echo "[$redisPrivateDnsZoneName] private DNS zone successfully created in the [$harborResourceGroupName] resource group"
else
echo "Failed to create [$redisPrivateDnsZoneName] private DNS zone in the [$harborResourceGroupName] resource group"
exit
fi
else
echo "[$redisPrivateDnsZoneName] private DNS zone already exists in the [$harborResourceGroupName] resource group"
fi
# Check if the private DNS Zone virtual network link already exists
az network private-dns link vnet show \
--name $redisPrivateDnsZoneVirtualNetworkLinkName \
--resource-group $harborResourceGroupName \
--zone-name $redisPrivateDnsZoneName \
--only-show-errors &>/dev/null
if [[ $? != 0 ]]; then
echo "No [$redisPrivateDnsZoneVirtualNetworkLinkName] private DNS zone virtual network link actually exists in the [$harborResourceGroupName] resource group"
# Retrieve the client virtual network resource id
virtualNetworkResourceId=$(az network vnet show \
--name $virtualNetworkName \
--resource-group $harborResourceGroupName \
--only-show-errors \
--query id \
--output tsv)
if [[ -z $virtualNetworkResourceId ]]; then
echo "Failed to retrieve [$virtualNetworkName] client virtual network resource id in the [$harborResourceGroupName] resource group"
exit
fi
echo "Creating [$redisPrivateDnsZoneVirtualNetworkLinkName] private DNS zone virtual network link in the [$harborResourceGroupName] resource group..."
# Create the private DNS Zone virtual network link
az network private-dns link vnet create \
--name $redisPrivateDnsZoneVirtualNetworkLinkName \
--resource-group $harborResourceGroupName \
--zone-name $redisPrivateDnsZoneName \
--virtual-network $virtualNetworkResourceId \
--registration-enabled false \
--only-show-errors 1>/dev/null
if [[ $? == 0 ]]; then
echo "[$redisPrivateDnsZoneVirtualNetworkLinkName] private DNS zone virtual network link successfully created in the [$harborResourceGroupName] resource group"
else
echo "Failed to create [$redisPrivateDnsZoneVirtualNetworkLinkName] private DNS zone virtual network link in the [$harborResourceGroupName] resource group"
exit
fi
else
echo "[$redisPrivateDnsZoneVirtualNetworkLinkName] private DNS zone virtual network link already exists in the [$harborResourceGroupName] resource group"
fi
# Retrieve the resource id of the private DNS Zone
echo "Retrieving [$redisPrivateDnsZoneName] private DNS zone resource id in the [$harborResourceGroupName] resource group..."
redisPrivateDnsZoneResourceId=$(az network private-dns zone show \
--name $redisPrivateDnsZoneName \
--resource-group $harborResourceGroupName \
--query id \
--output tsv \
--only-show-errors)
# Check if the private endpoint for the Azure Cache for Redis resource already exists
privateEndpointId=$(az network private-endpoint list \
--resource-group $harborResourceGroupName \
--only-show-errors \
--query "[?name=='$redisCachePrivateEndpointName'].id" \
--output tsv)
if [[ -z $privateEndpointId ]]; then
echo "Private endpoint [$redisCachePrivateEndpointName] does not exist"
echo "Creating a private endpoint for the [$redisCacheName] Azure Cache for Redis..."
# Create a private endpoint for th Azure Cache for Redis resource
az network private-endpoint create \
--name $redisCachePrivateEndpointName \
--resource-group $harborResourceGroupName \
--vnet-name $virtualNetworkName \
--subnet $defaultSubnetName \
--group-id redisCache \
--private-connection-resource-id $redisCacheResourceId \
--connection-name "${redisCacheName}Connection" \
--only-show-errors 1>/dev/null
if [[ $? == 0 ]]; then
echo "Private endpoint successfully created for the [$redisCacheName] Azure Cache for Redis"
else
echo "Failed to create a private endpoint for the [$redisCacheName] Azure Cache for Redis"
exit
fi
else
echo "Private endpoint [$redisCachePrivateEndpointName] already exists"
fi
# Check if the Private DNS Zone Group already exists
echo "Checking if [$redisCachePrivateDnsZoneGroupName] Private DNS Zone Group actually exists in the [$redisCachePrivateEndpointName] private endpoint..."
groupResourceId=$(az network private-endpoint dns-zone-group show \
--name $redisCachePrivateDnsZoneGroupName \
--endpoint-name $redisCachePrivateEndpointName \
--resource-group $harborResourceGroupName \
--only-show-errors \
--query id \
--output tsv)
if [[ -z $groupResourceId ]]; then
echo "No [$redisCachePrivateDnsZoneGroupName] Private DNS Zone Group actually exists in the [$redisCachePrivateEndpointName] private endpoint"
echo "Creating [$redisCachePrivateDnsZoneGroupName] Private DNS Zone Group in the [$redisCachePrivateEndpointName] private endpoint..."
# Create the Private DNS Zone Group
az network private-endpoint dns-zone-group create \
--name $redisCachePrivateDnsZoneGroupName \
--endpoint-name $redisCachePrivateEndpointName \
--resource-group $harborResourceGroupName \
--private-dns-zone $redisPrivateDnsZoneResourceId \
--zone-name $redisPrivateDnsZoneName \
--only-show-errors 1>/dev/null
if [[ $? == 0 ]]; then
echo "[$redisCachePrivateDnsZoneGroupName] Private DNS Zone Group successfully created in the [$redisCachePrivateEndpointName] private endpoint"
else
echo "Failed to create [$redisCachePrivateDnsZoneGroupName] Private DNS Zone Group in the [$redisCachePrivateEndpointName] private endpoint"
exit
fi
else
echo "[$redisCachePrivateDnsZoneGroupName] Private DNS Zone Group already exists in the [$redisCachePrivateEndpointName] private endpoint"
fi
# Check if the Harbor repository is not already added
result=$(helm repo list | grep $harborRepoName | awk '{print $1}')
if [[ -n $result ]]; then
echo "[$harborRepoName] Helm repo already exists"
else
# Add the ingress-Harbor repository
echo "Adding [$harborRepoName] Helm repo..."
helm repo add $harborRepoName $harborRepoUrl
fi
# Update your local Helm chart repository cache
echo 'Updating Helm repos...'
helm repo update
# Use Helm to deploy Harbor
result=$(helm list -n $harborNamespace | grep $harborReleaseName | awk '{print $1}')
if [[ -n $result ]]; then
echo "[$harborReleaseName] release already exists in the [$harborNamespace] namespace"
# Upgrade Harbor
echo "Upgrading [$harborReleaseName] release in the [$harborNamespace] namespace..."
helm upgrade $harborReleaseName $harborRepoName/$harborChartName \
--install \
--create-namespace \
--namespace $harborNamespace \
--set externalURL=https://$harborHostname \
--set harborAdminPassword=$harborAdminPassword \
--set expose.type=$harborExposeType \
--set expose.tls.certSource=secret \
--set expose.tls.secret.secretName=$harborTlsSecretName \
--set expose.ingress.hosts.core=$harborHostname \
--set expose.ingress.className=$nginxIngressClassName \
--set expose.ingress.annotations."cert-manager\.io/cluster-issuer"=$certManagerClusterIssuer \
--set expose.ingress.annotations."cert-manager\.io/acme-challenge-type"=$certManagerAcmeChallengeType \
--set database.type=external \
--set database.external.host=$postgreSqlFullyQualifiedDomainName \
--set database.external.port=5432 \
--set database.external.username=$postgreSqlAdminUsername \
--set database.external.password=$postgreSqlAdminPassword \
--set database.external.database=$postgreSqlDatabaseName \
--set database.external.sslmode=require \
--set redis.type=external \
--set redis.external.addr=$redisCacheHostName:$redisCachePort \
--set redis.external.password=$redisCachePrimaryKey \
--set persistence.enabled=true \
--set persistence.persistentVolumeClaim.registry.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.jobservice.jobLog.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.trivy.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.registry.size=$registrySize \
--set persistence.persistentVolumeClaim.jobservice.jobLog.size=$jobserviceJobLogSize \
--set persistence.persistentVolumeClaim.trivy.size=$trivySize \
--set portal.replicas=$portalReplicaCount \
--set core.replicas=$coreReaplicaCount \
--set jobservice.replicas=$jobserviceReplicaCount \
--set registry.replicas=$registryReplicaCount \
--set trivy.replicas=$trivyReplicaCount
else
# Install Harbor
echo "Installing [$harborReleaseName] release in the [$harborNamespace] namespace..."
helm install $harborReleaseName $harborRepoName/$harborChartName \
--create-namespace \
--namespace $harborNamespace \
--set externalURL=https://$harborHostname \
--set harborAdminPassword=$harborAdminPassword \
--set expose.type=$harborExposeType \
--set expose.tls.certSource=secret \
--set expose.tls.secret.secretName=$harborTlsSecretName \
--set expose.ingress.hosts.core=$harborHostname \
--set expose.ingress.className=$nginxIngressClassName \
--set expose.ingress.annotations."cert-manager\.io/cluster-issuer"=$certManagerClusterIssuer \
--set expose.ingress.annotations."cert-manager\.io/acme-challenge-type"=$certManagerAcmeChallengeType \
--set database.type=external \
--set database.external.host=$postgreSqlFullyQualifiedDomainName \
--set database.external.port=5432 \
--set database.external.username=$postgreSqlAdminUsername \
--set database.external.password=$postgreSqlAdminPassword \
--set database.external.database=$postgreSqlDatabaseName \
--set database.external.sslmode=require \
--set redis.type=external \
--set redis.external.addr=$redisCacheHostName:$redisCachePort \
--set redis.external.password=$redisCachePrimaryKey \
--set persistence.enabled=true \
--set persistence.persistentVolumeClaim.registry.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.jobservice.jobLog.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.trivy.storageClass=$harborRegistryStorageClass \
--set persistence.persistentVolumeClaim.registry.size=$registrySize \
--set persistence.persistentVolumeClaim.jobservice.jobLog.size=$jobserviceJobLogSize \
--set persistence.persistentVolumeClaim.trivy.size=$trivySize \
--set portal.replicas=$portalReplicaCount \
--set core.replicas=$coreReaplicaCount \
--set jobservice.replicas=$jobserviceReplicaCount \
--set registry.replicas=$registryReplicaCount \
--set trivy.replicas=$trivyReplicaCount
fi
# Get values
helm get values $harborReleaseName --namespace $harborNamespace
The picture below depicts an AKS cluster setup where Harbor is integrated with Azure Database for PostgreSQL flexible server and Azure Cache for Redis. To ensure private and secure connectivity, the Redis
service establishes a connection to the managed cache through an Azure Private Endpoint.
This configuration guarantees that the Redis
service can securely connect to the managed cache privately, enhancing the overall security and reliability of the Harbor deployment. Likewise, the Database
service accesses the Azure Database for PostgreSQL flexible server via a private IP address.
Deploy Harbor across Availability Zones
If you want to deploy Harbor across the Azure availability zones within a region, you need to create multiple replicas of each service and make sure to run at least one replica in each availability zone. Availability zones are a high-availability offering that protects your applications and data from datacenter failures. Zones are unique physical locations within an Azure region. Each zone includes one or more datacenters equipped with independent power, cooling, and networking. To ensure resiliency, there's always more than one zone in all zone enabled regions. The physical separation of availability zones within a region protects applications and data from datacenter failures. For more information, see What are availability zones in Azure? in the Azure documentation.
Create an AKS cluster across availability zones
When creating a cluster using the az aks create command, the --zones
parameter allows you to specify the availability zones for deploying agent nodes. However, it's important to note that this parameter does not control the deployment of managed control plane components. These components are automatically distributed across all available zones in the region during cluster deployment.
Here's an example that demonstrates creating an AKS cluster named myAKSCluster
in the resource group named myResourceGroup
, with a total of three nodes. One node is deployed in zone 1
, another in zone 2
, and the third in zone 3
. For more information, see Create an AKS cluster across availability zones.
az group create --name myResourceGroup --location eastus2
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--generate-ssh-keys \
--vm-set-type VirtualMachineScaleSets \
--load-balancer-sku standard \
--node-count 3 \
--zones 1 2 3
When using a single node pool spanning across three availability zones, you need to use Zone Redundant Storage for managed disks (ZRS). In fact, if a pod replica attaches a persistent volume in one availability zone, and then the pod is rescheduled in another availability zone, the pod could not reattached the managed disk if this is configured to use Locally Redundant Storage for managed disks.
When using a single node pool spanning across three availability zones, it is important to utilize Zone Redundant Storage for managed disks (ZRS) for persistent volumes. This ensures data availability and reliability. In scenarios where a pod replica attaches a persistent volume in one availability zone and gets rescheduled to another availability zone, the pod could not reattach the managed disk if it is configured with Locally redundant storage for managed disks. To prevent potential issues, it is recommended to configure persistent volume claims to use a storage class that is set up to utilize Zone Redundant Storage for managed disks (ZRS). By doing so, you can ensure the persistence and availability of your data across availability zones. For more information on persistent volume claims, you can refer to the official Kubernetes documentation.
To effectively use the cluster autoscaler with node pools spanning multiple zones and take advantage of zone-related scheduling features like volume topological scheduling, we recommend creating a separate node pool for each availability zone. Additionally, enabling the --balance-similar-node-groups
option through the autoscaler profile ensures that the autoscaler can scale up and maintain balanced sizes across the node pools. This approach allows for optimal scaling and resource distribution within your AKS cluster. When using the AKS cluster autoscaler with node pools spanning multiple Availability Zones, there are a few considerations to keep in mind:
- For node pools that use Azure Storage for persistent volumes, it is advisable to create a separate node pool for each zone. This is because persistent volumes cannot be used across zones, unless you use Zone Redundant Storage for managed disks (ZRS). By doing so, each new Virtual Machine in the node pool will be able to attach its respective persistent volumes.
- If multiple node pools are created within each zone, it is recommended to enable the
--balance-similar-node-groups
property in the autoscaler profile. This feature helps identify similar node pools and ensures a balanced distribution of nodes across them. - However, if you are not utilizing Persistent Volumes, Cluster Autoscaler should work without any issues with node pools that span multiple Availability Zones.
Spread Pods across Zones using Pod Topology Spread Constraints
When deploying pods to an AKS cluster that spans multiple availability zones, it is essential to ensure optimal distribution and resilience. To achieve this, you can utilize the Pod Topology Spread Constraints Kubernetes feature. By implementing Pod Topology Spread Constraints, you gain granular control over how pods are spread across your AKS cluster, taking into account failure-domains like regions, availability zones, and nodes. Specifically, you can create constraints that span pod replicas across availability zones, as well as across different nodes within a single availability zone. By doing so, you can achieve several benefits. First, spreading pod replicas across availability zones ensures that your application remains available even if an entire zone goes down. Second, distributing pods across different nodes within a zone enhances fault tolerance, as it minimizes the impact of node failures or maintenance activities. By using Pod Topology Spread Constraints, you can maximize the resilience and availability of your applications in an AKS cluster. This approach optimizes resource utilization, minimizes downtime, and delivers a robust infrastructure for your workloads across multiple availability zones and nodes.
Custom ZRS Storage Classes
The official Harbor Helm Chart provides flexibility in defining storage classes for provisioning persistent volumes used by each service. There are built-in storage classes available for managed disks and Azure Files:
- For managed disks:
managed-csi
: Uses Azure Standard SSD locally redundant storage (LRS) to create a managed disk.managed-csi-premium
: Uses Azure Premium LRS to create a managed disk.
- For Azure Files:
azurefile-csi
: Uses Azure Standard Storage to create an Azure file share.azurefile-csi-premium
: Uses Azure Premium Storage to create an Azure file share.
While these built-in storage classes are suitable for most scenarios, they use Standard_LRS
and Premium_LRS
, which employ Locally Redundant Storage (LTS). However, to create pods across availability zones in a zone-redundant AKS cluster, you need to use Zone Redundant Storage for managed disks (ZRS) for persistent volumes.
Zone Redundant Storage (ZRS) synchronously replicates your Azure managed disk across three availability zones within your selected region. Each availability zone is a separate physical location with independent power, cooling, and networking. With ZRS disks, you benefit from at least 99.9999999999% (12 9's) of durability over a year and the ability to recover from failures in availability zones. In case a zone goes down, a ZRS disk can be attached to a virtual machine (VM) in a different zone.
To create a custom storage class using StandardSSD_ZRS
or Premium_ZRS
managed disks, you can use the following example:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: managed-csi-premium-zrs
provisioner: disk.csi.azure.com
parameters:
skuname: Premium_ZRS
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
For more information on the parameters for the Azure Disk CSI Driver, refer to the Azure Disk CSI Driver Parameters documentation.
Similarly, you can create a storage class using the Azure Files CSI Driver with Standard_ZRS
, Standard_RAGZRS
, and Premium_ZRS
storage options, ensuring that data copies are stored across different zones within a region:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: azurefile-csi-premium-zrs
mountOptions:
- mfsymlinks
- actimeo=30
parameters:
skuName: Premium_ZRS
enableLargeFileShares: "true"
provisioner: file.csi.azure.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
For more information about the parameters for the Azure Files CSI Driver, refer to the Azure File CSI Driver Parameters documentation.
Harbor with ZRS Azure Files and Managed Disks
The official Harbor Helm Chart provides flexibility in definining Pod Topology Spread Constraints for each service. The 05-deploy-harbor-via-helm-across-azs.sh
script deploys Harbor to a zone-redundant cluster employing Zone Redundant Storage (ZRS) storage for managed disks and Azure Files. The diagram below represents the deployment architecture.
In this setup, the services Trivy
, Job Service
, and Registry
utilize an Azure File Share as the persistent volume for storing data. The AKS cluster identity creates a storage account with the specified kind and SKU mentioned in the storage class within the node resource group associated with the cluster. It then creates a file share for each persistent volume claim, with the size specified for each service. Note that the minimum size for a premium file share is 100 GiB. For detailed information, see Azure Files scalability and performance targets. On the other hand, the Database
and Redis
services utilize an Azure Managed Disk as the persistent volume for storing data.
#!/bin/bash
# Variables
source ./00-variables.sh
# Check if the managed-csi-premium-zrs storage class exists
result=$(kubectl get storageclass -o jsonpath="{.items[?(@.metadata.name=='$managedCsiPremiumZrsStorageClassName')].metadata.name}")
if [[ -n $result ]]; then
echo "$managedCsiPremiumZrsStorageClassName storage class already exists in the cluster"
else
echo "$managedCsiPremiumZrsStorageClassName storage class does not exist in the cluster"
echo "creating $managedCsiPremiumZrsStorageClassName storage class in the cluster..."
kubectl apply -f ./managed-csi-premium-zrs.yml
fi
# Check if the azurefile-csi-premium-zrs storage class exists
result=$(kubectl get storageclass -o jsonpath="{.items[?(@.metadata.name=='$azureFilePremiumZrsStorageClassName')].metadata.name}")
if [[ -n $result ]]; then
echo "$azureFilePremiumZrsStorageClassName storage class already exists in the cluster"
else
echo "$azureFilePremiumZrsStorageClassName storage class does not exist in the cluster"
echo "creating $azureFilePremiumZrsStorageClassName storage class in the cluster..."
kubectl apply -f ./azurefile-csi-premium-zrs.yml
fi
# Check if the Harbor repository is not already added
result=$(helm repo list | grep $harborRepoName | awk '{print $1}')
if [[ -n $result ]]; then
echo "[$harborRepoName] Helm repo already exists"
else
# Add the ingress-Harbor repository
echo "Adding [$harborRepoName] Helm repo..."
helm repo add $harborRepoName $harborRepoUrl
fi
# Update your local Helm chart repository cache
echo 'Updating Helm repos...'
helm repo update
# Use Helm to deploy Harbor
result=$(helm list -n $harborNamespace | grep $harborReleaseName | awk '{print $1}')
if [[ -n $result ]]; then
echo "[$harborReleaseName] release already exists in the [$harborNamespace] namespace"
# Upgrade Harbor
echo "Upgrading [$harborReleaseName] release in the [$harborNamespace] namespace..."
helm upgrade $harborReleaseName $harborRepoName/$harborChartName \
--install \
--create-namespace \
--namespace $harborNamespace \
--set externalURL=https://$harborHostname \
--set harborAdminPassword=$harborAdminPassword \
--set expose.type=$harborExposeType \
--set expose.tls.certSource=secret \
--set expose.tls.secret.secretName=$harborTlsSecretName \
--set expose.ingress.hosts.core=$harborHostname \
--set expose.ingress.className=$nginxIngressClassName \
--set expose.ingress.annotations."cert-manager\.io/cluster-issuer"=$certManagerClusterIssuer \
--set expose.ingress.annotations."cert-manager\.io/acme-challenge-type"=$certManagerAcmeChallengeType \
--set persistence.enabled=true \
--set persistence.persistentVolumeClaim.registry.storageClass=$azureFilePremiumZrsStorageClassName \
--set persistence.persistentVolumeClaim.registry.accessMode=ReadWriteMany \
--set persistence.persistentVolumeClaim.registry.size=$registrySize \
--set persistence.persistentVolumeClaim.jobservice.jobLog.storageClass=$azureFilePremiumZrsStorageClassName \
--set persistence.persistentVolumeClaim.jobservice.accessMode=ReadWriteMany \
--set persistence.persistentVolumeClaim.jobservice.jobLog.size=$jobserviceJobLogSize \
--set persistence.persistentVolumeClaim.database.storageClass=$managedCsiPremiumZrsStorageClassName \
--set persistence.persistentVolumeClaim.database.size=$databaseSize \
--set persistence.persistentVolumeClaim.redis.storageClass=$managedCsiPremiumZrsStorageClassName \
--set persistence.persistentVolumeClaim.redis.size=$redisSize \
--set persistence.persistentVolumeClaim.trivy.storageClass=$azureFilePremiumZrsStorageClassName \
--set persistence.persistentVolumeClaim.trivy.accessMode=ReadWriteMany \
--set persistence.persistentVolumeClaim.trivy.size=$trivySize \
--set portal.replicas=$portalReplicaCount \
--set core.replicas=$coreReaplicaCount \
--set jobservice.replicas=$jobserviceReplicaCount \
--set registry.replicas=$registryReplicaCount \
--set trivy.replicas=$trivyReplicaCount \
--set portal.topologySpreadConstraints[0].maxSkew=1 \
--set portal.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set portal.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set portal.topologySpreadConstraints[1].maxSkew=1 \
--set portal.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set portal.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set core.topologySpreadConstraints[0].maxSkew=1 \
--set core.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set core.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set core.topologySpreadConstraints[1].maxSkew=1 \
--set core.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set core.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set jobservice.topologySpreadConstraints[0].maxSkew=1 \
--set jobservice.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set jobservice.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set jobservice.topologySpreadConstraints[1].maxSkew=1 \
--set jobservice.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set jobservice.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set registry.topologySpreadConstraints[0].maxSkew=1 \
--set registry.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set registry.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set registry.topologySpreadConstraints[1].maxSkew=1 \
--set registry.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set registry.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set trivy.topologySpreadConstraints[0].maxSkew=1 \
--set trivy.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set trivy.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set trivy.topologySpreadConstraints[1].maxSkew=1 \
--set trivy.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set trivy.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set exporter.topologySpreadConstraints[0].maxSkew=1 \
--set exporter.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set exporter.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set exporter.topologySpreadConstraints[1].maxSkew=1 \
--set exporter.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set exporter.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set persistence.persistentVolumeClaim.registry.size=$registrySize \
--set persistence.persistentVolumeClaim.jobservice.size=$jobserviceJobLogSize \
--set persistence.persistentVolumeClaim.database.size=$databaseSize \
--set persistence.persistentVolumeClaim.redis.size=$redisSize \
--set persistence.persistentVolumeClaim.trivy.size=$trivySize
else
# Install Harbor
echo "Installing [$harborReleaseName] release in the [$harborNamespace] namespace..."
helm install $harborReleaseName $harborRepoName/$harborChartName \
--create-namespace \
--namespace $harborNamespace \
--set externalURL=https://$harborHostname \
--set harborAdminPassword=$harborAdminPassword \
--set expose.type=$harborExposeType \
--set expose.tls.certSource=secret \
--set expose.tls.secret.secretName=$harborTlsSecretName \
--set expose.ingress.hosts.core=$harborHostname \
--set expose.ingress.className=$nginxIngressClassName \
--set expose.ingress.annotations."cert-manager\.io/cluster-issuer"=$certManagerClusterIssuer \
--set expose.ingress.annotations."cert-manager\.io/acme-challenge-type"=$certManagerAcmeChallengeType \
--set persistence.enabled=true \
--set persistence.persistentVolumeClaim.registry.storageClass=$azureFilePremiumZrsStorageClassName \
--set persistence.persistentVolumeClaim.registry.accessMode=ReadWriteMany \
--set persistence.persistentVolumeClaim.registry.size=$registrySize \
--set persistence.persistentVolumeClaim.jobservice.jobLog.storageClass=$azureFilePremiumZrsStorageClassName \
--set persistence.persistentVolumeClaim.jobservice.accessMode=ReadWriteMany \
--set persistence.persistentVolumeClaim.jobservice.jobLog.size=$jobserviceJobLogSize \
--set persistence.persistentVolumeClaim.database.storageClass=$managedCsiPremiumZrsStorageClassName \
--set persistence.persistentVolumeClaim.database.size=$databaseSize \
--set persistence.persistentVolumeClaim.redis.storageClass=$managedCsiPremiumZrsStorageClassName \
--set persistence.persistentVolumeClaim.redis.size=$redisSize \
--set persistence.persistentVolumeClaim.trivy.storageClass=$azureFilePremiumZrsStorageClassName \
--set persistence.persistentVolumeClaim.trivy.accessMode=ReadWriteMany \
--set persistence.persistentVolumeClaim.trivy.size=$trivySize \
--set portal.replicas=$portalReplicaCount \
--set core.replicas=$coreReaplicaCount \
--set jobservice.replicas=$jobserviceReplicaCount \
--set registry.replicas=$registryReplicaCount \
--set trivy.replicas=$trivyReplicaCount \
--set portal.topologySpreadConstraints[0].maxSkew=1 \
--set portal.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set portal.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set portal.topologySpreadConstraints[1].maxSkew=1 \
--set portal.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set portal.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set core.topologySpreadConstraints[0].maxSkew=1 \
--set core.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set core.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set core.topologySpreadConstraints[1].maxSkew=1 \
--set core.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set core.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set jobservice.topologySpreadConstraints[0].maxSkew=1 \
--set jobservice.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set jobservice.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set jobservice.topologySpreadConstraints[1].maxSkew=1 \
--set jobservice.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set jobservice.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set registry.topologySpreadConstraints[0].maxSkew=1 \
--set registry.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set registry.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set registry.topologySpreadConstraints[1].maxSkew=1 \
--set registry.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set registry.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set trivy.topologySpreadConstraints[0].maxSkew=1 \
--set trivy.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set trivy.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set trivy.topologySpreadConstraints[1].maxSkew=1 \
--set trivy.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set trivy.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set exporter.topologySpreadConstraints[0].maxSkew=1 \
--set exporter.topologySpreadConstraints[0].whenUnsatisfiable=ScheduleAnyway \
--set exporter.topologySpreadConstraints[0].topologyKey=topology.kubernetes.io/zone \
--set exporter.topologySpreadConstraints[1].maxSkew=1 \
--set exporter.topologySpreadConstraints[1].whenUnsatisfiable=ScheduleAnyway \
--set exporter.topologySpreadConstraints[1].topologyKey=kubernetes.io/hostname \
--set persistence.persistentVolumeClaim.registry.size=$registrySize \
--set persistence.persistentVolumeClaim.jobservice.size=$jobserviceJobLogSize \
--set persistence.persistentVolumeClaim.database.size=$databaseSize \
--set persistence.persistentVolumeClaim.redis.size=$redisSize \
--set persistence.persistentVolumeClaim.trivy.size=$trivySize
fi
# Get values
helm get values $harborReleaseName --namespace $harborNamespace
Network Topologies in Azure
In a multi-region scenario where multiple AKS clusters are deployed, it is crucial to consider the network topology to optimize performance, cost, and security. Relying on a single Harbor instance located in a single AKS cluster within a region can lead to latency issues, increased bandwidth usage, and higher network costs. Moreover, such a setup creates a single point of failure and does not provide the necessary business continuity.
To address these concerns, it is recommended to establish a distributed environment with a main Harbor instance serving as the central source of truth for publishing images and artifacts. Multiple peripheral Harbor instances can be deployed to create copies of these container images and artifacts via pull or push replication rules. This architecture enhances reliability and ensures that the artifacts are available across different regions.
To achieve this, the AKS clusters hosting the main Harbor instance and the Harbor replica instances can be connected using various networking approaches. These include:
Public IP Addresses
: Exposing the Harbor instances via the public internet. By configuring the Harbor instances with public IP addresses and exposing them directly to the internet, workloads from different AKS clusters can communicate with each other. However, it's important to note that this option may introduce potential security risks and can be less optimal in terms of latency, bandwidth usage, and network costs compared to other network topologies.Virtual Network Peering
: Connecting the virtual networks hosting the AKS clusters using Azure Virtual Network peering. This allows communication over the Azure backbone network, reducing latency and enhancing security.Azure Private Link
: Exposing the Harbor instances via Azure Private Link, enabling communication between the AKS clusters privately, without the need for public IP addresses or traversing the public internet. This approach ensures secure and efficient connectivity.
By adopting a distributed Harbor architecture with appropriate network topologies, you can enhance reliability, reduce latency, and improve the overall performance of your multi-region environment. It also mitigates the risk of a single point of failure and provides the necessary business continuity for your CI/CD or GitOps workflows.
Harbor Instances communicating via Public IP Address
Harbor instances in different AKS clusters can be exposed via public IP addresses. This network topology involves utilizing an ingress controller, such as the NGINX Ingress controller or the Application Gateway for Containers, along with the public load balancer of each AKS cluster.
The following architecture diagram illustrates this network topology:
When a Harbor instance running on an AKS cluster in one region communicates with another Harbor instance hosted by another AKS cluster, in the same or a different region, via a public IP address, the communication typically occurs over the public internet. In this scenario, the communication between the two AKS clusters traverses the external network, utilizing public IP addresses and potentially going through various network providers and routers before reaching the destination cluster.
It's important to note that while Microsoft owns and operates a robust global network backbone, called the Azure backbone network, which provides fast and reliable connectivity within the Azure ecosystem, communication between AKS clusters across regions through public LoadBalancer
Kubernetes services typically does not utilize this internal backbone network. To ensure secure and reliable communication between AKS clusters across regions, you might consider using alternative methods such as Azure Virtual Network peering or Azure Private Link Service connections. These options provide private network connections over the Azure backbone network, offering lower latency and enhanced security compared to traversing the public internet.
Harbor Instances communicating via Private IP Address
You can use a network topology using private IP addresses to let two or more Harbor instances located in different AKS clusters communicate with each other over a private and secure channel. This approach involves configuring ingress controllers to utilize the internal Azure Standard Load Balancer specific to each AKS cluster. The network architecture diagram below illustrates this configuration.
To establish this network topology, the virtual networks hosting the AKS clusters need to be connected using Azure Virtual Network peering. Virtual Network peering allows for seamless connection of two or more Azure Virtual Networks, enabling them to function as a single network for connectivity purposes. Traffic between virtual machines in peered virtual networks is routed exclusively through the Microsoft backbone network, ensuring secure and efficient communication.
In this network topology, a global Azure Private DNS Zone acts as a reliable and secure DNS service, managing domain name resolution across multiple virtual networks. By associating Private DNS zones with the virtual networks hosting the AKS clusters, you can utilize A records to resolve the hostname of the Harbor instances to their corresponding private IP addresses assigned to the ingress controllers.
Implementing Harbor communication via private IP addresses offers enhanced security and performance compared to the public internet. By leveraging Azure Virtual Network peering and Azure Private DNS zones, you can establish a resilient network infrastructure for seamless communication between Harbor instances in different AKS clusters.
Harbor Instances communicating via Azure Private Link
To establish secure and private communication between Harbor instances in different AKS clusters, you can configure them to communicate via Azure Private Link Service connections. The diagram below illustrates this network topology.
In this network topology, each Harbor instance is exposed through an Azure Private Link Service, utilizing the internal Azure Standard Load Balancer specific to the respective AKS cluster. This setup eliminates the need for Azure Virtual Network peering to connect the virtual networks hosting the AKS clusters. Instead, each Harbor instance uses an Azure Private Endpoint within the AKS virtual network to establish secure connectivity to the Private Link Service of another Harbor instance.
To enable DNS resolution across multiple virtual networks, a global Azure Private DNS Zone is utilized. This DNS zone provides a reliable and secure DNS service to manage and resolve domain names. By associating the Private DNS zones with the virtual networks hosting the AKS clusters, you can utilize an A record to resolve the name of a remote Harbor Private Link Service to the private IP address of the respective local Private Endpoint.
The use of Azure Private Link ensures secure and direct communication between Harbor instances, keeping the traffic within the Azure backbone network. This topology enhances security and performance, providing a reliable and efficient network infrastructure for the communication between Harbor instances in different AKS clusters.
Working with Harbor
In Harbor, a project represents a collection of container images. Before pushing images to Harbor, a project must be created. Role-Based Access Control (RBAC) is applied to projects, ensuring that only users with specific roles can perform certain operations.
Harbor supports two types of projects:
Public
: This type allows any user to pull images from the project, making it convenient for sharing repositories with others.Private
: Only project members can pull images from a private project, enhancing privacy and access control.
By default, when deploying Harbor, a public project named library
is created. However, you can create additional projects and assign users to them, enabling them to push and pull image repositories. User access to images is determined by their roles within the project.
The following roles are available in Harbor:
Limited Guest
: Limited Guests have restricted read privileges. They can only pull images and are unable to push, view logs, or see other project members. This role is useful for granting limited access to users from different organizations.Guest
: Guests have read-only privileges for a specified project. They can pull and retag images but cannot push.Developer
: Developers have read and write privileges for a project.Maintainer
: Maintainers have elevated permissions beyond developers. They can scan images, view replication jobs, and delete images and Helm charts.ProjectAdmin
: When creating a new project, the creator is assigned theProjectAdmin
role. In addition to read-write privileges, ProjectAdmins have management capabilities such as adding and removing members and initiating vulnerability scans.
The image below illustrates the RBAC roles in Harbor:
By assigning appropriate roles to users in different projects, you can control their access to image repositories and manage their permissions effectively. For more information on Projects, see the official Harbor documentation.
Create a New Project
Assuming that you have successfully deployed Harbor to your Azure Kubernetes Service (AKS) cluster and exposed it either inside or outside the virtual network using an ingress controller, such as the NGINX Ingress controller or the Application Gateway for Containers, log in using the administrator credentials. Once logged in, you will see the Projects
page.
You can proceed as follows to create a new Project
:
- Log in to the Harbor interface with an account that has Harbor system administrator privileges.
- Go to
Projects
and clickNew Project
. - Provide a name for the project, for example
private
. - You have the option to make the project
Public
by checking the corresponding checkbox. If the project is set toPublic
, any user can pull images from it. If you leave the project asPrivate
, only members of the project can pull images. - Enable the
Proxy Cache
option if you want this project to function as a pull-through cache for a specific target registry instance. Note that Harbor can only act as a proxy for DockerHub, Docker Registry, Harbor itself, AWS ECR, Azure ACR, Google GCR, GitHub GHCR, and JFrog Artifactory registries. - Click
OK
.
Once the project is created, you can explore various sections such as the summary, repositories, helm charts, members, labels, scanner, P2P preheat, policy, robot accounts, logs, and configuration using the navigation tabs provided. These sections allow you to manage and configure your projects in Harbor, providing granular control over access, scanning, caching, and more.
Create a New User
Proceed as follows to create a new user:
- Log in to the Harbor interface with an account that has Harbor system administrator privileges.
- Under
Administration
, go toUsers
. - Click
New User
. - Enter information about the new user.
- The username must be unique in the Harbor system
- The email address must be unique in the Harbor system
- The password must contain at least 8 characters with 1 lowercase letter, 1 uppercase letter and 1 numeric character
If users forget their password, they need to ask the administrator to reset their password. Optionally, you can select a user, then click the Set as Admin
to assign the administrator role to the user.
You can proceed as follows to assign a user a role on a Project
:
- Go to
Projects
and select a project. - Select the
Members
tab.
- Click
User
. - Specify the name of an existing user and select a role. For more information, see User Permissions By Role.
- Click
OK
.
Depending on the selected role, the user will be able to see or edit the repositories in the project, pull or push images, etc. On public projects all users will be able to see the list of repositories, images, image vulnerabilities, Helm charts and Helm chart versions, pull images, retag images (need push permission for destination image), download Helm charts, download Helm chart versions.
Working with Images in Harbor
In this section, you will learn how to efficiently pull and push container images to Harbor.
Pushing Images to Harbor
For simplicity, let's assume that the Harbor registry is publicly exposed at harbor.contoso.com
. Follow these steps to publish a container image to Harbor:
Run the following command to build the container image:
- Run the following command to build the container image:
# Build the docker image
docker build -t flaskapp:1.0 -f ./app/Dockerfile ./app
- To access a private project, you need to sign in using the credentials of a Harbor user who has been assigned a role with the
Push Image
permission for the specific library containing the image.
# Login to Harbor
docker login harbor.contoso.com -u <your-harbor-username> -p <your-harbor-password>
- Tag the Docker image using the format
harbor-hostname/harbor-project/image-name:image-tag
# Tag the docker image
docker tag flaskapp:1.0 harbor.contoso.com/private/flaskapp:1.0
- Push the image.
# Push the docker image
docker push harbor.contoso.com/private/flaskapp:1.0
To push Windows images to your Harbor instance, you also must set your docker daemon to allow-nondistributable-artifacts
. For more information see Pushing Windows Images.
Follow these steps to copy the push command for an image in Harbor:
- Log in to the Harbor interface with an account that has Harbor system administrator privileges.
- Navigate to the
Projects
section and click on the link to the project that contains the image. - Locate the
Push Command
option. - Copy the push command in the desired format to the clipboard.
These steps will enable you to easily access and use the push command for the image in Harbor.
Pulling Images from Harbor
Follow this steps to pull an image from Harbor:
- To access a private project, you need to sign in using the credentials of a Harbor user who has been assigned a role with the
Pull Image
permission for the specific library containing the image.
# Login to Harbor
docker login harbor.contoso.com -u <your-harbor-username> -p <your-harbor-password>
- Pull the image from Harbor:
# Pull the docker image from Harbor
docker pull harbor.contoso.com/private/flaskapp:1.0
If you don't know the exact tag of the container image to pull, you can proceed as follows:
- Log in to the Harbor interface with an account that has Harbor system administrator privileges.
- Navigate to the
Projects
section and click on the link to the project that contains the image. - Click the container image link.
- Select a container image version.
- Click
Copy Pull Command
. - Click
Docker
orPodman
to copy the pull command in the desired format to the clipboard.
Using an Image from Harbor
In order to deploy a workload to your Azure Kubernetes Service (AKS) cluster that uses a container image from a private project in Harbor, you need to follow these steps. If you are not familiar with pulling images from a private registry, you can refer to the Kubernetes documentation for more information.
Log in to Harbor
On your laptop, you must authenticate with Harbor in order to pull an image from a private library. Use the docker
tool to log in to Harbor. For more information, see Docker ID accounts. You need to sign in using the credentials of a Harbor user who has been assigned a role with the Pull Image
permission for the specific library containing the image.
# Login to Harbor
docker login harbor.contoso.com -u <your-harbor-username> -p <your-harbor-password>
The login process creates or updates a config.json
file that contains an authorization token. For more information, see Kubernetes Interpretation of config.json. You can view the contents of the config.json
file by running the following command:
cat ~/.docker/config.json
The output contains a section similar to this:
{
"auths": {
"https://index.docker.io/v1/": {
"auth": "c3R...zE2"
}
}
}
Note:
If you use a Docker credentials store, you won't see that auth
entry but a credsStore
entry with the name of the store as value. In that case, you can create a secret directly. See Create a Secret by providing credentials on the command line.
Create a Secret based on existing credentials
A Kubernetes cluster uses the Secret of kubernetes.io/dockerconfigjson
type to authenticate with a private container registry to pull a container image. If you already ran docker login
, you can copy that credential into Kubernetes:
kubectl create secret generic regcred \
--from-file=.dockerconfigjson=<path/to/.docker/config.json> \
--type=kubernetes.io/dockerconfigjson
If you need more control (for example, to set a namespace or a label on the new secret) then you can customize the secret as follows before storing it.
- Set the name of the data item to
.dockerconfigjson
. - Use
base64
encode on the Docker configuration file and then paste that string, unbroken as the value for fielddata[".dockerconfigjson"]
. - Set
type
tokubernetes.io/dockerconfigjson
.
Here is an example:
apiVersion: v1
kind: Secret
metadata:
name: myregistrykey
namespace: awesomeapps
data:
.dockerconfigjson: UmVhbGx5IHJlYWxseSByZWVlZWVlZWVlZWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWxsbGxsbGxsbGxsbGxsbGxsbGxsbGxsbGxsbGxsbGx5eXl5eXl5eXl5eXl5eXl5eXl5eSBsbGxsbGxsbGxsbGxsbG9vb29vb29vb29vb29vb29vb29vb29vb29vb25ubm5ubm5ubm5ubm5ubm5ubm5ubm5ubmdnZ2dnZ2dnZ2dnZ2dnZ2dnZ2cgYXV0aCBrZXlzCg==
type: kubernetes.io/dockerconfigjson
If you get the error message error: no objects passed to create
, it may mean the base64 encoded string is invalid. If you get an error message like Secret "myregistrykey" is invalid: data[.dockerconfigjson]: invalid value ...
, it means the base64 encoded string in the data was successfully decoded, but could not be parsed as a .docker/config.json
file.
Create a Secret by providing credentials on the command line
Create this Secret, naming it regcred
:
kubectl create secret docker-registry regcred \
--namespace <your-workload-namespace> \
--docker-server=<your-harbor-hostname> \
--docker-username=<your-harbor-username> \
--docker-password=<your-harbor-password> \
--docker-email=<your-email>
where:
<your-workload-namespace>
is the workload namespace.<your-harbor-hostname>
is your hostname of your Harbor registry, for exampleharbor.contoso.com
.<your-harbor-username>
is your Harbor username.<your-harbor-password>
is your Harbor password.<your-email>
is your Harbor user email.
You have successfully set your Harbor credentials in the cluster as a secret called regcred
.
Note:
Typing secrets on the command line may store them in your shell history unprotected, and those secrets might also be visible to other users on your PC during the time that kubectl
is running.
Inspecting the regcred
Secret
To understand the contents of the regcred
secret you created, start by viewing the Secret in YAML format:
kubectl get secret regcred --namespace <your-workload-namespace> --output yaml
The output is similar to this:
apiVersion: v1
data:
.dockerconfigjson: eyJhdXRocyI6eyJ0YW5oYXJib3IuYmFib3NiaXJkL...=
kind: Secret
metadata:
creationTimestamp: "2023-11-24T10:17:42Z"
name: regcred
namespace: flaskapp
resourceVersion: "73013805"
uid: 42514bad-c471-4083-bf74-b38a1e8010b9
type: kubernetes.io/dockerconfigjson
The value of the .dockerconfigjson
field is a base64 representation of your Harbor credentials in Docker format. To understand what is in the .dockerconfigjson
field, convert the secret data to a readable format:
kubectl get secret regcred --output="jsonpath={.data.\.dockerconfigjson}" | base64 --decode
The output is similar to this:
{"auths":{"harbor.contoso.com":{"username":"paolos","password":"xxxxxxxxxx","email":"paolos@contoso.com","auth":"YWRta...aW4="}}}
To understand what is in the auth
field, convert the base64-encoded data to a readable format:
echo "c3R...zE2" | base64 --decode
The output, username and password concatenated with a :
, is similar to this:
paolos:xxxxxxxxxxx
Notice that the Secret data contains the authorization token similar to your local ~/.docker/config.json
file. You have successfully set your Harbor credentials as a Secret called regcred
in the cluster.
Create a Deployment that uses a Harbor image
Below you can see the YAML manifest of a Kubernetes deployment that pulls a container image from a private library of an Harbor registry usig the regcred
secret.
apiVersion: apps/v1
kind: Deployment
metadata:
name: flaskapp
labels:
app: flaskapp
spec:
replicas: 2
selector:
matchLabels:
app: flaskapp
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
minReadySeconds: 5
template:
metadata:
labels:
app: flaskapp
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: flaskapp
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: flaskapp
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: flaskapp
image: harbor.contoso.com/private/flaskapp:1.0
imagePullPolicy: Always
resources:
requests:
memory: "32Mi"
cpu: "100m"
limits:
memory: "64Mi"
cpu: "200m"
ports:
- containerPort: 8888
livenessProbe:
httpGet:
path: /
port: 8888
failureThreshold: 1
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /
port: 8888
failureThreshold: 1
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 5
startupProbe:
httpGet:
path: /
port: 8888
failureThreshold: 1
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 5
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: SERVICE_NAME
value: flaskapp
imagePullSecrets:
- name: regcred
To pull the image from a private registry, Kubernetes needs credentials. The imagePullSecrets
field in the YAML manifest specifies that Kubernetes should get the credentials from a secret named regcred
. Create a Kubernetes deployment that uses the regcred
secret located in the same namespace, and verify that its pods are running:
kubectl apply -f deployment.yaml --namespace <your-workload-namespace>
In case the pods or your deployment fail to start with the status ImagePullBackOff
, view the Pod events:
kubectl describe pod flaskapp-7f5df9c48d-dhn67
If you then see an event with the reason set to FailedToRetrieveImagePullSecret
, Kubernetes can't find a secret with the given name, regcred
in this example. If you specify that a pod needs image pull credentials, the kubelet checks that it can access that secret before attempting to pull the image. Make sure that the secret you have specified exists, and that its name is spelled properly.
Events:
... Reason ... Message
------ -------
... FailedToRetrieveImagePullSecret ... Unable to retrieve some image pull secrets (<regcred>); attempting to pull the image may not succeed.
Create a Registry Endpoint in Harbor
Harbor registries can be used to replicate images with other Harbor instances or external container registries. Replication enables synchronization of container images between multiple registries, providing redundancy, disaster recovery, and load balancing. Here's how you can use Harbor to replicate images:
Create Replication Endpoints
: You start by creating replication endpoints in Harbor. These endpoints define the external registries or other Harbor instances that you want to replicate images with. You can specify the type of registry (Harbor or non-Harbor) and provide the necessary authentication details.Configure Replication Policies
: After creating the endpoints, you can define replication policies in Harbor. These policies determine which projects and repositories are replicated to the specified endpoints. You can choose to replicate all repositories or specific ones based on certain criteria, such as project name, repository name, or image tags.Schedule Replication
: Once the replication policies are set up, you can schedule when and how often replication should occur. You can configure the frequency, timing, and other parameters for replication jobs. Harbor allows you to set up one-time replication or recurring ones at regular intervals.Monitor Replication Status
: Harbor provides monitoring and logging capabilities to track the replication process. You can view the replication status, track any errors or warnings, and ensure that the images are successfully replicated to the specified endpoints. These insights help in troubleshooting and maintaining the replication process.
By leveraging Harbor's replication feature, you can establish a resilient and distributed infrastructure for your container images. This enables seamless collaboration and sharing of images across different teams, locations, or cloud environments. It also allows you to integrate Harbor with external registries like Docker Hub, ECR, Azure Container Registry, etc., facilitating a streamlined and efficient workflow for container image management.
To replicate image repositories from one instance of Harbor to another Harbor or non-Harbor registry, you need to create replication endpoints. Here's how:
-
Go to
Registries
and click the+ New Endpoint
button. -
For
Provider
, select the type of registry to set up as a replication endpoint from the drop-down menu. The supported non-Harbor registries are:- Docker Hub
- Docker registry
- AWS Elastic Container Registry
- Azure Container Registry (ACR)
- Ali Cloud Container Registry
- Google Container Registry
- Google Artifact Registry
- Huawei SWR
- Artifact Hub (Support added in v2.0.4)
- Gitlab
- Quay.io
- Jfrog Artifactory
-
Enter a suitable name and description for the new replication endpoint.
-
Enter the full URL of the registry to set up as a replication endpoint. For example, to replicate to another Harbor instance, enter
https://harbor_instance_address:443
. Make sure the registry exists and is running before creating the endpoint. -
Enter the Access ID and Access Secret for the endpoint registry instance. Use an account with the appropriate privileges on that registry, or an account with write permission on the corresponding project in a Harbor registry. Note:
- Azure ACR adapter should use the username and password of an ACR token. For more information on how to create tokens and scope maps to manage access to specific repositories in your Azure Container Registry (ACR), see Create a token with repository-scoped permissions.
- AWS ECR adapters should use an authentication token. For more information, see Private registry authentication.
- Google GCR adapters should use the entire JSON key generated in the service account. The namespace should start with the project ID. For more information, see Create and delete service account keys.
-
Optionally, select the
Verify Remote Cert
check box. Deselect it if the remote registry uses a self-signed or untrusted certificate. -
Click
Test Connection
. -
If the connection test is successful, click
OK
.
You can list, add, edit, and delete registries under Administration
-> Registries
. Only registries that are not referenced by any rules can be deleted.
Create a Registry Endpoint to another Harbor Instance
If you want to create a registry endpoint to another Harbor instance, you can follow these steps:
- Go to
Registries
and click the+ New Endpoint
button. - From the
Provider
dropdown, selectHarbor
. - Provide a suitable name and description for the new replication endpoint.
- Enter the full URL of the Harbor registry that you want to set up as the replication endpoint. For example, if you want to replicate to another Harbor instance, enter
https://harbor_instance_address:443
. Make sure the registry is already running before creating the endpoint. - Enter the username and password of a user in the remote Harbor registry with the necessary privileges.
- If needed, select the
Verify Remote Cert
checkbox. You can deselect it if the remote registry uses a self-signed or untrusted certificate. - Click on
Test Connection
to verify the connection. - If the connection test is successful, click
OK
.
Create a Registry Endpoint to Docker Hub
Proceed as follows to create a registry endpoint to Docker Hub:
- Go to
Registries
and click the+ New Endpoint
button. - From the
Provider
dropdown, selectDocker Hub
. - Provide a suitable name and description for the new replication endpoint.
- Enter the username and password of a Docker Hub user with the required privileges. If you are only planning to use this registry endpoint only for pulling public images from Docker Hub, you can leave these fields blank.
- Click on
Test Connection
to verify the connection. - If the connection test is successful, click
OK
.
Create a Registry Endpoint to Azure Container Registry (ACR)
Before creating a registry endpoint for your Azure Container Registry, you need to create a token and a scope map to let Harbor access a subset of repositories in your container registry. By creating tokens, a registry owner can provide users or services with scoped, time-limited access to repositories to pull or push images or perform other actions.
A token
along with a generated password lets the user authenticate with the registry. You can set an expiration date for a token password, or disable a token at any time. After authenticating with a token, Harbor can perform one or more actions
scoped to one or more repositories.
Action | Description | Example |
---|---|---|
content/delete |
Remove data from the repository | Delete a repository or a manifest |
content/read |
Read data from the repository | Pull an artifact |
content/write |
Write data to the repository | Use with content/read to push an artifact |
metadata/read |
Read metadata from the repository | List tags or manifests |
metadata/write |
Write metadata to the repository | Enable or disable read, write, or delete operations |
A scope map
groups the repository permissions you apply to a token and can reapply to other tokens. Every token is associated with a single scope map. With a scope map, you can:
- Configure multiple tokens with identical permissions to a set of repositories.
- Update token permissions when you add or remove repository actions in the scope map, or apply a different scope map.
Azure Container Registry also provides several system-defined scope maps you can apply when creating tokens. The permissions of system-defined scope maps apply to all repositories in your registry.The individual actions
corresponds to the limit of Repositories per scope map. The following image shows the relationship between tokens and scope maps.
You can use following script to create a scope map and a token.
#!/bin/bash
# Variables
acrName="TanAcr"
acrResourceGroupName="TanRG"
acrScopeMapName="harbor-scope-map"
acrUsername="harbor"
repositories=("chat" "doc")
# Checck if an Azure Container Registry scope map already exists
echo "Checking if [$acrScopeMapName] scope map actually exists in the [$acrName] container registry..."
az acr scope-map show \
--name $acrScopeMapName \
--registry $acrName \
--resource-group $acrResourceGroupName &>/dev/null
if [[ $? != 0 ]]; then
echo "No [$acrScopeMapName] scope map actually exists in the [$acrName] container registry"
echo "Creating [$acrScopeMapName] scope map in the [$acrName] container registry..."
# Create the Azure Container Registry scope map
az acr scope-map create \
--name $acrScopeMapName \
--registry $acrName \
--resource-group $acrResourceGroupName \
--description "Scope map for Harbor" &>/dev/null
if [[ $? == 0 ]]; then
echo "[$acrScopeMapName] scope map successfully created in the [$acrName] container registry"
else
echo "Failed to create [$acrScopeMapName] scope map in the [$acrName] container registry"
exit
fi
# Add the repositories to the Azure Container Registry scope map
for repository in ${repositories[@]}; do
echo "Adding [$repository] repository to the [$acrScopeMapName] scope map in the [$acrName] container registry..."
az acr scope-map update \
--name $acrScopeMapName \
--registry $acrName \
--resource-group $acrResourceGroupName \
--add-repository $repository content/read content/write metadata/read metadata/write &>/dev/null
if [[ $? == 0 ]]; then
echo "[$repository] repository successfully added to the [$acrScopeMapName] scope map in the [$acrName] container registry"
else
echo "Failed to add [$repository] repository to the [$acrScopeMapName] scope map in the [$acrName] container registry"
exit
fi
done
else
echo "[$acrScopeMapName] scope map already exists in the [$acrName] container registry"
fi
# Check if the token already exists
echo "Checking if [$acrUsername] token actually exists in the [$acrName] container registry..."
az acr token show \
--name $acrUsername \
--registry $acrName \
--resource-group $acrResourceGroupName &>/dev/null
if [[ $? != 0 ]]; then
echo "No [$acrUsername] token actually exists in the [$acrName] container registry"
echo "Creating [$acrUsername] token in the [$acrName] container registry..."
# Create the token
password=$(az acr token create \
--name $acrUsername \
--registry $acrName \
--resource-group $acrResourceGroupName \
--scope-map $acrScopeMapName \
--query "credentials.passwords[?name=='password1'].value" \
--output tsv \
--only-show-errors)
if [[ $? == 0 ]]; then
echo "[$acrUsername] token successfully created in the [$acrName] container registry"
else
echo "Failed to create [$acrUsername] token in the [$acrName] container registry"
exit
fi
else
echo "[$acrUsername] token already exists in the [$acrName] container registry"
fi
# Displaying the token
echo "Username: $acrUsername"
echo "Token: $password"
Once you created a token to let Harbor access a subset of ACR repositories, you can proceed as follows to create a registry endpoint to the Azure Container Registry resource:
- Go to
Registries
and click the+ New Endpoint
button. - From the
Provider
dropdown, selectAzure ACR
. - Provide a suitable name and description for the new replication endpoint.
- Enter
https://<acr-login-server>
, for examplehttps://contoso.azurecr.io
- Enter use the username and password of an ACR token with the necessary privileges on repositories.
- If needed, select the
Verify Remote Cert
checkbox. You can deselect it if the remote registry uses a self-signed or untrusted certificate. - Click on
Test Connection
to verify the connection. - If the connection test is successful, click
OK
.
Create a Replication Rule
Once you created a registry endpoint, you can create one or more replication rule as follows:
- Log in to the Harbor interface with an account that has Harbor system administrator privileges.
- Expand the
Administration
section and selectReplications
.
- Click
New Replication Rule
. - Provide a name and description for the replication rule.
- Choose between
Push-based
orPull-based
replication, depending on the desired direction, to or from the remote registry.
-
If you are setting up a Pull-based replication rule, use the
Source Registry
drop-down menu to select from the configured replication endpoints. -
Use the
Source resource filter
options to filter the artifacts to be replicated (*):Name
: Enter an artifact name or fragment to replicate resources with a specific name.Tag
: Enter a tag name or fragment to replicate resources with a particular tag. You can also specify matching/excluding for this filter.Label
: Select from available labels to replicate resources with a specific label. You can also specify matching/excluding for this filter.Resource
: Choose whether to replicate images, artifacts, or both.
-
If you are setting up a Push-based replication rule, use the
Destination Registry
drop-down menu to select from the configured replication endpoints. -
For
Destination Namespace
, enter the name of the namespace where you want to replicate resources. If left empty, resources will be placed in the same namespace as in the source registry. -
Use the
Destination Flattening
drop-down to choose how you want Harbor to handle the image hierarchy when replicating images. Select the desired option based on your preference.Flatten All Levels
: Remove all hierarchy from the replicated image.No Flattening
: Maintain the same hierarchy when replicating.Flattening 1 level
: Remove one level from the image hierarchy.Flattening 2 levels
: Remove two levels from the image hierarchy.Flattening 3 levels
: Remove three levels from the image hierarchy.
- Use the
Trigger Mode
drop-down to select when and how the replication rule runs.Manual
: Replicate the resources manually when needed. Deletion operations are not replicated.Scheduled
: Replicate the resources periodically by defining a cron job. Deletion operations are not replicated.Event Based
: Replicate resources immediately when a new resource is pushed or retagged. If you select theDelete remote resources when locally deleted
option, artifacts deleted locally will also be deleted from the replication destination.
- Optionally, set the maximum network bandwidth for each replication task using the
Bandwidth
option. Please take into account the number of concurrent executions, keeping in mind that the default value is 10 for each job-service pod. The bandwidth unit is kilobytes per second (-1 stands for unlimited bandwidth). Llimiting the bandwidth too much and stopping the replication job might result in a long delay before the job worker can run a new job. - Optionally, select the
Override
checkbox to force replicated resources to replace resources at the destination with the same name. - Optionally, select the
Copy by chunk
checkbox to enable copying the artifact blobs by chunks. This feature is currently supported only when the source and destination registries are both Harbor. For other registry types, you can enable it manually by calling the Harbor API. The official support for copy by chunk between Harbor and other registry types has not been verified. The default chunk size is 10MB, but you can override it by setting theREPLICATION_CHUNK_SIZE
environment variable in the jobservice. For example, to set a chunk size of 10MB, you would setREPLICATION_CHUNK_SIZE=10485760
. - Click
Save
to create the replication rule.
(*) Note that certain patterns are supported for name and tag filters, such as *
, ```, ?
, and `{alt1,…}`. Please refer to the table in the original text for examples. For more information, see Creating a Replication Rule.
Create a Pull Replication Rule from Docker Hub
You can proceed as follows to create a replication rule to pull an official image, for example the latest version of the ubuntu image, from Docker Hub:
-
Log in to the Harbor interface with a Harbor system administrator account.
-
Go to the
Administration
section and selectReplications
. -
Click
New Replication Rule
. -
Provide a name and description for the rule.
-
Choose
Pull-based
replication. -
Select
Docker Hub
as theSource Registry
. -
Use the
Source resource filter
options to filter the artifacts to be replicated:Name
: Enter an artifact name or fragment to replicate resources with a specific name. You must addlibrary
if you want to replicate the official artifacts ofDocker Hub
. For example,library/ubuntu
matches the official ubuntu image. As noticed in the previous section, you can also use wilcard characters in the name and tag.Tag
: Enter a tag name or fragment to replicate resources with a particular tag, for examplelatest
.
-
For
Destination Namespace
, enter the name of the name of the project where you want to replicate resources, for exampleprivate
. -
Use the
Destination Flattening
drop-down to choose how you want Harbor to handle the image hierarchy when replicating images. Select the desired option based on your preference.Flatten All Levels
: Remove all hierarchy from the replicated image.No Flattening
: Maintain the same hierarchy when replicating.Flattening 1 level
: Remove one level from the image hierarchy.Flattening 2 levels
: Remove two levels from the image hierarchy.Flattening 3 levels
: Remove three levels from the image hierarchy.
-
Use the
Trigger Mode
drop-down to select when and how the replication rule runs.Manual
: Replicate the resources manually when needed. Deletion operations are not replicated.Scheduled
: Replicate the resources periodically by defining a cron job. Deletion operations are not replicated.Event Based
: Replicate resources immediately when a new resource is pushed or retagged. If you select theDelete remote resources when locally deleted
option, artifacts deleted locally will also be deleted from the replication destination.
-
Optionally, set the maximum network bandwidth for each replication task.
-
Optionally, enable the
Override
checkbox to replace replicated resources with the same name at the destination. -
Optionally, enable the
Copy by chunk
checkbox to copy artifact blobs in chunks (only supported for Harbor-to-Harbor replication). -
Click
Save
to create the replication rule.
Create a Pull Replication Rule from another Harbor Instance
You can proceed as follows to create a replication rule to pull a container image from another Harbor instance:
-
Log in to the Harbor interface with a Harbor system administrator account.
-
Go to the
Administration
section and selectReplications
. -
Click
New Replication Rule
. -
Provide a name and description for the rule.
-
Choose
Pull-based
replication. -
Select the other Harbor instance as the
Source Registry
. -
Use the
Source resource filter
options to filter the artifacts to be replicated:Name
: Enter an artifact name or fragment to replicate resources with a specific name, for exampleprivate/flaskapp
, whereprivate
is the name of the project containing the image in the source Harbor.Tag
: Enter a tag name or fragment to replicate resources with a particular tag, for example*
to pull all the versions.Label
: Select from available labels to replicate resources with a specific label. You can also specify matching/excluding for this filter.Resource
: Choose whether to replicate images, artifacts, or both.
-
For
Destination Namespace
, enter the name of the name of the project where you want to replicate resources, for exampleprivate
. -
Use the
Destination Flattening
drop-down to choose how you want Harbor to handle the image hierarchy when replicating images. Select the desired option based on your preference.Flatten All Levels
: Remove all hierarchy from the replicated image.No Flattening
: Maintain the same hierarchy when replicating.Flattening 1 level
: Remove one level from the image hierarchy.Flattening 2 levels
: Remove two levels from the image hierarchy.Flattening 3 levels
: Remove three levels from the image hierarchy.
-
Use the
Trigger Mode
drop-down to select when and how the replication rule runs.Manual
: Replicate the resources manually when needed. Deletion operations are not replicated.Scheduled
: Replicate the resources periodically by defining a cron job. Deletion operations are not replicated.Event Based
: Replicate resources immediately when a new resource is pushed or retagged. If you select theDelete remote resources when locally deleted
option, artifacts deleted locally will also be deleted from the replication destination.
-
Optionally, set the maximum network bandwidth for each replication task.
-
Optionally, enable the
Override
checkbox to replace replicated resources with the same name at the destination. -
Optionally, enable the
Copy by chunk
checkbox to copy artifact blobs in chunks (only supported for Harbor-to-Harbor replication). -
Click
Save
to create the replication rule.
Create a Pull Replication Rule from Azure Container Registry (ACR)
You can proceed as follows to create a replication rule to pull a container image from an Azure Container Registry (ACR) resource:
- Log in to the Harbor interface with a Harbor system administrator account.
- Go to the
Administration
section and selectReplications
. - Click
New Replication Rule
. - Provide a name and description for the rule.
- Choose
Pull-based
replication. - Select the Azure Container Registry (ACR) as the
Source Registry
. - Use the
Source resource filter
options to filter the artifacts to be replicated:
Name
: Enter an artifact name or fragment to replicate resources with a specific name, for exampleprivate/flaskapp
, whereprivate
is the name of the project containing the image in the source Harbor.Tag
: Enter a tag name or fragment to replicate resources with a particular tag, for example**
to pull all the versions.
- For
Destination Namespace
, enter the name of the name of the project where you want to replicate resources, for exampleprivate
. - Use the
Destination Flattening
drop-down to choose how you want Harbor to handle the image hierarchy when replicating images. Select the desired option based on your preference, for exampleNo Flattening
to maintain the same hierarchy when replicating the images. - Use the
Trigger Mode
drop-down to select when and how the replication rule runs. - Optionally, set the maximum network bandwidth for each replication task.
- Optionally, enable the
Override
checkbox to replace replicated resources with the same name at the destination. - Optionally, enable the
Copy by chunk
checkbox to copy artifact blobs in chunks (only supported for Harbor-to-Harbor replication). - Click
Save
to create the replication rule.
Create a Push Replication Rule to Azure Container Registry (ACR)
You can proceed as follows to create a replication rule to push a container image to an Azure Container Registry (ACR) resource:
-
Log in to the Harbor interface with an account that has Harbor system administrator privileges.
-
Expand the
Administration
section and selectReplications
. -
Click
New Replication Rule
. -
Provide a name and description for the replication rule.
-
Choose
Push-based
replication. -
Use the
Source resource filter
options to filter the artifacts to be replicated:Name
: Enter an artifact name or fragment to replicate resources with a specific name, for exampleprivate/flaskapp
, whereprivate
is the name of the project containing the image in the source Harbor.Tag
: Enter a tag name or fragment to replicate resources with a particular tag, for example**
to push all the versions.Label
: Select from available labels to replicate resources with a specific label. You can also specify matching/excluding for this filter.Resource
: Choose whether to replicate images, artifacts, or both.
-
Use the
Destination Registry
drop-down menu to select the destination Azure Container Registry (ACR). -
For
Destination Namespace
, enter the name of the ACR repository where you want to replicate the artifacts. -
Use the
Destination Flattening
drop-down to choose how you want Harbor to handle the image hierarchy when replicating images. Select the desired option based on your preference.Flatten All Levels
: Remove all hierarchy from the replicated image.No Flattening
: Maintain the same hierarchy when replicating.Flattening 1 level
: Remove one level from the image hierarchy.Flattening 2 levels
: Remove two levels from the image hierarchy.Flattening 3 levels
: Remove three levels from the image hierarchy.
-
Use the
Trigger Mode
drop-down to select when and how the replication rule runs.Manual
: Replicate the resources manually when needed. Deletion operations are not replicated.Scheduled
: Replicate the resources periodically by defining a cron job. Deletion operations are not replicated.Event Based
: Replicate resources immediately when a new resource is pushed or retagged. If you select theDelete remote resources when locally deleted
option, artifacts deleted locally will also be deleted from the replication destination.
-
Optionally, set the maximum network bandwidth for each replication task using the
Bandwidth
option. Please take into account the number of concurrent executions, keeping in mind that the default value is 10 for each job-service pod. The bandwidth unit is kilobytes per second (-1 stands for unlimited bandwidth). Llimiting the bandwidth too much and stopping the replication job might result in a long delay before the job worker can run a new job. -
Optionally, select the
Override
checkbox to force replicated resources to replace resources at the destination with the same name. -
Optionally, select the
Copy by chunk
checkbox to enable copying the artifact blobs by chunks. This feature is currently supported only when the source and destination registries are both Harbor. For other registry types, you can enable it manually by calling the Harbor API. The official support for copy by chunk between Harbor and other registry types has not been verified. The default chunk size is 10MB, but you can override it by setting theREPLICATION_CHUNK_SIZE
environment variable in the jobservice. For example, to set a chunk size of 10MB, you would setREPLICATION_CHUNK_SIZE=10485760
. -
Click
Save
to create the replication rule.
Import Images into Azure Container Registry (ACR) from Harbor
As shown in the following script, you can use the az acr import command to import container images from Harbor to a repository named harbor
in your Azure Container Registry (ACR). For more information, see Import container images to a container registry.
#!/bin/bash
# Variables
acrName="TanAcr"
acrResourceGroupName="TanRG"
importProject="private"
importRepositories=("ubuntu:latest" "python:latest")
harborHostname="harbor.contoso.com"
harborAdminUsername="admin"
harborAdminPassword="Passw@rd1!"
# Import images from Harbor into Azure Container Registry
for repository in ${importRepositories[@]}; do
echo "Importing [$repository] repository from the [$harborHostname] Harbor into the [$acrName] container registry..."
az acr import \
--name $acrName \
--resource-group $acrResourceGroupName \
--source $harborHostname/$importProject/$repository \
--image harbor/$repository \
--username $harborAdminUsername \
--password $harborAdminPassword \
--force &>/dev/null
if [[ $? == 0 ]]; then
echo "[$repository] repository successfully imported from the [$harborHostname] Harbor into the [$acrName] container registry"
else
echo "Failed to import [$repository] repository from the [$harborHostname] Harbor into the [$acrName] container registry"
exit
fi
done
Use Trivy to scan an Image for Vulnerabilities
Harbor provides static analysis of vulnerabilities in images through the open source project Trivy. To be able to use Trivy, you must have enabled Trivy when you installed your Harbor instance (by appending installation options --with-trivy
). For information about installing Harbor with Trivy, see the Run the Installer Script.
If the upgrading path is from the version that is >=V1.10 to the current version (V2.0) and there was an existing system default scanner "ABC" set in the previous version, that scanner "ABC" will be kept as the system default scanner.
You can also connect Harbor to your own instance of Trivy or to other additional vulnerability scanners through Harbor’s embedded interrogation service. These scanners can be configured in the Harbor UI at any time after installation.
It might be necessary to connect Harbor to other scanners for corporate compliance reasons or because your organization already uses a particular scanner. Different scanners also use different vulnerability databases, capture different CVE sets, and apply different severity thresholds. By connecting Harbor to more than one vulnerability scanner, you broaden the scope of your protection against vulnerabilities. For the list of additional scanners that are currently supported, see the Harbor Compatibility List.
You can manually initiate scanning on a particular image or on all images in Harbor. Additionally, you can also set a policy to automatically scan all of the images at specific intervals. Vulnerability scans of Cosign signatures are not supported.
You can also export scans for an image using the Harbor API endpoint /projects/{project_name}/repositories/{repository_name}/artifacts/{reference}/additions/vulnerabilities
. See more information about using this endpoint in the Harbor Swagger file.
To use Trivy to manually scan a container image from the Harbor portal, follow these steps:
- Navigate to the Harbor portal and log in to your account.
- Locate and open the
Project
holding the container image that you want to scan. - Click on the name or tag of the image to access its details page.
- Select a release then click
Scan
. This triggers a scan operation:
- When the scan operation is complete, you can easily view the results of the vulnerability scan in the Harbor portal. Look for the
Vulnerabilities
cell, and hover your mouse over it. A histogram will appear, providing a visual representation
- If you click the link to open the release, you can see a detailed report of the vulnerabilities found by Trivy.
Configure Harbor Replication Rules using Terraform
The Terraform Harbor Provider can be used to configure an instance of Harbor. The provider needs to be configured with the proper credentials before it can be used.
Resources
The Terraform Harbor Provider supports the following resources:
- Resource: harbor_configuration
- Resource: harbor_config_system
- Resource: harbor_config_email
- Resource: harbor_garbage_collection
- Resource: harbor_purge_audit_log
- Resource: harbor_immutable_tag_rule
- Resource: harbor_interrogation_services
- Resource: harbor_label
- Resource: harbor_project_member_group
- Resource: harbor_project_member_user
- Resource: harbor_project
- Resource: harbor_registry
- Resource: harbor_replication
- Resource: harbor_retention_policy
- Resource: harbor_robot_account
- Resource: harbor_tasks
- Resource: harbor_user
Authentication
You can configure the Terraform Harbor Provider with credentials as follows:
provider "harbor" {
url = "https://harbor.aceme_corpartion.com"
username = "insert_admin_username_here"
password = "insert_password_here"
}
Alternatively, these environment variables can be used to set the provider config values:
HARBOR_URL
HARBOR_USERNAME
HARBOR_PASSWORD
HARBOR_IGNORE_CERT
Argument Reference
The following arguments are supported:
url
- (Required) The url of harborusername
- (Required) The username to be used to access harborpassword
- (Required) The password to be used to access harborinsecure
- (Optional) Choose to ignore certificate errorsapi_version
- (Optional) Choose which version of the api you would like to use 1 or 2. Default is2
Links
For more information, see the following resources:
- Terraform Harbor Provider
- Harbor Terraform Provider GitHub repository
- Harbor Terraform Provider Documentation
- Harbor Multi Datacenter With Replicated Artifacts
Using Harbor in CI/CD
Harbor can serve as a centralized hub and a single source of truth for hosting container images and OCI artifacts, such as Helm charts, in a CI/CD workflow. Whether you are leveraging Azure DevOps or GitOps technologies like Argo CD and Flux CD, Harbor offers seamless integration and ensures the reliability and consistency of your deployment process.
GitOps and DevOps
GitOps and DevOps are two popular methodologies for managing software development and deployment processes.
GitOps
is an approach where the desired state of infrastructure and applications is declared and version-controlled in a Git repository. Tools such as Argo CD and Flux continuously monitor the repository and automatically synchronize the actual state of the system with the desired state defined in the repository. This ensures a declarative and auditable approach to software delivery and deployment.DevOps
, on the other hand, is a set of practices that combines software development and IT operations. It emphasizes collaboration, automation, and continuous delivery of software changes. DevOps methodologies promote shorter development cycles, faster feedback loops, and improved collaboration between development teams and operations teams.
Harbor Integration in CI/CD Workflows
Harbor seamlessly integrates with Azure DevOps or GitOps technologies like Argo CD and Flux , enabling smooth workflows for managing container images. By utilizing Harbor as a container registry, you can store and version your container images in a secure and centralized location. This ensures consistent access to the required images across different stages of your CI/CD pipeline. With Azure DevOps, you can easily configure your build pipelines to push the container images to Harbor. By making Harbor the authoritative source for your container images, you can ensure that all stages of your pipeline use the same trusted images.
Similarly, when utilizing GitOps technologies like Argo CD or Flux, you can configure those tools to pull the container images from Harbor for deployment. This ensures that the deployed environments match the desired state defined in your Git repository. By leveraging Harbor's capabilities as a single source of truth for container images, you can enhance the reliability, traceability, and security of your CI/CD workflows, regardless of whether you are using Azure DevOps or GitOps methodologies.
Using Harbor with Argo CD
The following diagram shows how Harbor integrates with Argo CD:
In this scenario, we establish a pull-based DevOps pipeline to deploy an application using Argo CD, Harbor, and GitHub Actions. The pipeline is designed to pull YAML manifests or Helm charts from a Git repository and container images from the main Harbor registry. Alternatively, it can also fetch container images from an Azure Container Registry (ACR) or a secondary Harbor instance that is synchronized with the main Harbor. Let's take a look at the data flow:
- The application code is developed using a popular integrated development environment (IDE) like Visual Studio Code.
- The code is then committed to a Git repository hosted on GitHub or any other distributed version control system.
- The GitHub Actions CI pipeline is configured to build a container image from the application code and push it to a Harbor registry used as a single source of truth.
- As part of the continuous integration process, GitHub Actions updates a Kubernetes YAML manifest or Helm chart with the latest version of the container image, based on the image version in Harbor.
- The Argo CD operator provides seamless monitoring of the Git repository for any configuration changes. It automatically detects any changes and efficiently pulls the updated YAML manifests or Helm charts. As a result, Argo CD effortlessly deploys the new application release onto the AKS cluster, ensuring smooth and timely updates.
- Argo CD, acting as the GitOps operator, leverages the YAML manifest or Helm charts to deploy the web application to an AKS cluster.
- During the deployment process, the container images are directly pulled from the primary Harbor instance. Alternatively, the container images can be pulled from a Harbor replica, or an Azure Container Registry (ACR).
- The main Harbor instance also pushes container images to an Azure Container Registry (ACR) for wider distribution and consumption.
- Alternatively, Azure Container Registry (ACR) can be configured to import the container images from Harbor, ensuring they are readily available in the registry.
- In a distributed Harbor setup, the main Harbor instance pushes container images to a replica Harbor instance.
- Alternatively, the secondary Harbor instance pulls container images directly from the main Harbor instance to maintain consistency.
Using Harbor with Flux
The following diagram shows how Harbor integrates with Flux:
In this scenario, we establish a pull-based DevOps pipeline to deploy an application using Flux, Harbor, and GitHub Actions. The pipeline is designed to pull YAML manifests or Helm charts from a Git repository and container images from the main Harbor registry. Alternatively, it can also fetch container images from an Azure Container Registry (ACR) or a secondary Harbor instance that is synchronized with the main Harbor. Let's take a look at the data flow:
- The application code is developed using a popular integrated development environment (IDE) like Visual Studio Code.
- The code is then committed to a Git repository hosted on GitHub or any other distributed version control system.
- The GitHub Actions CI pipeline is configured to build a container image from the application code and push it to a Harbor registry used as a single source of truth.
- As part of the continuous integration process, GitHub Actions updates a Kubernetes YAML manifest or Helm chart with the latest version of the container image, based on the image version in Harbor.
- The Flux operator provides seamless monitoring of the Git repository for any configuration changes. It automatically detects any changes and efficiently pulls the updated YAML manifests or Helm charts. As a result, Flux effortlessly deploys the new application release onto the AKS cluster, ensuring smooth and timely updates.
- Flux, acting as the GitOps operator, leverages the YAML manifest or Helm charts to deploy the web application to an AKS cluster.
- During the deployment process, the container images are directly pulled from the primary Harbor instance. Alternatively, the container images can be pulled from a Harbor replica, or an Azure Container Registry (ACR).
- The main Harbor instance also pushes container images to an Azure Container Registry (ACR) for wider distribution and consumption.
- Alternatively, Azure Container Registry (ACR) can be configured to import the container images from Harbor, ensuring they are readily available in the registry.
- In a distributed Harbor setup, the main Harbor instance pushes container images to a replica Harbor instance.
- Alternatively, the secondary Harbor instance pulls container images directly from the main Harbor instance to maintain consistency.
Using Harbor with Azure DevOps
The following diagram shows how Harbor integrates with Azure DevOps:
In this scenario, we establish a push-based DevOps pipeline to deploy an application using Azure DevOps and Harbor. The CI/CD pipelines are designed to deploy YAML manifests or Helm charts from a Git repository and pull container images from the main Harbor registry. Alternatively, the YAML manifests and Helm charts can be configured to fetch container images from an Azure Container Registry (ACR) or a secondary Harbor instance that is synchronized with the main Harbor. Let's take a look at the data flow:
- The application code is developed using a popular integrated development environment (IDE) like Visual Studio Code.
- The code is then committed to a Git repository hosted on GitHub or any other distributed version control system.
- The Azure DevOps CI pipeline is configured to build a container image from the application code and push it to a Harbor registry used as a single source of truth.
- The successful completion of the CI pipeline triggers the execution of the CD pipeline
- The CD pipeline updates a Kubernetes YAML manifest or Helm chart with the latest version of the container image, based on the image version in Harbor.
- The CD pipeline deploys the new application release to the AKS cluster using YAML manifests or Helm charts.
- During the deployment process, the container images are directly pulled from the primary Harbor instance. Alternatively, the container images can be pulled from a Harbor replica, or an Azure Container Registry (ACR).
- The main Harbor instance also pushes container images to an Azure Container Registry (ACR) for wider distribution and consumption.
- Alternatively, Azure Container Registry (ACR) can be configured to import the container images from Harbor, ensuring they are readily available in the registry.
- In a distributed Harbor setup, the main Harbor instance pushes container images to a replica Harbor instance.
- Alternatively, the secondary Harbor instance pulls container images directly from the main Harbor instance to maintain consistency.
Using Harbor in multi-cloud scenario
In a multi-cloud scenario, a primary Harbor instance can serve as the central repository and single source of truth for container images and OCI artifacts, including Helm charts. To achieve this, replication rules can be utilized to export these artifacts to one or more Harbor replica instances hosted in separate cloud platforms.
To ensure secure and private communication channels across different cloud environments, Virtual Private Network (VPN) connections can be established between the cloud platforms. This facilitates the secure transfer of artifacts between different Harbor instances. The main CI pipeline of the DevOps system can be configured to push container images to the primary Harbor instance. From there, these artifacts can be replicated to cloud-specific Harbor replica instances hosted by managed Kubernetes in one or multiple regions for best performance and resiliency. Harbor replica instances can be eventually configured to replicate OCI artifacts to one or more managed container registries such as Azure Container Registry (ACR), AWS Elastic Container Registry (ECR), or Google Container Registry (GCR).
When deploying applications on cloud platforms like Azure Kubernetes Service (AKS), Amazon Elastic Kubernetes Service (EKS), or Google Kubernetes Engine (GKE), YAML manifests or Helm charts can pull the required container images either from local Harbor installations or from cloud-specific managed container registries. By leveraging the primary Harbor instance as the single source of truth, organizations can ensure consistency and reliability in their multi-cloud CI/CD pipelines, while taking advantage of cloud-specific container registry capabilities. This approach enhances flexibility, security, and portability across different cloud environments.
Conclusion
With Harbor as their container registry, customers enjoy the benefits of consistent, secure, and streamlined container image management. The advantages of using Harbor extend beyond a single cloud provider, providing customers with the flexibility to leverage multiple cloud environments without sacrificing operational efficiency or security. By adopting Harbor, customers can confidently navigate their multi-cloud Kubernetes infrastructure while reaping the rewards of simplified operations and enhanced security. To learn more about Harbor and its capabilities, visit the Harbor website and explore the vast resources available within the CNCF community.
Acknowledgements
I would like to express my gratitude to Michael C. Bazarewsky and Shankar Ramachandran for reviewing the article and for providing valuable suggestions. Thank you for your time and expertise.