This article describes procedure to prepare SAP HANA scale-up high availability in Disaster Region (DR) running on Red Hat Enterprise Linux on Azure. The procedure described in this article has been initially tested by Red Hat and Microsoft engineering and should be used as a basis to set up corresponding pilot implementations. As its current state, the solution is still unverified as mentioned in Red Hat article and needs to be considered experimental. It is highly recommended to engage Red Hat and SAP consulting service before implementing this solution in your organization. Carefully read the disclaimer from Red Hat about this procedure before you proceed with the setup on Azure.
In below figure, primary database hanadb1 in production region replicates data change synchronously to hanadb2 in the same region. Primary database on hanadb1 in production region also replicates data change asynchronously to hanadb3 in another region. Secondary node hanadb3 is a source system for a further secondary database on hanadb4 located in the same region with hanadb1. For more information on this system replication setup, see SAP HANA Multi-target System Replication | SAP Help Portal.
To configure above disaster recovery setup of SAP HANA scale-up with high availability cluster you need to configure two independent clusters, one in each region. In primary region, configure SAP HANA scale-up high availability between hanadb1 and hanadb2 as documented in High availability of SAP HANA on Azure VMs on RHEL or High availability of SAP HANA Scale-up with ANF on RHEL based on your storage type (local or NFS). Similarly on secondary or DR region, set up another independent SAP HANA scale-up cluster between hanadb3 and hanadb4 following the same document. You then configure SAP HANA multi-target system replication and cluster resource on DR region as described in below configuration steps section.
The cluster on the DR region is ready to run, but the services are stopped. The automatic handling of the cluster resources (like SAPHana_HN1_03-Clone, g_ip_HN1_03, hanadbx_nfs etc.) in DR region is configured but are placed in unmanaged mode. When the primary region goes down and failover to DR region is initiated, you need to manually takeover hanadb3 as the new primary. HANA system replication between hanadb3 and hanadb4 should be active before you start cluster service on the DR region. After starting cluster services, you can put the resources in managed mode.
The configuration of SAP HANA scale-up high availability cluster on DR region looks identical to primary region. But you should understand some key differences and take into account following points.
NOTE: In case of cluster managed mount, it is recommended to add file system entry into /etc/fstab using 'noauto' option. 'noauto' option avoids automatic mount of filesystem after reboot. So, mounts are either handled manually or by the cluster.
After configuring an independent SAP HANA scale-up high availability cluster on the DR region, following steps needs to be performed on the DR region cluster. These steps are applicable only in DR region.
pcs cluster disable --all
# Place HANA resource in unmanage mode
pcs resource unmanage SAPHana_HN1_03-Clone
# Place virtual IP group (contains virtual IP and probe port) resource in unmanage mode
pcs resource unmanage g_ip_HN1_03
If HANA file systems are on NFS mounts, put the filesystem resource on the DR-site into unmanaged state. This step is applicable only when NFS filesystems are used for HANA.# Place filesystem group resource in unmanage mode
pcs resource unmanage hanadb3_nfs
pcs resource unmanage hanadb4_nfs
# Stop the cluster after placing resources in unmanage mode
pcs cluster stop --all
NOTE: When the resources are in unmanaged state and the cluster is stopped, it is highly recommended to start the cluster on a regular basis on the secondary region to ensure that the cluster service comes up. As resources are already in unmanage state, it can remain in the same state after starting the cluster. You can stop the cluster after cluster comes up and services are running.
pcs status --full
Establish system replication from node hanadb1 in primary region to the node hanadb3 in DR region.
# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StopSystem HDB
# Copy keys from hanadb1 to hanadb3
scp /usr/sap/HN1/SYS/global/security/rsecssfs/data/SSFS_HN1.DAT sidadm@hanadb3:/usr/sap/HN1/SYS/global/security/rsecssfs/data/
scp /usr/sap/HN1/SYS/global/security/rsecssfs/key/SSFS_HN1.KEY sidadm@hanadb3:/usr/sap/HN1/SYS/global/security/rsecssfs/key/
# Copy keys from hanadb1 to hanadb4
scp /usr/sap/HN1/SYS/global/security/rsecssfs/data/SSFS_HN1.DAT sidadm@hanadb4:/usr/sap/HN1/SYS/global/security/rsecssfs/data/
scp /usr/sap/HN1/SYS/global/security/rsecssfs/key/SSFS_HN1.KEY sidadm@hanadb4:/usr/sap/HN1/SYS/global/security/rsecssfs/key/
hdbnsutil -sr_register --remoteHost=hanadb1 --remoteInstance=03 --replicationMode=async --operationMode=logreplay --name=SITE3
# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StartSystem HDB
# Execute command using <hanasid>adm on hanadb3
hdbnsutil -sr_enable --name=SITE3
# Execute command using <hanasid>adm on hanadb4
hdbnsutil -sr_register --remoteHost=hanadb3 --remoteInstance=03 --replicationMode=sync --operationMode=logreplay --name=SITE4
# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StartSystem HDB
# Execute command using <hanasid>adm in primary node (hanadb1) in primary region
python /usr/sap/HN1/HDB03/exe/python_support/systemReplicationStatus.py
Primary region goes down and business has decided to perform failover to DR region. Follow below steps to takeover hanadb3 on DR region as the new primary.
# Execute command using <hanasid>adm on hanadb3
hdbnsutil -sr_takeover --suspendPrimary
# Execute command using <hanasid>adm in new primary node (hanadb3) in disaster region
python /usr/sap/HN1/HDB03/exe/python_support/systemReplicationStatus.py
# Start the cluster in disaster region
pcs cluster start --all
pcs status --full
# If there are any failed resource after starting cluster, you need to cleanup the resource(s).
pcs resource cleanup SAPHana_HN1_03-Clone
# ONLY APPLICABLE - If you are using NFS mount for HANA file system
pcs resource manage hanadb3_nfs
pcs resource manage hanadb4_nfs
# Place HANA and virtual group resource in manage mode
pcs resource manage SAPHana_HN1_03-Clone
pcs resource manage g_ip_HN1_03
# Execute command in disaster region cluster i.e. hanadb3 or hanadb4.
pcs cluster enable --all
Important: After DR region becomes the new primary for HANA database, you need to change the database connection to all clients with new hostname.
After failover to DR region, you want former primary to be your new secondary region. Follow below steps only on former primary region.
# Execute command in former primary node i.e. hanadb1
pcs cluster disable --all
# Place HANA resource in unamange mode
pcs resource unmanage SAPHana_HN1_03-Clone
# Place virtual IP group (contains virtual IP and probe port) resource in unmanage mode
pcs resource unmanage g_ip_HN1_03
If HANA file systems are on NFS mounts, put the filesystem resource on the DR-site into unmanaged state. This step is applicable only when NFS filesystems are used for HANA.# Place filesystem group resource in unmanage mode
pcs resource unmanage hanadb1_nfs
pcs resource unmanage hanadb2_nfs
# Stop the cluster after placing resources in unmanage mode
pcs cluster stop --all
# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StartSystem HDB
# Execute command using <hanasid>adm
hdbnsutil -sr_cleanup --force
# Copy keys from hanadb3 to hanadb1
scp /usr/sap/HN1/SYS/global/security/rsecssfs/data/SSFS_HN1.DAT sidadm@hanadb1:/usr/sap/HN1/SYS/global/security/rsecssfs/data/
scp /usr/sap/HN1/SYS/global/security/rsecssfs/key/SSFS_HN1.KEY sidadm@hanadb1:/usr/sap/HN1/SYS/global/security/rsecssfs/key/
# Copy keys from hanadb3 to hanadb2
scp /usr/sap/HN1/SYS/global/security/rsecssfs/data/SSFS_HN1.DAT sidadm@hanadb2:/usr/sap/HN1/SYS/global/security/rsecssfs/data/
scp /usr/sap/HN1/SYS/global/security/rsecssfs/key/SSFS_HN1.KEY sidadm@hanadb2:/usr/sap/HN1/SYS/global/security/rsecssfs/key/
hdbnsutil -sr_register --remoteHost=hanadb3 --remoteInstance=03 --replicationMode=async --operationMode=logreplay --name=SITE1
# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StartSystem HDB
# Execute command using <hanasid>adm on hanadb1
hdbnsutil -sr_enable --name=SITE1
# Execute command using <hanasid>adm on hanadb2
hdbnsutil -sr_register --remoteHost=hanadb1 --remoteInstance=03 --replicationMode=sync --operationMode=logreplay --name=SITE2
# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StartSystem HDB
# Execute command using <hanasid>adm in primary node (hanadb1) in primary region
python /usr/sap/HN1/HDB03/exe/python_support/systemReplicationStatus.py
NOTE: When the resources are in unmanaged state and the cluster is stopped, it is highly recommended to start the cluster on a regular basis on the secondary region to ensure that the cluster service comes up. As resources are already in unmanage state, it can remain in the same state after starting the cluster. You can stop the cluster after cluster comes up and services are running.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.