Running SAP Applications on the Microsoft Platform

8 MIN READ

Implementing SAP HANA scale-up high availability on Disaster Region

Microsoft

Jun 10, 2022

DISCLAIMER

This article describes procedure to prepare SAP HANA scale-up high availability in Disaster Region (DR) running on Red Hat Enterprise Linux on Azure. The procedure described in this article has been initially tested by Red Hat and Microsoft engineering and should be used as a basis to set up corresponding pilot implementations. As its current state, the solution is still unverified as mentioned in Red Hat article and needs to be considered experimental. It is highly recommended to engage Red Hat and SAP consulting service before implementing this solution in your organization. Carefully read the disclaimer from Red Hat about this procedure before you proceed with the setup on Azure.

Solution overview

In below figure, primary database hanadb1 in production region replicates data change synchronously to hanadb2 in the same region. Primary database on hanadb1 in production region also replicates data change asynchronously to hanadb3 in another region. Secondary node hanadb3 is a source system for a further secondary database on hanadb4 located in the same region with hanadb1. For more information on this system replication setup, see SAP HANA Multi-target System Replication | SAP Help Portal.

To configure above disaster recovery setup of SAP HANA scale-up with high availability cluster you need to configure two independent clusters, one in each region. In primary region, configure SAP HANA scale-up high availability between hanadb1 and hanadb2 as documented in High availability of SAP HANA on Azure VMs on RHEL or High availability of SAP HANA Scale-up with ANF on RHEL based on your storage type (local or NFS). Similarly on secondary or DR region, set up another independent SAP HANA scale-up cluster between hanadb3 and hanadb4 following the same document. You then configure SAP HANA multi-target system replication and cluster resource on DR region as described in below configuration steps section.

The cluster on the DR region is ready to run, but the services are stopped. The automatic handling of the cluster resources (like SAPHana_HN1_03-Clone, g_ip_HN1_03, hanadbx_nfs etc.) in DR region is configured but are placed in unmanaged mode. When the primary region goes down and failover to DR region is initiated, you need to manually takeover hanadb3 as the new primary. HANA system replication between hanadb3 and hanadb4 should be active before you start cluster service on the DR region. After starting cluster services, you can put the resources in managed mode.

Key points on the setup

The configuration of SAP HANA scale-up high availability cluster on DR region looks identical to primary region. But you should understand some key differences and take into account following points.

SAP HANA scale-up high availability setup on DR region looks identical to the sites in primary region except for the hostname.
To establish HANA system replication between primary and DR region, the communication between nodes from primary to DR region should be open, and vice-versa.
HANA system replication is established as described in overview section and should always be active across all sites.
No automatic failover feature from production region to DR region. Failover to DR region is manual.
After manual failover to DR region, all cluster functions are manually activated again.
Sites running on primary and DR region needs to be in sync in terms of changes and patch levels. For example, if you've changed the constraint in primary region cluster, same needs to be updated on DR region cluster.
The database connection on all clients should point to the virtual IP of the primary region. It needs to be changed after manual failover to DR. region
Automatic start of cluster services on VM boot should never be stopped on the primary region. But you need to disable it on the DR region.

Configuration steps

Pre-requisite: Set up HANA clusters on primary and DR region

Configure two independent SAP HANA scale-up high availability clusters, one on each region as documented in High availability of SAP HANA on Azure VMs on RHEL.
Ensure SAP HANA System ID (SID), instance number are same on both region sites, except hostname.
If your HANA file systems are NFS mount, configure two independent SAP HANA scale-up high availability clusters, one on each region as documented in High availability of SAP HANA Scale-up with ANF on RHEL. Also maintain the NFS file system entry in /etc/fstab as well.

NOTE: In case of cluster managed mount, it is recommended to add file system entry into /etc/fstab using 'noauto' option. 'noauto' option avoids automatic mount of filesystem after reboot. So, mounts are either handled manually or by the cluster.

Configure cluster resources in DR region

After configuring an independent SAP HANA scale-up high availability cluster on the DR region, following steps needs to be performed on the DR region cluster. These steps are applicable only in DR region.

Disable automatic start of cluster service on VM boot in DR region cluster.
```
pcs cluster disable --all
```

Put the resources in unmanaged mode in DR region cluster.

# Place HANA resource in unmanage mode
pcs resource unmanage SAPHana_HN1_03-Clone
# Place virtual IP group (contains virtual IP and probe port) resource in unmanage mode
pcs resource unmanage g_ip_HN1_03

If HANA file systems are on NFS mounts, put the filesystem resource on the DR-site into unmanaged state. This step is applicable only when NFS filesystems are used for HANA.

# Place filesystem group resource in unmanage mode
pcs resource unmanage hanadb3_nfs
pcs resource unmanage hanadb4_nfs

Stop the cluster.

# Stop the cluster after placing resources in unmanage mode
pcs cluster stop --all

NOTE: When the resources are in unmanaged state and the cluster is stopped, it is highly recommended to start the cluster on a regular basis on the secondary region to ensure that the cluster service comes up. As resources are already in unmanage state, it can remain in the same state after starting the cluster. You can stop the cluster after cluster comes up and services are running.

pcs status --full

Establish system replication from primary to DR region

Establish system replication from node hanadb1 in primary region to the node hanadb3 in DR region.

Stop HANA database on hanadb3 and hanadb4

# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StopSystem HDB

Copy keys from hanadb1 in primary region to hanadb3 and hanadb4 of DR region.

# Copy keys from hanadb1 to hanadb3
scp /usr/sap/HN1/SYS/global/security/rsecssfs/data/SSFS_HN1.DAT sidadm@hanadb3:/usr/sap/HN1/SYS/global/security/rsecssfs/data/
scp /usr/sap/HN1/SYS/global/security/rsecssfs/key/SSFS_HN1.KEY sidadm@hanadb3:/usr/sap/HN1/SYS/global/security/rsecssfs/key/

# Copy keys from hanadb1 to hanadb4
scp /usr/sap/HN1/SYS/global/security/rsecssfs/data/SSFS_HN1.DAT sidadm@hanadb4:/usr/sap/HN1/SYS/global/security/rsecssfs/data/
scp /usr/sap/HN1/SYS/global/security/rsecssfs/key/SSFS_HN1.KEY sidadm@hanadb4:/usr/sap/HN1/SYS/global/security/rsecssfs/key/

Register hanadb3 as secondary of hanadb1 in asynchronous replication mode. Log in as <hanasid>adm in hanadb3.

hdbnsutil -sr_register --remoteHost=hanadb1 --remoteInstance=03 --replicationMode=async --operationMode=logreplay --name=SITE3

Start HANA database on hanadb3.

# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StartSystem HDB

Enable system replication on hanadb3.

# Execute command using <hanasid>adm on hanadb3
hdbnsutil -sr_enable --name=SITE3

# Execute command using <hanasid>adm on hanadb4
hdbnsutil -sr_register --remoteHost=hanadb3 --remoteInstance=03 --replicationMode=sync --operationMode=logreplay --name=SITE4

Start HANA database on hanadb4.

# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StartSystem HDB

After establishing system replication between primary and DR region, check the system replication on hanadb1 in primary region.

# Execute command using <hanasid>adm in primary node (hanadb1) in primary region
python /usr/sap/HN1/HDB03/exe/python_support/systemReplicationStatus.py

Failover to DR region

Primary region goes down and business has decided to perform failover to DR region. Follow below steps to takeover hanadb3 on DR region as the new primary.

Perform a takeover on hanadb3.

# Execute command using <hanasid>adm on hanadb3
hdbnsutil -sr_takeover --suspendPrimary

Check the system replication status on hanadb3. HANA system replication between hanadb3 and hanadb4 should be active after the takeover.

# Execute command using <hanasid>adm in new primary node (hanadb3) in disaster region
python /usr/sap/HN1/HDB03/exe/python_support/systemReplicationStatus.py

Start the cluster in DR region.

# Start the cluster in disaster region
pcs cluster start --all

Check the status of the cluster. The resources should still be in unmanaged mode.
```
pcs status --full
```

Clean up the resources and place the resources in manage mode.

# If there are any failed resource after starting cluster, you need to cleanup the resource(s).
pcs resource cleanup SAPHana_HN1_03-Clone

# ONLY APPLICABLE - If you are using NFS mount for HANA file system
pcs resource manage hanadb3_nfs
pcs resource manage hanadb4_nfs

# Place HANA and virtual group resource in manage mode
pcs resource manage SAPHana_HN1_03-Clone
pcs resource manage g_ip_HN1_03

Enable cluster service to start on VM boot.

# Execute command in disaster region cluster i.e. hanadb3 or hanadb4.
pcs cluster enable --all

Important: After DR region becomes the new primary for HANA database, you need to change the database connection to all clients with new hostname.

Configure former primary as new secondary site

After failover to DR region, you want former primary to be your new secondary region. Follow below steps only on former primary region.

Disable cluster service to start on VM boot.

# Execute command in former primary node i.e. hanadb1
pcs cluster disable --all

Put the resources in unmanaged mode.

# Place HANA resource in unamange mode
pcs resource unmanage SAPHana_HN1_03-Clone

# Place virtual IP group (contains virtual IP and probe port) resource in unmanage mode
pcs resource unmanage g_ip_HN1_03

If HANA file systems are on NFS mounts, put the filesystem resource on the DR-site into unmanaged state. This step is applicable only when NFS filesystems are used for HANA.

# Place filesystem group resource in unmanage mode
pcs resource unmanage hanadb1_nfs
pcs resource unmanage hanadb2_nfs

Stop the cluster.

# Stop the cluster after placing resources in unmanage mode
pcs cluster stop --all

Stop HANA database on hanadb1 and hanadb2, if running.

# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StartSystem HDB

Clean up SAP HANA replication setup on former primary.

# Execute command using <hanasid>adm
hdbnsutil -sr_cleanup --force

Copy keys from hanadb3 in new primary to hanadb1 and hanadb2.

# Copy keys from hanadb3 to hanadb1
scp /usr/sap/HN1/SYS/global/security/rsecssfs/data/SSFS_HN1.DAT sidadm@hanadb1:/usr/sap/HN1/SYS/global/security/rsecssfs/data/
scp /usr/sap/HN1/SYS/global/security/rsecssfs/key/SSFS_HN1.KEY sidadm@hanadb1:/usr/sap/HN1/SYS/global/security/rsecssfs/key/

# Copy keys from hanadb3 to hanadb2
scp /usr/sap/HN1/SYS/global/security/rsecssfs/data/SSFS_HN1.DAT sidadm@hanadb2:/usr/sap/HN1/SYS/global/security/rsecssfs/data/
scp /usr/sap/HN1/SYS/global/security/rsecssfs/key/SSFS_HN1.KEY sidadm@hanadb2:/usr/sap/HN1/SYS/global/security/rsecssfs/key/

Register hanadb1 as secondary of hanadb3 in asynchronous replication mode. Log in as <hanasid>adm in hanadb1.

hdbnsutil -sr_register --remoteHost=hanadb3 --remoteInstance=03 --replicationMode=async --operationMode=logreplay --name=SITE1

Start HANA database on hanadb1.

# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StartSystem HDB

Enable system replication on hanadb1.

# Execute command using <hanasid>adm on hanadb1
hdbnsutil -sr_enable --name=SITE1

# Execute command using <hanasid>adm on hanadb2
hdbnsutil -sr_register --remoteHost=hanadb1 --remoteInstance=03 --replicationMode=sync --operationMode=logreplay --name=SITE2

Start HANA database on hanadb2.

# Execute command using <hanasid>adm
sapcontrol -nr 03 -function StartSystem HDB

After establishing system replication between the new primary and former primary regions, check the system replication on hanadb3 in new primary region.

# Execute command using <hanasid>adm in primary node (hanadb1) in primary region
python /usr/sap/HN1/HDB03/exe/python_support/systemReplicationStatus.py

Updated Jun 09, 2022

Version 1.0

Disaster Recovery

red hat

SAP High Availability

sap on azure

dennispadia

Microsoft

Joined August 19, 2020

View Profile

Running SAP Applications on the Microsoft Platform

Follow this blog board to get notified when there's new activity