SAP ASCS HA Cluster (in Linux OS) failover to DR region using Azure Site Recovery
Published Feb 17 2021 07:34 AM 4,561 Views
Microsoft

Overview

This blog provides guidance to perform the steps during the failover of SAP ASCS/ERS HA VMs in Linux cluster to the DR region in Azure using ASR. We will have details of changes to be made in the DR environment to re-configure the pacemaker cluster to start the ASCS/ERS HA environment with Azure Fence agent as STONITH device. Steps cover both SUSE Linux and RHEL OS. In SUSE Pacemaker cluster, we can also use SBD device (in place of Azure Fence agent) for fencing which requires additional VMs and its DR setup will require additional changes which is not covered in this blog.

 

Note : The specific procedures described have been exercised with these OS releases
• OS release #1 : SUSE Linux 12 SP5
• OS release #2 : RHEL 8.1
Please note that the procedures described have not been coordinated with the OS providers and therefore might not work in completeness with your specific implementations or with future OS releases. As a result you should test the procedures described thoroughly in your environment.

Also note that the procedure as described works only with Azure Fencing Agent and not with iSCSI SBD devices.

 

Disaster Recovery Architecture for SAP ASCS HA Cluster

The SAP ASCS/ERS HA cluster design in the primary and DR region in Azure is as described in the diagram and can be used as reference architecture for SAP HA & DR setup in Azure. Highly Available NFS File share to be used for common file systems of SAP. Azure Site Recovery (ASR) is recommended to be used for across region replication of the VMs for DR setup. An NFS fileshare needs to be available in the respective region for starting the SAP ASCS/ERS application services and should be synchronized between region for availability of latest data.

 

Picture1.jpg

Preparations

  • Configure ASR for both the nodes of ASCS in the primary region.
  1. Deploy the Resource Group, VNet, Subnet and Recovery Vault in the Secondary Region.
  2. Click on the ‘Disaster recovery’ for the ASCS/ERS VMs. Select the DR region (e.g. West US 2).
  3. In advanced settings, Select the target Resource Group, Vnet, Recovery vault, AV Set(if needed), PPG (if needed) and disks to be included.Picture2.jpg
  4. Review the settings and start the Replication.
  • Check that ASR replication is 100% and its healthy.

 Picture3.jpgPicture4.jpg

 

  • Deploy Azure ILB for ASCS & ERS in DR region.

Define frontend IP, backend pool, Probe port and loadbalancing rules. Frontend IP would be different in the DR region. Probe port can be same as primary region ASCS/ERS cluster.

Front-end IP

Backend Pool

Health probe port

Load balancing rule

172.10.0.45

(ASCS Virtual IP - HA)

azshafsascs1

and

azshafsascs2

 

64300

Enable HA Port,

Enable Floating IP,

Idle Timeout (30 Minutes)

172.10.0.46

(AERS Virtual IP - HA)

64302

Enable HA Port,

Enable Floating IP,

Idle Timeout (30 Minutes)

173.30.0.45

(ASCS Virtual IP - DR)

azshafsascs1-test

and

azshafsascs2-test

 

64300

Enable HA Port,

Enable Floating IP,

Idle Timeout (30 Minutes)

173.30.0.46

(AERS Virtual IP - DR)

64302

Enable HA Port,

Enable Floating IP,

Idle Timeout (30 Minutes)

  • NFS files shares synchronization

NFS Fileshare for ‘sapmnt’, ‘trans’ and ‘usr/sap’ must be must be synchronized with Primary Region and available/mounted in the DR region. New location/path of NFS files needs to be updated in '/etc/fstab' and cluster configuration the DR ASCS VMs.

Note: One of the options for NFS FileShare is to use Azure File NFS. As ASR can’t replicate NFS sources, one of the methods to replicate is to Copy the data to locally attached disk in the ASCS/ERS VMs using cronjob(for frequent interval copy)  so that ASR can replicate the data to DR region. Detailed steps are described in Appendix.

 

ASCS/ERS DR Failover

The following items are prefixed with either [A - DR] - applicable to all nodes of DR ASCS/ERS, [1-DR] - only applicable to node 1 of DR ASCS/ERS or [2-DR] - only applicable to node 2 of DR ASCS/ERS

  1. Perform the ‘Failover’ OR ‘Test Failover’ of ASCS/ERS Cluster VMs using ASR to the DR region.
 
  1. [A - DR] Update the IP addresses of the VMs and virtual IPs either in AD/DNS or in ‘hosts’ file.
  2. [A - DR] Mount the NFS filesystems for ‘sapmnt’, ‘trans’ and ‘SYS’. Mounting process depends on the NFS Share type (ANF / Azure Files NFS(in preview as of February 2021)).
  3. [A – DR] Ensure that contents of ‘sapmnt’, ‘trans’ and ‘SYS’ filesystems are synchronized from Primary Region.
  4. [A - DR] Update the VMs physical IP addresses in /etc/corosync/corosync.conf

nodelist {

        node {

                ring0_addr: 173.30.0.61

                nodeid: 1

        }

 

        node {

                ring0_addr: 172.30.0.62

                nodeid: 2

        }

Note: This step is only required in SUSE Linux.

  1. [A – DR] Start the pacemaker cluster using the command.
    • For SUSE Linux

systemctl start pacemaker

    • For RHEL

pcs cluster start

           

  1. [A – DR] Keep the cluster in maintenance mode.  
    • For SUSE Linux                                                                                                                   

sudo crm configure property maintenance-mode="true"

    • For RHEL

sudo pcs property set maintenance-mode=true

 

  1. [1-DR] Update the pacemaker configuration and save the changes. 

For SUSE Linux : The properties of the resources can be changed in the GUI tool ‘Hawk’ (https://<hostname>:7630/) or using the syntax “crm configure edit” (use ‘vi’ editor commands to update the content)

For RHEL : The properties of the resources can be changed using the ‘PCSD web UI’ (https://<hostname>:2224/). Once you start the pcs web UI, click on ‘+Add Existing’ and enter hostname of the cluster to see the properties.

  • Fileshare location of ‘ASCS’ and ‘ERS’.

Picture6.jpg

  • Probe Port numbers of ILB for ASCS and ERS (if different probe port numbers are used in Primary and DR)
  • Frontend IP (virtual IP) defined in ILB for ASCS and ERS.

 

Picture7.jpg

 
  • Azure Fence Agent.
    • We can reuse the Azure Fence agent API created for ASCS/ERS cluster(in the primary region) in the DR region. Optionally, we can create a new Azure Fence Agent API.
    • Assign the custom role to the Service Principle for the DR VMs as per the link.
    • Update the Azure Fence agent details (new resource group) in the cluster configuration.

Picture8.jpg

Note : Azure Fence Agent requires outbound connectivity to public end points as documented, along with possible solutions, in Public endpoint connectivity for VMs using standard ILB.

 

Note : While performing ‘Test Failover’ in ASR, VM name created in the DR Region will be suffixed by ‘-test’ but hostname at operating system will be same as Primary Region VMs. Since VM name doesn’t match with node name(hostname) in the cluster, we need to add parameter ‘pcmk_host_map’ and map hostname & VM name in Azure Fence Agent configuration in the pacemaker. This will ensure fencing of the VM during cluster testing.

Picture9.jpg

 
  1. [A – DR] Ensure that ‘ASCS<nr>’ and ‘ERS<nr>’ filesystems contents are synchronized with the data from Primary region.
  2. [1-DR] Remove the maintenance mode and cleanup cluster resources (if required).
    • For SUSE Linux

sudo crm configure property maintenance-mode="false"

    • For RHEL

sudo pcs property set maintenance-mode=false

  1. Check the cluster status.

Picture10.jpg

 

  1. Continue with the DR activation tasks for DB and application servers.
  2. Perform the DR validation tasks and cluster testing in the DR environment.
  3. Once DR test is completed, ‘Cleanup test failover’ in ASR for both ASCS/ERS VMs.

 

 

 

 

Appendix

This section describes steps to synchronize Azure Files NFS between primary and secondary region. This method of synchronization is one of the several possible ways to achieve data synchronization.

To setup ASCS/ERS cluster with Azure Files NFS(in public preview as of February 2021), please refer to the blog.

High level steps

  • Attach and Mount Azure premium disks to the VMs in the primary region ASCS/ERS VMs.
  • Regularly Copy the NFS share data/files into an azure disk using cronjob script.
  • ASR will be able copy Azure Disk to DR region. Ensure this disk included in the ASR replication.
  • During DR activation, Once the VMs are available, mount the Azure Files NFS from the DR region.
  • Copy the data/files from local disks to Azure Files NFS mount points.

Detailed Steps:

Provided steps as reference by considering SAP SID as T01, ASCS system number as ‘00’ and ERS system number as ‘02’.

In Primary Region

  1. [A] Add azure premium disk to both of VMs of ASCS/ERS cluster and mount the filesystem (e.g. /sapfoldercopy ).
  2. [A] Create folders in the filesystem.

sudo mkdir -p /sapfoldercopy/T01ASCS00

sudo mkdir -p /sapfoldercopy/T01ERS02

sudo mkdir -p /sapfoldercopy/sapmntT01

sudo mkdir -p /sapfoldercopy/trans

sudo mkdir -p /sapfoldercopy/usrsapT01

chown <sid>adm:sapsys /sapfoldercopy/*

  1. [A] Create shell script to copy data from NFS fileshare to local azure disk.

>>vi copy_sap_folders.sh

#!/bin/sh

cp -p -u -R /sapmnt/T01/ /sapfoldercopy/sapmntT01/

cp -p -u -R /usr/sap/trans/ /sapfoldercopy/trans/

cp -p -u -R /usr/sap/T01/ /sapfoldercopy/usrsapT01/

erscount="$(ls -l /usr/sap/T01/ERS02/ | wc -l)"

if [[ $erscount -gt 1 ]]

then

{

        cp -p -R /usr/sap/T01/ERS02/ /sapfoldercopy/T01ERS02/

        mv /sapfoldercopy/T01ASCS00/ASCS00 /sapfoldercopy/T01ASCS00/ASCS00_old

}

fi

ascscount="$(ls -l /usr/sap/T01/ASCS00/ | wc -l)"

if [[ $ascscount -gt 1 ]]

then

{

        cp -p -R /usr/sap/T01/ASCS00/ /sapfoldercopy/T01ASCS00/

        mv /sapfoldercopy/T01ERS02/ERS02 /sapfoldercopy/T01ERS02/ERS02_old

}

fi

Note : comment out copy of ‘sapmnt’, ‘trans’ and ‘usrsapT01’ in one of the VM as contents are same in both the VMs.

  1. [A] Ensure file have right ownership and permissions.

chown <sid>adm:sapsys copy_sap_folders.sh

chmod 755 copy_sap_folders.sh

  1. [A] Schedule the cronjob for use <sid>adm

>>crontab -e 

15,30,45,59 * * * * /home/t01adm/copy_sap_folders.sh

In Secondary Region, during DR activation OR DR testing

  1. [A – DR] Update the /etc/fstab files to mount the Azure Files NFS in the secondary region.

Picture11.jpg

 

>> mount -a

  1. [1-DR]Update the cluster configuration to update the Azure Files location for ASCS00 and ERS02 folders. Details are described in the main section of this document.
  2. [A – DR] Copy the contents from Azure local disk filesystem (/sapfoldercopy) to Azure Files NFS filesystem paths in respective locations.
Co-Authors
Version history
Last update:
‎Feb 26 2021 10:16 AM
Updated by: