Blog Post

Running SAP Applications on the Microsoft Platform
6 MIN READ

DR scenario for SAP Windows application servers with the focus on ASCS/ERS using Azure Files SMB

SteffenMueller's avatar
Apr 10, 2023

Management Summary

 

A disaster exists if a primary side (Region) becomes entirely unavailable due to major outages. Within the region the system is protected by WSFC and redundancy measures from single points of failure.

In this example the focus is on the ASCS/ERS functionality. In another Azure region, a second SAP system is built, acting as for example a QAS system. This allows the virtual machines making up this system to be temporarily repurposed for the PRD DR system. This is a very cost-effective measure since there’re no idle machines. Both systems are installed with virtual hostnames. Following SAP Note 1564275, CNAMES should not be used to implement virtual hostnames. For the virtual machines acting as SAP application servers a second IP address needs to be added. This doesn’t apply to the WSFC ASCS cluster role. Here a CNAME is used. In a disaster the SAP system running on the DR site will be shut down. The DNS entries will be adjusted to reflect the virtual hostnames of the PRD system. For the ASCS cluster role the CNAME needs to be adjusted. Additionally, the entry for SAPGLOBALHOST in the DEFAULT.PFL profile needs to be changed from the PRD site to the DR site. To ensure the consistency of the DR system, OS patches and SAP configuration changes should be made at the same time they’re done on the PRD site. These are mainly changes to the file system level like profiles. Changes stored within the database are replicated through the database. This measure preserves all the DNS names used in the SAP PRD system thus avoiding the need for adjustments within the SAP system. Printer definitions, logon groups, external interfaces etc. will work immediately after the system is online.

This solution supports all Windows based application server scenarios. The underlying database can of course be an SAP HANA database, Microsoft SQL Server or any other SAP certified database which supports data replication.

The same scenario could also be used for Windows based scenarios using ANF.

The clustered Enqueue Replication Server can only be installed for SAP HANA based system which are installed using SWPM2. For installations based on SWPM1 the cluster functionality can be retrofitted according to SAP Note 263928. You must be careful because after the ERS2 conversion was done, the current version of SWPM1 version doesn’t allow for additional application server installations.

 

Prerequisites 

 

For the DR site to be consistent apply OS patches and changes to the SAP system configuration at the same time to both sites. Make sure that the SAP kernel repository is in sync on both SAPGLOBALHOST (robocopy). This also applies to any other information you want to preserve like SAP job logs. Changes stored within the DB will be taken care of by system replication.

Perform a standard SAP installation based on the High Availability scenario. For the Windows Failover Cluster installation use FSC (File Share Cluster) based on High availability for SAP NetWeaver on Azure VMs on Windows with Azure Files Premium SMB for SAP applications

Installing the PRD (East) system and QAS (West) systems result in the following DNS/HOST file entries.

 

These are the DNS/hosts file entries during normal operation.

 

The 10.19.x.x VNET is the production site.

 

VM                              IP address      Alias/CNAME

mswinnode43              10.19.0.52       #VM cluster node 1

mswinnode44              10.19.0.53       #VM cluster node 2

ts6ascseast                 10.19.0.54       ts6ascs

ts6erseast                    10.19.0.55

mswinapp01                10.19.0.56       #Application server 1

mswinapp02                10.19.0.57       #Application server 2

mswinapp03                10.19.0.58       #Application server 3

mswinapp04                10.19.0.59       #Application server 4

mswinapp05                10.19.0.60       #Application server 5

ts6app01                     10.19.0.71       #Virt. hostname 1 second IP address for 10.19.0.56

ts6app02                     10.19.0.72       #Virt. hostname 2 second IP address for 10.19.0.57

ts6app03                     10.19.0.73       #Virt. hostname 3 second IP address for 10.19.0.58

ts6app04                     10.19.0.74       #Virt. hostname 4 second IP address for 10.19.0.59

ts6app05                     10.19.0.75       #Virt. hostname 5 second IP address for 10.19.0.60

 

The 10.20.x.x VNET is the DR site.

 

ts6ascswest                 10.20.0.40       #CNAME ts6asc must be activated here in case of DR.

ts6erswest                   10.20.0.41

mswinnode53              10.20.0.52       #VM cluster node 1                                       

mswinnode54              10.20.0.53       #VM cluster node 2

ts7ascswest                10.20.0.54       ts7ascs

ts7erswest                   10.20.0.55

mswinapp61                10.20.0.101     #Application server 1

mswinapp62                10.20.0.102     #Application server 2

mswinapp63                10.20.0.103     #Application server 3

mswinapp64                10.20.0.104     #Application server 4

mswinapp65                10.20.0.105     #Application server 5

ts7app01                     10.20.0.111     #Virt. hostname 1 second IP address for10.20.0.101

ts7app02                     10.20.0.112     #Virt. hostname 1 second IP address for10.20.0.102

ts7app03                     10.20.0.113     #Virt. hostname 1 second IP address for10.20.0.103

ts7app04                     10.20.0.114     #Virt. hostname 1 second IP address for10.20.0.104

ts7app05                     10.20.0.115     #Virt. hostname 1 second IP address for10.20.0.105

ts6app01                     10.20.0.121     #Placwholder entry for speedy address change

ts6app02                     10.20.0.122     #Placwholder entry for speedy address change

ts6app03                     10.20.0.123     #Placwholder entry for speedy address change

ts6app04                     10.20.0.124     #Placwholder entry for speedy address change

ts6app05                     10.20.0.125     #Placwholder entry for speedy address change

 

Activating DR scenario

 

DNS/host file entries during active DR scenario.

 

The 10.19.x.x VNET is the production site.

 

VM                              IP address        Alias/CNAME

mswinnode43              10.19.0.52       #VM cluster node 1

mswinnode44              10.19.0.53       #VM cluster node 2

ts6ascseast                 10.19.0.54      

ts6erseast                    10.19.0.55

mswinapp01                10.19.0.56       #Application server 1

mswinapp02                10.19.0.57       #Application server 2

mswinapp03                10.19.0.58       #Application server 3

mswinapp04                10.19.0.59       #Application server 4

mswinapp05                10.19.0.60       #Application server 5

ts6app01                     10.19.0.71       #Virt. hostname 1 second IP address for 10.19.0.56

ts6app02                     10.19.0.72       #Virt. hostname 2 second IP address for 10.19.0.57

ts6app03                     10.19.0.73       #Virt. hostname 3 second IP address for 10.19.0.58

ts6app04                     10.19.0.74       #Virt. hostname 4 second IP address for 10.19.0.59

ts6app05                     10.19.0.75       #Virt. hostname 5 second IP address for 10.19.0.60

 

The 10.20.x.x VNET is the DR site.

 

ts6ascswest                10.20.0.40       #CNAME ts6asc must be activated here in case of DR.

ts6erswest                   10.20.0.41       ts6ascs

mswinnode53              10.20.0.52       #VM cluster node 1                                       

mswinnode54              10.20.0.53       #VM cluster node 2

ts7ascswest                 10.20.0.54      

ts7erswest                   10.20.0.55

mswinapp61                10.20.0.101     #Application server 1

mswinapp62                10.20.0.102     #Application server 2

mswinapp63                10.20.0.103     #Application server 3

mswinapp64                10.20.0.104     #Application server 4

mswinapp65                10.20.0.105     #Application server 5

ts6app01                     10.20.0.111     #Virt. hostname 1 second IP address for10.20.0.101

ts6app02                     10.20.0.112     #Virt. hostname 1 second IP address for10.20.0.102

ts6app03                     10.20.0.113     #Virt. hostname 1 second IP address for10.20.0.103

ts6app04                     10.20.0.114     #Virt. hostname 1 second IP address for10.20.0.104

ts6app05                     10.20.0.115     #Virt. hostname 1 second IP address for10.20.0.105

 

 

  1. Shutdown QAS SAP TS7 system on DR site
    Stop application servers and the 2 cluster roles.
  2. Making DNS changes
    Replace the CNAME of the PRD ASCS cluster role with ts6ascs. Change the DNS/host file entries from ts7appxx to ts6appxx by switching the ts6appxx addresses to 10.20.0.111 – 10.20.0.115. Delete 10.20.0.121 – to 10.20.0.124 entries.
  3. SAPGLOBALHOST change.
    Verify that the SAPGLOBALHOST entry in the DEFAULT.PFL points to the DR site (ts6ascswest).
  4. Make sure that SAPDBHOST points to the replicated database on the DR site.

     

  5. Use robocopy (ideally already scheduled to run on a regular base) to copy the desired files (SAP kernel repository, Job logs etc.) from ts6smbeast.file.core.windows.net to ts6smbwest.file.core.windows.net.
    The following command is an example for copying the SAP job logs.
    Robocopy \\ts6smbeast.file.core.windows.net\sapmnt\TS6\sys\global    \\ts6smbwest.file.core.windows.net\sapmnt\TS6\sys\global /s /e /mir
    A full documentation of the robocopy command can be found here robocopy | Microsoft Learn.
  6. ILBs
    Due to the unique cluster roles, no changes to any ILB are required.
  7. Start SAP PRD on the DR site.
    Since the system uses the same hostnames used on the primary site, all SAP system specific settings remain intact. No configuration changes to logon groups, printer definitions, interfaces etc. are required.

 

Failback

 

  1. Stop PRD TS6 on DR site.
  2. Make sure DB is synced to primary site (East)
  3. Copy log files etc. back from ts6smbwest.file.core.windows.net to ts6smbeast.file.core.windows.net.
  4. Reverse DNS/host file entries back to pre-failover settings.
  5. Start PRD on primary site (East).
  6. Start QAS on DR site (West).

 

Appendix

 

Assign multiple IP addresses to virtual machines using the Azure portal

 



 

 

Updated Apr 06, 2023
Version 1.0

2 Comments

  • Dear PavloS2260!

    Thank you for taking the time to read my blog and your feedback! The idea behind the blog is a scenario for a Disaster Recovery with Windows application servers where no system is idle. Therefore, we have 2 SAP systems to start with: TS6 and TS7. TS6 is the production system (PRD) and TS7 is the quality assurance system (QAS). The production database is synchronized, depending on the DB by AlwaysOn in case of MS SQL Server and HSR in case of SAP HANA. A copy of the database can be found on the DR site as shown in the first diagram. I want to point out that failing over to a DR site is a judgment call, since changes impacting connected SAP systems might be involved. It may take longer to establish those changes may take longer than waiting for the situation to be resolved on the system side. I want to point out though that the individual sites are protected by local HA measures like clustering the databases and ASCS/SCS. This is not showed in the first diagram, which would have made the diagram too complex. I should have at least mentioned it in the description. So, as you can see in the screenshots of the DEFAULT.PFL, the name of the SAPDBHOST was changed accordingly in order to point to the copy of the primary DB. During an active DR scenario, the QAS system is of course not available. A rename of the SID is not necessary, since both systems are already installed. The adaptation of the showed profiles will suffice. For the synchronization of the SAP executables and log files Windows robocopy is used as described. All the screenshots used in the blog are based on MS Windows with SQL Server.

  • PavloS2260's avatar
    PavloS2260
    Copper Contributor

    This is great idea, thank you posting this. However, second system cannot act as QA without any systems being idle according to the diagram we have DB sync going on. Which means that for second system to function as QA it should have separate DB or not have sync going on (which defeats the purpose of the DR)

     

    In addition, in order to leverage DR properly you have to have the same SIDs to avoid repopulating xml to the system and renaming integrations that are using SID. Alternatively change SIDs during DR, which is quite cumbersome on the Windows since you have SID tied to AD'based login domain\SIDadm domain\sapServiceSID. Although possible if RTO is quite substantials.

     

    All the above to clarify that this design can 100% be used as DR quite effectively, however using this configuration for PRD + QA (DR) will not quite work unless RTO allows for system rename.