Blog Post

Microsoft Mission Critical Blog
7 MIN READ

MSL correction from clone to multistate HANA DB Cluster SUSE activation

AnuradhaKarnam's avatar
Oct 28, 2025

Introduction:

SAP HANA system replication involves configuring one primary node and at least one secondary node. Any changes made to the data on the primary node are replicated to the secondary node synchronously. This ensures that we have a consistent and up-to-date backup, which is crucial for maintaining the integrity and availability of our data.

Problem Description:

Azure VM was in a degraded state causing a major outage since the SAP cluster was unable to start. Node health score (-1000000) did not reset automatically after redeploying and remained until manual intervention.

Consider below execution if your cluster nodes are running on SLES 12 or later: Please note that promotable is not supported.

Replace <placeholders> with your instance number and HANA system ID.

sudo crm configure primitive rsc_SAPHana_<HANA SID>HDB<instance number> ocf:suse:SAPHana
operations $id="rsc_sap<HANA SID>_HDB<instance number>-operations"
op start interval="0" timeout="3600"
op stop interval="0" timeout="3600"
op promote interval="0" timeout="3600"
op monitor interval="60" role="Master" timeout="700"
op monitor interval="61" role="Slave" timeout="700"
params SID="<HANA SID>" InstanceNumber="<instance number>" PREFER_SITE_TAKEOVER="true"
DUPLICATE_PRIMARY_TIMEOUT="7200" AUTOMATED_REGISTER="false"

sudo crm configure ms msl_SAPHana_<HANA SID>HDB<instance number> rsc_SAPHana<HANA SID>_HDB<instance number>
meta notify="true" clone-max="2" clone-node-max="1"
target-role="Started" interleave="true"

sudo crm resource meta msl_SAPHana_<HANA SID>_HDB<instance number> set priority 100

Cutover steps: These steps encompass pre-steps, execution steps, post-validation steps, and the rollback plan.

First, we have the pre-steps, which involve preparations and checks that need to be completed before we proceed with the main execution. This ensures that everything is in order and ready for the next phase. Next, we move on to the execution steps. These are the core actions that need to be carried out to ensure the task is completed accurately and efficiently. It's crucial that we follow these steps meticulously to avoid any issues. Post-validation steps come after the execution. This phase involves verifying the results and ensuring that everything works as expected.

Pre-Steps:

Check cluster status:

  • crm status
  • crm configure show
  • SAPHanaSR-showAttr

Ensure no pending operations or failed resources:

  • crm_mon -1

Confirm replication is healthy:

  • hdbnsutil -sr_state
  • SAPHanaSR-showAttr

Backup current configuration:

  • crm configure show > /root/cluster_config_backup.txt
Execution Steps:

Enable maintenance mode:

  • sudo crm configure property maintenance-mode=true

Delete the incorrect clone resource:

  • crm configure delete msl_SAPHana_<SID>_HDB<instance>

Recreate using ms primitive:

  • sudo crm configure ms msl_SAPHana_<SID>_HDB<instance> rsc_SAPHana_<SID>_HDB<instance>  meta notify="true" clone-max="2" clone-node-max="1" target-role="Started" interleave="true" maintenance="true"
  • sudo crm resource meta msl_SAPHana_<HANA SID>_HDB<instance number> set priority 100

Disable maintenance mode:

  • crm configure property maintenance-mode=false

Refresh resource and disable maintenance:

  • sudo crm resource refresh msl_SAPHana_<SID>
  • wait 10 seconds
  • Check HSR status match in all SAPHanaSR-showAttr and crm_mon -A -1 and hdbnsutil -sr_state
  • sudo crm resource maintenance msl_SAPHana_<SID> off
 Post Validation steps:
  • crm status
  • crm configure show
  • SAPHanaSR-showAttr
Rollback Plan:

Enable maintenance mode:

  • crm configure property maintenance-mode=true
  • sudo crm resource maintenance msl_SAPHana_<SID> on

Restore configuration from backup:

  • "crm configure load update /root/cluster_config_backup.txt"

Recreate the previous clone configuration if needed:

  • crm configure clone msl_SAPHana_<SID>_HDB<instance> rsc_SAPHana_<SID>_HDB<instance> \ meta notify=true clone-max=2 clone-node-max=1 target-role=Started interleave=true promotable=true

Disable maintenance and refresh resources:

  • crm configure property maintenance-mode=false
  • sudo crm resource refresh msl_SAPHana_<SID>
  • wait 10 seconds
  • sudo crm resource maintenance msl_SAPHana_<SID> off

Perform below steps during actual execution:

Task Description

Team

Pre Step: Submit a CAB request for approval

Basis

Perform Pre-checks

 

· Check cluster status:
SBD,pacemaker, coro services, sbd messages, isscsi, constraint
crm status
crm configure show
SAPHanaSR-showAttr
· Ensure no pending operations or failed resources:
crm_mon -R1 -Af -1
· Confirm replication is healthy:
hdbnsutil -sr_state
· Backup current configuration: Pre-change
crm configure show > /hana/shared/SID/dbcluster_backup_prechange.txt
crm configure show | sed -n '/primitive rsc_SAPHana_SID_HD/,/^$/p'
crm configure show | sed -n '/clone msl_SAPHana_SID_HD/,/^$/p'

Basis

Execution

 

Get Go ahead from Leadership team

Basis

Step 0 – Put cluster into maintenance mode

Basis

crm resource maintenance g_ip_SID_HD on

Basis

#Backup current configuration: When cluster, msl, g_ip is in maintenance
crm configure show > /hana/shared/SID/dbcluster_backup_prehealth.txt

Basis

Step 1 – (If not already done) clear Node 1 health and ensure topology/azure-events are running on both nodes (this avoids scheduler surprises when we re-manage)

Basis

#Execute on m1vms*(Ideally it can be executed on any node)
crm_attribute -N vm** -n '#health-azure' -v 0
crm_attribute --node vm** --delete --name "azure-events-az_curNodeState"
crm_attribute --node vm**--delete --name "azure-events-az_pendingEventIDs"

SOPS

crm resource cleanup health-azure-events-cln
crm resource cleanup cln_SAPHanaTopology_SID_HD

Basis

#Backup current configuration: When health correct is complete and msl correction remaining.
crm configure show > /hana/shared/SID/dbcluster_backup_premsl.txt

Basis

Step 2 – Convert the wrapper inside a single atomic transaction
We delete the promotable clone wrapper only (not the primitive), then create the ms wrapper with the same name msl_SAPHana_SID_HD so existing colocation/order constraints that reference the name keep working.

Basis

# Remove the promotable clone wrapper (keeps rscSAPHanaSIDHD primitive intact)
crm configure delete msl_SAPHana_SID_HD

Basis

# Recreate as multi-state (ms) for classic agents
sudo crm configure ms msl_SAPHana_SID_HD rsc_SAPHana_SID_HD meta notify="true" clone-max="2" clone-node-max="1" target-role="Started" interleave="true" maintenance="true"

Basis

sudo crm resource meta msl_SAPHana_SID_HD set priority 100

Basis

Step 3 – Re‑enable cluster management of IP and HANA

Basis

Prechecks by MSFT, SUSE Teams

MSFT/SUSE

Precheck by BASIS Team

Basis

crm configure property maintenance-mode=false
crm resource refresh msl_SAPHana_SID_HD
wait 10 seconds
crm resource maintenance msl_SAPHana_SID_HD off
crm resource maintenance g_ip_SID_HD off

Basis

Validation

Basis

crm_mon -R1 -Af -1
crm status
crm configure show
SAPHanaSR-showAttr

Basis

Rollback Plan

 

Enable maintenance mode:

Basis

crm configure property maintenance-mode=true
crm resource maintenance msl_SAPHana_SID_HD on
crm resource maintenance g_ip_SID_HD on

Basis

Restore configuration from backup: Decide to which state we need to revert and use respective backup

Basis

crm configure load update /hana/shared/SID/dbcluster_backup_prechange/prehealth/premsl.txt

Basis

Recreate the previous clone configuration if needed:

Basis

crm configure clone msl_SAPHana_SID_HD rsc_SAPHana_SID_HD meta notify=true clone-max=2 clone-node-max=1 target-role=Started interleave=true promotable=true maintenance="true"

Basis

Disable maintenance and refresh resources:

Basis

crm configure property maintenance-mode=false
crm resource refresh msl_SAPHana_SID_HD
wait 10 seconds
crm resource maintenance msl_SAPHana_SID_HD off
crm resource maintenance g_ip_SID_HD off

Basis

 

Important Points:

1. Are there known version-specific considerations when migrating from clone to ms?

If you are using SAPHanaSR, please ensure you are using 'ms'. On the other hand, if you are working with SAPHanaSR-angi, you should use 'clone'.

There are 3 different sets of HANA resource agents and SRHook scripts, two older ones and one newer one.

2. Does this change apply across the board on SUSE OS and/or Pacemaker versions?

The packages for the older ones are:

SAPHanaSR which is for Scale-Up HANA clusters.

SAPHanaSR-ScaleOut which is for Scale-Out HANA clusters.

The package for the new one is:

SAPHanaSR-angi which is for both Scale-up and Scale-out clusters. (angi stands for "advanced next generation interface").

When using the older SAPHanaSR or SAPHanaSR-ScaleOut resource agents and SRHook scripts, SUSE only supports the multi-state (ms) clone type for the SAPHanaSR (scale-up) or SAPHanaController (scale-out) resource. The older resource agents and scripts are supported on all Service Packs of SLES for SAP 12 and 15. 

When using the newer SAPHanaSR-angi resource agents and scripts, SUSE only supports the regular clone type for the SAPHanaController resource (scale-up AND scale-out) with the "promotable=true" meta-attribute set on the clone. The newer "angi" resource agents and scripts are supported on SLES for SAP 15 SP5 and higher and on SLES for SAP 16 when it is released later this year. 

So, with SLES for SAP 15 SP5 and higher, you can use either the older or the newer resource agents and scripts. For all Service Packs of SLES for SAP 12 and Service Packs of SLES for SAP 15 prior to SP5, you must use the older resource agents and scripts. Starting with SLES for SAP 16, you must use the new angi resource agents and scripts.

Installing the new SAPHanaSR-angi package will automatically uninstall the older SAPHanaSR or SAPHanaSR-ScaleOut packages if they are already installed. SUSE has published a blog on how to migrate from the older resource agents and scripts to the newer ones provided in the reference suse link.

Conclusion:

Let us set up and ensure that system replication is active. This is crucial to avoid any business disruptions during our critical operational hours. By taking these steps, we can seamlessly enhance the cluster architecture and resilience of our systems. Implementing these replication strategies will not only bolster our business continuity measures but also significantly improve our overall resilience. This means our operations will run more smoothly and efficiently, allowing us to handle future demands with ease.

Reference MS links: 

High availability for SAP HANA on Azure VMs on SLES | Microsoft Learn

https://www.suse.com/c/how-to-upgrade-to-saphanasr-angi/

Updated Oct 28, 2025
Version 1.0
No CommentsBe the first to comment