Blog Post

Microsoft Mission Critical Blog
3 MIN READ

Deep dive into Pacemaker cluster for Azure SAP systems optimization

AnuradhaKarnam's avatar
Sep 17, 2025

 

Introduction:  

Azure Pacemaker offers a centralized management platform that streamlines the process of monitoring and maintaining pacemakers. With this innovative service, you can ensure the safety and well-being of your systems through automated alerts and comprehensive management tools.

By leveraging Azure Pacemaker, organizations can experience enhanced efficiency and peace of mind knowing that their pacemakers are being managed optimally. The centralized platform simplifies the management process, making it easier to keep track of devices and promptly respond to any issues that may arise. 

Current customer challenges: 

  • Configuration: Common misconfigurations occur when customers don’t follow up-to-date HA setup guidance from learn.microsoft.com, leading to failover issues. 
  • Testing: Manual testing causes untested failover scenarios and config drift. 
  • Limited expertise in HA tools complicates troubleshooting. 

Key Use Cases for SAP HA Testing Automation: 

I wanted to discuss some important updates regarding our testing and validation procedures to ensure that we continue to maintain the highest standards in our work.  

  • First off, we need to automate the validation process on new OS versions. This will help us ensure that the Pacemaker cluster configuration remains up-to-date and functions smoothly with the latest OS releases. By doing this, we can promptly address any compatibility issues that might arise. 
  • Next, we should implement loop tests to run on a regular cadence. These tests will enable us to catch regressions early and ensure that our customer systems remain robust and reliable over time. It's essential to have continuous monitoring in place to maintain optimal performance. 
  • Furthermore, we must validate our high availability (HA) configurations according to the documented SAP on Azure best practices. This will ensure effective failover and quick recovery, minimize downtime and maximize system uptime. Adhering to these best practices will significantly enhance our HA capabilities. 

SAP Testing Automation Framework (Public Preview): 

The most recommended approach for validating Pacemaker configurations in SAP HANA clusters, which is through the SAP Deployment Automation Framework (SDAF) and its High Availability Testing Framework. 

This framework includes a comprehensive set of automated test cases designed to validate cluster behavior under various scenarios such as primary node crashes, manual resource migrations, and service failures. Additionally, it rigorously checks OS versions, Azure roles for fencing, SAP parameters, and Pacemaker/Corosync configurations to ensure everything is set up correctly. 

Low-level administrative commands are employed to validate the captured values against best practices, particularly focusing on constraints and meta-attributes. This thorough validation process ensures that our clusters are reliable, resilient, and adhering to industry standards. 

 

 

SAP System High Availability on Azure: 

SAP HANA Scale-UP: 

 

SAP Central Services: 

 

 

Support Matrix: 

Linux Distribution: 

Distribution 

Supported Release 

SUSE Linux Enterprise Server (SLES) 

15 SP4, 15 SP5, 15 SP6 

Red Hat Enterprise Linux (RHEL) 

8.8, 8.10, 9.2, 9.4 

High Availability Configuration Patterns: 

Component 

Type 

Cluster Type 

Storage 

SAP Central Services 

ENSA1 or ENSA2 

Azure Fencing Agent 

Azure Files or ANF 

SAP Central Services 

ENSA1 or ENSA2 

ISCSI (SBD device) 

Azure Files or ANF 

SAP HANA 

Scale-up 

Azure Fencing Agent 

Azure Managed Disk or ANF 

SAP HANA 

Scale-up 

ISCSI (SBD device) 

Azure Managed Disk or ANF 

 

High Availability Tests scenarios: 

Test Type 

Database Tier (HANA) 

Central Services 

Configuration Checks 

HA Resource Parmeter Validation 
Azure Load Balancer Configuration 

HA Resource Parmeter Validation 
SAPControl 
Azure Load Balancer Configuration 

Failover Tests 

HANA Resource Migration 
Primary Node Crash 

ASCS Resource Migration 
ASCS Node Crash 

Process & Services 

Index Server Crash 
Node Kill 
Kill SBD Service 

Message Server 
Enqueue Server 
Enqueue Replication Server 
SAPStartSRV process 

Network Tests 

Block network 

Block network 

Infrastructure 

Virtual machine crash 
Freeze file system (storage) 

Manual Restart 
HA Failover to Node 

 

Reference links:  

SLESSet up Pacemaker on SUSE Linux Enterprise Server (SLES) in Azure | Microsoft Learn 

Troubleshoot startup issues in a SUSE Pacemaker cluster - Virtual Machines | Microsoft Learn 

RHELSet up Pacemaker on RHEL in Azure | Microsoft Learn 

Troubleshoot Azure fence agent issues in an RHEL Pacemaker cluster - Virtual Machines | Microsoft Learn 

STAFGitHub - Azure/sap-automation-qa: This is the repository supporting the quality assurance for SAP systems running on Azure. 

Conclusion:  

This innovative tool is designed to significantly streamline and enhance the high availability deployment of SAP systems on Azure by reducing potential misconfigurations and minimizing manual effort. Please note that since this framework performs multiple failovers sequentially to validate the cluster behavior. It is not recommended to be run on production systems directly. It is intended for use in new high availability deployments that are not yet live / non-business critical systems.  

Updated Sep 17, 2025
Version 2.0
No CommentsBe the first to comment