SAP on Azure
1 TopicDeep dive into Pacemaker cluster for Azure SAP systems optimization
Introduction: Azure Pacemaker offers a centralized management platform that streamlines the process of monitoring and maintaining pacemakers. With this innovative service, you can ensure the safety and well-being of your systems through automated alerts and comprehensive management tools. By leveraging Azure Pacemaker, organizations can experience enhanced efficiency and peace of mind knowing that their pacemakers are being managed optimally. The centralized platform simplifies the management process, making it easier to keep track of devices and promptly respond to any issues that may arise. Current customer challenges: Configuration: Common misconfigurations occur when customers don’t follow up-to-date HA setup guidance from learn.microsoft.com, leading to failover issues. Testing: Manual testing causes untested failover scenarios and config drift. Limited expertise in HA tools complicates troubleshooting. Key Use Cases for SAP HA Testing Automation: I wanted to discuss some important updates regarding our testing and validation procedures to ensure that we continue to maintain the highest standards in our work. First off, we need to automate the validation process on new OS versions. This will help us ensure that the Pacemaker cluster configuration remains up-to-date and functions smoothly with the latest OS releases. By doing this, we can promptly address any compatibility issues that might arise. Next, we should implement loop tests to run on a regular cadence. These tests will enable us to catch regressions early and ensure that our customer systems remain robust and reliable over time. It's essential to have continuous monitoring in place to maintain optimal performance. Furthermore, we must validate our high availability (HA) configurations according to the documented SAP on Azure best practices. This will ensure effective failover and quick recovery, minimize downtime and maximize system uptime. Adhering to these best practices will significantly enhance our HA capabilities. SAP Testing Automation Framework (Public Preview): The most recommended approach for validating Pacemaker configurations in SAP HANA clusters, which is through the SAP Deployment Automation Framework (SDAF) and its High Availability Testing Framework. This framework includes a comprehensive set of automated test cases designed to validate cluster behavior under various scenarios such as primary node crashes, manual resource migrations, and service failures. Additionally, it rigorously checks OS versions, Azure roles for fencing, SAP parameters, and Pacemaker/Corosync configurations to ensure everything is set up correctly. Low-level administrative commands are employed to validate the captured values against best practices, particularly focusing on constraints and meta-attributes. This thorough validation process ensures that our clusters are reliable, resilient, and adhering to industry standards. SAP System High Availability on Azure: SAP HANA Scale-UP: SAP Central Services: Support Matrix: Linux Distribution: Distribution Supported Release SUSE Linux Enterprise Server (SLES) 15 SP4, 15 SP5, 15 SP6 Red Hat Enterprise Linux (RHEL) 8.8, 8.10, 9.2, 9.4 High Availability Configuration Patterns: Component Type Cluster Type Storage SAP Central Services ENSA1 or ENSA2 Azure Fencing Agent Azure Files or ANF SAP Central Services ENSA1 or ENSA2 ISCSI (SBD device) Azure Files or ANF SAP HANA Scale-up Azure Fencing Agent Azure Managed Disk or ANF SAP HANA Scale-up ISCSI (SBD device) Azure Managed Disk or ANF High Availability Tests scenarios: Test Type Database Tier (HANA) Central Services Configuration Checks HA Resource Parmeter Validation Azure Load Balancer Configuration HA Resource Parmeter Validation SAPControl Azure Load Balancer Configuration Failover Tests HANA Resource Migration Primary Node Crash ASCS Resource Migration ASCS Node Crash Process & Services Index Server Crash Node Kill Kill SBD Service Message Server Enqueue Server Enqueue Replication Server SAPStartSRV process Network Tests Block network Block network Infrastructure Virtual machine crash Freeze file system (storage) Manual Restart HA Failover to Node Reference links: SLES: Set up Pacemaker on SUSE Linux Enterprise Server (SLES) in Azure | Microsoft Learn Troubleshoot startup issues in a SUSE Pacemaker cluster - Virtual Machines | Microsoft Learn RHEL: Set up Pacemaker on RHEL in Azure | Microsoft Learn Troubleshoot Azure fence agent issues in an RHEL Pacemaker cluster - Virtual Machines | Microsoft Learn STAF: GitHub - Azure/sap-automation-qa: This is the repository supporting the quality assurance for SAP systems running on Azure. Conclusion: This innovative tool is designed to significantly streamline and enhance the high availability deployment of SAP systems on Azure by reducing potential misconfigurations and minimizing manual effort. Please note that since this framework performs multiple failovers sequentially to validate the cluster behavior. It is not recommended to be run on production systems directly. It is intended for use in new high availability deployments that are not yet live / non-business critical systems.