sap on azure
3 TopicsDeep dive into Pacemaker cluster for Azure SAP systems optimization
Introduction: Azure Pacemaker offers a centralized management platform that streamlines the process of monitoring and maintaining pacemakers. With this innovative service, you can ensure the safety and well-being of your systems through automated alerts and comprehensive management tools. By leveraging Azure Pacemaker, organizations can experience enhanced efficiency and peace of mind knowing that their pacemakers are being managed optimally. The centralized platform simplifies the management process, making it easier to keep track of devices and promptly respond to any issues that may arise. Current customer challenges: Configuration: Common misconfigurations occur when customers don’t follow up-to-date HA setup guidance from learn.microsoft.com, leading to failover issues. Testing: Manual testing causes untested failover scenarios and config drift. Limited expertise in HA tools complicates troubleshooting. Key Use Cases for SAP HA Testing Automation: I wanted to discuss some important updates regarding our testing and validation procedures to ensure that we continue to maintain the highest standards in our work. First off, we need to automate the validation process on new OS versions. This will help us ensure that the Pacemaker cluster configuration remains up-to-date and functions smoothly with the latest OS releases. By doing this, we can promptly address any compatibility issues that might arise. Next, we should implement loop tests to run on a regular cadence. These tests will enable us to catch regressions early and ensure that our customer systems remain robust and reliable over time. It's essential to have continuous monitoring in place to maintain optimal performance. Furthermore, we must validate our high availability (HA) configurations according to the documented SAP on Azure best practices. This will ensure effective failover and quick recovery, minimize downtime and maximize system uptime. Adhering to these best practices will significantly enhance our HA capabilities. SAP Testing Automation Framework (Public Preview): The most recommended approach for validating Pacemaker configurations in SAP HANA clusters, which is through the SAP Deployment Automation Framework (SDAF) and its High Availability Testing Framework. This framework includes a comprehensive set of automated test cases designed to validate cluster behavior under various scenarios such as primary node crashes, manual resource migrations, and service failures. Additionally, it rigorously checks OS versions, Azure roles for fencing, SAP parameters, and Pacemaker/Corosync configurations to ensure everything is set up correctly. Low-level administrative commands are employed to validate the captured values against best practices, particularly focusing on constraints and meta-attributes. This thorough validation process ensures that our clusters are reliable, resilient, and adhering to industry standards. SAP System High Availability on Azure: SAP HANA Scale-UP: SAP Central Services: Support Matrix: Linux Distribution: Distribution Supported Release SUSE Linux Enterprise Server (SLES) 15 SP4, 15 SP5, 15 SP6 Red Hat Enterprise Linux (RHEL) 8.8, 8.10, 9.2, 9.4 High Availability Configuration Patterns: Component Type Cluster Type Storage SAP Central Services ENSA1 or ENSA2 Azure Fencing Agent Azure Files or ANF SAP Central Services ENSA1 or ENSA2 ISCSI (SBD device) Azure Files or ANF SAP HANA Scale-up Azure Fencing Agent Azure Managed Disk or ANF SAP HANA Scale-up ISCSI (SBD device) Azure Managed Disk or ANF High Availability Tests scenarios: Test Type Database Tier (HANA) Central Services Configuration Checks HA Resource Parmeter Validation Azure Load Balancer Configuration HA Resource Parmeter Validation SAPControl Azure Load Balancer Configuration Failover Tests HANA Resource Migration Primary Node Crash ASCS Resource Migration ASCS Node Crash Process & Services Index Server Crash Node Kill Kill SBD Service Message Server Enqueue Server Enqueue Replication Server SAPStartSRV process Network Tests Block network Block network Infrastructure Virtual machine crash Freeze file system (storage) Manual Restart HA Failover to Node Reference links: SLES: Set up Pacemaker on SUSE Linux Enterprise Server (SLES) in Azure | Microsoft Learn Troubleshoot startup issues in a SUSE Pacemaker cluster - Virtual Machines | Microsoft Learn RHEL: Set up Pacemaker on RHEL in Azure | Microsoft Learn Troubleshoot Azure fence agent issues in an RHEL Pacemaker cluster - Virtual Machines | Microsoft Learn STAF: GitHub - Azure/sap-automation-qa: This is the repository supporting the quality assurance for SAP systems running on Azure. Conclusion: This innovative tool is designed to significantly streamline and enhance the high availability deployment of SAP systems on Azure by reducing potential misconfigurations and minimizing manual effort. Please note that since this framework performs multiple failovers sequentially to validate the cluster behavior. It is not recommended to be run on production systems directly. It is intended for use in new high availability deployments that are not yet live / non-business critical systems.Gen1 to Gen2 Azure VM Upgrade in Rolling Fashion
Introduction: Azure offers Trusted Launch. This seamless solution is designed to significantly enhance the security of our Generation 2 virtual machines (VMs), providing robust protection against advanced and persistent attack techniques. Trusted Launch is composed of several coordinated infrastructure technologies, each of which can be enabled independently. These technologies work in harmony to create multiple layers of defense, ensuring our virtual machines remain secure against sophisticated threats. With Trusted Launch, we can confidently improve our security posture and safeguard our VMs from potential vulnerabilities. Upgrading of Azure VMs from Generation 1 (Gen1) to Generation 2 (Gen2) involves several steps to ensure a smooth transition without data loss or disruptions. Rolling fashion upgrade process: First and foremost, it is crucial to have a complete backup of virtual machines before starting the upgrade. This step is essential to protect valuable data in case of any unforeseen issues that may arise during the process. Having a backup will give you peace of mind and ensure that data is safe and secure. It is crucial to perform any new process or implementation in pre-production systems first. This step is vital to ensure that we can identify and resolve any potential issues before moving to the production environment. By doing so, we can maintain the integrity and stability of our systems, ultimately serving our customers better. Please run the pre-validation steps before you bring down the VM. SSH into VM: Connect to the Gen1 Linux VM. Identify Boot Device with sudo : bootDevice=$(echo "/dev/$(lsblk -no pkname $(df /boot | awk 'NR==2 {print $1}'))") Check Partition Type (must return 'gpt'): sudo blkid $bootDevice -o value -s PTTYPE Validate EFI System Partition (e.g., /dev/sda2 or 3): sudo fdisk -l $bootDevice | grep EFI | awk '{print $1}' Check EFI Mountpoint (/boot/efi must be in /etc/fstab): sudo grep -qs '/boot/efi' /etc/fstab && echo '/boot/efi present in /etc/fstab' || echo '/boot/efi missing /boot/efi present in /etc/fstab' Example: Once the complete backup is in place and the pre-validation steps are completed, we will need the SAP Basis team to proceed with stopping the application. As part of our planned procedure, once the application has been taken down, the Unix team will proceed to shut down the operating system on the ERS servers. Azure team to follow below steps and perform the Gen upgrade on the selected approved servers: Example: Example: Example: Start the VM: Start-AzVM -ResourceGroupName myResourceGroup -Name myVm (Or) Start from Azure Portal Login into Azure Portal to check the VM Generation is successfully changed to V2. Example: Unix team to validate OS on approved servers. SAP Basis team to generate a new license key based on the new hardware to apply and start the application. Unix team to perform failover of ASCS cluster. SAP Basis team to stop the application server. Unix team to shutdown OS on ERS for selected VM’s and validate the OS. SAP Basis team to apply the new Hardware key and start the application. Unix team to perform failover of ASCS cluster. Azure team to work on capacity analysis to find the path forward for hosting Mv2 VMs on the same PPG group. Once successfully completed test rollback on at least one app server for rollback planning. Here are the other methods to achieve this: Method 1: Using Trusted Launch Direct Upgrade Prerequisites Check: Ensure your subscription is onboarded to preview feature Gen1ToTLMigrationPreview under Microsoft. Compute namespace. The VM should be configured with Trusted launch supported size family and OS version. Also have a successful backup in place. Update Guest OS Volume: Update guest OS volume with GPT Disk layout and EFI system partition. Use PowerShell-based orchestration script for MBR2GPT validation and conversion. Enable Trusted Launch:Deallocate the VM using Stop-AzVM. Enable Trusted launch by setting -SecurityType to TrustedLaunch using Update-AzVM command. Stop-AzVM -ResourceGroupName myResourceGroup -Name myVm, Update-AzVM -ResourceGroupName myResourceGroup -VMName myVm -SecurityType TrustedLaunch -EnableSecureBoot $true -EnableVtpm $true Validate and Start VM:Validate the security profile in the updated VM configuration. Start the VM and verify that you can sign in using RDP or SSH. Method 2: Using Azure Backup Verify Backup Data: Ensure you have valid and up-to-date backups of your Gen1 VMs, including both OS disks and data disks. Verify that the backups are successfully completed and can be restored. Create Gen2 VMs: Create new Gen2 VMs with the desired specifications and configuration. There's no need to start them initially, just have them created and ready for when we need them. Restore VM Backups: In the Azure Portal, go to the Azure Backup service. Select "Recovery Services vaults" from the left-hand menu, and then select your existing backup vault that contains the backups of the Gen1 VMs. Inside the recovery services vault, go to the "Backup Items" section and select the VM you want to restore. Initiate a restore operation for the VM. During the restore process, choose the target resource group and the target VM (which should be the newly created Gen2 VM). Restore OS Disk: Choose to restore the OS disk of the Gen1 VM to the newly created Gen2 VM. Azure Backup will restore the OS disk to the new VM, effectively migrating it to Generation 2. Restore Data Disks: Once the OS disk is restored and the Gen2 VM is operational, proceed to restore the data disks. Repeat the restore process for each data disk, attaching them to the Gen2 VM as needed. Verify and Test: Verify that the Gen2 VM is functioning correctly, and all data is intact. Test thoroughly to ensure all applications and services are running as expected. Decommission Gen1 VMs (Optional): Once the migration is successful, and you have verified that the Gen2 VMs are working correctly please decommission the original Gen1 VMs. Important Notes: Before proceeding with any production migration, thoroughly test this process in a non-production environment to ensure its success and identify any potential issues. Make sure you have a backup of critical data and configurations before attempting any migration. While this approach focuses on using Azure Backup for restoring the VMs, there are other migration strategies available that may better suit your specific scenario. Evaluate them based on your requirements and constraints. Remember, migrating VMs between generations involves changes in the underlying virtual hardware, so thorough testing and planning are essential to ensure a smooth transition without data loss or disruptions. Why Generation2 upgrade without Trusted launch is not supported? Trusted Launch provides foundational compute security for our VMs at no additional cost, which means we can enhance our security posture without incurring extra expenses. Moreover, Trusted Launch VMs are largely on par with Generation 2 VMs in terms of features and performance. This means that upgrading to Generation 2 VMs without enabling Trusted Launch does not provide any added benefits. Unsupported Gen1 VM configurations: Gen1 to Trusted launch VM upgrade is NOT supported if Gen1 VM is configured with below options: Operating system: Windows Server 2016, Azure Linux, Debian, and any other operating system not listed under Trusted launch supported operating system (OS) version. Ref link: Trusted Launch for Azure VMs - Azure Virtual Machines | Microsoft Learn VM size: Gen1 VM configured with VM size not listed under Trusted launch supported size families. Ref link: Trusted Launch for Azure VMs - Azure Virtual Machines | Microsoft Learn Azure Backup: Gen1 VM configured with Azure Backup using Standard policy. As workaround, migrate Gen1 VM backups from Standard to Enhanced policy. Ref link: Move VM backup - standard to enhanced policy in Azure Backup - Azure Backup | Microsoft Learn Conclusion: We will enhance Azure virtual machines by transitioning from Gen1 to Gen2. By implementing these approaches, we can seamlessly unlock improved security and performance of systems. This transition will not only bolster our security measures but also significantly enhance the overall performance, ensuring our operations run more smoothly and efficiently. Let us make this upgrade to ensure virtual machines are more robust and capable of handling future demands. Ref links: Upgrade Gen1 VMs to Trusted launch - Azure Virtual Machines | Microsoft Learn GitHub - Azure/Gen1-Trustedlaunch: aka.ms/Gen1ToTLUpgrade Enable Trusted launch on existing Gen2 VMs - Azure Virtual Machines | Microsoft Learn