Blog Post

Running SAP Applications on the Microsoft Platform
20 MIN READ

Azure SAP Zone Resource Agent (Public Preview) — Technical Deep Dive (Part 2)

sanoopt's avatar
sanoopt
Icon for Microsoft rankMicrosoft
Apr 28, 2026

This is Part 2 of a two-part series. Part 1 covers the concepts and features; this post covers the technical details: architecture, setup, configuration, and troubleshooting. Public preview: Recommended for non-production use only while in preview.

Overview

In Part 1, we discussed why keeping the SAP application tier aligned with the HANA primary zone matters for latency-sensitive workloads, and how the Azure SAP Zone Resource Agent automates this alignment after failovers. In this post, we get into the specifics - how the agent is structured, what it needs to run, and how to set it up in your Pacemaker cluster.

The azure-sap-zone resource agent is a Pacemaker resource agent designed to manage the alignment of SAP application Azure Virtual Machines (VMs) with the primary HANA Azure VM. This agent ensures that SAP application servers are started in the same Azure availability zone as the HANA primary VM to maintain high availability and optimal performance.

Key Benefits

  • Reduced Latency: Minimizes cross-zone network latency between application and database tiers
  • High Availability: Maintains SAP system availability during failover scenarios
  • Automated Management: Automatically handles VM and SAP instance lifecycle during zone transitions
  • Cost Optimization: Enables efficient resource utilization across availability zones

Architecture

The following diagram illustrates how the resource agent manages SAP application server alignment with the primary HANA database across Azure availability zones:

Key Components:

  • HANA Cluster: Primary and secondary HANA VMs are deployed in separate availability zones with System Replication configured
  • Pacemaker Cluster: Runs across both zones with the azure-sap-zone resource agent deployed on both nodes. The SAP application server VMs are not Pacemaker cluster members - they are managed remotely by the agent via Azure APIs.
  • Application Servers: Identical sets of SAP application VMs deployed in both availability zones
  • Azure Management API: Used by the resource agent to control VM lifecycle and execute remote commands
  • Managed Identity: Provides authentication for Azure API operations

Current State (Zone 1 Primary):

  • Zone 1: HANA Primary is active, SAP application servers are ACTIVE
  • Zone 2: HANA Secondary is in standby, SAP application servers are STANDBY (VMs may be running but SAP instances deactivated, or VMs stopped based on stop_vms parameter)

Failover Scenario (Zone 2 becomes Primary):

  1. HANA failover occurs from Zone 1 to Zone 2
  2. Pacemaker detects the failover and the azure-sap-zone resource agent triggers
  3. Agent starts VMs and SAP instances in Zone 2 (same zone as new primary HANA)
  4. Agent stops/deactivates SAP instances and optionally stops VMs in Zone 1
  5. Result: Zone 2 becomes the active zone with both HANA Primary and active SAP application servers

How It Works

Background

In Azure deployments of SAP systems with scale-up HANA configurations, optimizing latency between SAP application servers and the HANA database server can significantly enhance performance. In typical zonal deployments, primary and secondary HANA servers are located in different availability zones, with SAP application servers distributed across these zones. In certain Azure regions, cross-zonal latency may be higher, affecting performance for processes involving significant data transfer between application and database tiers.

Solution Overview

This resource agent addresses latency concerns by placing critical application servers in the same availability zone as the primary HANA database server. The solution provisions identical SAP application server VMs in both availability zones, with only one set active at any given time.

Execution Workflow

During a database failover, the resource agent executes the following phases in sequence:

  1. start_vms_in_same_zone: Initiates virtual machines in the same zone as the primary HANA VM
  2. wait_for_vms_in_same_zone_to_start: Waits for VMs in the same zone to start successfully
  3. start_sap_in_same_zone: Starts SAP instances in the same zone (parallel execution supported)
  4. wait_for_sap_in_same_zone_to_start: Waits for SAP instances in the same zone to start successfully
  5. stop_sap_in_diff_zone: Stops or deactivates SAP instances in different zones (behavior depends on stop_vms parameter)
  6. wait_for_sap_in_diff_zone_to_stop: Waits for SAP instances in different zones to shut down (skipped when stop_vms=false)
  7. stop_vms_in_diff_zone: Stops VMs in different zones (skipped when stop_vms=false)

The resource agent supports both SAPHanaSR and SAPHanaSR-angi (A Next Generation Interface) resource agents for HANA state detection.

Each phase includes built-in timeout management (controlled by the wait_time parameter) and retry logic. The stop_vms parameter determines whether the agent fully stops and deallocates VMs or just deactivates SAP instances in the non-primary zone.

Cluster Attributes

The resource agent uses the following cluster node attributes to track execution state:

AttributeDescription
azure_sap_zone_current_phaseStores the current phase of execution
azure_sap_zone_phase_start_timeRecords the start time for each phase (used for timeout detection)

 

Configuration Parameters

The following cluster resource parameters configure the resource agent's behavior:

NameDescriptionTypeDefaultRequiredExample
sidSAP System ID (SID) namestring-S4H
hana_sidHANA System ID (if different from SAP SID)stringsid valueHDB
hana_vm_zonesMapping of HANA VM name to logical zone group (optional; for non-zonal/PPG scenarios)string-hanavm1:1,hanavm2:2
verboseEnable verbose loggingbooleanfalsetrue
soft_shutdown_timeoutSoft shutdown timeout (seconds). Used as the timeout argument for SAP stop operations when stop_vms=trueinteger600600
app_vm_namesComma-separated list of SAP application server VM namesstring-✗*sapapp01,sapapp02,sapapp03,sapapp04
app_vm_name_patternRegex pattern to identify SAP application server VM namesstring-✗*sapapp.*
resource_groupAzure resource group for SAP application serversstringHANA VMs RGsap-app-rg
hana_resourceName of the HANA resource in Pacemaker clusterstring-rsc_SAPHana_S4H_HDB00
client_idClient ID of user-assigned managed identity (optional for system identity)string-a1b2c3d4-e5f6-7890-abcd-ef1234567890
stop_vmsStop VMs in different zones (true) or just deactivate SAP instances (false)booleanfalsefalse
wait_before_stop_sapWait time before stopping SAP instances in different zones (seconds)integer300300
wait_timeWait time for phases to complete (seconds)integer600600
retry_countAzure API retry countinteger33
retry_waitWait time between retries (seconds)integer2020
app_vm_zonesMapping of app VM name to logical zone group (optional; for non-zonal/PPG scenarios)string-✗*sapapp01:1,sapapp02:1,sapapp03:2,sapapp04:2
 

Note: Provide at least one of app_vm_names, app_vm_name_pattern, or app_vm_zones. If multiple are specified, the effective VM list is the union of:

  • app_vm_names (explicit list),
  • VMs matching app_vm_name_pattern (when app_vm_names is not provided), and
  • VM names present in app_vm_zones.

If app_vm_zones is provided but neither app_vm_names nor app_vm_name_pattern are set, the agent treats app_vm_zones as the authoritative source of application VM names.

If both app_vm_names and app_vm_name_pattern are set, app_vm_names is used (pattern matching is skipped). app_vm_zones is still merged in.

app_vm_zones is a supplemental mapping primarily intended for non-zonal/PPG scenarios; it can be used just for the subset of VMs that have no Azure zone metadata.

Non-zonal/PPG Note: In proximity placement group (PPG) or other non-zonal deployments, Azure VM metadata and ARM VM properties may not include an Availability Zone. In that case:

  • Set hana_vm_zones to map each HANA VM name to a logical group label (e.g. hanavm1:1,hanavm2:2).
  • Set app_vm_zones to map each SAP application VM (or just the subset missing zone metadata) to a logical group label. These are logical labels used for alignment, not necessarily Azure Availability Zones.

Warning (zone/group mappings): Be very careful when setting hana_vm_zones and app_vm_zones (sometimes referred to as “hana zone” / “app VM zone” parameters). In deployments where Azure zone metadata is unavailable, these mappings fully determine which application VMs are considered “same group” vs “different group”.

If the grouping is wrong, the agent can take action on the wrong servers:

  • With stop_vms=false, it may deactivate (make passive) SAP instances on the wrong app VMs.
  • With stop_vms=true, it may soft-shutdown SAP and stop/deallocate the wrong app VMs.

Double-check the VM name → group assignments and keep them consistent across the HANA and app tiers.

Parameter interactions and practical notes

  • hana_sid is used when the HANA Pacemaker attributes are named using a different SID than the SAP application SID.
  • When stop_vms=true, the agent:
    • waits wait_before_stop_sap seconds before initiating shutdown (to reduce churn during rapid failovers),
    • calls sapcontrol -function Stop <soft_shutdown_timeout> on the "different-zone" app VMs (soft shutdown with a configurable timeout),
    • waits for process dispstatus to become GRAY (stopped) before deallocating VMs.
  • When stop_vms=false, the agent:
    • calls sapcontrol -function ABAPSetServerInactive on the "different-zone" app VMs,
    • leaves the SAP instances running but in inactive/passive mode, and
    • does not stop/deallocate the Azure VMs.
  • Timeouts:
    • Most phases use wait_time.
    • The stop/wait-for-stop window effectively needs to cover both wait_time and soft_shutdown_timeout.

app_vm_zones format

Use a comma-separated mapping: vm_name:group.

Example: app_vm_zones="sapapp01:1,sapapp02:1,sapapp03:2,sapapp04:2"

Start-time validation

If you provide app_vm_zones or hana_vm_zones in a deployment where Azure zone metadata is available, the agent validates on every start that the provided values match Azure. It will fail to start if they do not match.

Prerequisites

Topology requirement (critical)

This solution assumes you have two equivalent sets of SAP application server VMs, one set placed/aligned with each HANA VM zone (or logical group in non-zonal/PPG deployments). Only one set is expected to be active at a time.

  • Zonal deployments: provision the same application server capacity in each Availability Zone used by the HANA primary/secondary VMs.
  • Non-zonal / PPG deployments: provision two equivalent application server sets and map them consistently using hana_vm_zones and app_vm_zones.

“Equivalent/identical” here means the VMs are prepared to run the same SAP application workload (same SAP installation/SID/instance layout and configuration as applicable for your landscape), so the agent can start SAP on the “same-zone” set and deactivate/stop SAP on the “different-zone” set during failover.

SAP workload routing/groups (required)

To ensure workloads continue seamlessly when the active application-server set switches zones/groups, configure your SAP group/routing settings to include both application server sets as appropriate for your landscape, including:

  • SAP logon groups: SMLG
  • RFC server groups: RZ12
  • Background/batch server groups: SM61
  • Spool server groups: SPAD
  • Update configuration/groups: SM14

System Requirements

Operating System Support

  • SUSE Linux Enterprise Server (SLES): 15 SP5 and above

Network Requirements

  • HANA VMs must have outbound access to Azure API endpoints
  • Required for VM management operations (start, stop, execute commands)

Azure Linux VM Agent

  • Must be installed on all SAP application server VMs
  • Pre-installed on Azure Marketplace images
  • Manual installation required for custom/non-Marketplace images
  • Installation Guide

Python Environment

  • Python 3.x installed on HANA cluster nodes
  • Required Python packages: requests (all other imports are Python standard library)

Verification Command:

python3 -c 'import os, sys, time, subprocess, re, requests, shlex, random; from typing import Dict, List, Optional'

 

HANA Resource Agent Compatibility

  • SAPHanaSR: Traditional SAP HANA System Replication resource agent
  • SAPHanaSR-angi: SAP HANA System Replication A Next Generation Interface resource agent

The azure-sap-zone resource agent automatically detects which HANA resource agent is in use and adapts accordingly.

Azure Permissions

The resource agent requires either a user-assigned managed identity (via client_id) or a system-assigned managed identity with specific Azure permissions:

Required Azure Role Actions

{
    "permissions": [
        {
            "actions": [
                "Microsoft.Compute/*/read",
                "Microsoft.Compute/virtualMachines/start/action",
                "Microsoft.Compute/virtualMachines/restart/action",
                "Microsoft.Compute/virtualMachines/powerOff/action",
                "Microsoft.Compute/virtualMachines/deallocate/action",
                "Microsoft.Compute/virtualMachines/runCommand/action",
                "Microsoft.Compute/virtualMachines/runCommands/read",
                "Microsoft.Compute/virtualMachines/runCommands/write"
            ],
            "notActions": [],
            "dataActions": [],
            "notDataActions": []
        }
    ]
}

Identity Assignment Requirements

  • User-assigned managed identity must be assigned to both HANA servers
  • Identity must have Virtual Machine Contributor role (or custom role with above actions)
  • Role assignment scope: SAP application server VMs' resource group (recommended) or individual VMs

Limitations

  • Supported SAP Systems: ABAP systems on HANA scale-up only
  • Not Supported: SAP JAVA, HANA scale-out, multi-SID environments

 

Installation

Step 1: Azure Configuration

Configure Azure resources using Azure CLI. Install Azure CLI if not already available.

PowerShell Script for Azure Setup

# Define parameters - Update these values for your environment
$subscriptionId = "Your-Subscription-ID"
$hanaResourceGroup = "HANA-VMs-Resource-Group"
$hanaVMNames = @("hana-vm1", "hana-vm2")
$managedIdentityName = "sap-azure-zone-alignment"
$customAzureRole = "Azure SAP Zone Alignment"

# Resource group scope assignment (recommended)
$sapAppResourceGroup = "SAP-Application-Servers-Resource-Group"

# Alternative: Direct VM assignment
$sapAppVMNames = @("sap-app1", "sap-app2", "sap-app3", "sap-app4")

# Login to Azure
az login

# Verify Azure Linux Agent and run-command capability on application servers
$sapAppVMNames | ForEach-Object -ThrottleLimit $sapAppVMNames.Count -Parallel {
    $vmName = $_
    $result = az vm run-command invoke `
        --resource-group $using:sapAppResourceGroup `
        --name $vmName `
        --command-id RunShellScript `
        --scripts "systemctl is-active waagent" `
        --output json 2>&1 | ConvertFrom-Json
    $msg = $result.value[0].message
    if ($msg -match '\[stdout\]\s*active') {
        Write-Host "[$vmName] OK - waagent active, run-command working"
    } else {
        Write-Host "[$vmName] FAIL - unexpected output: $msg"
    }
}

# Create custom Azure role
$roleDefinition = @{
    Name = $customAzureRole
    IsCustom = $true
    Description = "Custom Azure role for sap-azure-zone pacemaker resource agent"
    Actions = @(
        "Microsoft.Compute/*/read",
        "Microsoft.Compute/virtualMachines/start/action",
        "Microsoft.Compute/virtualMachines/restart/action",
        "Microsoft.Compute/virtualMachines/powerOff/action",
        "Microsoft.Compute/virtualMachines/deallocate/action",
        "Microsoft.Compute/virtualMachines/runCommand/action",
        "Microsoft.Compute/virtualMachines/runCommands/read",
        "Microsoft.Compute/virtualMachines/runCommands/write"
    )
    NotActions = @()
    AssignableScopes = @("/subscriptions/$subscriptionId")
} | ConvertTo-Json -Depth 3

$roleDefinition | Out-File -FilePath "$env:TEMP\az-role.json" -Encoding utf8
az role definition create --role-definition "$env:TEMP\az-role.json"

# Recommendation: Use a user-assigned managed identity for authentication. 
# System-assigned managed identities are also supported; if you choose this option, 
# ensure that system-assigned managed identity is enabled on both HANA VMs and that 
# the required roles (listed below) are assigned to each system identity.

# Create user-assigned managed identity
$managedIdentityResourceId = az identity create `
    --resource-group $hanaResourceGroup `
    --name $managedIdentityName `
    --query id --output tsv

# Assign managed identity to HANA VMs
foreach ($vmName in $hanaVMNames) {
    az vm identity assign `
        --resource-group $hanaResourceGroup `
        --name $vmName `
        --identities $managedIdentityResourceId
}

# Alternative: Enable system-assigned managed identity (uncomment if preferred)
# foreach ($vmName in $hanaVMNames) {
#     az vm identity assign `
#         --resource-group $hanaResourceGroup `
#         --name $vmName
# }

# Assign role to managed identity (resource group scope)
$managedIdentityPrincipalId = az identity show `
    --resource-group $hanaResourceGroup `
    --name $managedIdentityName `
    --query principalId --output tsv

az role assignment create `
    --assignee-object-id $managedIdentityPrincipalId `
    --assignee-principal-type ServicePrincipal `
    --role $customAzureRole `
    --scope "/subscriptions/$subscriptionId/resourceGroups/$sapAppResourceGroup"

# Display the client ID (needed for cluster configuration)
Write-Host "Managed Identity Client ID:"
az identity show `
    --resource-group $hanaResourceGroup `
    --name $managedIdentityName `
    --query clientId --output tsv

Step 2: Install Resource Agent

Download and Install on Both HANA Cluster Nodes

# Download the resource agent script
curl -o azure-sap-zone.in https://raw.githubusercontent.com/ClusterLabs/resource-agents/refs/heads/main/heartbeat/azure-sap-zone.in

# Create the resource agent file
sudo cp azure-sap-zone.in /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone

# Update the interpreter line
# Note: the downloaded file typically starts with the placeholder `#!@PYTHON@ -tt`.
# Replace it with your actual python3 path.
PYTHON3_PATH="$(command -v python3)"
echo "python3: ${PYTHON3_PATH}"
# Bash note: `!` triggers history expansion inside double-quotes, so use this quoting form.
sudo sed -i '1 s|^#!@PYTHON@ -tt$|#!'"${PYTHON3_PATH}"' -tt|' /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone

# If you need to force a specific interpreter path, you can also do:
# sudo sed -i '1 s|^#!@PYTHON@ -tt$|#!/usr/bin/python3 -tt|' /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone

# Convert line endings and set permissions
sudo dos2unix /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone
sudo chmod +x /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone

# Copy to secondary node (alternative: repeat above steps manually)
sudo scp /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone <secondary-hana-vm>:/usr/lib/ocf/resource.d/heartbeat/

 

Configuration

Configuration Options

The resource agent provides two distinct behaviors for application servers in the different zone/group.

OptionSettingWhat happens to SAPDo the “different-zone” servers take new users/jobs/sessions?What happens to the Azure VMsTypical trade-off
1) Deactivate (Passive mode)stop_vms=falseSAP stays running, but the agent calls sapcontrol -function ABAPSetServerInactive to set the instance inactive/passiveNo — the server is kept out of service for new workload (e.g., new user logons, new batch/background work, and other new sessions)VMs stay runningFastest to make active again, but no Azure compute cost savings for the inactive zone because the VMs keep running
2) Soft shutdown + stop/deallocatestop_vms=trueAgent calls sapcontrol -function Stop <soft_shutdown_timeout> (graceful stop with a configurable timeout) and waits until the instance is stopped (dispstatus=GRAY)No — during shutdown the instance is not available for new workload/sessionsAfter shutdown, VMs are stopped and deallocatedSlower to re-activate (VM boot + SAP start), but can save costs in pay-as-you-go models by deallocating the inactive-zone VMs

Notes:

  • soft_shutdown_timeout controls how long SAP is given to stop gracefully.
  • stop_vms=true is the only mode where the agent will stop/deallocate VMs.

Note on capacity: When using stop_vms=true, deallocated VMs are not guaranteed to have capacity available when restarted. Consider using On-Demand Capacity Reservations (ODCR) or Capacity Reservation Groups to ensure VM sizes remain available in both zones. The resource agent does not manage capacity reservations — this is an infrastructure planning consideration.

Cluster Configuration Examples

The configuration has two parts:

  1. Create the primitive resource — choose one of the examples below (A–F) based on your deployment pattern.
  2. Create the clone and order constraint — this is required regardless of which example you use (see Step 2 below).

Note on monitor interval: the resource agent advertises a default monitor interval of 300 seconds in its meta-data. The examples below use a shorter interval (e.g. 10s) to detect failovers quickly; choose an interval appropriate for your environment.

Step 1: Create the primitive resource

Choose the example that matches your deployment:

Example A: Zonal deployment (system-assigned managed identity) + explicit VM list

Use this when Azure Availability Zones are present and you want to provide an explicit list of application VMs.

SLES (crmsh):

sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
    params sid=<SAP_SID> \
           hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
           app_vm_names=<app_vm1,app_vm2,app_vm3,app_vm4> \
           stop_vms=false \
           wait_time=600 \
           verbose=true \
    meta failure-timeout=120s \
    op start start-delay=60s interval=0s timeout=360s \
    op monitor interval=10s timeout=360s \
    op stop timeout=10s interval=0s on-fail=ignore

Example B: Zonal deployment (user-assigned managed identity) + VM name pattern

Use this when Azure Availability Zones are present and you want the agent to discover application VMs by name.

SLES (crmsh):

sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
    params sid=<SAP_SID> \
           hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
           app_vm_name_pattern=<REGEX_OR_PREFIX_PATTERN> \
           client_id=<MANAGED_IDENTITY_CLIENT_ID> \
           stop_vms=false \
           wait_time=600 \
           verbose=true \
    meta failure-timeout=120s \
    op start start-delay=60s interval=0s timeout=360s \
    op monitor interval=10s timeout=360s \
    op stop timeout=10s interval=0s on-fail=ignore

Example C: Non-zonal / PPG deployment (logical grouping)

Use this when Azure zone metadata is missing (for example, proximity placement group or other non-zonal deployments).

Key points:

  • Set hana_vm_zones to map each HANA VM name to a logical group label (for example hanavm1:1,hanavm2:2).
  • Set app_vm_zones to map each application VM name (or just the subset missing zone metadata) to a logical group label.
  • If Azure later reports real zone data for those VMs, the agent validates on every start that your mapping matches Azure and fails if it does not.

SLES (crmsh):

sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
    params sid=<SAP_SID> \
           hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
        hana_vm_zones="<hana_vm1>:1,<hana_vm2>:2" \
           app_vm_zones="sapapp01:1,sapapp02:1,sapapp03:2,sapapp04:2" \
           stop_vms=false \
           wait_time=600 \
           verbose=true \
    meta failure-timeout=120s \
    op start start-delay=60s interval=0s timeout=360s \
    op monitor interval=10s timeout=360s \
    op stop timeout=10s interval=0s on-fail=ignore

Example D: Mixed deployment (mostly zonal, a few VMs missing zone metadata)

Use this when most application VMs have Azure zone metadata, but a small subset does not. Provide the full VM list via app_vm_names (or discovery via app_vm_name_pattern), and provide app_vm_zones only for the VMs that are missing zone metadata.

SLES (crmsh):

sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
    params sid=<SAP_SID> \
           hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
           app_vm_names=<app_vm1,app_vm2,app_vm3,...> \
           app_vm_zones="<nonzonal_vm_a>:1,<nonzonal_vm_b>:2" \
           stop_vms=false \
           wait_time=600 \
           verbose=true

Example E: Zonal deployment with stop_vms=true (shutdown + deallocate different-zone VMs)

Use this when you want maximum cost optimization by shutting down and deallocating the application VMs in the non-primary zone.

SLES (crmsh):

sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
    params sid=<SAP_SID> \
           hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
           app_vm_names=<app_vm1,app_vm2,app_vm3,app_vm4> \
           stop_vms=true \
           wait_before_stop_sap=300 \
           soft_shutdown_timeout=600 \
           wait_time=600 \
           verbose=true \
    op start start-delay=60s interval=0s timeout=360s \
    op monitor interval=10s timeout=360s \
    op stop timeout=10s interval=0s on-fail=ignore

Example F: HANA SID differs from SAP SID (hana_sid)

Use this when the HANA cluster uses a different SID, so the HANA Pacemaker attributes are named hana_<hana_sid>_*.

SLES (crmsh):

sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
    params sid=<SAP_SID> \
           hana_sid=<HANA_SID> \
           hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
           app_vm_names=<app_vm1,app_vm2,app_vm3,app_vm4> \
           stop_vms=false \
           verbose=true

Step 2: Create clone and order constraint

After creating the primitive resource using any of the examples above, run the following commands to create the clone resource and order constraint. This is required for all deployment patterns.

SLES (crmsh):

# Create clone resource (runs on both nodes)
sudo crm configure clone cln_azure-sap-zone azure-sap-zone \
    meta clone-node-max=1 target-role=Started interleave=true

# Create order constraint (start after HANA resource)
sudo crm configure order ord_azure-sap-zone Mandatory: <HANA_CLONE_RESOURCE> cln_azure-sap-zone symmetrical=false

Usage

After installation and configuration, the resource agent will automatically:

  1. Monitor HANA primary location: Detects which availability zone hosts the current HANA primary
  2. Manage application servers: Starts/stops or activates/deactivates SAP application servers based on zone alignment
  3. Handle failover scenarios: Automatically adjusts during HANA failover events

Manual Operations

Enable Verbose Logging

# SLES
sudo crm_resource --resource azure-sap-zone --set-parameter verbose --parameter-value true

 

Resource Management

Put resource in maintenance mode (for maintenance):

Setting the resource to unmanaged mode stops Pacemaker from running any operations (start, stop, monitor) on the resource, which prevents the agent from taking action during planned maintenance.

# SLES — enable maintenance mode
sudo crm resource maintenance cln_azure-sap-zone on

Resume resource management:

# SLES — disable maintenance mode
sudo crm resource maintenance cln_azure-sap-zone off

 

Validating the Setup with a Test Failover

Once the resource agent is installed and configured, we recommend running through a test failover on a non-production system to confirm everything works end to end. The steps below walk you through the before, during, and after of a validation cycle.

Step 1: Verify the resource agent is running

Before triggering a failover, confirm the resource agent is healthy and the cluster sees it on both nodes:

# Check overall cluster status
sudo crm status

# Verify the azure-sap-zone clone is started on both nodes
sudo crm resource show cln_azure-sap-zone

You should see the clone resource running on both HANA cluster nodes.

Step 2: Check the initial state

Record the current state so you can compare after the failover:

# Which node is the HANA primary?
sudo crm status | grep -i "Masters\|Promoted"

# What zone/phase does the resource agent report?
sudo crm_attribute --name azure_sap_zone_current_phase --query --quiet --node $(hostname)

# Check application server VM power state (from Azure CLI, if available)
az vm list -g <SAP-App-Resource-Group> -d --query "[].{Name:name, PowerState:powerState, Zone:zones[0]}" -o table

At this point the phase should be all_phases_completed (if the agent has already aligned once) or Started / no_action_required depending on which node you are on.

Step 3: Enable verbose logging (recommended)

Turn on verbose logging before the failover so you can trace every phase in detail:

sudo crm_resource --resource azure-sap-zone --set-parameter verbose --parameter-value true

Step 4: Trigger a test HANA failover

Important: Only perform this on a test/non-production system.

You can trigger a controlled HANA takeover using standard Pacemaker commands. The exact method depends on your HANA resource agent:

# Option A: Migrate the HANA primary to the secondary node
sudo crm resource move <HANA_CLONE_RESOURCE> <target-node> force

# After the move completes, clear the location constraint so Pacemaker can manage normally
sudo crm resource clear <HANA_CLONE_RESOURCE>

Alternatively, if your runbook uses sr_takeover or SAPHanaSR tools, follow your existing takeover procedure. The key point is that the HANA primary ends up on the other node/zone.

Step 5: Monitor the resource agent's progress

After the failover, the resource agent on the new primary node will detect the zone change and begin executing its phases. You can watch it in real time:

# Watch the phase attribute update (run on the new primary node)
watch -n 5 'crm_attribute --name azure_sap_zone_current_phase --query --quiet --node $(hostname)'

You should see the phase progress through:

  1. start_vms_in_same_zone
  2. wait_for_vms_in_same_zone_to_start
  3. start_sap_in_same_zone
  4. wait_for_sap_in_same_zone_to_start
  5. stop_sap_in_diff_zone
  6. wait_for_sap_in_diff_zone_to_stop (only when stop_vms=true)
  7. stop_vms_in_diff_zone (only when stop_vms=true)
  8. all_phases_completed

Step 6: Validate the outcome

Once the phase reaches all_phases_completed, verify that the application tier has been aligned correctly.

Check application VMs in the same zone as the new HANA primary:

# Verify VMs are running
az vm list -g <SAP-App-Resource-Group> -d --query "[].{Name:name, PowerState:powerState, Zone:zones[0]}" -o table

# Verify SAP instances are active (GREEN) — run on a same-zone app VM
sapcontrol -nr <instance_number> -function GetProcessList

 

All SAP processes on the same-zone VMs should show dispstatus: GREEN.

Check application VMs in the different zone:

  • If stop_vms=false: the VMs should still be running, but SAP instances should be in inactive/passive mode. You can verify this by checking logon groups (SMLG) or the server's active status.
  • If stop_vms=true: the VMs should be stopped/deallocated in the Azure portal or via az vm list.

Step 7: Review the logs

Check the Pacemaker log to confirm all phases executed without errors:

# View all agent activity
sudo grep -i 'azure-sap-zone' /var/log/pacemaker/pacemaker.log | tail -50

# Filter to only INFO/WARNING/ERROR messages (skip routine monitor noise)
sudo grep -iE 'azure-sap-zone.*(INFO|WARNING|ERROR):' /var/log/pacemaker/pacemaker.log | grep -v -iE "All phases|monitor: Started"

Example output (filtered):

Apr 23 10:15:32 hanavm1 azure-sap-zone INFO: monitor: Started
Apr 23 10:15:32 hanavm1 azure-sap-zone INFO: Executing phase: start_vms_in_same_zone
Apr 23 10:15:35 hanavm1 azure-sap-zone INFO: Executing phase: wait_for_vms_in_same_zone_to_start
Apr 23 10:15:45 hanavm1 azure-sap-zone INFO: All VMs are started
Apr 23 10:15:45 hanavm1 azure-sap-zone INFO: Executing phase: start_sap_in_same_zone
Apr 23 10:15:45 hanavm1 azure-sap-zone INFO: Starting SAP on VMs: ['sapapp01', 'sapapp02']
Apr 23 10:16:02 hanavm1 azure-sap-zone INFO: Executing phase: wait_for_sap_in_same_zone_to_start
Apr 23 10:16:15 hanavm1 azure-sap-zone INFO: All SAP instances are started
Apr 23 10:16:15 hanavm1 azure-sap-zone INFO: Executing phase: stop_sap_in_diff_zone
Apr 23 10:16:15 hanavm1 azure-sap-zone INFO: Setting SAP instances to passive mode on VMs: ['sapapp03', 'sapapp04']
Apr 23 10:16:20 hanavm1 azure-sap-zone INFO: All phases have been executed successfully
Apr 23 10:16:20 hanavm1 azure-sap-zone INFO: monitor: Finished

 

Look for:

  • Phase transitions: confirm each phase started and completed in order
  • No errors: no ERROR or FAIL messages
  • Timing: note how long the full cycle took — this is your expected failover alignment time

Step 8: Clean up

After validation, you can disable verbose logging to reduce log volume:

sudo crm_resource --resource azure-sap-zone --set-parameter verbose --parameter-value false

If you triggered the failover using crm resource move, make sure the location constraint was cleared (Step 4) so Pacemaker can manage resources normally going forward.

Troubleshooting

Common Issues

1. Authentication Problems

  • Verify managed identity is assigned to HANA VMs
  • Check Azure role assignments
  • Ensure proper permissions on target application server VMs

2. Network Connectivity

  • Validate outbound access to Azure API endpoints
  • Check firewall rules and network security groups

3. Azure Linux Agent Issues

  • Verify agent status: systemctl status waagent
  • Check agent logs: /var/log/waagent.log

Log Analysis

View Resource Agent Logs

sudo grep -i 'azure-sap-zone' /var/log/pacemaker/pacemaker.log

 

Common Log Patterns

  • Phase transitions: Look for "current_phase" changes
  • API errors: Search for "Azure API" or "HTTP" error codes
  • Timeout issues: Check for "timeout" or "wait_time exceeded"

Performance Monitoring

Monitor the following metrics:

  • Phase execution times: Should complete within configured wait_time
  • API response times: Azure API calls should be < 30 seconds
  • VM startup times: Application server boot time affects total failover duration

 

FAQ

Q: Can I use this with SAP JAVA systems? A: No, this resource agent currently only supports SAP ABAP systems on HANA scale-up configurations.

Q: What HANA resource agents are supported? A: The agent supports both SAPHanaSR and SAPHanaSR-angi (A Next Generation Interface) resource agents and automatically detects which one is in use.

Q: What happens if an application server VM fails to start? A: The resource agent will retry based on the retry_count parameter and eventually fail the phase if the VM doesn't start within the wait_time.

Q: Can I run this in a multi-SID environment? A: No, multi-SID environments are not currently supported.

Q: Can I use different SIDs for SAP and HANA? A: Yes, use the hana_sid parameter if your HANA SID differs from your SAP SID.

Q: Can I use system-assigned managed identity instead of user-assigned? A: Yes, simply omit the client_id parameter and ensure system-assigned managed identity is enabled on both HANA VMs with appropriate permissions.

Important Notes

Resource State Management Upon completion, the cluster attribute azure_sap_zone_current_phase is set to all_phases_completed. The resource agent will not take further action until restarted.

Maintenance Operations When performing maintenance on application servers in different zones (e.g., patching), put the resource in maintenance mode to prevent the agent from taking action:

# SLES
sudo crm resource maintenance cln_azure-sap-zone on

Zone Alignment This solution requires identical SAP application server VMs in both availability zones. Only one set should be active at any time.

Wrapping up

With the information in this post, you should have what you need to evaluate the Azure SAP Zone Resource Agent in your environment - from setting up the managed identity and permissions, to installing the agent, configuring the cluster, and troubleshooting common issues. If you haven't already, we recommend reading Part 1 for an introduction to the concepts and features behind this solution.

We welcome your feedback during this public preview. If you encounter issues or have suggestions, please file them via GitHub Issues on the ClusterLabs resource-agents repository.

Public preview expectations

During public preview:

  • This solution is provided as a Public Preview for evaluation and feedback.
  • It is not covered by a formal support commitment.
  • The design, configuration, and behaviors may evolve based on learnings.

Because of that, we recommend using this in non-production environments while it is in preview.

If you're interested in piloting the preview, your feedback will help shape what becomes generally available and supported.

Disclaimer

This post describes a public preview capability. It is shared for informational purposes only and is subject to change. It is not a substitute for your organization's validation, testing, and operational readiness reviews.

 

Updated Apr 23, 2026
Version 1.0
No CommentsBe the first to comment