This is Part 2 of a two-part series. Part 1 covers the concepts and features; this post covers the technical details: architecture, setup, configuration, and troubleshooting. Public preview: Recommended for non-production use only while in preview.
Overview
In Part 1, we discussed why keeping the SAP application tier aligned with the HANA primary zone matters for latency-sensitive workloads, and how the Azure SAP Zone Resource Agent automates this alignment after failovers. In this post, we get into the specifics - how the agent is structured, what it needs to run, and how to set it up in your Pacemaker cluster.
The azure-sap-zone resource agent is a Pacemaker resource agent designed to manage the alignment of SAP application Azure Virtual Machines (VMs) with the primary HANA Azure VM. This agent ensures that SAP application servers are started in the same Azure availability zone as the HANA primary VM to maintain high availability and optimal performance.
Key Benefits
- Reduced Latency: Minimizes cross-zone network latency between application and database tiers
- High Availability: Maintains SAP system availability during failover scenarios
- Automated Management: Automatically handles VM and SAP instance lifecycle during zone transitions
- Cost Optimization: Enables efficient resource utilization across availability zones
Architecture
The following diagram illustrates how the resource agent manages SAP application server alignment with the primary HANA database across Azure availability zones:
Key Components:
- HANA Cluster: Primary and secondary HANA VMs are deployed in separate availability zones with System Replication configured
- Pacemaker Cluster: Runs across both zones with the azure-sap-zone resource agent deployed on both nodes. The SAP application server VMs are not Pacemaker cluster members - they are managed remotely by the agent via Azure APIs.
- Application Servers: Identical sets of SAP application VMs deployed in both availability zones
- Azure Management API: Used by the resource agent to control VM lifecycle and execute remote commands
- Managed Identity: Provides authentication for Azure API operations
Current State (Zone 1 Primary):
- Zone 1: HANA Primary is active, SAP application servers are ACTIVE
- Zone 2: HANA Secondary is in standby, SAP application servers are STANDBY (VMs may be running but SAP instances deactivated, or VMs stopped based on stop_vms parameter)
Failover Scenario (Zone 2 becomes Primary):
- HANA failover occurs from Zone 1 to Zone 2
- Pacemaker detects the failover and the azure-sap-zone resource agent triggers
- Agent starts VMs and SAP instances in Zone 2 (same zone as new primary HANA)
- Agent stops/deactivates SAP instances and optionally stops VMs in Zone 1
- Result: Zone 2 becomes the active zone with both HANA Primary and active SAP application servers
How It Works
Background
In Azure deployments of SAP systems with scale-up HANA configurations, optimizing latency between SAP application servers and the HANA database server can significantly enhance performance. In typical zonal deployments, primary and secondary HANA servers are located in different availability zones, with SAP application servers distributed across these zones. In certain Azure regions, cross-zonal latency may be higher, affecting performance for processes involving significant data transfer between application and database tiers.
Solution Overview
This resource agent addresses latency concerns by placing critical application servers in the same availability zone as the primary HANA database server. The solution provisions identical SAP application server VMs in both availability zones, with only one set active at any given time.
Execution Workflow
During a database failover, the resource agent executes the following phases in sequence:
- start_vms_in_same_zone: Initiates virtual machines in the same zone as the primary HANA VM
- wait_for_vms_in_same_zone_to_start: Waits for VMs in the same zone to start successfully
- start_sap_in_same_zone: Starts SAP instances in the same zone (parallel execution supported)
- wait_for_sap_in_same_zone_to_start: Waits for SAP instances in the same zone to start successfully
- stop_sap_in_diff_zone: Stops or deactivates SAP instances in different zones (behavior depends on stop_vms parameter)
- wait_for_sap_in_diff_zone_to_stop: Waits for SAP instances in different zones to shut down (skipped when stop_vms=false)
- stop_vms_in_diff_zone: Stops VMs in different zones (skipped when stop_vms=false)
The resource agent supports both SAPHanaSR and SAPHanaSR-angi (A Next Generation Interface) resource agents for HANA state detection.
Each phase includes built-in timeout management (controlled by the wait_time parameter) and retry logic. The stop_vms parameter determines whether the agent fully stops and deallocates VMs or just deactivates SAP instances in the non-primary zone.
Cluster Attributes
The resource agent uses the following cluster node attributes to track execution state:
| Attribute | Description |
|---|---|
| azure_sap_zone_current_phase | Stores the current phase of execution |
| azure_sap_zone_phase_start_time | Records the start time for each phase (used for timeout detection) |
Configuration Parameters
The following cluster resource parameters configure the resource agent's behavior:
| Name | Description | Type | Default | Required | Example |
|---|---|---|---|---|---|
| sid | SAP System ID (SID) name | string | - | ✓ | S4H |
| hana_sid | HANA System ID (if different from SAP SID) | string | sid value | ✗ | HDB |
| hana_vm_zones | Mapping of HANA VM name to logical zone group (optional; for non-zonal/PPG scenarios) | string | - | ✗ | hanavm1:1,hanavm2:2 |
| verbose | Enable verbose logging | boolean | false | ✗ | true |
| soft_shutdown_timeout | Soft shutdown timeout (seconds). Used as the timeout argument for SAP stop operations when stop_vms=true | integer | 600 | ✗ | 600 |
| app_vm_names | Comma-separated list of SAP application server VM names | string | - | ✗* | sapapp01,sapapp02,sapapp03,sapapp04 |
| app_vm_name_pattern | Regex pattern to identify SAP application server VM names | string | - | ✗* | sapapp.* |
| resource_group | Azure resource group for SAP application servers | string | HANA VMs RG | ✗ | sap-app-rg |
| hana_resource | Name of the HANA resource in Pacemaker cluster | string | - | ✓ | rsc_SAPHana_S4H_HDB00 |
| client_id | Client ID of user-assigned managed identity (optional for system identity) | string | - | ✗ | a1b2c3d4-e5f6-7890-abcd-ef1234567890 |
| stop_vms | Stop VMs in different zones (true) or just deactivate SAP instances (false) | boolean | false | ✗ | false |
| wait_before_stop_sap | Wait time before stopping SAP instances in different zones (seconds) | integer | 300 | ✗ | 300 |
| wait_time | Wait time for phases to complete (seconds) | integer | 600 | ✗ | 600 |
| retry_count | Azure API retry count | integer | 3 | ✗ | 3 |
| retry_wait | Wait time between retries (seconds) | integer | 20 | ✗ | 20 |
| app_vm_zones | Mapping of app VM name to logical zone group (optional; for non-zonal/PPG scenarios) | string | - | ✗* | sapapp01:1,sapapp02:1,sapapp03:2,sapapp04:2 |
Note: Provide at least one of app_vm_names, app_vm_name_pattern, or app_vm_zones. If multiple are specified, the effective VM list is the union of:
- app_vm_names (explicit list),
- VMs matching app_vm_name_pattern (when app_vm_names is not provided), and
- VM names present in app_vm_zones.
If app_vm_zones is provided but neither app_vm_names nor app_vm_name_pattern are set, the agent treats app_vm_zones as the authoritative source of application VM names.
If both app_vm_names and app_vm_name_pattern are set, app_vm_names is used (pattern matching is skipped). app_vm_zones is still merged in.
app_vm_zones is a supplemental mapping primarily intended for non-zonal/PPG scenarios; it can be used just for the subset of VMs that have no Azure zone metadata.
Non-zonal/PPG Note: In proximity placement group (PPG) or other non-zonal deployments, Azure VM metadata and ARM VM properties may not include an Availability Zone. In that case:
- Set hana_vm_zones to map each HANA VM name to a logical group label (e.g. hanavm1:1,hanavm2:2).
- Set app_vm_zones to map each SAP application VM (or just the subset missing zone metadata) to a logical group label. These are logical labels used for alignment, not necessarily Azure Availability Zones.
Warning (zone/group mappings): Be very careful when setting hana_vm_zones and app_vm_zones (sometimes referred to as “hana zone” / “app VM zone” parameters). In deployments where Azure zone metadata is unavailable, these mappings fully determine which application VMs are considered “same group” vs “different group”.
If the grouping is wrong, the agent can take action on the wrong servers:
- With stop_vms=false, it may deactivate (make passive) SAP instances on the wrong app VMs.
- With stop_vms=true, it may soft-shutdown SAP and stop/deallocate the wrong app VMs.
Double-check the VM name → group assignments and keep them consistent across the HANA and app tiers.
Parameter interactions and practical notes
- hana_sid is used when the HANA Pacemaker attributes are named using a different SID than the SAP application SID.
- When stop_vms=true, the agent:
- waits wait_before_stop_sap seconds before initiating shutdown (to reduce churn during rapid failovers),
- calls sapcontrol -function Stop <soft_shutdown_timeout> on the "different-zone" app VMs (soft shutdown with a configurable timeout),
- waits for process dispstatus to become GRAY (stopped) before deallocating VMs.
- When stop_vms=false, the agent:
- calls sapcontrol -function ABAPSetServerInactive on the "different-zone" app VMs,
- leaves the SAP instances running but in inactive/passive mode, and
- does not stop/deallocate the Azure VMs.
- Timeouts:
- Most phases use wait_time.
- The stop/wait-for-stop window effectively needs to cover both wait_time and soft_shutdown_timeout.
app_vm_zones format
Use a comma-separated mapping: vm_name:group.
Example: app_vm_zones="sapapp01:1,sapapp02:1,sapapp03:2,sapapp04:2"
Start-time validation
If you provide app_vm_zones or hana_vm_zones in a deployment where Azure zone metadata is available, the agent validates on every start that the provided values match Azure. It will fail to start if they do not match.
Prerequisites
Topology requirement (critical)
This solution assumes you have two equivalent sets of SAP application server VMs, one set placed/aligned with each HANA VM zone (or logical group in non-zonal/PPG deployments). Only one set is expected to be active at a time.
- Zonal deployments: provision the same application server capacity in each Availability Zone used by the HANA primary/secondary VMs.
- Non-zonal / PPG deployments: provision two equivalent application server sets and map them consistently using hana_vm_zones and app_vm_zones.
“Equivalent/identical” here means the VMs are prepared to run the same SAP application workload (same SAP installation/SID/instance layout and configuration as applicable for your landscape), so the agent can start SAP on the “same-zone” set and deactivate/stop SAP on the “different-zone” set during failover.
SAP workload routing/groups (required)
To ensure workloads continue seamlessly when the active application-server set switches zones/groups, configure your SAP group/routing settings to include both application server sets as appropriate for your landscape, including:
- SAP logon groups: SMLG
- RFC server groups: RZ12
- Background/batch server groups: SM61
- Spool server groups: SPAD
- Update configuration/groups: SM14
System Requirements
Operating System Support
- SUSE Linux Enterprise Server (SLES): 15 SP5 and above
Network Requirements
- HANA VMs must have outbound access to Azure API endpoints
- Required for VM management operations (start, stop, execute commands)
Azure Linux VM Agent
- Must be installed on all SAP application server VMs
- Pre-installed on Azure Marketplace images
- Manual installation required for custom/non-Marketplace images
- Installation Guide
Python Environment
- Python 3.x installed on HANA cluster nodes
- Required Python packages: requests (all other imports are Python standard library)
Verification Command:
python3 -c 'import os, sys, time, subprocess, re, requests, shlex, random; from typing import Dict, List, Optional'
HANA Resource Agent Compatibility
- SAPHanaSR: Traditional SAP HANA System Replication resource agent
- SAPHanaSR-angi: SAP HANA System Replication A Next Generation Interface resource agent
The azure-sap-zone resource agent automatically detects which HANA resource agent is in use and adapts accordingly.
Azure Permissions
The resource agent requires either a user-assigned managed identity (via client_id) or a system-assigned managed identity with specific Azure permissions:
Required Azure Role Actions
{
"permissions": [
{
"actions": [
"Microsoft.Compute/*/read",
"Microsoft.Compute/virtualMachines/start/action",
"Microsoft.Compute/virtualMachines/restart/action",
"Microsoft.Compute/virtualMachines/powerOff/action",
"Microsoft.Compute/virtualMachines/deallocate/action",
"Microsoft.Compute/virtualMachines/runCommand/action",
"Microsoft.Compute/virtualMachines/runCommands/read",
"Microsoft.Compute/virtualMachines/runCommands/write"
],
"notActions": [],
"dataActions": [],
"notDataActions": []
}
]
}
Identity Assignment Requirements
- User-assigned managed identity must be assigned to both HANA servers
- Identity must have Virtual Machine Contributor role (or custom role with above actions)
- Role assignment scope: SAP application server VMs' resource group (recommended) or individual VMs
Limitations
- Supported SAP Systems: ABAP systems on HANA scale-up only
- Not Supported: SAP JAVA, HANA scale-out, multi-SID environments
Installation
Step 1: Azure Configuration
Configure Azure resources using Azure CLI. Install Azure CLI if not already available.
PowerShell Script for Azure Setup
# Define parameters - Update these values for your environment
$subscriptionId = "Your-Subscription-ID"
$hanaResourceGroup = "HANA-VMs-Resource-Group"
$hanaVMNames = @("hana-vm1", "hana-vm2")
$managedIdentityName = "sap-azure-zone-alignment"
$customAzureRole = "Azure SAP Zone Alignment"
# Resource group scope assignment (recommended)
$sapAppResourceGroup = "SAP-Application-Servers-Resource-Group"
# Alternative: Direct VM assignment
$sapAppVMNames = @("sap-app1", "sap-app2", "sap-app3", "sap-app4")
# Login to Azure
az login
# Verify Azure Linux Agent and run-command capability on application servers
$sapAppVMNames | ForEach-Object -ThrottleLimit $sapAppVMNames.Count -Parallel {
$vmName = $_
$result = az vm run-command invoke `
--resource-group $using:sapAppResourceGroup `
--name $vmName `
--command-id RunShellScript `
--scripts "systemctl is-active waagent" `
--output json 2>&1 | ConvertFrom-Json
$msg = $result.value[0].message
if ($msg -match '\[stdout\]\s*active') {
Write-Host "[$vmName] OK - waagent active, run-command working"
} else {
Write-Host "[$vmName] FAIL - unexpected output: $msg"
}
}
# Create custom Azure role
$roleDefinition = @{
Name = $customAzureRole
IsCustom = $true
Description = "Custom Azure role for sap-azure-zone pacemaker resource agent"
Actions = @(
"Microsoft.Compute/*/read",
"Microsoft.Compute/virtualMachines/start/action",
"Microsoft.Compute/virtualMachines/restart/action",
"Microsoft.Compute/virtualMachines/powerOff/action",
"Microsoft.Compute/virtualMachines/deallocate/action",
"Microsoft.Compute/virtualMachines/runCommand/action",
"Microsoft.Compute/virtualMachines/runCommands/read",
"Microsoft.Compute/virtualMachines/runCommands/write"
)
NotActions = @()
AssignableScopes = @("/subscriptions/$subscriptionId")
} | ConvertTo-Json -Depth 3
$roleDefinition | Out-File -FilePath "$env:TEMP\az-role.json" -Encoding utf8
az role definition create --role-definition "$env:TEMP\az-role.json"
# Recommendation: Use a user-assigned managed identity for authentication.
# System-assigned managed identities are also supported; if you choose this option,
# ensure that system-assigned managed identity is enabled on both HANA VMs and that
# the required roles (listed below) are assigned to each system identity.
# Create user-assigned managed identity
$managedIdentityResourceId = az identity create `
--resource-group $hanaResourceGroup `
--name $managedIdentityName `
--query id --output tsv
# Assign managed identity to HANA VMs
foreach ($vmName in $hanaVMNames) {
az vm identity assign `
--resource-group $hanaResourceGroup `
--name $vmName `
--identities $managedIdentityResourceId
}
# Alternative: Enable system-assigned managed identity (uncomment if preferred)
# foreach ($vmName in $hanaVMNames) {
# az vm identity assign `
# --resource-group $hanaResourceGroup `
# --name $vmName
# }
# Assign role to managed identity (resource group scope)
$managedIdentityPrincipalId = az identity show `
--resource-group $hanaResourceGroup `
--name $managedIdentityName `
--query principalId --output tsv
az role assignment create `
--assignee-object-id $managedIdentityPrincipalId `
--assignee-principal-type ServicePrincipal `
--role $customAzureRole `
--scope "/subscriptions/$subscriptionId/resourceGroups/$sapAppResourceGroup"
# Display the client ID (needed for cluster configuration)
Write-Host "Managed Identity Client ID:"
az identity show `
--resource-group $hanaResourceGroup `
--name $managedIdentityName `
--query clientId --output tsv
Step 2: Install Resource Agent
Download and Install on Both HANA Cluster Nodes
# Download the resource agent script
curl -o azure-sap-zone.in https://raw.githubusercontent.com/ClusterLabs/resource-agents/refs/heads/main/heartbeat/azure-sap-zone.in
# Create the resource agent file
sudo cp azure-sap-zone.in /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone
# Update the interpreter line
# Note: the downloaded file typically starts with the placeholder `#!@PYTHON@ -tt`.
# Replace it with your actual python3 path.
PYTHON3_PATH="$(command -v python3)"
echo "python3: ${PYTHON3_PATH}"
# Bash note: `!` triggers history expansion inside double-quotes, so use this quoting form.
sudo sed -i '1 s|^#!@PYTHON@ -tt$|#!'"${PYTHON3_PATH}"' -tt|' /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone
# If you need to force a specific interpreter path, you can also do:
# sudo sed -i '1 s|^#!@PYTHON@ -tt$|#!/usr/bin/python3 -tt|' /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone
# Convert line endings and set permissions
sudo dos2unix /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone
sudo chmod +x /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone
# Copy to secondary node (alternative: repeat above steps manually)
sudo scp /usr/lib/ocf/resource.d/heartbeat/azure-sap-zone <secondary-hana-vm>:/usr/lib/ocf/resource.d/heartbeat/
Configuration
Configuration Options
The resource agent provides two distinct behaviors for application servers in the different zone/group.
| Option | Setting | What happens to SAP | Do the “different-zone” servers take new users/jobs/sessions? | What happens to the Azure VMs | Typical trade-off |
|---|---|---|---|---|---|
| 1) Deactivate (Passive mode) | stop_vms=false | SAP stays running, but the agent calls sapcontrol -function ABAPSetServerInactive to set the instance inactive/passive | No — the server is kept out of service for new workload (e.g., new user logons, new batch/background work, and other new sessions) | VMs stay running | Fastest to make active again, but no Azure compute cost savings for the inactive zone because the VMs keep running |
| 2) Soft shutdown + stop/deallocate | stop_vms=true | Agent calls sapcontrol -function Stop <soft_shutdown_timeout> (graceful stop with a configurable timeout) and waits until the instance is stopped (dispstatus=GRAY) | No — during shutdown the instance is not available for new workload/sessions | After shutdown, VMs are stopped and deallocated | Slower to re-activate (VM boot + SAP start), but can save costs in pay-as-you-go models by deallocating the inactive-zone VMs |
Notes:
- soft_shutdown_timeout controls how long SAP is given to stop gracefully.
- stop_vms=true is the only mode where the agent will stop/deallocate VMs.
Note on capacity: When using stop_vms=true, deallocated VMs are not guaranteed to have capacity available when restarted. Consider using On-Demand Capacity Reservations (ODCR) or Capacity Reservation Groups to ensure VM sizes remain available in both zones. The resource agent does not manage capacity reservations — this is an infrastructure planning consideration.
Cluster Configuration Examples
The configuration has two parts:
- Create the primitive resource — choose one of the examples below (A–F) based on your deployment pattern.
- Create the clone and order constraint — this is required regardless of which example you use (see Step 2 below).
Note on monitor interval: the resource agent advertises a default monitor interval of 300 seconds in its meta-data. The examples below use a shorter interval (e.g. 10s) to detect failovers quickly; choose an interval appropriate for your environment.
Step 1: Create the primitive resource
Choose the example that matches your deployment:
Example A: Zonal deployment (system-assigned managed identity) + explicit VM list
Use this when Azure Availability Zones are present and you want to provide an explicit list of application VMs.
SLES (crmsh):
sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
params sid=<SAP_SID> \
hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
app_vm_names=<app_vm1,app_vm2,app_vm3,app_vm4> \
stop_vms=false \
wait_time=600 \
verbose=true \
meta failure-timeout=120s \
op start start-delay=60s interval=0s timeout=360s \
op monitor interval=10s timeout=360s \
op stop timeout=10s interval=0s on-fail=ignore
Example B: Zonal deployment (user-assigned managed identity) + VM name pattern
Use this when Azure Availability Zones are present and you want the agent to discover application VMs by name.
SLES (crmsh):
sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
params sid=<SAP_SID> \
hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
app_vm_name_pattern=<REGEX_OR_PREFIX_PATTERN> \
client_id=<MANAGED_IDENTITY_CLIENT_ID> \
stop_vms=false \
wait_time=600 \
verbose=true \
meta failure-timeout=120s \
op start start-delay=60s interval=0s timeout=360s \
op monitor interval=10s timeout=360s \
op stop timeout=10s interval=0s on-fail=ignore
Example C: Non-zonal / PPG deployment (logical grouping)
Use this when Azure zone metadata is missing (for example, proximity placement group or other non-zonal deployments).
Key points:
- Set hana_vm_zones to map each HANA VM name to a logical group label (for example hanavm1:1,hanavm2:2).
- Set app_vm_zones to map each application VM name (or just the subset missing zone metadata) to a logical group label.
- If Azure later reports real zone data for those VMs, the agent validates on every start that your mapping matches Azure and fails if it does not.
SLES (crmsh):
sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
params sid=<SAP_SID> \
hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
hana_vm_zones="<hana_vm1>:1,<hana_vm2>:2" \
app_vm_zones="sapapp01:1,sapapp02:1,sapapp03:2,sapapp04:2" \
stop_vms=false \
wait_time=600 \
verbose=true \
meta failure-timeout=120s \
op start start-delay=60s interval=0s timeout=360s \
op monitor interval=10s timeout=360s \
op stop timeout=10s interval=0s on-fail=ignore
Example D: Mixed deployment (mostly zonal, a few VMs missing zone metadata)
Use this when most application VMs have Azure zone metadata, but a small subset does not. Provide the full VM list via app_vm_names (or discovery via app_vm_name_pattern), and provide app_vm_zones only for the VMs that are missing zone metadata.
SLES (crmsh):
sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
params sid=<SAP_SID> \
hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
app_vm_names=<app_vm1,app_vm2,app_vm3,...> \
app_vm_zones="<nonzonal_vm_a>:1,<nonzonal_vm_b>:2" \
stop_vms=false \
wait_time=600 \
verbose=true
Example E: Zonal deployment with stop_vms=true (shutdown + deallocate different-zone VMs)
Use this when you want maximum cost optimization by shutting down and deallocating the application VMs in the non-primary zone.
SLES (crmsh):
sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
params sid=<SAP_SID> \
hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
app_vm_names=<app_vm1,app_vm2,app_vm3,app_vm4> \
stop_vms=true \
wait_before_stop_sap=300 \
soft_shutdown_timeout=600 \
wait_time=600 \
verbose=true \
op start start-delay=60s interval=0s timeout=360s \
op monitor interval=10s timeout=360s \
op stop timeout=10s interval=0s on-fail=ignore
Example F: HANA SID differs from SAP SID (hana_sid)
Use this when the HANA cluster uses a different SID, so the HANA Pacemaker attributes are named hana_<hana_sid>_*.
SLES (crmsh):
sudo crm configure primitive azure-sap-zone ocf:heartbeat:azure-sap-zone \
params sid=<SAP_SID> \
hana_sid=<HANA_SID> \
hana_resource=<HANA_CLUSTER_RESOURCE_NAME> \
app_vm_names=<app_vm1,app_vm2,app_vm3,app_vm4> \
stop_vms=false \
verbose=true
Step 2: Create clone and order constraint
After creating the primitive resource using any of the examples above, run the following commands to create the clone resource and order constraint. This is required for all deployment patterns.
SLES (crmsh):
# Create clone resource (runs on both nodes)
sudo crm configure clone cln_azure-sap-zone azure-sap-zone \
meta clone-node-max=1 target-role=Started interleave=true
# Create order constraint (start after HANA resource)
sudo crm configure order ord_azure-sap-zone Mandatory: <HANA_CLONE_RESOURCE> cln_azure-sap-zone symmetrical=false
Usage
After installation and configuration, the resource agent will automatically:
- Monitor HANA primary location: Detects which availability zone hosts the current HANA primary
- Manage application servers: Starts/stops or activates/deactivates SAP application servers based on zone alignment
- Handle failover scenarios: Automatically adjusts during HANA failover events
Manual Operations
Enable Verbose Logging
# SLES
sudo crm_resource --resource azure-sap-zone --set-parameter verbose --parameter-value true
Resource Management
Put resource in maintenance mode (for maintenance):
Setting the resource to unmanaged mode stops Pacemaker from running any operations (start, stop, monitor) on the resource, which prevents the agent from taking action during planned maintenance.
# SLES — enable maintenance mode
sudo crm resource maintenance cln_azure-sap-zone on
Resume resource management:
# SLES — disable maintenance mode
sudo crm resource maintenance cln_azure-sap-zone off
Validating the Setup with a Test Failover
Once the resource agent is installed and configured, we recommend running through a test failover on a non-production system to confirm everything works end to end. The steps below walk you through the before, during, and after of a validation cycle.
Step 1: Verify the resource agent is running
Before triggering a failover, confirm the resource agent is healthy and the cluster sees it on both nodes:
# Check overall cluster status
sudo crm status
# Verify the azure-sap-zone clone is started on both nodes
sudo crm resource show cln_azure-sap-zone
You should see the clone resource running on both HANA cluster nodes.
Step 2: Check the initial state
Record the current state so you can compare after the failover:
# Which node is the HANA primary?
sudo crm status | grep -i "Masters\|Promoted"
# What zone/phase does the resource agent report?
sudo crm_attribute --name azure_sap_zone_current_phase --query --quiet --node $(hostname)
# Check application server VM power state (from Azure CLI, if available)
az vm list -g <SAP-App-Resource-Group> -d --query "[].{Name:name, PowerState:powerState, Zone:zones[0]}" -o table
At this point the phase should be all_phases_completed (if the agent has already aligned once) or Started / no_action_required depending on which node you are on.
Step 3: Enable verbose logging (recommended)
Turn on verbose logging before the failover so you can trace every phase in detail:
sudo crm_resource --resource azure-sap-zone --set-parameter verbose --parameter-value true
Step 4: Trigger a test HANA failover
Important: Only perform this on a test/non-production system.
You can trigger a controlled HANA takeover using standard Pacemaker commands. The exact method depends on your HANA resource agent:
# Option A: Migrate the HANA primary to the secondary node
sudo crm resource move <HANA_CLONE_RESOURCE> <target-node> force
# After the move completes, clear the location constraint so Pacemaker can manage normally
sudo crm resource clear <HANA_CLONE_RESOURCE>
Alternatively, if your runbook uses sr_takeover or SAPHanaSR tools, follow your existing takeover procedure. The key point is that the HANA primary ends up on the other node/zone.
Step 5: Monitor the resource agent's progress
After the failover, the resource agent on the new primary node will detect the zone change and begin executing its phases. You can watch it in real time:
# Watch the phase attribute update (run on the new primary node)
watch -n 5 'crm_attribute --name azure_sap_zone_current_phase --query --quiet --node $(hostname)'
You should see the phase progress through:
- start_vms_in_same_zone
- wait_for_vms_in_same_zone_to_start
- start_sap_in_same_zone
- wait_for_sap_in_same_zone_to_start
- stop_sap_in_diff_zone
- wait_for_sap_in_diff_zone_to_stop (only when stop_vms=true)
- stop_vms_in_diff_zone (only when stop_vms=true)
- all_phases_completed
Step 6: Validate the outcome
Once the phase reaches all_phases_completed, verify that the application tier has been aligned correctly.
Check application VMs in the same zone as the new HANA primary:
# Verify VMs are running
az vm list -g <SAP-App-Resource-Group> -d --query "[].{Name:name, PowerState:powerState, Zone:zones[0]}" -o table
# Verify SAP instances are active (GREEN) — run on a same-zone app VM
sapcontrol -nr <instance_number> -function GetProcessList
All SAP processes on the same-zone VMs should show dispstatus: GREEN.
Check application VMs in the different zone:
- If stop_vms=false: the VMs should still be running, but SAP instances should be in inactive/passive mode. You can verify this by checking logon groups (SMLG) or the server's active status.
- If stop_vms=true: the VMs should be stopped/deallocated in the Azure portal or via az vm list.
Step 7: Review the logs
Check the Pacemaker log to confirm all phases executed without errors:
# View all agent activity
sudo grep -i 'azure-sap-zone' /var/log/pacemaker/pacemaker.log | tail -50
# Filter to only INFO/WARNING/ERROR messages (skip routine monitor noise)
sudo grep -iE 'azure-sap-zone.*(INFO|WARNING|ERROR):' /var/log/pacemaker/pacemaker.log | grep -v -iE "All phases|monitor: Started"
Example output (filtered):
Apr 23 10:15:32 hanavm1 azure-sap-zone INFO: monitor: Started
Apr 23 10:15:32 hanavm1 azure-sap-zone INFO: Executing phase: start_vms_in_same_zone
Apr 23 10:15:35 hanavm1 azure-sap-zone INFO: Executing phase: wait_for_vms_in_same_zone_to_start
Apr 23 10:15:45 hanavm1 azure-sap-zone INFO: All VMs are started
Apr 23 10:15:45 hanavm1 azure-sap-zone INFO: Executing phase: start_sap_in_same_zone
Apr 23 10:15:45 hanavm1 azure-sap-zone INFO: Starting SAP on VMs: ['sapapp01', 'sapapp02']
Apr 23 10:16:02 hanavm1 azure-sap-zone INFO: Executing phase: wait_for_sap_in_same_zone_to_start
Apr 23 10:16:15 hanavm1 azure-sap-zone INFO: All SAP instances are started
Apr 23 10:16:15 hanavm1 azure-sap-zone INFO: Executing phase: stop_sap_in_diff_zone
Apr 23 10:16:15 hanavm1 azure-sap-zone INFO: Setting SAP instances to passive mode on VMs: ['sapapp03', 'sapapp04']
Apr 23 10:16:20 hanavm1 azure-sap-zone INFO: All phases have been executed successfully
Apr 23 10:16:20 hanavm1 azure-sap-zone INFO: monitor: Finished
Look for:
- Phase transitions: confirm each phase started and completed in order
- No errors: no ERROR or FAIL messages
- Timing: note how long the full cycle took — this is your expected failover alignment time
Step 8: Clean up
After validation, you can disable verbose logging to reduce log volume:
sudo crm_resource --resource azure-sap-zone --set-parameter verbose --parameter-value false
If you triggered the failover using crm resource move, make sure the location constraint was cleared (Step 4) so Pacemaker can manage resources normally going forward.
Troubleshooting
Common Issues
1. Authentication Problems
- Verify managed identity is assigned to HANA VMs
- Check Azure role assignments
- Ensure proper permissions on target application server VMs
2. Network Connectivity
- Validate outbound access to Azure API endpoints
- Check firewall rules and network security groups
3. Azure Linux Agent Issues
- Verify agent status: systemctl status waagent
- Check agent logs: /var/log/waagent.log
Log Analysis
View Resource Agent Logs
sudo grep -i 'azure-sap-zone' /var/log/pacemaker/pacemaker.log
Common Log Patterns
- Phase transitions: Look for "current_phase" changes
- API errors: Search for "Azure API" or "HTTP" error codes
- Timeout issues: Check for "timeout" or "wait_time exceeded"
Performance Monitoring
Monitor the following metrics:
- Phase execution times: Should complete within configured wait_time
- API response times: Azure API calls should be < 30 seconds
- VM startup times: Application server boot time affects total failover duration
FAQ
Q: Can I use this with SAP JAVA systems? A: No, this resource agent currently only supports SAP ABAP systems on HANA scale-up configurations.
Q: What HANA resource agents are supported? A: The agent supports both SAPHanaSR and SAPHanaSR-angi (A Next Generation Interface) resource agents and automatically detects which one is in use.
Q: What happens if an application server VM fails to start? A: The resource agent will retry based on the retry_count parameter and eventually fail the phase if the VM doesn't start within the wait_time.
Q: Can I run this in a multi-SID environment? A: No, multi-SID environments are not currently supported.
Q: Can I use different SIDs for SAP and HANA? A: Yes, use the hana_sid parameter if your HANA SID differs from your SAP SID.
Q: Can I use system-assigned managed identity instead of user-assigned? A: Yes, simply omit the client_id parameter and ensure system-assigned managed identity is enabled on both HANA VMs with appropriate permissions.
Important Notes
Resource State Management Upon completion, the cluster attribute azure_sap_zone_current_phase is set to all_phases_completed. The resource agent will not take further action until restarted.
Maintenance Operations When performing maintenance on application servers in different zones (e.g., patching), put the resource in maintenance mode to prevent the agent from taking action:
# SLES
sudo crm resource maintenance cln_azure-sap-zone on
Zone Alignment This solution requires identical SAP application server VMs in both availability zones. Only one set should be active at any time.
Wrapping up
With the information in this post, you should have what you need to evaluate the Azure SAP Zone Resource Agent in your environment - from setting up the managed identity and permissions, to installing the agent, configuring the cluster, and troubleshooting common issues. If you haven't already, we recommend reading Part 1 for an introduction to the concepts and features behind this solution.
We welcome your feedback during this public preview. If you encounter issues or have suggestions, please file them via GitHub Issues on the ClusterLabs resource-agents repository.
Public preview expectations
During public preview:
- This solution is provided as a Public Preview for evaluation and feedback.
- It is not covered by a formal support commitment.
- The design, configuration, and behaviors may evolve based on learnings.
Because of that, we recommend using this in non-production environments while it is in preview.
If you're interested in piloting the preview, your feedback will help shape what becomes generally available and supported.
Disclaimer
This post describes a public preview capability. It is shared for informational purposes only and is subject to change. It is not a substitute for your organization's validation, testing, and operational readiness reviews.