Blog Post

Running SAP Applications on the Microsoft Platform
5 MIN READ

New Azure Monitor for SAP Solutions HA Cluster features

Ross Sponholtz's avatar
Apr 14, 2021

Introduction

The Azure Monitor for SAP solutions (AMS) team has announced new capabilities for AMS, including monitoring SAP NetWeaver metrics, OS metrics and enhanced High Availability cluster visualization. This blog post is an overview of the recent changes for the High Availability cluster provider for AMS and associated cluster workbook views.

The most important changes for the HA provider are in the workbook visualizations. You can now see:

  • Location constraints that are left by "crm resource move" and "crm resource migrate" commands. These will change the operation of the cluster, and it's useful to be reminded if they have been left in the cluster configuration.
  • Historical node view. You can now see whether a cluster node is up, and is the "designated coordinator" for the cluster, for configurable time periods.
  • Historical resource view. You can now see the failcount over time for individual cluster resources.

Prerequisites

To use AMS monitoring for your HA clusters, there are some requirements for your environment:

  • One or more clusters of monitored Azure VMs or Azure Large instances.
  • The OS for the cluster nodes currently should be SLES 12 or 15. Other OS options are in development.
  • The Pacemaker cluster installation should be completed - there are instructions for this in Setting up Pacemaker on SLES in Azure - Azure Virtual Machines
  • The monitored instances must be reachable over the network that AMS is deployed to. Also, the HA provider uses HTTP requests to the monitored instances to retrieve monitoring data, so this must be enabled.

The AMS team has tested monitoring several different types of cluster managed applications with the AMS HA provider, including

  • SUSE Linux Network File System (NFS)
  • SAP HANA
  • IBM DB/2
  • SAP Netweaver Central Services

Other managed applications and services should work with the AMS HA provider, but are not officially supported.

Setting up the HA Provider

The process for setting up the HA provider hasn't changed, but here is a walkthrough of the process for setting up your cluster, AMS, and the HA cluster providers:

  • Create your HA cluster in Azure, using the instructions linked above.
  • Install the Prometheus ha_cluster_exporter in each of the cluster nodes, following the instructions here. For each instance, log onto the machine as root and install using the zypper package manager:
zypper install prometheus-ha_cluster_exporter

After the exporter is installed you can enable it (so it is automatically started on future reboots of the instance):

systemctl --now enable prometheus-ha_cluster_exporter

After this is done, it is useful to test that the cluster exporter is actually working. From another machine on the same network, you can test this using the Linux "curl" command (using the proper machine name). For example:

testuser@linuxjumpbox:~> curl http://hana1:9664/metrics
# HELP ha_cluster_corosync_member_votes How many votes each member node has contributed with to the current quorum
# TYPE ha_cluster_corosync_member_votes gauge
ha_cluster_corosync_member_votes{local="false",node="hana2",node_id="2"} 1
ha_cluster_corosync_member_votes{local="true",node="hana1",node_id="1"} 1
# HELP ha_cluster_corosync_quorate Whether or not the cluster is quorate
# TYPE ha_cluster_corosync_quorate gauge
ha_cluster_corosync_quorate 1
...
 

You should configure a HA cluster provider for each node of the cluster using the following information:

  • Type - High-availability cluster(Pacemaker)
  • Name - a unique name you give the cluster provider. I use a pattern of "ha-nodename" for this.
  • Prometheus Endpoint - this is the same as the URL you used to test the cluster exporter above, and usually will be http://nodeipaddr:9664/metrics
  • SID - this is three-character abbreviation to identify the cluster - if this is an SAP instance, you should make this the same as the SAP SID
  • Hostname - this is the hostname for the monitored node
  • Cluster - this is the name of the monitored cluster - you can find this out by doing the following on a cluster instance:
hana1:~ # crm config show | grep cluster-name
cluster-name=hacluster \
  • When finished, select Add provider
 

 

As an example, here is a subscription with three clusters - one for NFS, one for SAP HANA, and one for SAP central services. The provider view looks like this:

 

Overview of the new views

Continuing with the example of the resource group with three clusters, here is the overall HA cluster status:

 

Here, you can tell that there are three clusters, but only one of them (the NFS cluster) is in a healthy state. The H10 cluster is in maintenance mode, which means the cluster is not managing cluster resources. The s40 cluster has had errors in the resource state.

Cli location constraints

When you select one of the cluster hexagons, you will see more information on that specific cluster. First, there is information on cli-ban or cli-prefer location constraints in the cluster configuration. These constraints are created by commands such as crm resource move or crm resource migrate. This is what you will see in the HA workbook if there are no such constraints:

 

If there are any of these constraints, you will see the constraint names:

 

You should remove these constraints after the resource movement has been completed using the "crm configure delete <constraint name>" command. If you do not, they will impact the expected operation of the cluster.

Node status over time

In the cluster view, you will see the current node status for each node in the cluster, and you will now also see the historical status of a particular node at the bottom:

 

The Time range and node are selectable in this view. This is useful to see when a particular node went offline from the standpoint of the cluster. It will also indicate which of the nodes is the clusters "designated coordinator".

Resource status over time

The cluster view will show the current resource status for the cluster managed resources, and there is now a historical view of the failure counts for a selected resource:

 

Again, the time range and resource are selectable in this view. It shows the failure count and failure threshold for the selected resource - If the failure count reaches the threshold, the resource will be moved to another node. Any errors should be investigated and resolved if possible.

Resources

Here are some additional resource links for learning about Azure Monitor for SAP Solutions and providers for other monitoring information:

Feedback form:

Summary

We hope you will find the new visualizations helpful, and please let us know if you have ideas for any new features for Azure Monitor for SAP solutions.

Updated Apr 14, 2021
Version 1.0
No CommentsBe the first to comment