Advanced Monitoring for PostgreSQL Using Log Analytics
Published Jan 22 2024 03:15 PM 5,042 Views
Microsoft

Monitoring your database resources is critical to being able to detect issues or opportunities for improvements. When it comes to Azure Database for PostgreSQL Flexible Server, the monitoring story can be a bit overwhelming with multiple different services seeming to offer similar solution. There is often discussion between two services, Azure Monitor Metrics, and Azure Log Analytics . In this blog post, we are going to look at these two services and understand scenarios when you would use them. Let us start by looking at what these services do.

 

Azure Monitor metrics

 

Azure Monitor is a comprehensive solution for collecting, analyzing, and alerting on time-series metrics for services like Azure Postgres. It helps to understand how the service is performing and proactively identifies any issues. There are over 60 Azure Postgres Monitoring Metrics available for monitoring on various database aspect including resource utilization, database activity, replication, saturation, connections etc. Overall, Azure monitor is a powerful tools for the collection, visualization, and alerting on time series metrics. To learn more, refer Azure Monitor Metrics overview.

 

Azure Log Analytics (LA)

 

Azure Log Analytics (LA) is a service that can monitors cloud resources and applications. It allows you to collect and aggregation of logs generated by Azure resources in your cloud and on-premises environments. Further, Log Analytics  provides advanced analytics, monitoring and alerting on logs data. Log Analytics will collect and store your data from various log sources and allow you to query over them using a custom language called Kusto Query Language (KQL).

 

While Azure Monitor metrics focuses on time series data such as performance or resource usage metrics, Azure Log Analytics offers more complex querying and log data analysis. Log Analytics is especially useful for long-term trend analysis and combining data from multiple sources to gain deeper insight into operations and performance across your application stack. To learn more, refer Overview of Log Analytics in Azure Monitor.

 

Combining Metrics with Log Analytics

 

For single-instance monitoring and troubleshooting, Azure Monitor metrics is the go-to tool. Say, if you are having a slowness issue with a database server and you want to go and look at its CPU utilization, you can do this through Azure Monitor using the portal and generate some great charts to learn what is happening now. Also, you can pin this chart to the Azure dashboard and finally, Use the Azure portal to set up alerts on metrics for a specific resource. When debugging a specific issue on a small number of instances Azure monitor is great and will do exactly what is needed.

 

media1_cpu.png

 

However, for complex monitoring tasks such as monitoring a fleet of Azure PostgreSQL instances, Azure Log Analytics offer better capabilities. This becomes invaluable when managing large server fleets, such as analyzing CPU utilization across 100+ servers. With Log Analytics, metrics data from all servers can be consolidated into one single Log Analytics workspace. This aggregation enables comprehensive visualization and alerting on logs using the Kusto Query Language (KQL), simplifying what would otherwise be a daunting task of individually analyzing each server. This approach highlights Log Analytics' strength in handling large-scale detailed data analysis.

Similar to Azure Monitor, Log Analytics (LA) charts can be pinned to the Azure Dashboard for easy access and visualization. With LA, configuring a single alert suffices for monitoring CPU usage across multiple servers, triggering when any server exceeds the set threshold. This approach reduces the need for individual alerts per resource. Integrating Azure Monitor metrics into Log Analytics enhances functionality, offering broader and more detailed monitoring capabilities.

  • Monitoring at scale - Consolidate log entries from multiple Azure subscriptions and tenants.
  • Long-term analysis - Log analytics offers retention for up to 2 years.
  • Metrics correlation – We can query multiple different metrics and display them together to look for correlation.
  • Complex queries – Log analytics leverages Kusto query language which can be used to undertake complex queries over large data series.
  • Query other data – Azure monitor is obviously focused on timeseries metrics, versus with Log Analytics you can collect any sort of log data, including custom logs.

 

Combining Azure Monitor metrics and Azure Log Analytics offers a comprehensive monitoring solution. Azure Monitor provides real-time insights for individual instances, while Log Analytics offers depth for multiple instances. Azure monitor metrics data is stored in AzureMetrics table that holds metrics for all Azure resources. Once the metrics are enabled in the diagnostic settings for a resource, information is stored in this table. In the example below, I am using the below Kusto query on AzureMetrics table to visualize cpu_percent across a fleet of Azure Postgres Flexible Server Instances.

Create a Log Query Alert for Monitoring CPU metrics


Metrics data in a Log Analytics workspace is stored in a table called AzureMetrics that you can retrieve with a log query in Log Analytics. The structure of this table varies depending on the category of the log entry. For a description of the table properties, see the AzureMetric data reference. For a complete lists of metrics available for Azure Postgres, refer Supported Metrics.

For example, to view a count of metrics log records for each resource, use the following Kusto query.

 

AzureMetrics | summarize count() by Resource

 

 

To retrieve all metrics in the specific Azure Postgres Resource

 

AzureMetrics | where Resource == "VARUND-QPI-DEMO"

 


To set up a log query alert for monitoring CPU usage across fleet of Azure PostgreSQL servers, you start by selecting a prebuilt log query or creating one using Kusto Query Language (like one above). This query should leverage cpu_percent metrics from your Azure resources.

 

Sample Kusto Query

 

# KQL to visualize average cpu percent for multiple Flexible Server instances across multiple regions

AzureMetrics
| where TimeGenerated > ago(30m) //Filter for the last 30 minutes
| where ResourceProvider == "MICROSOFT.DBFORPOSTGRESQL"
| where MetricName == "cpu_percent"
| where ResourceId contains "FLEXIBLESERVERS" //Filter for ResourceId containing "FLEXIBLESERVERS"
| extend LocalTimeGenerated = datetime_utc_to_local(TimeGenerated, "US/Central") // Convert to local time
| project LocalTimeGenerated, Resource, Average
| summarize AvgCpuUsage = avg(Average) by LocalTimeGenerated, Resource
| render timechart

 

 

Output

media2_memory.png

 

After verifying the query's results, proceed to create an alert rule in the Azure portal. Key steps involve:

  1. Configuring the condition based on the log query results.
  2. Setting alert logic with appropriate thresholds for CPU usage (e.g., alert if CPU usage is greater than 60%).
  3. Choosing an action group for notifications (email, SMS, etc.).
  4. Finalizing alert details like name, description, and severity.
  5. When an alert fires, it sends notifications in its action groups. You can also view the alert in the Azure portal.

 

media4_la_alert.png


For detailed steps, refer to the
Tutorial: Create a log query alert for an Azure resource.

Some things to consider when using Log Analytics

While Log Analytics offers comprehensive data analysis, it is important to note some key considerations:

  • Data Ingestion and Alert Delay: The average latency to ingest log data is between 1 minute and 5 minutes. The specific latency for any data will vary depending on several factors that can cause delays in alert triggering compared to Azure Monitor's near real-time alerts. For details, refer Log data ingestion time.
  • Additional Costs: Azure Log analytics cost is based on two plans: Basic Logs, currently free for search but with future billing to be announced, and Analytic Logs, offering Pay-As-You-Go (first 5 GB/month free) and Commitment Tiers (discounted rates starting at 100 GB/day with a 31-day commitment period, with overage billed at the tier rate). For details, refer Azure monitor logs pricing.


Log Analytics data ingestion strategies

 

So hopefully now it is clear that Azure Monitor is the service to aggregate the time series metrics from the Azure resources, and Log Analytics is the service to aggregate logs, and can be used if you want to monitor multiple Azure Postgres resources. Given this, the question is How do we get that data into Log Analytics. Fortunately, Azure Monitor comes with options for exporting metrics data. For PostgreSQL Flexible Server you have the ability to export logs data to following logging destinations:

  1. A storage account.
  2. An Event Hub.
  3. A Log Analytics Workspace.

 

This can be configured through the portal, underneath the diagnostic settings tab for the resource you want to configure. In here you configure which of the 3 destination you want to send the data too and then what data you want to send. This will usually include options for both Logs and Metrics, and often the metrics option will just be “All metrics” which as the name suggests, sends all metrics that are available for that service. Below are the options for Azure database for PostgreSQL Flexible Server.

 

media3_diagnostic_settings.png

 

It is important to note that a single Log Analytics workspace can monitor resources across multiple subscriptions as long as they are under the same Tenant. For example, if you have an Azure PostgreSQL database in Subscription A and Log Analytics Workspace in Subscription B, you can send the logs and metrics from that PostgreSQL database to the Log Analytics workspace.

 

Conclusion


For large enterprises, the combination of Azure Monitor and Log Analytics forms a robust solution for monitoring Azure Database for PostgreSQL Flexible Server ‘at scale.’ This integrated approach is critical for monitoring a vast array of Azure resources through a single pane of glass, ensuring not just immediate issue resolution but also facilitating long-term, strategic oversight across the entire enterprise infrastructure. This strategy enhances both performance and reliability, aligning with the needs of large-scale, complex deployments.

4 Comments
Co-Authors
Version history
Last update:
‎Apr 04 2024 08:12 AM
Updated by: