If you are facing Performance issues with Azure Database for PostgreSQL, these can often be resolved by doing basic troubleshooting steps to understand the cause of the bottleneck and then explore options to fix it.
Server Level Troubleshooting
1- Resource Utilization:
It's important to make sure that you are not maxing out any of your resource limits, reaching high levels of resource utilization will cause a performance issue this can be avoided by monitoring server's resource utilization on the Azure Portal.
In Azure Database for PostgreSQL resource limits determined by the service tier that you are using, number of vCores and Storage size that you provision. All this information is documented here and here, you can navigate to your Azure Database for PostgreSQL and use the Metrics blade to check if you are maxing out any of your resource limits.
If your workload is maxing out any of these server metrics please consider scaling up resources to accommodate the workload and review top queries consumers and tune them.
2-Make sure you are applying Application side best practices:
a) Connection pooling: This significantly reduces connection latency by reusing existing connections and enables higher database throughput (transactions per second) on the server.
Typically, application side pooling opens a bunch of connections which sit idle at the database. Idle applications consume 10MB of memory per connection and can also create contention for establishing new connections and slow down queries. You can learn much more about this here.
Essentially if you see a high number of idle connections, it would be recommended to put a PostgreSQL specific connection pooler in place (server side connection pooler). We recommend leveraging pgbouncer for this. You can find steps to configure and install pgbouncer here.
b) Accelerated Networking: enables single root I/O virtualization (SR-IOV) to a VM, greatly improving its networking performance. This high-performance path bypasses the host from the datapath, reducing latency, jitter, and CPU utilization, for use with the most demanding network workloads on supported VM types.
c) Improve Performance of Read Intensive Workloads on Azure DB for PostgreSQL using Query Caching: Pgpool-II Query Caching, Redis cache , Memcached are the popular external query caching solutions read more about query caching here.
d) Application and Database should be in the same Data Center: make sure your application is in the same data center as your database server to avoid overhead network latency.
e) CPU exhaustion: Single-threaded applications can result in CPU exhaustion of one CPU while the other CPUs are under-utilized. Consider parallelizing your workload to take advantage of all the vCores available.
f) TCP_NODELAY: TCP_NODELAY is a client-side setting that should be considered on a client Virtual Machine (VM). Applications that benefit from the TCP_NODELAY option typically tend to do smaller infrequent writes and are particularly sensitive to latency. As an example, latency can be reduced from 15-40 ms to 2-3 ms with this setting.
3- Leverage ANALYZE command ANALYZE collects statistics about the contents of tables in the database, and stores the results in the pg_statistic system catalog. Subsequently, the query planner uses these statistics to help determine the most efficient execution plans for queries. You can read more about this here.
With no parameter, ANALYZE examines every table in the current database. With a parameter, ANALYZE examines only that table. It is further possible to give a list of column names, in which case only the statistics for those columns are collected.
4- Overall Slowness
a) Consider disabling pg_stat_statements, this can be done by visiting the server parameters blade on the Azure Portal and set "pg_stat_statements.track" to NONE. Some customers workload have seen up to a 50 percent performance improvement with disabling this. The tradeoff you make when you disable pg_stat_statements is the inability to troubleshoot performance issues.
b) Network latency: Check the network latency between the client and the database service instance. You can check the network latency by running simple query as ‘SELECT 1’.
c) Consider disabling the following server parameters to reduce the overhead on the database side. Those server parameters are helpful to understand your workload but proven to cause performance degradation when enabled. - log_duration = OFF - log_min_duration_statement = -1 - log_statement_stats = OFF
d) Checkpoint: Periodic flush of the cached data to the disk to ensure the durability. This is an IO intense job and cause some performance impact. For larger databases, please consider changing the following two server parameters:
- bgwriter_delay = 20
- checkpoint_completion_target = 0.2
e) Autovacuum: PostgreSQL uses multiversion concurrency control (MVCC) to allow greater database concurrency. Every update results in an insert and delete, and every delete results in rows being soft-marked for deletion. Soft-marking identifies dead tuples that will be purged later. To carry out these tasks, PostgreSQL runs a vacuum job. More information can be found here.
"autovacuum_*" server parameters needs to be tuned to be more effective and less invasive.
Query Level Troubleshooting
Use Microsoft Intelligent Performance tools to explore queries that are considered top consumers and also learn about missing indexes along with recommendation and full analysis for your workload. Intelligent Performance for Azure Database for PostgreSQL includes:
1) Query Store: Query Store feature in Azure Database for PostgreSQL provides a way to track query performance over time. Query Store simplifies performance troubleshooting by helping you quickly find the longest running and most resource-intensive queries. Query Store automatically captures a history of queries and runtime statistics, and it retains them for your review. It separates data by time windows so that you can see database usage patterns. Data for all users, databases, and queries is stored in a database named azure_sys in the Azure Database for PostgreSQL instance. Learn more about Query store here.
2) Query Performance Insight Query Performance Insight helps you to quickly identify what your longest running queries are, how they change over time, and what waits are affecting them.
3) Performance recommendations Performance Recommendations feature analyses your databases to create customized suggestions for improved performance. To produce the recommendations, the analysis looks at various database characteristics including schema. Enable Query Store on your server to fully utilize the Performance Recommendations feature. After implementing any performance recommendation, you should test performance to evaluate the impact of those changes.