A Practical Guide to Understanding Memory Efficiency
PostgreSQL performance is often dictated not just by query design or indexing strategy, but by how effectively the database leverages memory. At the heart of this memory usage lies shared_buffers—PostgreSQL’s primary buffer cache. Understanding how well this cache is utilized can make the difference between a system that scales smoothly and one that struggles under load.
In this post, we’ll walk you through a practical, data-driven approach to analyzing PostgreSQL buffer cache behavior using native statistics and the pg_buffercache extension. The goal is to answer a few critical questions:
- Is the current shared_buffers configuration sufficient?
- Are high-value tables and indexes actually being served from memory?
- Is PostgreSQL spending too much time going to disk when it shouldn’t?
By the end, you’ll have a repeatable methodology to assess cache efficiency and make informed tuning decisions.
Why Buffer Cache Analysis Matters
PostgreSQL relies heavily on its buffer cache to minimize disk I/O. Every time a query needs a data or index page, PostgreSQL first checks whether that page already exists in shared_buffers. If it does, the page is served directly from memory—fast and efficient. If not, PostgreSQL must fetch it from disk (or the OS page cache), which is significantly slower.
While metrics like query latency and IOPS can tell you that performance is degraded, buffer cache analysis helps explain why. It allows you to:
- Validate whether frequently accessed objects stay hot in cache
- Identify cache pollution caused by large, low-value tables
- Determine whether increasing shared_buffers would provide real benefits or just waste memory
Inspecting Shared Buffers with pg_buffercache
The pg_buffercache extension provides a real-time view into PostgreSQL’s shared buffers. Unlike cumulative statistics, it shows what is in memory right now—which relations are cached, how many blocks they occupy, and how frequently those buffers are reused.
Enabling the Extension
pg_buffercache is not enabled by default and requires superuser privileges:
CREATE EXTENSION pg_buffercache;
Once enabled, you can directly query the contents of shared buffers across databases, tables, and indexes.
Analyzing Cache Distribution
Understanding where your shared buffers are being consumed is the first step toward meaningful tuning.
Database-Level Cache Distribution
This query shows how shared buffers are distributed across databases in the server:
SELECT CASE
WHEN c.reldatabase IS NULL THEN ''
WHEN c.reldatabase = 0 THEN ''
ELSE d.datname
END AS database,
count(*) AS cached_blocks
FROM pg_buffercache AS c
LEFT JOIN pg_database AS d ON c.reldatabase = d.oid
WHERE datname NOT LIKE 'template%'
GROUP BY d.datname, c.reldatabase
ORDER BY d.datname, c.reldatabase;
This is particularly useful in multi-database environments where one workload may be evicting cache pages needed by another.
Table and Index-Level Cache Consumption
To understand which relations, dominate the cache, the following query breaks buffer usage down by tables and indexes:
SELECT c.relname, c.relkind, count(*)
FROM pg_database AS a, pg_buffercache AS b, pg_class AS c
WHERE c.relfilenode = b.relfilenode
AND b.reldatabase = a.oid
GROUP BY 1, 2
ORDER BY 3 DESC, 1;
This helps answer an important question: Are your most business-critical tables and indexes actually resident in memory, or are they constantly being evicted?
If large, rarely used tables consume a disproportionate share of buffers, it may indicate cache churn or the need for workload isolation.
Understanding Buffer Usage Count (Hot vs Cold Data)
Each buffer in shared memory carries a usage count, which reflects how frequently it has been accessed before eviction. Higher values indicate hotter data.
SELECT c.relname, c.relkind, usagecount, count(*) AS buffers
FROM pg_database AS a, pg_buffercache AS b, pg_class AS c
WHERE c.relfilenode = b.relfilenode
AND b.reldatabase = a.oid
AND a.datname = current_database()
GROUP BY 1, 2, 3
ORDER BY 3 DESC, 1;
A healthy system typically shows a meaningful number of buffers with higher usage counts (for example, 4–5), indicating frequently reused data that benefits from caching.
Buffer Cache Percentages: Putting Numbers in Context
Raw buffer counts are useful, but percentages make interpretation easier. The following query shows:
- How much of shared_buffers each relation occupies
- What percentage of the relation itself is cached
SELECT c.relname,
pg_size_pretty(count(*) * 8192) AS buffered,
round(100.0 * count(*) / (SELECT setting FROM pg_settings WHERE name='shared_buffers')::integer, 1) AS buffers_percent,
round(100.0 * count(*) * 8192 / pg_relation_size(c.oid), 1) AS percent_of_relation
FROM pg_class c
JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
JOIN pg_database d ON b.reldatabase = d.oid AND d.datname = current_database()
GROUP BY c.oid, c.relname
ORDER BY 3 DESC
LIMIT 10;
This view is especially powerful when validating whether performance-critical objects are adequately cached relative to their size.
Complementing Cache Views with I/O Statistics
While pg_buffercache shows the current state of memory, I/O statistics reveal long-term trends. PostgreSQL exposes these via pg_statio_user_tables and pg_statio_user_indexes.
Table Heap Hit Ratios
SELECT relname,
heap_blks_hit::numeric / (heap_blks_hit + heap_blks_read) AS hit_pct,
heap_blks_hit,
heap_blks_read
FROM pg_catalog.pg_statio_user_tables
WHERE (heap_blks_hit + heap_blks_read) > 0
ORDER BY hit_pct;
Hit ratios close to 1 indicate that table data is largely served from memory rather than disk.
Index Hit Ratios
SELECT relname,
idx_blks_hit::numeric / (idx_blks_hit + idx_blks_read) AS hit_pct,
idx_blks_hit,
idx_blks_read
FROM pg_catalog.pg_statio_user_tables
WHERE (idx_blks_hit + idx_blks_read) > 0
ORDER BY hit_pct;
Poor index hit ratios often point to insufficient cache or inefficient query patterns that bypass indexes.
Including TOAST and Index Reads
For large objects, TOAST activity can significantly impact I/O. This query provides a more holistic view:
SELECT *,
(heap_blks_read + toast_blks_read + tidx_blks_read) AS total_blocks_read,
(heap_blks_hit + toast_blks_hit + tidx_blks_hit) AS total_blocks_hit
FROM pg_catalog.pg_statio_user_tables;
This helps identify indexes that are frequently read from disk and may benefit from better caching or query rewrites.
How to Interpret the Results
When reviewing buffer cache and I/O metrics, keep the following guidelines in mind:
- Validate cache residency of critical objects: If business-critical tables and indexes occupy a meaningful share of shared_buffers, your cache sizing is likely reasonable.
- Correlate buffer data with hit ratios: High hit ratios in pg_statio_user_tables and pg_statio_user_indexes confirm effective caching. Persistently low ratios may justify increasing shared_buffers.
- Analyze usage count distribution: A healthy number of buffers with higher usage counts indicates hot data benefiting from cache reuse.
- Avoid over-tuning: If most buffers have low usage counts but hit ratios remain high, increasing shared_buffers further may not yield measurable gains.
Conclusion
Buffer cache analysis bridges the gap between theory and reality in PostgreSQL performance tuning. By combining real-time cache inspection with long-term I/O statistics, you gain a clear picture of how memory is actually used—and whether changes to shared_buffers will deliver tangible benefits.
Rather than tuning memory blindly, this approach lets you optimize with confidence, grounded in data that reflects your real workload.