Blog Post

Azure Database for PostgreSQL Blog
5 MIN READ

PostgreSQL 18 Vacuuming Improvements Explained

Shashikant_Shakya's avatar
Oct 17, 2025

Introduction

This PostgreSQL 18 release brings one of the most significant collections of VACUUM and ANALYZE improvements in years. These updates include faster heap scans through asynchronous I/O (AIO), flexible on-the-fly autovacuum scaling, earlier triggers for large tables, safer tail shrinking, and more robust observability.

At a Glance: What’s New

  • VACUUM runs faster by overlapping reads with processing via asynchronous I/O (AIO).
  • Autovacuum throughput is adjustable without restarts: set a startup ceiling at autovacuum_worker_slots and tune autovacuum_max_workers on the fly.
  • Autovacuum can trigger earlier on big tables using autovacuum_vacuum_max_threshold (hard cap on dead tuples).
  • Normal VACUUM can “eager-freeze” all visible pages to amortize anti-wraparound work; control with vacuum_max_eager_freeze_failure_rate.
  • VACUUM and ANALYZE extend recursively into child tables by default; use the ONLY table modifier to target parent-level only.
  • Tail shrinking is explicit and predictable using the server setting vacuum_truncate or per-command TRUNCATE.
  • New visibility: per-table cumulative times, explicit cost-delay timing (track_cost_delay_timing).

Deep Dive: Changes That Matter

1) Vacuum is faster with asynchronous I/O (AIO)

A new asynchronous I/O subsystem lets VACUUM queue up reads and continue working asynchronously while data is fetched, reducing heap-scan wait time on cache misses and smoothing throughput during large table passes.

How to enable & verify:

  • Controlled by the io_method server parameter
  • SHOW io_method; (typically 'worker' by default)
  • SHOW io_workers; (default 3)
  • Increase workers gradually if VACUUM remains I/O-bound: io_workers = 6; (test on non-prod first)

Note: Raise workers gradually (e.g., 3 → 6 → 12) and monitor server storage metrics and VACUUM durations.

2) Adjust autovacuum workers on the fly

Autovacuum scaling is now much more flexible. Previously, changing autovacuum_max_workers required a server restart, which could be painful for time-based tuning or urgent bloat cleanup activities. The new approach now separates these two concepts:

  • autovacuum_worker_slots: The total number of slots reserved at startup (server restart required).
  • autovacuum_max_workers: The number of workers allowed to run concurrently (reloadable on the fly).

If you set autovacuum_max_workers higher than the reserved slots, PostgreSQL will cap the number and log a warning.

Note: When you scale up workers, consider enlarging the shared cost budget so that per-worker share does not collapse. Either raise autovacuum_vacuum_cost_limit or reduce autovacuum_vacuum_cost_delay.

3) Force earlier autovacuum on very large tables

Extremely large tables can accumulate a significant number of dead tuples while staying below the classic trigger threshold.

threshold = reltuples * autovacuum_vacuum_scale_factor + autovacuum_vacuum_threshold 

To fix this, a new parameter autovacuum_vacuum_max_threshold adds a hard cap on dead tuples (default: 100,000,000). Once a table crosses above this number of dead rows, autovacuum runs immediately even if the scale factor says “wait.” This prevents bloat buildup between runs on very large relations. Set it to -1 to disable the feature.

Examples:

  • Server-wide cap (example: 50M dead tuples): autovacuum_vacuum_max_threshold = 50000000
  • Per-table override for a hot relation: ALTER TABLE orders SET (autovacuum_vacuum_max_threshold = 20000000);
  • Disable on a table if needed: ALTER TABLE staging SET (autovacuum_vacuum_max_threshold = -1);

Note: Prefer per-table caps (e.g., 10–50M) for a few giant tables instead of lowering the server-wide cap too far.

4) Eager freezing during normal vacuum

On very large tables, “normal” VACUUM often skips all visible pages and leaves a backlog of pages that are all visible but not all frozen. When an anti-wraparound (aggressive) VACUUM finally triggers, it must scan all those pages, causing long, I/O heavy maintenance runs that compete for resources with your foreground workload.

With new eager freezing, normal VACUUM can proactively freeze tuples on all visible pages, spreading the freeze workload across routine runs and shortening future anti-wraparound passes on very large tables.

Examples:

  • Server-wide: be a bit more eager: vacuum_max_eager_freeze_failure_rate = 0.05
  • Per-table: even more eager on a hot relation: ALTER TABLE events SET (vacuum_max_eager_freeze_failure_rate = 0.10);
  • Disable for churny tables: ALTER TABLE temp_staging SET (vacuum_max_eager_freeze_failure_rate = 0.0);

Note: Increase if anti-wraparound VACUUMs on big tables are too long or spike I/O. Decrease if routine VACUUMs become noticeably longer without observable benefit.

5) Vacuum/analyze: recurse by default; use ONLY modifier when needed

PostgreSQL now simplifies maintenance for partitioned tables. For inheritance-based partitioning, VACUUM and ANALYZE automatically include child tables by default, removing the need for manual recursion. Use the ONLY keyword when you want to process just the parent. Declarative partitioning already behaved this way, but the new ability to run ANALYZE ONLY on a partitioned parent is a key improvement. It lets you refresh the parent’s statistics without scanning all partitions, ideal when partition stats are already current but accurate parent-level estimates are needed for better query planning.

Examples:

  • Inheritance-based partitioning: 
    • Recurse by default (new behavior)
      • VACUUM VERBOSE public.legacy_parent; --vacuum parent + child tables
      • ANALYZE public.legacy_parent; -- analyze parent + child tables
    • Restrict to the parent only
      • VACUUM VERBOSE ONLY public.legacy_parent; --old behaviour
      • ANALYZE ONLY public.legacy_parent; --old behaviour
  • Declarative partitioning: 
    • Refresh just the parent’s stats (skip partitions)
      • ANALYZE ONLY public.sales_partitioned;
    • VACUUM ONLY on a partitioned parent warns and does no work
      • VACUUM VERBOSE ONLY public.sales_partitioned;

Note: For large-partitioned tables, schedule parent-only ANALYZE during peak hours and full recursive runs off-peak.

6) Tail shrinking made explicit (vacuum_truncate)

Autovacuum’s tail shrink step (removing empty pages from the end of a table) can take an ACCESS EXCLUSIVE lock that is replayed on hot standbys, briefly stalling read operations.

To make this more predictable, PostgreSQL added a top level switch (vacuum_truncate) so you can disable automatic truncation and perform this operation only when you choose.

Examples:

  • Prefer explicit control: disable automatic tail-shrinks: vacuum_truncate = off
  • Later, shrink the tail of a large log table in a window: VACUUM (VERBOSE, TRUNCATE ON) public.pgbench_history;
  • Or keep truncation off for specific hot tables: ALTER TABLE public.orders SET (vacuum_truncate = off);

Note: If replicas serve heavy read operations, keep automatic truncation off and run explicit truncation within your maintenance windows.

7) Easier to see vacuum/analyze overhead

  • Per-table time counters: cumulative total_vacuum_time / total_autovacuum_time in the pg_stat_all_tables views expose where maintenance time is spent, which is great for finding your top offenders.
  • Throttling timing: flip track_cost_delay_timing = on to record actual sleep time from cost based delays in pg_stat_progress_vacuum and in VERBOSE output, perfect to identify “we waited” vs. “we hit disk.” (This setting is Off by default to avoid adding timing overhead.)

Example: Top tables by total maintenance time:

 
SELECT relname,
       total_vacuum_time,
       total_autovacuum_time
FROM pg_stat_user_tables
ORDER BY (COALESCE(total_vacuum_time, 0)
        + COALESCE(total_autovacuum_time, 0))
        DESC NULLS LAST
LIMIT 10;
 

Conclusion

These improvements make your maintenance operations faster, more predictable, and easier to tune. Get started by enabling AIO, right sizing autovacuum concurrency and cost budgets, setting hard caps for your biggest tables, and adopting explicit tail shrinking if you have hot standbys. With better observability, you can quickly confirm whether VACUUM is throttled by cost-based delays or genuinely storage-bound and adjust accordingly.

Before applying changes in production, ensure you have validated these settings in a staging environment, monitor pg_stat_progress_vacuum, pg_stat_io, and table-level counters, and iterate gradually. The net effect is smoother autovacuum behavior, shorter aggressive VACUUMs, and fewer unexpected locks.

Updated Oct 17, 2025
Version 1.0
No CommentsBe the first to comment