First published on MSDN on Jun 01, 2012
FlushCache is the SQL Server routine that performs the checkpoint operation. The following message is output to the SQL Server error log when trace flag ( 3504 ) is enabled.
2012-05-30 02:01:56.31 spid14s FlushCache: cleaned up 216539 bufs with 154471 writes in 69071 ms (avoided 11796 new dirty bufs) for db 6:0
2012-05-30 02:01:56.31 spid14s average throughput: 24.49 MB/sec, I/O saturation: 68365, context switches 80348
2012-05-30 02:01:56.31 spid14s last target outstanding: 1560, avgWriteLatency 15
Prior to SQL Server 2012 the trace flag had to be enabled in order to output the information to the SQL Server error log. ( Trace flag was the only way to obtain the output. ) |
|
SQL Server 2012 adds an additional condition ( is long checkpoint ) to the logic. If the trace flag is enabled or the checkpoint 'TRUE == IsLong' the message is added to the SQL Server error log. |
Is Long Checkpoint: A long checkpoint is defined as a 'FlushCache / checkpoint' operation on a database that has exceeded the configured recovery interval. |
If your server does not have the trace flag enabled ( use dbcc tracestatus(-1) to check ) the message is indicating that the checkpoint process, for the indicated database, exceeded the configured recovery interval. If this is the case you should review your I/O capabilities as well as the checkpoint and recovery interval targets. |
Not meeting the recovery interval target means that recovery from a crash could exceeded operational goals. |
Update (Nov 2019): more detail on some of the numbers in the error message using the following example:
* 4253201 bufs – buf is short for "buffer" and indicates a database page, therefore in this example FlushCache had to write 4253201 pages * 8KB = 33881608 KB = 32.3GB
* with 269274 writes – how many I/O operations were issued in this FlushCache request. SQL Server will try to group pages into larger I/O requests: from the number of writes and bufs we can calculate the average I/O size -- 33881608KB / 269274 writes = 126.76 KB per I/O
* in 73938 ms – how much time it took to complete FlushCache. Note: typically you won’t see this message with a time lower than 60 seconds
* average throughput: 449.41 MB/sec – this is self-explanatory, but you can also infer the throughput dividing the amount of I/O performed by the total elapsed time 33087.5MB / 73.938 seconds = 447.5MB/s (slightly different due to rounding)
* I/O saturation: 202742 – the maximum amount of pending I/Os seen during FlushCache, in this case 202742 out of the total 269274
* context switches 320164 – how many times the task running Checkpoint/FlushCache yielded the CPU to other SQL tasks inside SQL SOS Scheduler, this is useful in CPU bottleneck scenarios
* last target outstanding: 11218 – target of queued outstanding I/Os which would cause FlushCache task to work more aggressively to reach
* avgWriteLatency 25 – average response time of the I/Os issues in milliseconds
As you can see in the above example, in a very short interval (60 seconds by default) a lot of data in the database was changed, turning pages in the buffer pool from clean to dirty. The resulting FlushCache made the I/O subsystem work very hard to keep up with the I/O – 32.3GB of changes in about 74 seconds.
Bob Dorr - Principal SQL Server Escalation Engineer