Recently I have fielded several questions related to I/O affinity. Allow me to outline the behavior and clear up misconceptions.
When I/O affinity is enabled SQL Server creates a hidden scheduler for the
process as well as a
special lazy writer
thread for each of the bits set in the I/O affinity mask. The only job of the special lazy writer(s) is to watch the
read and write request lists
, issue and complete the I/O requests. These special lazy writers are not responsible for any memory targets or other behavior that a normal lazy writer handles.
For example, if you set the I/O affinity mask to 0x3 (bits 00000011) then SQL Server creates a hidden scheduler for a Log Writer and two (2) special lazy writers.
When a worker is issuing an I/O request (read or write) the I/O request is packaged and placed on the read or write list for the parent, special lazy writer. The parent worker (special lazy writer) is determined by a MOD operation of the current scheduler id.
Take the example of two (2) I/O affinity workers on an 8 proc computer. The following parent, I/O affinity assignments occur.
Parent I/O Affinity Scheduler
Do not use overlapping I/O affinity mask bits and scheduler affinity mask bits. If you attempt this using sp_configure a warning is generated. Management Studio currently does not present this warning.
The goal of I/O affinity is to offload the I/O work loads onto a dedicated scheduler(s) because the I/O work requires enough CPU cycles to consume a CPU.
Placing normal schedulers and I/O affinity schedulers on the same CPU causes resource competition and degrades performance.
SQL Server attempts to keep a single worker scheduled per scheduler. When you create I/O affinity scheduler and normal schedulers on the same CPU contention occurs.
SQL Server 2005 I/O affinity is not completely NUMA aware. On a hardware NUMA instance of SQL Server a normal lazy writer is created for each node. This lazy writer is responsible for maintaining memory parameters for the associated node. SQL Server 2005
has also been enhanced for hardware NUMA awareness. When checkpoint executes under hardware NUMA the writes are not issued by checkpoint. Instead the I/O requests are distributed to the per node lazy writers allowing checkpoint to scale better and keep the I/O requests local to the associated memory nodes.
Enabling I/O affinity when hardware NUMA is in place redirects the I/O requests to the associated I/O affinity scheduler. The redirection is based on the MOD operation and results in remote memory accesses which should be avoided on NUMA instances. There is no way to set an I/O affinity mask on a multi-node NUMA installation that allows the I/O requests to remain local node. Currently I/O affinity and hardware NUMA don't mix.