Blog Post

SQL Server Blog
7 MIN READ

HOW IT WORKS: SQL Server Scheduler Affinity

BobDorr's avatar
BobDorr
Icon for Microsoft rankMicrosoft
Feb 05, 2022
Moved from: bobsql.com

 

SQL Server uses 3 types of affinity to control where the SQL Server worker threads execute.  Before explaining the different scheduler affinity types let me clarify some terminology.

 

Node Types

SQL Server makes a specific distinction between scheduling and memory nodes.

 

Scheduling nodes:        sys.dm_os_nodes

Memory nodes:            sys.dm_os_memory_nodes

 

A scheduling node is a used to group a set of SQLOS schedulers.  The scheduling node must :

-       Remain within a single memory node.

-       Can be configured to use a subset of the CPUs presented by the OS from the same memory node.

 

For example: A memory node with 64 CPUs is a complete, Operating System, scheduler group.  SQL Server may choose to divide the memory node allowing for better partitioning and performance.  The Soft Numa feature may take the 64 CPUs and create 8 scheduler nodes, each managing 8 CPUs or 4 scheduler nodes managing 16 CPUs, etc.  The decision is performance driven.  

 

A memory node represents the memory associated with a group of CPUs from the physical hardware.   SQL Server aligns schedulers and other partitioned structures with the memory node to reduce access to remote, NUMA node memory when possible.  A memory node may have 1 or more scheduling nodes, but a scheduling node can only be assigned to a single memory node. 


Online vs Offline Schedulers

 

The SQL Server schedulers can be online or offline. 

 

Online

An online scheduler IS accepting requests.

Offline

An offline scheduler is NOT accepting requests. 

A scheduler dynamically may transition from online to offline.  The scheduler’s workers are not moved to online schedulers until the current task completes.  While, draining workers from offline scheduler’s the worker’s affinity mask is adjusted to use any online scheduler’s CPU, allowing SQL Server to stop using the associated offline CPU and complete the active tasks.  Then migrate the workers to online schedulers.


This allows you to scale up or down the SQL Server schedulers without restarting the SQL Server.

 

Online vs Offline Schedulers

 

The SQL Server schedulers can be visible or hidden.

 

Visible

The scheduler is accepting external requests.  (Usually queries submitted from TDS based clients.)

Hidden

A hidden scheduler only accepts internal requests. 

 

For example: When performing a backup, the striped file activities are handled by tasks on hidden scheduler.

 

Windows Job Objects

In SQL Azure the SQL Server may be executed under the control of a Windows Job Object.   The job object can be configured to limit the resource usage on the system, including which nodes and CPUs the processes started under its control may utilize.

 

For example: A smaller sized, SQL Azure database, using 2 CPUs, could be assigned to CPU 2 and 3 on the host machine.  For this example, assume the host machine is single node, 8 CPU system.  The affinity mask for the job would be 00001100 indicating CPU 2 and3 are to be used for processes running in this job context.  The sys.dm_os_schedulers would show 8 schedulers but only schedulers 2 and 3 are set to visible, online and the other scheduler are set to visible, offline.  If the job configuration is changed to use CPU 0 and 1 ( 00000011) and SQL Server brings scheduler 0 and 1 online and takes scheduler 2 and 3 offline.

 

Types of Scheduling Affinity

 

Refer to the SetThreadAffinityMask /  SetThreadGroupAffinity APIs for more details as well as Processor Groups.

 

Hint: select * from sys.dm_os_sys_info

 

NO_AFFINITY (AGNOSTIC)
Applies To: SQL Azure Database Only

The workers assigned to a scheduler are allowed to use any online CPU presented and the workers are not affinitized (mask remains 0.) 

 

NO_AFFINITY only creates online schedulers. By only creating the online schedulers the overhead of the offline schedulers (~8MB per scheduler) is eliminated.

 

By not affinitizing the workers to specific CPUs the job object can be reconfigured to use different CPUs.  The workers and schedulers are not impacted because the Operating System simply schedules execution on the configured CPUs.  The scheduler ids remain 0, 1, .. even though the workers are scheduled on other CPUs by the OS.

For example: 2 CPU Slo with Job Object configured to use CPU 4 and 5.  The sys.dm_os_schedulers are 0 and 1 with NO_AFFINITY set to allow workers to run on any presented CPU from the job object.

If the job object is reconfigured to use CPU 1 and 2 the sys.dm_os_schedulers data remains the same and CPU resource assignment is handled by the OS.

Note: To change from NO_AFFINITY requires SQL Server restart.

Currently (Aug 2021): Slos less than 40 vcores leverage NO_AFFINITY.

MANUAL

Applies To: SQL Azure Database & SQL Server Box

Manual affinity is also termed hard affinity.

 

Manual affinity instructs SQL Server to set the worker’s CPU affinity to only the CPU designated for the scheduler.  The Operating System is instructed to only schedule the worker on the specific CPU.

 

When manual affinity is used both online and offline schedulers are created.  This allows dynamic configurations to bring schedulers on or offline. 

 

For example: SQL Server is using MANUAL affinity on a 3 CPU system with only Scheduler 0 online.  Scheduler’s 1 and 2 are created and set offline.  Incoming requests are assigned to scheduler 0 and the worker affinity is set so only CPU 1 is used for scheduler 0, CPU2 for scheduler 1 and so forth. 

If the MANUAL affinity is changed to use scheduler’s 1 and 2.  Incoming requests are assigned to either scheduler 1 or 2 but scheduler 1 worker’s can only execute on CPU 2 and scheduler 2 worker’s can only execute on CPU 3.

MANUAL affinity provides the best, top end performance (for benchmarks by utilizing L1 caches) but is susceptible to noisy, CPU neighbors.  From this example assume MANUAL affinity is set to all 3 schedulers are online.  Workload is being assigned to all 3 CPUs when a high priority thread from an outside process (SQLAgent for example) starts running on CPU 2.  MANUAL affinity requires the workers on Scheduler 1 to remain on CPU 2.  The noisy neighbor competes for CPU 2 resources and can affect the worker throughput on scheduler 1.

 

Note: Changing to AUTO or adjusting the online schedulers can take place without restart of SQL Server.

AUTO
Applies To: SQL Azure Database & SQL Server Box

Auto affinity is the middle ground between NO_AFFINITY and MANUAL affinity and the most used and recommend configuration mode.

 

-       Auto affinity sets the worker’s affinity to any, online scheduler within the same scheduling node.

-       Auto affinity sets the worker’s preferred CPU hint to the scheduler assigned CPU.  The preferred CPU tells the Operating System to schedule the worker on the desired CPU unless the ideal CPU resources are deemed limited, in which case allow the Operating System dispatcher to schedule the worker on any CPU associated with the worker’s affinity setting.

 

For example: A memory node with 4 CPUs is split into 2 scheduling nodes.  The affinity mask for scheduler 0 would be 0011 to use CPU 1 or 2 and its preferred CPU set to 0001 to use CPU 1.  Whereas scheduler 2 would have the same affinity mask 0011 with a preferred CPU set to 0010 to use CPU 2.

Allowing the Operating System to schedule the worker on alternate CPUs can eliminate the impact of a noisy neighbor.  As discussed in the MANUAL affinity section a noisy neighbor could consume CPU resources and impact work associated with the scheduler.  When the SQL Server is configured for AUTO scheduling the OS can detect a busy CPU and use the alternative CPUs associated with the workers affinity.

 

Note: Changing to AUTO or adjusting the online schedulers can take place without restart of SQL Server.

 

Currently (Aug 2021): Slos greater than (and equal) to 40 vcores use auto affinity.

 

Trace Flag 8002

Enabling trace flag 8002 allows you to set a subset of CPUs online (MANUAL mode) but use AUTO affinity assignments.

 

For example, the affinity mode is MANUAL and set to 0110 to online scheduler’s 1 and 2.   Without trace flag 8002 MANUAL affinity would be a one-to-one assignment:

Scheduler 1 = CPU 2
Scheduler 2 = CPU 3

With the 8002 trace flag enabled the affinity mask is treated like auto for the online schedulers.


Scheduler 1 = 0110 with preferred 0010

Scheduler 2 = 0110 with preferred 0100

 

Note: Currently the dm_os_threads, affinity column does not account for -T8002 enablement.

 

Affinity Masks

 

The affinity mask values can be viewed using the sys.dm_os_threads and sys.dm_os_nodes DMVs and altered using the ALTER SERVER CONFIGURATION statement.

 

The dm_os_threads DMV indicates the affinity of the thread.   For example, an affinity mask of 0011 indicates the thread is allowed to be scheduled on CPU 1 and 2. 

select affinity, s.scheduler_id, s.status, * from sys.dm_os_threads t

    inner join sys.dm_os_schedulers s on t.scheduler_address = s.scheduler_address

 

For online schedulers (see the section discussing offline schedulers for affinity mask behaviors):

 

-      Auto Affinity: The thread affinity should match the online scheduler mask (underneath Operating System can schedule the thread on any core of the NUMA node)

-       Manual Affinity: The thread affinity should be a single CPU.

-       NO_AFFINITY: The value should be 0 to indicate that affinity has not been set on the thread and the OS/Job controls the scheduling behavior.  (Note: Currently a bug in the DMV incorrectly reports an affinity mask value)

 

The dm_os_nodes DMV indicates the affinity set for the scheduling node.  This indicates the online and offline scheduler CPUs associated with the scheduling node.

 

select * from sys.dm_os_nodes

 

For example: An 8 CPU system has a node mask of (0x255 = 11111111).  The affinity is configured as MANAUL to use CPUs 1 and 2.  The dm_os_threads, affinity indicates masks of (Scheduler 1, 0x2 0010) and (Scheduler 2, 0x4 0100) and the other schedulers are offline in MANUAL mode.  Enabling -T8002 for the similiar configuration.

 

Hidden schedulers

 

Hidden schedulers are used for background processing such as backups and the affinity settings are treated as AUTO mode.  Any CPU for the assigned scheduling node can be used by the hidden scheduler workers.

 

Which requests are using which scheduler / CPU?

 

You can use queries like the following example to cross reference executing queries to the scheduler/CPU being used.

 

select t.affinity,s.scheduler_id, * from sys.dm_exec_requests r

       inner join sys.dm_os_workers w on w.task_address = r.task_address

       inner join sys.dm_os_threads t on t.worker_address = w.worker_address

       inner join sys.dm_os_schedulers s on s.scheduler_address = t.scheduler_address

 

 

 

Updated Feb 05, 2022
Version 2.0
No CommentsBe the first to comment