Strange Sch-S / Sch-M Deadlock on Machines with 16 or More Schedulers

Published Jan 15 2019 02:37 PM 856 Views
Microsoft
First published on MSDN on Aug 31, 2012
Since it took me several days to track down this bug, and I did learn a couple of new things along the way, I thought I would share some of my work.

16 or More CPUS

When a system presents SQL Server with 16 or more CPUs, and you are using a high end SQL Server SKU, SQL Server will enable lock partitioning.   (Lock partitioning can be disabled using startup parameter, trace flag -T1229.)

Lock Partitioning

Lock partitioning optimizes locking structures by adding additional, per scheduler structures and actions.   This design has similarities to Sub/Super Latching ( http://blogs.msdn.com/b/psssql/archive/2009/01/28/hot-it-works-sql-server-superlatch-ing-sub-lat... )

As a quick overview, if the query needs to obtain a Shared lock it only needs to acquire the shared lock on the local partition.  For an exclusive lock the query acquires the lock on each partition, always progressing from partition 0 to n to avoid deadlocks.   This allows the SQL Server to utilize the local partition when appropriate and improves scalability on larger systems.



Deadlock from Shared Lock on a Different Partition - What?

The problem I was presented with was the following deadlock output.   (This was from trace flag 1222 and 3605 to add deadlock information to the error log.  You could get similar information using the trace events.)

objectlock lockPartition=8 objid=1765581328 subresource=FULL dbid=8 objectname=Test id=lock47b821a00 mode=Sch-M associatedObjectId=1765581328

Notice the partition is 8 and the mode held is Sch-M.
owner-list

owner id=process46c276188 mode=Sch-M

The process is the task address that can be mapped to sys.dm_os_tasks, who owns the lock.

waiter-list

waiter id=process47b07dc38 mode=Sch-S requestType=wait

This is the close of the deadlock cycle by the second process.

Note: The waiter list is usually printed in ascending order based on how the victims will be selected; usually work investment based.
objectlock lockPartition=13 objid=1765581328 subresource=FULL dbid=8 objectname=Test id=lock47b821f80 mode=Sch-S associatedObjectId=1765581328 Partition 13 is showing the process that already holds the same Sch-S and is attempting a new acquire on partition 8.

owner-list

owner id=process47b07dc38 mode=Sch-S

Owner of the Sch-S lock.

waiter-list

waiter id=process46c276188 mode=Sch-M requestType=wait

Blocked process attempting to acquire the Sch-M lock.  This is expected as the Sch-M is attempting to acquire the lock on all partitions.




Under a rare condition SQL Server may not associate the proper lock partition with the lock request, leading to additional locking overhead or possible deadlocks.   This bug does not expose any locking problems that would lead to data integrity issues. This is a very small window during compile, before a user transaction is started.

The problem is that when using lock partitioning the Sch-S lock should be acquired on the transaction associated, local partition.  However, the same process is attempting to acquire the Sch-S lock on 2 different partitions leading to the deadlock.  Why?

  • The lock partition hint is stored with the connection object (sys.dm_exec_sessions - physical connection internal object to be more precise.)
  • SQL Server assigns new batches to one of the active schedulers on the same NUMA node based on active task load for the schedulers.


In this case the login took place on scheduler 8 and the lock partition, hint is cached.  When the batch is processed it is assigned to scheduler 13 and the second partition becomes involved; triggering the unexpected behavior.

Bob Dorr - Principal SQL Server Escalation Engineer

%3CLINGO-SUB%20id%3D%22lingo-sub-317208%22%20slang%3D%22en-US%22%3EStrange%20Sch-S%20%2F%20Sch-M%20Deadlock%20on%20Machines%20with%2016%20or%20More%20Schedulers%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-317208%22%20slang%3D%22en-US%22%3E%0A%20%26lt%3Bmeta%20http-equiv%3D%22Content-Type%22%20content%3D%22text%2Fhtml%3B%20charset%3DUTF-8%22%20%2F%26gt%3B%3CSTRONG%3EFirst%20published%20on%20MSDN%20on%20Aug%2031%2C%202012%20%3C%2FSTRONG%3E%20%3CBR%20%2F%3E%20Since%20it%20took%20me%20several%20days%20to%20track%20down%20this%20bug%2C%20and%20I%20did%20learn%20a%20couple%20of%20new%20things%20along%20the%20way%2C%20I%20thought%20I%20would%20share%20some%20of%20my%20work.%20%3CBR%20%2F%3E%20%3CBR%20%2F%3E%20%3CSTRONG%3E16%20or%20More%20CPUS%20%3C%2FSTRONG%3E%20%3CBR%20%2F%3E%20%3CBR%20%2F%3E%20When%20a%20system%20presents%20SQL%20Server%20with%2016%20or%20more%20CPUs%2C%20and%20you%20are%20using%20a%20high%20end%20SQL%20Server%20SKU%2C%20SQL%20Server%20will%20enable%20lock%20partitioning.%26nbsp%3B%26nbsp%3B%20(Lock%20partitioning%20can%20be%20disabled%20using%20startup%20parameter%2C%20trace%20flag%20-T1229.)%20%3CBR%20%2F%3E%20%3CBR%20%2F%3E%20%3CSTRONG%3E%20Lock%20Partitioning%20%3C%2FSTRONG%3E%20%3CBR%20%2F%3E%20%3CBR%20%2F%3E%20Lock%20partitioning%20optimizes%20locking%20structures%20by%20adding%20additional%2C%20per%20scheduler%20structures%20and%20actions.%26nbsp%3B%26nbsp%3B%20This%20design%20has%20similarities%20to%20Sub%2FSuper%20Latching%20(%20%3CA%20href%3D%22http%3A%2F%2Fblogs.msdn.com%2Fb%2Fpsssql%2Farchive%2F2009%2F01%2F28%2Fhot-it-works-sql-server-superlatch-ing-sub-latches.aspx%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3E%20http%3A%2F%2Fblogs.msdn.com%2Fb%2Fpsssql%2Farchive%2F2009%2F01%2F28%2Fhot-it-works-sql-server-superlatch-ing-sub-latches.aspx%20%3C%2FA%3E%20)%20%3CBR%20%2F%3E%20%3CBR%20%2F%3E%3CBLOCKQUOTE%3EAs%20a%20quick%20overview%2C%20if%20the%20query%20needs%20to%20obtain%20a%20Shared%20lock%20it%20only%20needs%20to%20acquire%20the%20shared%20lock%20on%20the%20local%20partition.%26nbsp%3B%20For%20an%20exclusive%20lock%20the%20query%20acquires%20the%20lock%20on%20each%20partition%2C%20always%20progressing%20from%20partition%200%20to%20n%20to%20avoid%20deadlocks.%26nbsp%3B%26nbsp%3B%20This%20allows%20the%20SQL%20Server%20to%20utilize%20the%20local%20partition%20when%20appropriate%20and%20improves%20scalability%20on%20larger%20systems.%3CP%3E%3C%2FP%3E%0A%20%20%3C%2FBLOCKQUOTE%3E%3CBR%20%2F%3E%20%3CBR%20%2F%3E%3CP%3E%3CSTRONG%3EDeadlock%20from%20Shared%20Lock%20on%20a%20Different%20Partition%20-%20What%3F%20%3C%2FSTRONG%3E%20%3CBR%20%2F%3E%20%3CBR%20%2F%3E%20The%20problem%20I%20was%20presented%20with%20was%20the%20following%20deadlock%20output.%26nbsp%3B%26nbsp%3B%20(This%20was%20from%20trace%20flag%201222%20and%203605%20to%20add%20deadlock%20information%20to%20the%20error%20log.%26nbsp%3B%20You%20could%20get%20similar%20information%20using%20the%20trace%20events.)%20%3CBR%20%2F%3E%20%3CBR%20%2F%3E%3C%2FP%3E%0A%20%20%3CTABLE%3E%0A%20%20%20%3CTBODY%3E%3CTR%3E%0A%20%20%20%20%3CTD%3E%3CP%3Eobjectlock%20%3CB%3E%20lockPartition%3D8%20%3C%2FB%3E%20objid%3D1765581328%20subresource%3DFULL%20dbid%3D8%20objectname%3DTest%20id%3Dlock47b821a00%20mode%3DSch-M%20associatedObjectId%3D1765581328%3C%2FP%3E%0A%20%20%20%20%3C%2FTD%3E%0A%20%20%20%20%3CTD%3ENotice%20the%20partition%20is%208%20and%20the%20mode%20held%20is%20Sch-M.%3C%2FTD%3E%0A%20%20%20%3C%2FTR%3E%0A%20%20%20%3CTR%3E%0A%20%20%20%20%3CTD%3Eowner-list%3CP%3E%3C%2FP%3E%0A%20%20%20%20%20%3CP%3Eowner%20id%3Dprocess46c276188%20mode%3DSch-M%3C%2FP%3E%0A%20%20%20%20%3C%2FTD%3E%0A%20%20%20%20%3CTD%3EThe%20process%20is%20the%20task%20address%20that%20can%20be%20mapped%20to%20sys.dm_os_tasks%2C%20who%20owns%20the%20lock.%3C%2FTD%3E%0A%20%20%20%3C%2FTR%3E%0A%20%20%20%3CTR%3E%0A%20%20%20%20%3CTD%3E%3CP%3Ewaiter-list%3C%2FP%3E%0A%20%20%20%20%20%3CP%3E%3CB%3E%20waiter%20id%3Dprocess47b07dc38%20mode%3DSch-S%20requestType%3Dwait%20%3C%2FB%3E%3C%2FP%3E%0A%20%20%20%20%3C%2FTD%3E%0A%20%20%20%20%3CTD%3E%3CEM%3E%20This%20is%20the%20close%20of%20the%20deadlock%20cycle%20by%20the%20second%20process.%20%3C%2FEM%3E%20%3CBR%20%2F%3E%20%3CBR%20%2F%3E%20%3CSTRONG%3ENote%3A%20%3C%2FSTRONG%3E%20The%20waiter%20list%20is%20usually%20printed%20in%20ascending%20order%20based%20on%20how%20the%20victims%20will%20be%20selected%3B%20usually%20work%20investment%20based.%3C%2FTD%3E%0A%20%20%20%3C%2FTR%3E%0A%20%20%20%3CTR%3E%0A%20%20%20%20%3CTD%3Eobjectlock%20%3CB%3E%20lockPartition%3D13%20%3C%2FB%3E%20%3CB%3E%20%3C%2FB%3E%20objid%3D1765581328%20subresource%3DFULL%20dbid%3D8%20objectname%3DTest%20id%3Dlock47b821f80%20mode%3DSch-S%20associatedObjectId%3D1765581328%3C%2FTD%3E%0A%20%20%20%20%3CTD%3EPartition%2013%20is%20showing%20the%20process%20that%20already%20holds%20the%20same%20Sch-S%20and%20is%20attempting%20a%20new%20acquire%20on%20partition%208.%3C%2FTD%3E%0A%20%20%20%3C%2FTR%3E%0A%20%20%20%3CTR%3E%0A%20%20%20%20%3CTD%3E%3CP%3Eowner-list%3C%2FP%3E%0A%20%20%20%20%20%3CP%3Eowner%20id%3Dprocess47b07dc38%20mode%3DSch-S%3C%2FP%3E%0A%20%20%20%20%3C%2FTD%3E%0A%20%20%20%20%3CTD%3EOwner%20of%20the%20Sch-S%20lock.%3C%2FTD%3E%0A%20%20%20%3C%2FTR%3E%0A%20%20%20%3CTR%3E%0A%20%20%20%20%3CTD%3E%3CP%3Ewaiter-list%3C%2FP%3E%0A%20%20%20%20%20%3CP%3Ewaiter%20id%3Dprocess46c276188%20mode%3DSch-M%20requestType%3Dwait%3C%2FP%3E%0A%20%20%20%20%3C%2FTD%3E%0A%20%20%20%20%3CTD%3EBlocked%20process%20attempting%20to%20acquire%20the%20Sch-M%20lock.%26nbsp%3B%20This%20is%20expected%20as%20the%20Sch-M%20is%20attempting%20to%20acquire%20the%20lock%20on%20all%20partitions.%3C%2FTD%3E%0A%20%20%20%3C%2FTR%3E%0A%20%20%3C%2FTBODY%3E%3C%2FTABLE%3E%3CBR%20%2F%3E%20%3CBR%20%2F%3E%3CP%3E%3CBR%20%2F%3E%20%3CBR%20%2F%3E%20%3CEM%3E%20Under%20a%20rare%20condition%20SQL%20Server%20may%20not%20associate%20the%20proper%20lock%20partition%20with%20the%20lock%20request%2C%20leading%20to%20additional%20locking%20overhead%20or%20possible%20deadlocks.%26nbsp%3B%26nbsp%3B%20This%20bug%20does%20not%20expose%20any%20locking%20problems%20that%20would%20lead%20to%20data%20integrity%20issues.%20This%20is%20a%20very%20small%20window%20during%20compile%2C%20before%20a%20user%20transaction%20is%20started.%20%3C%2FEM%3E%20%3CBR%20%2F%3E%20%3CBR%20%2F%3E%20The%20problem%20is%20that%20when%20using%20lock%20partitioning%20the%20Sch-S%20lock%20should%20be%20acquired%20on%20the%20transaction%20associated%2C%20local%20partition.%26nbsp%3B%20However%2C%20the%20same%20process%20is%20attempting%20to%20acquire%20the%20Sch-S%20lock%20on%202%20different%20partitions%20leading%20to%20the%20deadlock.%26nbsp%3B%20Why%3F%20%3CBR%20%2F%3E%20%3CBR%20%2F%3E%3C%2FP%3E%0A%20%20%3CUL%3E%0A%20%20%20%3CLI%3EThe%20lock%20partition%20hint%20is%20stored%20with%20the%20connection%20object%20(sys.dm_exec_sessions%20-%20physical%20connection%20internal%20object%20to%20be%20more%20precise.)%3C%2FLI%3E%0A%20%20%20%3CLI%3ESQL%20Server%20assigns%20new%20batches%20to%20one%20of%20the%20active%20schedulers%20on%20the%20same%20NUMA%20node%20based%20on%20active%20task%20load%20for%20the%20schedulers.%3C%2FLI%3E%0A%20%20%3C%2FUL%3E%3CBR%20%2F%3E%20%3CBR%20%2F%3E%20In%20this%20case%20the%20login%20took%20place%20on%20scheduler%208%20and%20the%20lock%20partition%2C%20hint%20is%20cached.%26nbsp%3B%20When%20the%20batch%20is%20processed%20it%20is%20assigned%20to%20scheduler%2013%20and%20the%20second%20partition%20becomes%20involved%3B%20triggering%20the%20unexpected%20behavior.%20%3CBR%20%2F%3E%20%3CBR%20%2F%3E%3CP%3EBob%20Dorr%20-%20Principal%20SQL%20Server%20Escalation%20Engineer%3C%2FP%3E%0A%20%0A%3C%2FLINGO-BODY%3E%3CLINGO-TEASER%20id%3D%22lingo-teaser-317208%22%20slang%3D%22en-US%22%3EFirst%20published%20on%20MSDN%20on%20Aug%2031%2C%202012%20Since%20it%20took%20me%20several%20days%20to%20track%20down%20this%20bug%2C%20and%20I%20did%20learn%20a%20couple%20of%20new%20things%20along%20the%20way%2C%20I%20thought%20I%20would%20share%20some%20of%20my%20work.%3C%2FLINGO-TEASER%3E%3CLINGO-LABS%20id%3D%22lingo-labs-317208%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EErrors%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3ESQL%20OS%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
Version history
Last update:
‎Jan 15 2019 02:37 PM
Updated by: