Another improved functionality in SQL Server 2008 R2
I am posting this on behalf of Juergen Thomas who has been with SQL Server PM team from 12+ years and is an expert in SAP.
Juergen>> In this article I’d like to talk about another improvement we made to SQL Server 2008 R2. The improvement pretty much is not noticed since it is buried deep into the SQL Server Engine. There also is nothing to tune about it. It is about the hash key algorithm which is used by SQL Server in its Lock Manager. So let’s step and explain what SQL Server does and what impact this change in SQL Server 2008 R2 has.
How does the locking work in SQL Server?
Unlike some other database vendors, there is a logical component to SQL Server’s Lock Manager. SQL Server uses a lockhash value to represent a lock on the lock structure in the SQL Server Lock Manager instead of using the physical description to a row, page or a table. The lockhash value then is kept in memory. This design was driven by major considerations like:
· No locking information to be stored on the page containing the resource. This eliminates additional IO or any space penalty on the page due to locking.
· Since the key to a row could be as large as 900 bytes, using the real key values would have inflicted larger memory consumption. Especially with applications running long transactions and holding hundreds of thousands of locks. Therefore one needed to seek for a possibility to have a lock value which would not exceed a few bytes and fixed in size for better memory management
The solution to this problem was found when designing SQL server 7.0 in 1996 and 1997 by using the key of the row and apply a hash algorithm to it which then results in a 6 byte long lockhash value. This value is stored as resource description. Added to this is HoBT ID (B-Tree ID). If an another row in the same B-Tree needs to be locked, the hash value for the key of the row gets calculated and then compared to the hash values already stored as granted or waiting locks in order to see whether a lock on this row already exists. This mechanism worked sufficiently well for many years
Issues appearing on the horizon
Using a hash algorithm to calculate a value out of keys does have one small disadvantage:
· Depending upon the # of rows, structure of the primary key, the data distribution and the complexity of the hashing algorithm, one can get hash collisions. For example, one calculated lockhash value can lock more than one row within a B-Tree.
How can we see what the hash key value for a lock held on a key is?
Let’s demonstrate this with an example.
CREATE TABLE test
(a VARCHAR(3) NOT NULL, b VARCHAR(8) NOT NULL, c VARCHAR(5) NOT NULL,
d integer NOT NULL)
CREATE UNIQUE CLUSTERED INDEX ucl ON test(a,b,c)
INSERT test VALUES('150','00001082','00345',1)
Now let’s perform this query:
select resource_type,resource_database_id, resource_description, resource_associated_entity_id,request_mode, request_type, request_status from sys.dm_tran_locks where resource_type = 'KEY'
The result will look like:
As resource_description you can see the value our hash key algorithm did calculate out of the input of the three key values ‘150’, ‘00001082’ and ‘00345’. The column ‘resource_associated_entity_id’ is the id of the B-Tree. It also finds usage in the system views sys.partitions and sys.allocation_units and can be used to get to the object_id or the name of the table the B-Tree belongs to.
Now let’s insert another row into the very same table with another Query Window. We’ll try to insert with this command:
INSERT test VALUES('150','00024855','00012',4)
To our surprise the insert of this row seems to be blocked and doesn’t come back (please note that the transaction of the first Query Window with the first row inserted still is open). Executing the same query against sys.dm_tran_locks gives us this result: