Issue
An issue was brought to our attention recently where an azure SQL DB was throwing TempDB related errors although the customer felt that the TempDB usage never came close to the value published in the official Microsoft document. Here’s the error the customer had complained about:
Error
Here is a more detailed error text :
The database 'tempdb' has reached its size quota. Partition or delete data, drop indexes, or consult the documentation for possible resolutions.'. Possible failure reasons: Problems with the query, 'ResultSet' property not set correctly, parameters not set correctly, or connection not established correctly.
WorkAround/Fix:
The customer wanted to know the TempDB allocated to them. While the official documentation suggested that the SLO had 1.2 TB of Temp DB, In reality they always received a TempDB full error after 64GB.
I asked their technical team to run the following command to check the current Temp DB space allocated:
SELECT FileName = df.name,
current_file_size_MB = df.size*1.0/128,
max_size = CASE df.max_size
WHEN 0 THEN 'Autogrowth is off.'
WHEN -1 THEN 'Autogrowth is on.'
ELSE 'Log file grows to a maximum size of 2 TB.'
END,
growth_value =
CASE
WHEN df.growth = 0 THEN df.growth
WHEN df.growth > 0 AND df.is_percent_growth = 0 THEN df.growth*1.0/128.0
WHEN df.growth > 0 AND df.is_percent_growth = 1 THEN df.growth
END,
growth_increment_unit =
CASE
WHEN df.growth = 0 THEN 'Size is fixed.'
WHEN df.growth > 0 AND df.is_percent_growth = 0 THEN 'Growth value is MB.'
WHEN df.growth > 0 AND df.is_percent_growth = 1 THEN 'Growth value is a percentage.'
END
FROM tempdb.sys.database_files AS df;
GO
The Output that was shared, verified that the TempDB allocated was indeed 1.2TB as described in the public documentation against the DB SLO. Here’s the output:
Now the next step was to check what was the free space available in the TempDB at that point because per the customer, they were not having any heavy-duty jobs/queries running on the DB at that time. I asked them to execute the query below:
-- Determining the amount of free space in tempdb
SELECT SUM(unallocated_extent_page_count) AS [free pages],
(SUM(unallocated_extent_page_count)*1.0/128) AS [free space in MB]
FROM tempdb.sys.dm_db_file_space_usage;
-- Determining the amount of space used by the version store
SELECT SUM(version_store_reserved_page_count) AS [version store pages used],
(SUM(version_store_reserved_page_count)*1.0/128) AS [version store space in MB]
FROM tempdb.sys.dm_db_file_space_usage;
-- Determining the amount of space used by internal objects
SELECT SUM(internal_object_reserved_page_count) AS [internal object pages used],
(SUM(internal_object_reserved_page_count)*1.0/128) AS [internal object space in MB]
FROM tempdb.sys.dm_db_file_space_usage;
-- Determining the amount of space used by user objects
SELECT SUM(user_object_reserved_page_count) AS [user object pages used],
(SUM(user_object_reserved_page_count)*1.0/128) AS [user object space in MB]
FROM tempdb.sys.dm_db_file_space_usage;
The output shared by the team below surprised even the customer as they didn’t expect to see this in their DB. Here’s the output:
Now the next question from the team was what was occupying >95% space on their TempDB. While I pointed out a specific section from the official documentation, I had to send them a query to get some material evidence around this. Here’s the public documentation states around User object Pages Used section in the output above:
user_object_reserved_page_count - Total number of pages allocated from uniform extents for user objects in the database. Unused pages from an allocated extent are included in the count.
You can use the total_pages column in the sys.allocation_units catalog view to return the reserved page count of each allocation unit in the user object. However, note that the total_pages column includes IAM pages.
The following objects are included in the user object page counters:
- User-defined tables and indexes
- System tables and indexes
- Global temporary tables and indexes
- Local temporary tables and indexes
- Table variables
- Tables returned in the table-valued functions
After that I shared the query below to help them investigate more into their TempDB:
SELECT
OBJECT_NAME(p.object_id) AS TableName,
au.*
FROM
tempdb.sys.allocation_units au
JOIN
tempdb.sys.partitions p ON au.container_id = p.partition_id
JOIN
tempdb.sys.objects o ON p.object_id = o.object_id
WHERE
au.type_desc = 'IN_ROW_DATA'; -- Optional: Add condition based on allocation unit type
Here is the output that we received and this explained the entire scenario to the customer:
The output above indicated that the 3rd party application that was connecting to the Azure SQL DB, was creating a lot of Global Temp Objects or objects that seem to persist beyond the session lifetime inside the Temp DB. The sum of the space occupied by those objects was a little over 1 TB, thereby explaining the 64 GB Temp DB space left for the rest of the queries. The customer also suspected that a new module of the 3rd party vendor could be either creating permanent objects inside the Temp DB or objects that are different from the usual Temp objects. After the discussion, the customer started a separate conversation with their vendor to address the issue.
They however had a complete understanding of the issue by the end of the troubleshooting session.
References
SORT_IN_TEMPDB Option For Indexes - SQL Server | Microsoft Learn
tempdb database - SQL Server | Microsoft Learn
Index Disk Space Example - SQL Server | Microsoft Learn
Azure SQL DB and TEMPDB usage tracking - Microsoft Community Hub