Lesson Learned #482: Identifying Potential Duplicate Statistics
Published Apr 01 2024 11:41 AM 1,071 Views
Microsoft

# Lesson Learned #482: Identifying Potential Duplicate Statistics

Some time ago, we encountered a support case where a customer experienced significant delays in updating auto-created and user-created statistics. I would like to share the insights gained from this experience, including the underlying causes of the issue and the potential solutions we identified and implemented to address the problem effectively.

## The Scenario

Consider the following table:

``````CREATE TABLE EmployeesStats (
EmployeeID INT PRIMARY KEY IDENTITY(1,1),
LastName VARCHAR(255),
FirstName VARCHAR(255),
Department VARCHAR(255)
);
``````

We added rows to this table and rebuilt the clustered index:

``````-- Insert sample data
BEGIN TRANSACTION;
DECLARE @i int = 0;
WHILE @i < 1000000
BEGIN
INSERT INTO EmployeesStats (LastName, FirstName, Department)
VALUES (CONCAT('LastName', @i), CONCAT('FirstName', @i), CASE WHEN @i % 5 = 0 THEN 'Engineering' ELSE 'Marketing' END);
SET @i = @i + 1;
END;
COMMIT TRANSACTION;

ALTER INDEX [PK__Employee__7AD04FF149345346] ON EmployeesStats rebuild
``````

To obtain the total number of rows filtered by the Engineering department, we executed the following T-SQL:

``````select Department, COUNT(*)
FROM EmployeesStats WHERE Department = 'Engineering'
group by Department ``````

With auto-create statistics enabled in our database, SQL Server created an auto-created statistic using the Department column for this table.

Following this, we found that the performance was not optimal, and upon reviewing the execution plan, we discovered that creating an index on the Department column would improve performance by approximately 82%. Consequently, the index was created:

``````-- Creating a nonclustered index
CREATE NONCLUSTERED INDEX [Index_Departament]
ON [dbo].[EmployeesStats] ([Department]);
``````

At this point, we observed the following:

• Two statistics were present: _WA_Sys_00000004_09C5A64A and Index_Departament.
• The execution plan would utilize and update (if needed, with auto-update statistics enabled) the Index_Departament statistics.
• The _WA_Sys_00000004_09C5A64A statistic would never be used.
• However, when our customer used sp_updatestats or another tool that required updating all auto/user-created statistics, this statistic would be recalculated, which could be time-consuming depending on the number of rows.

To address this, we advised our customer to check for potentially duplicated statistics and to review each one to determine whether it could be deleted. As an example, we provided the following T-SQL to identify all statistics for this table where a column is involved in any statistics, allowing for later review and decision-making on whether to retain or delete them.

It's important to note that manual review is often the best approach to deciding on the maintenance of statistics, as a column may be used in multiple definitions. Additionally, it's crucial to test any changes in a test environment before implementation to avoid potential performance issues:

``````WITH StatsColumns AS (
SELECT
s.object_id,
s.stats_id,
s.name AS StatisticName,
s.auto_created,
s.user_created,
c.name AS ColumnName,
sc.stats_column_id
FROM
sys.stats s
INNER JOIN
sys.stats_columns sc ON s.stats_id = sc.stats_id AND s.object_id = sc.object_id
INNER JOIN
sys.columns c ON sc.column_id = c.column_id AND sc.object_id = c.object_id
),
DuplicatedStats AS (
SELECT
sc1.object_id,
sc1.ColumnName,
STRING_AGG(sc1.StatisticName, ', ') WITHIN GROUP (ORDER BY sc1.StatisticName) AS StatisticsNames,
COUNT(*) AS NumberOfDuplicatedStats
FROM
StatsColumns sc1
GROUP BY
sc1.object_id,
sc1.ColumnName
HAVING
COUNT(*) > 1
)
SELECT
OBJECT_NAME(ds.object_id) AS TableName,
ds.ColumnName,
ds.StatisticsNames,
ds.NumberOfDuplicatedStats
FROM
DuplicatedStats ds
WHERE
ds.object_id = OBJECT_ID('EmployeesStats')
ORDER BY
TableName,
ds.ColumnName;
``````

Here's an example of the output

Enjoy!

1 Comment
Co-Authors
Version history
Last update:
‎Apr 01 2024 11:36 AM
Updated by: