Forum Discussion
Assistance Requested for SQL Execution Efficiency Issue
The main difference between these two queries is the inclusion of a UNION operator in the subquery of sql1,
A UNION operator performs always a sort operator to remove duplicates.
In you subquery + EXISTS duplicates don't matter, so change it to UNION ALL to avoid a sort.
See UNION (Transact-SQL) - SQL Server | Microsoft Learn
- UNION ALL - Includes duplicates.
- UNION - Excludes duplicates.
- GooTenJan 19, 2024Copper ContributorI have tried using UNION ALL, as well as changing the query to use EXISTS or JOIN, but the results were the same. I have added screenshots of the execution plans above. Could you please take a look and help me understand?
- LainRobertsonJan 19, 2024Silver Contributor
Hi, Chen.
The actual execution plans confirm that you should have a NONCLUSTERED index on the [ledger] column - in both tables. Half the long-running time is spent running the table scans on the two GLAccount tables.
Unfortunately, while we can see that you have some indexes on each table, we can't see what columns are included in those index definitions. However, it seems safe to say that none of them apply to the [ledger] column given it features as a suggestion - the green text near the top - in the second execution plan.
So, there's two things I'd be looking to do here:
- Create a new NONCLUSTERED index on the 2022 and 2023 tables;
- Re-work the query to use JOINs (not required but my personal preference for this scenario).
Index creation
CREATE NONCLUSTERED INDEX IX_admin_F2022_ledger ON F2022 (ledger) INCLUDE (makerid); GO CREATE NONCLUSTERED INDEX IX_admin_F2023_ledger ON F2023 (ledger) INCLUDE (makerid); GOJOIN-based query
SELECT DISTINCT [User].[id] AS [ID] , [User].[code] AS [Code] , [User].[name_CHS] AS [Name] , [User].[usergroup] AS [UserGroup] , [User].[sysorgid] AS [SysOrgId] , [User].[tenantid] AS [TenantId] , [User].[seclevel] AS [SecLevel] , [User].[usertype] AS [UserType] , [User].[note] AS [Note] , [User].[LastLoginTime] AS [LastLoginTime] , [User].[OrgIdPath] AS [OrgIdPath] FROM [User] LEFT OUTER JOIN [F2022] ON [User].[id] = [F2022].makerid LEFT OUTER JOIN [F2023] ON [User].[id] = [F2023].makerid WHERE ( [F2022].[ledger] = 'b4cb0d26-dd2f-4ae4-bb7a-861ba9dc2fbb' AND [F2022].[makerid] IS NOT NULL ) OR ( [F2023].[ledger] = 'b4cb0d26-dd2f-4ae4-bb7a-861ba9dc2fbb' AND [F2023].[makerid] IS NOT NULL ) ORDER BY [User].[id] OPTION (RECOMPILE);Note: I didn't include the TOP 20 as I'm curious how it performs across the whole data set, however, you can easily add the "TOP 20" statement immediately below the "DISTINCT" statement if you like.
Once you've created those indexes, you can see how your original queries perform, and optionally how the suggested example I've supplied above performs (I've left the OPTION (RECOMPILE) for testing purposes).
Cheers,
Lain