Assistance Requested for SQL Execution Efficiency Issue

Question

I am experiencing a significant difference in execution times between two SQL queries. The first query (sql1) takes about 30 seconds to execute, while the second query (sql2) completes in just 0.1 seconds.sql1:&nbsp;&nbsp;SELECT
    top 20 row_number() over (ORDER BY User.code Asc) as rowno,
    User.id AS ID,
    User.code AS Code,
    User.name_CHS AS Name,
    User.usergroup AS UserGroup,
    User.sysorgid AS SysOrgId,
    User.tenantid AS TenantId,
    User.seclevel AS SecLevel,
    User.usertype AS UserType,
    User.note AS Note,
    User.LastLoginTime AS LastLoginTime,
    User.OrgIdPath AS OrgIdPath
FROM
    User
WHERE
    User.id IN (
        SELECT DISTINCT makerid FROM f2022 WHERE ledger = 'b4cb0d26-dd2f-4ae4-bb7a-861ba9dc2fbb'
        UNION
        SELECT DISTINCT makerid FROM f2023 WHERE ledger = 'b4cb0d26-dd2f-4ae4-bb7a-861ba9dc2fbb'
    )
ORDER BY
    User.code Asc;&nbsp;&nbsp;&nbsp;sql2:&nbsp;&nbsp;SELECT
    top 20 row_number() over (ORDER BY User.code Asc) as rowno,
    User.id AS ID,
    User.code AS Code,
    User.name_CHS AS Name,
    User.usergroup AS UserGroup,
    User.sysorgid AS SysOrgId,
    User.tenantid AS TenantId,
    User.seclevel AS SecLevel,
    User.usertype AS UserType,
    User.note AS Note,
    User.LastLoginTime AS LastLoginTime,
    User.OrgIdPath AS OrgIdPath
FROM
    User
WHERE
    User.id IN (
        SELECT DISTINCT makerid FROM f2022 WHERE ledger = 'b4cb0d26-dd2f-4ae4-bb7a-861ba9dc2fbb'
    )
ORDER BY
    User.code Asc;&nbsp;&nbsp;&nbsp;&nbsp;The main difference between these two queries is the inclusion of a UNION operator in the subquery of sql1, whereas sql2 does not have this. Additionally, the data volume in both f2022 and f2023 is roughly equivalent.I am seeking assistance to understand why there is such a large discrepancy in execution times and how I might optimize the first query. Thank you in advance for your help.

lainrobertson · Answer

GooTen&nbsp;&nbsp;Hi, Chen.&nbsp;Assuming these queries are against two tables contained in the same database:&nbsp;If you run the second query but change the table from [fy2022] to [fy2023], does it still run fast or slow the first time around?If it's slow the first time around, check that you have an index (NONCLUSTERED would almost certainly be the preferable type) on the [makerid] column of the [fy2023] table.&nbsp;It's pure guesswork on my part but it sounds like your [fy2022] table has an index on the [makerid] column while your [fy2023] table does not.&nbsp;Cheers,Lain

gooten · Answer

LainRobertson&nbsp;Thank you for your response.Even after switching the tables in the first query, the execution remains fast.In both f2022 and f2023, there are no indexes on makerid.I can upload screenshots of the execution plans for these queries, but they are in Chinese. I'm not sure if you'll be able to understand them.sql1:&nbsp;sql2:&nbsp;

lainrobertson · Answer

GooTen&nbsp;&nbsp;Hi, Chen.&nbsp;These seem to be the estimated execution plans. Are you able to instead include the actual execution plans?&nbsp;Here's the button from SSMS for including actual execution plans.&nbsp;&nbsp;The actual execution plan shows how much real time (in milliseconds) was spent at each stage:&nbsp;&nbsp;Display an Actual Execution Plan - SQL Server | Microsoft Learn&nbsp;If you can also add the following two lines to the end of each query after the "SORT BY" statement - only while you're capturing the actual execution plans (remove it again afterwards), that would be appreciated:&nbsp;OPTION
	(RECOMPILE);&nbsp;Also, I have to correct myself as I wasn't paying proper attention to what I had written about the indexes.&nbsp;In my previous reply, I've incorrectly asked about an index on [makerid] when I'd meant to refer to the [ledger] column.&nbsp;Similarly, is the [id] column in the [User] table indexed?&nbsp;Your execution plans do involve a lot of table scanning, however, this may not be an issue. It just means it might be worth looking into, and the actual execution plan will help quantify that.&nbsp;Cheers,Lain

olafhelper · Answer

The main difference between these two queries is the inclusion of a UNION operator in the subquery of sql1,

A UNION operator performs always a sort operator to remove duplicates.

In you subquery + EXISTS duplicates don't matter, so change it to UNION ALL to avoid a sort.

See UNION (Transact-SQL) - SQL Server | Microsoft Learn

UNION ALL - Includes duplicates.
UNION - Excludes duplicates.

gooten · Answer

The first image is of the execution plan for sql1 without OPTION (RECOMPILE).&nbsp;The second image is of the execution plan for sql2 without OPTION (RECOMPILE).&nbsp;The third image is of the execution plan for sql1 with OPTION (RECOMPILE).&nbsp;The fourth image is of the execution plan for sql2 with OPTION (RECOMPILE).&nbsp;Both f2022 and f2023 have the same indexes and table structure.&nbsp;&nbsp;

Forum Discussion

Assistance Requested for SQL Execution Efficiency Issue

7 Replies

Index creation

JOIN-based query

Resources