In this article, I want to highlight some information and examples on how to enhance linked server queries, by using Inner Join of type Remote and OpenQuery:
Remote Join:
when you have an inner join between local and remote table (a linked server) , you can force the join operation to be on the remote server.
If you have two tables , one local small table (10 rows in my example testdest table) and one huge linked server table (sourctbl table with about 76 million rows)
on the Managed instance, I executed the select query with inner join , the left table is the small local table:
select testdest.rowid , sourcetbl.createdate from testdest inner join [mylinkedserver].mydatabase.[dbo].[sourcetbl] sourcetbl /*76915552 rows*/
on testdest.rowid = sourcetbl.rowid
and here is the execution plan:
as you see, the estimated execution plan here does not show too much about what is happening on the remote server , it shows only that remote query cost is 76%.
the execution time is more than 3.5 minutes:
SQL Server parse and compile time:
CPU time = 31 ms, elapsed time = 597 ms.
(802264 rows affected)
Table 'testdest'. Scan count 76915552, logical reads 153831104, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 203750 ms, elapsed time = 217270 ms.
I executed the same query but with Remote hint:
select testdest.rowid , sourcetbl.createdate from testdest inner remote join [mylinkedserver].mydatabase.[dbo].[sourcetbl] sourcetbl /*76915552 rows*/
on testdest.rowid = sourcetbl.rowid
the Remote Join execution plan:
the estimated execution plan shows that the query optimizer is using index scan and the remote query cost here is 97%, but the actual execution time is much lower, about 2~3 seconds:
Warning: The join order has been enforced because a local join hint is used.
SQL Server parse and compile time:
CPU time = 16 ms, elapsed time = 593 ms.
(802264 rows affected)
Table 'testdest'. Scan count 1, logical reads 2, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 937 ms, elapsed time = 2875 ms.
the idea here is that the join is performed on the remote server , and the number of network packets that are transferred between the instances back and forth are much lower.
if you compare between the inner join and remote inner join
Inner Join:
Inner Remote Join:
you will notice that this execution plan is using index scan and the first one was using seek, but here the index scan is for only 10 rows and was only once , also the rows retrieved from the remote server is only 802264 (the result set of the query).
* it is a important to create a covering index on remote server as well, rowID in our example here.
* Remote hint is working only for Inner Join not outer joins.
OpenQuery:
Here also I will give you examples of how to work with remote queries using the OPENQUERY statement.
select top 1000 example:
select top 1000 * from [mylinkedserver].mydatabase.[dbo].[sourcetbl]
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 629 ms.
(1000 rows affected)
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 1812 ms.
the execution plan:
the openquery statement:
select * from openquery([mylinkedserver], 'select top 1000 * mydatabase.[dbo].[sourcetbl]’) sourcetbl
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 146 ms.
(1000 rows affected)
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 888 ms.
openquery execution plan:
"where" example:
select * from [mylinkedserver].mydatabase.[dbo].[sourcetbl] where rowid =50
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 657 ms.
(50155 rows affected)
SQL Server Execution Times:
CPU time = 1078 ms, elapsed time = 10950 ms.
2nd try
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
(50155 rows affected)
SQL Server Execution Times:
CPU time = 984 ms, elapsed time = 9002 ms.
"Where" with OpenQuery example:
select * from openquery([mylinkedserver], 'select * from mydatabase.[dbo].[sourcetbl] where rowid =50') sourcetbl
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 149 ms.
(50155 rows affected)
SQL Server Execution Times:
CPU time = 1094 ms, elapsed time = 2009 ms.
2nd try
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
(50155 rows affected)
SQL Server Execution Times:
CPU time = 1094 ms, elapsed time = 2111 ms.
with Openquery, the evaluation of the query and part of the query processing “of where clause or top rows for example” will be on the remote server, put in consideration also the load on both servers and which one of them is the production and which one of them has more valuable resources.
for more information about Remote Joins and Openquery:
Remote Inner Join:
OpenQuery:
https://docs.microsoft.com/en-us/sql/t-sql/functions/openquery-transact-sql?view=sql-server-ver15
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.