First published on MSDN on Mar 23, 2008
So far we had discussed ( http://blogs.msdn.com/sqlserverstorageengine/archive/2008/03/23/minimal-logging-changes-in-sql-server-2008-part-2.aspx ) how minimal logging changes impact when you are moving data from one table to another table. Now let us look at how does this change more conventional bulk import. As you will see, conventional bulk import takes advantage on these changes as well but these changes are only useful when importing into a btree as conventional bulk import already provides minimal logging for heaps. Here are the series of scenarios that I tried
(1) Inserting into an HEAP. No changes in this behavior
begin tran
-- this is optimized load
-- (1) show that individual rows are not logged
-- (2) show the BU lock
bulk insert t_heap
from 'C:\sql-server-test\minimal-logging\t_source-dat.dat'
with (TABLOCK)
Logging:: Minimal Logging
-- if we don't specify TABLOCK, it leads to full logging. This is same what we always had
-- (1) without TABLOCK, no minmal logging.
bulk insert t_heap
from 'C:\sql-server-test\minimal-logging\t_source-dat.dat'
LOGGING: Fully logged
(2) Insert into BTREE
a) Into empty BTREE
create table t_ci (c1 int, c2 int, c3 char(100), c4 char(1000))
go
create clustered index ci on t_ci(c1)
go
begin tran
-- this is optimized load
-- (1) show that individual rows are not logged
-- (2) Lock: X lock on the table
bulk insert t_ci
from 'C:\sql-server-test\minimal-logging\t_source-dat.dat'
with (TABLOCK, datafiletype = 'char')
-- Now, what if don't specify TABLOCK? This is a change from earlier versions. and this is also optimized load. You will need to use TF-610
-- this functionality
-- (1) we get minimal logging
-- (2) IX: at table level
-- (3) you can use this to do parallel load as long as you are
-- (4) importing into disjoint ranges
bulk insert t_ci
from 'C:\sql-server-test\minimal-logging\t_source-dat.dat'
with (datafiletype = 'char')
Here are the top 10 log records. Note, in this case, only the range lock is taken
b) Into non-empty btree (when inserting into an increasing range)
-- Inserting rows with disjoint range, we still get minimal logging
-- if you are inserting into a btree but the range you are inserting overlaps partially with the existing data,
-- the logging will go between minimal and full logging. The rule is that you will get minimal logging
-- only when you allocate a new page.
bulk insert t_ci
from 'C:\sql-server-test\minimal-logging\t_newrange-dat.dat'
with (datafiletype = 'char')
Here are the top 10 log records. Note, in this case, only the range lock is taken. Since the SQL Server does not take X lock on the table, you can do parallel bulk import with minimal logging into a btree.
c) Into a table with clustered and non-clustered indexes
create table t_ci_nci (c1 int, c2 int, c3 char(100), c4 char(1000))
go
create clustered index ci on t_ci_nci(c1)
go
create index nci on t_ci_nci(c2, c3)
go
begin tran
-- this is optimized load
-- (1) show that individual rows are not logged
-- (2) Lock: IX lock on the table
bulk insert t_ci_nci
from 'C:\sql-server-test\minimal-logging\t_source-dat.dat'
with (datafiletype = 'char')
Query Plan:: Notice NCI and CI data is being sorted
Logged Records:: For the clustered index
Logged Records:: For the non-clustered index. Note, that index key is atleast > 100 bytes, so clearly it is minimally logged
As you can see, with the minimal logging changes in SQL2008, you have lot more options to bulk import the data with minimal logging. These changes are available on SQL2008/RTM.
Thanks
Sunil
Updated Mar 23, 2019
Version 2.0Sunil_Agarwal
Microsoft
Joined March 06, 2019
SQL Server Blog
Follow this blog board to get notified when there's new activity