Blog Post

Azure Database Support Blog
1 MIN READ

Lesson Learned #350: Improve bulk insert process in a ColumnStore Index Table

Jose_Manuel_Jurado's avatar
May 15, 2023

We used to receive questions from customers that they asked what is the best batch size when we are using bulk insert process against a column store index table.

 

In this post I would like to share with you a recomendation about it. 

 

The recomendation about in columnstore index is more than 102400 rows as you could see in this article - Columnstore indexes: Overview - SQL Server | Microsoft Learn

 

When you are adding less than this number the information will go to the DeltaStore and after it will be updated the to the ColumnStore Index partitions using a backgroung process. 

 

However, using more than this number 102,400 this information will saved directly to ColumnStore RowGroups affected. If an insert operation contains more than 1,048,576 rows, it will be subdivided into small batches of 1048576 and the remaining less than 102,400 will be saved on Delta Store. 

 

DROP TABLE IF EXISTS Fact.[SaleColumnStoreIndexBulkInsert]
CREATE TABLE [Fact].[SaleColumnStoreIndexBulkInsert](
	[Sale Key] [bigint] IDENTITY(1,1) NOT NULL,
	[City Key] [int] NOT NULL,
	[Customer Key] [int] NOT NULL,
	[Quantity] [int] NOT NULL,
	 index [SaleColumnStoreIndexBulkInsert_CI] CLUSTERED COLUMNSTORE
) 

INSERT INTO [Fact].[SaleColumnStoreIndexBulkInsert]
           ([City Key]
           ,[Customer Key]
           ,[Quantity])
     SELECT TOP 102000
           [City Key]
           ,[Customer Key]
           ,[Quantity]
		   FROM [FACT].[SaleColumnStoreIndex]

 

Using less the number of 102,400 you could see how the data is going to directly to DeltaStore. 

 

select * from sys.column_store_row_groups where not delta_store_hobt_id is null
select * from sys.internal_partitions

 

 

Enjoy!

 

Published May 15, 2023
Version 1.0
No CommentsBe the first to comment