Part 2 - Azure SQL DB Hyperscale Table Partitioning - Best Practices & Recommendations
Published Jan 31 2023 09:40 AM 5,804 Views
Microsoft

Introduction

Implementing table partitioning on a table that is exceptionally large in Azure SQL Database Hyperscale is not trivial due to the large data movement operations involved, and potential downtime needed to accomplish them efficiently. To help customers implement partitioning on these large tables, this 2-part article goes over the details.

Part 2 of this blog contains different scenarios to implement table partitioning on a large existing table and aims to assist you in determining what the best sequence of operations is to accomplish the task, with the advantages and disadvantages to consider in each scenario. This blog is the second part of Part 1

 

Partitioning in place

The scenario of partitioning in place consists of changing the existing table physical storage structure from an un-partitioned table defined on a filegroup, into a partitioned table defined on a partition scheme that maps a partition function to the single PRIMARY filegroup. SQL Server Management Studio (SSMS) includes a Create Partition Wizard with a series of steps that helps partition an existing table in place by right-clicking on an un-partitioned table and selecting Storage -> Create Partition from the drop-down menu. The T-SQL script generated by this wizard can be executed on a Hyperscale database to partition in place and can also be used in the next scenario to create a new partitioned table.

Partitioning in place is not generally the best option for several reasons detailed below. Consider partitioning into a new table structure after reading all the pros and cons of each scenario.

 

Pros and cons

Pros:
  • The movement of data among partitions happens in-place inside the same table, making this method less space consuming as there is no duplication of tables.
Cons:
  • Data movement inside the existing table limits the options to stop and roll back the partitioning operation if desired.
  • During the data movement, necessary locks are held on the pages of the table that could block other activities that need to occur in the table concurrently, making it difficult to do online. The degree of interference with other online activities has to be tested, which makes the endeavor more costly and time-consuming.
  • The resulting in-place partitioned table could end up having a high level of fragmentation which will have to be taken care of separately.

Step by step

The general sequence of steps to partition a clustered table in place are:

  1. Create the partition function.
  2. Create the partition scheme.
  3. Drop all FK table constraints.
  4. Drop all table indexes.
  5. Drop clustered index if it exists.
  6. Create clustered index on partition scheme.
  7. Create all previous FK table constraints.
  8. Create all previous indexes aligned to the table using the partition scheme.

Following is a simple example for partitioning the following Person table by the BusinessEntityID column in place:

 

-- Table Person
CREATE TABLE [Person](
	[BusinessEntityID] [int] NOT NULL,
	[PersonType] [nchar](2) NOT NULL,
	[Title] [nvarchar](8) NULL,
	[FirstName] [nvarchar](50) NOT NULL,
	[MiddleName] [nvarchar](50) NULL,
	[LastName] [nvarchar](50) NOT NULL,
	[Suffix] [nvarchar](10) NULL,
	[EmailPromotion] [int] NOT NULL,
	[rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL,
	[ModifiedDate] [datetime] NOT NULL,
 CONSTRAINT [PK_Person_BusinessEntityID] PRIMARY KEY CLUSTERED 
(	[BusinessEntityID] ASC ) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Person_LastName_FirstName_MiddleName] ON [Person]
(
	[LastName] ASC,
	[FirstName] ASC,
	[MiddleName] ASC
) ON [PRIMARY]

-- Insert data
DECLARE @i int = 1
WHILE @i<20000
BEGIN
	INSERT INTO Person values (@i,'A','Sr','John','William','Smith','PhD',34,NEWID(),GETDATE())
	SELECT @i = @i+1
END

-- Partitioning of Person
BEGIN TRANSACTION

CREATE PARTITION FUNCTION [PF](int) AS RANGE LEFT FOR VALUES (N'5000', N'10000', N'15000')
CREATE PARTITION SCHEME [PS] AS PARTITION [PF] ALL TO ([PRIMARY])
DROP INDEX [IX_Person_LastName_FirstName_MiddleName] ON [Person]

ALTER TABLE [Person] DROP CONSTRAINT [PK_Person_BusinessEntityID] WITH ( ONLINE = OFF )

ALTER TABLE [Person] ADD  CONSTRAINT [PK_Person_BusinessEntityID] PRIMARY KEY CLUSTERED 
(
	[BusinessEntityID] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PS]([BusinessEntityID])

CREATE NONCLUSTERED INDEX [IX_Person_LastName_FirstName_MiddleName] ON [Person]
(
	[LastName] ASC,
	[FirstName] ASC,
	[MiddleName] ASC,
	[BusinessEntityID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PS]([BusinessEntityID])
GO
COMMIT TRANSACTION

 

After you finish executing each code example, you can run the following query to obtain information about how the tables are partitioned:

 

SELECT object_name(so.object_id) AS ObjectName,
total_pages / 128. AS SpaceUsed_MB,
p.partition_id,
p.object_id,
p.index_id,
p.partition_number,
p.rows,
p.data_compression_desc
FROM sys.partitions AS p
JOIN sys.allocation_units AS au ON p.partition_id = au.container_id
JOIN sys.objects as so ON p.object_id = so.object_id
WHERE so.type = 'U'
	AND object_name(so.object_id) LIKE '%Person%'
ORDER BY ObjectName;

 

Partitioning into a new table

The scenario of partitioning into a new table structure consists of copying all the rows from the original monolithic clustered table into a completely new table that from the get-go has been created as partitioned. A simple table rename is used at the end to formally promote the partitioned table as the final authoritative table.

 

Table + All indexes size in GB / (4 GB/min) = Approximate time in minutes

 

It is important to determine if this partitioning process can be done entirely inside a maintenance window as the process becomes somewhat simpler if this is the case. A good way to do an estimation in minutes of how much time it would take to complete the process during a maintenance window (assuming no other workloads are running in the database) is to add the size of the monolithic table plus all its indexes in GB and divide it by 4 GB/min. This divisor factor is obtained by applying an arbitrary 33% reduction factor and assuming the maximum log throughput in SQL DB Hyperscale of 100MBs/second (as of December 2022):

Note that if the database service tier used by your Hyperscale database provides a log throughput that is less than 100 MB/s, then the above estimation has to be proportionally reduced.

The actual time to complete the data copy will vary and has to be validated in an equally sized test environment to increase confidence of the actual time it will take in production.

During the data copy process, it is important to insert the rows into the new partitioned table in the cluster key order to increase performance by taking advantage of the read ahead optimizations.

 

Partitioning in stages if a single stage cannot fit inside a maintenance window

If the partitioning process is too long to fit in a maintenance window, the recommendation is to implement in two stages:

  • First stage - Copy historical data during business hours: The assumption here is this historical data is not currently used in everyday transactions and copying it during business hours will not cause disruptions to current day business transactions.
  • Second stage - Copy current data during a short maintenance window: You will start a maintenance window by stopping the workload and copying the most recent data to reach 100% identical data between source and destination tables.

In order to distinguish data that is considered historical or current, there is a requirement for a column that indicates when is the last time the row was updated. For this purpose and throughout this document a column called ModifiedDate is being used in the table Person. This column will be used to implement the first and second stages above. While this covers most customer scenarios, it may be required to use a different way to divide the data between historical and current.

 

Pros and cons

Pros:
  • This option keeps the original table in place so it can be used by the workload, while the copy of historical data is happening in parallel.
  • As the original table stays in its original state, it is possible to stop and continue later or roll back altogether the data copy, making the whole partitioning operation more manageable.
  • The new partitioned table can be used for testing then dropped and created again with a different partitioning configuration to assist in finding the best partitioning strategy for your table and workload.
Cons:
  • Some downtime is required for the final data synchronization and rename. This downtime is likely to be shorter than the downtime required for the partitioning in-place option. The downtime duration will depend on what strategy and speed is used to copy data from the original table into the new partitioned table and if and how big of a final synchronization is needed before the table is renamed.
  • Higher complexity doing the data synchronization in comparison with in-place partitioning.

 

Step by step

The general sequence of steps to partition into a new table structure is:

  1. Create the partition function.
  2. Create the partition scheme.
  3. Create the new partitioned table on the partition scheme.
  4. Copy all rows from original table into the new partitioned table using an INSERT SELECT statement in batches of approximately 10,000 rows following the cluster key order.
  5. Create the same indexes present in the original table into the new partitioned table.
    1. Decide what indexes must be aligned or not aligned to the base partitioned table.
  6. Create the same FK constraints in the original table into the new partitioned table.
  7. Stop the workload that utilizes the original table.
  8. Do a final data synchronization between original table and new partitioned table using a MERGE statement.
    1. If you can afford a longer downtime, you can stop the workload after step #3 above and copy all the rows entirely using the INSERT SELECT without the need of a final sync with a MERGE statement, which simplifies the job but increases the downtime required.
  9. Validate the data, indexes and constraints are identical in both tables.
  10. Do the table rename so the original table takes a new name, and the new partitioned table is renamed with the original table name.

Following is a simple example for partitioning the following Person table by the BusinessEntityID column into a new table during a downtime window in 10 million rows batches:

 

-- Table Person
CREATE TABLE [Person](
	[BusinessEntityID] [int] NOT NULL,
	[PersonType] [nchar](2) NOT NULL,
	[Title] [nvarchar](8) NULL,
	[FirstName] [nvarchar](50) NOT NULL,
	[MiddleName] [nvarchar](50) NULL,
	[LastName] [nvarchar](50) NOT NULL,
	[Suffix] [nvarchar](10) NULL,
	[EmailPromotion] [int] NOT NULL,
	[rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL,
	[ModifiedDate] [datetime] NOT NULL,
 CONSTRAINT [PK_Person_BusinessEntityID] PRIMARY KEY CLUSTERED 
(	[BusinessEntityID] ASC ) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Person_LastName_FirstName_MiddleName] ON [Person]
(
	[LastName] ASC,
	[FirstName] ASC,
	[MiddleName] ASC
) ON [PRIMARY]

-- Insert data
DECLARE @i int = 1
WHILE @i<20000
BEGIN
	INSERT INTO Person values (@i,'A','Sr','John','William','Smith','PhD',34,NEWID(),GETDATE())
	SELECT @i = @i+1
END

-- Partitioning of Person
CREATE PARTITION FUNCTION [PF](int) AS RANGE LEFT FOR VALUES (N'5000', N'10000', N'15000')

CREATE PARTITION SCHEME [PS] AS PARTITION [PF] ALL TO ([PRIMARY])

-- Person_partitioned table creation
CREATE TABLE [Person_partitioned](
	[BusinessEntityID] [int] NOT NULL,
	[PersonType] [nchar](2) NOT NULL,
	[Title] [nvarchar](8) NULL,
	[FirstName] [nvarchar](50) NOT NULL,
	[MiddleName] [nvarchar](50) NULL,
	[LastName] [nvarchar](50) NOT NULL,
	[Suffix] [nvarchar](10) NULL,
	[EmailPromotion] [int] NOT NULL,
	[rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL,
	[ModifiedDate] [datetime] NOT NULL,
CONSTRAINT [PK_Person_partitioned_BusinessEntityID] PRIMARY KEY CLUSTERED 
(
	[BusinessEntityID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PS] ([BusinessEntityID])
) ON [PS] ([BusinessEntityID])
GO

-- Rows copy block
SET NOCOUNT ON;
CREATE TABLE #pk_tracker (pk bigint);
CREATE INDEX ix_pk on #pk_tracker (pk);
 
DECLARE @batch_maximum_counter bigint;
DECLARE @maximum_counter bigint;
DECLARE @message nvarchar(max);
 
SET @batch_maximum_counter = 0;
SELECT @maximum_counter = MAX(BusinessEntityID) from [Person];
SET @message = CAST(CONVERT(VARCHAR(26),GETDATE(),109) as varchar) + ': starting first batch';
RAISERROR (@message, 0, 1) WITH NOWAIT

WHILE (@batch_maximum_counter <= @maximum_counter)
BEGIN
       INSERT INTO #pk_tracker
       SELECT TOP 10000000 BusinessEntityID
       FROM [Person]
       WHERE BusinessEntityID > @batch_maximum_counter
       ORDER BY BusinessEntityID ASC;
 
       INSERT INTO [Person_partitioned]
       SELECT * FROM [Person]
       WHERE BusinessEntityID IN (SELECT * FROM #pk_tracker);
 
       SELECT @batch_maximum_counter = max(pk) from #pk_tracker;
 
       SET @message = CAST(CONVERT(VARCHAR(26),GETDATE(),109) as varchar) + ': maximum pk from last batch is = ' + CAST(@batch_maximum_counter as varchar);
       RAISERROR (@message, 0, 1) WITH NOWAIT
 
       TRUNCATE TABLE #pk_tracker;
END

-- NC index creation
CREATE NONCLUSTERED INDEX [IX_Person_partitioned_LastName_FirstName_MiddleName] ON [Person_partitioned]
(
	[LastName] ASC,
	[FirstName] ASC,
	[MiddleName] ASC,
	[BusinessEntityID]
) ON [PS] ([BusinessEntityID])

-- Simple rowcount validation and rename
DECLARE @count1 bigint;
DECLARE @count2 bigint;
SELECT @count1 = COUNT(*) FROM Person;
SELECT @count2 = COUNT(*) FROM Person_partitioned;

IF (@count1 = @count2)
	BEGIN
		PRINT 'Row count validate passed, renaming tables';
		EXEC sp_rename 'Person', 'Person_original';
		EXEC sp_rename 'Person_partitioned', 'Person';
		PRINT 'Tables renamed';
	END
ELSE 
	PRINT 'Row count validate failed, count on the source tables is ' + CAST(@count1 as varchar) + ' and count on destination partitioned table is ' + CAST(@count2 as varchar);

 

Copy historical data during business hours

One advantage of partitioning into a new table is that if there is historical data or data that is seldom updated in the original table, it is easier to copy it into the partitioned tabled during business hours, potentially at a low speed over many days, as the probabilities of interference and blocking with the normal workload is minimal. If additional precautions need to be used, the following options can be explored:

  • Using the NOLOCK hint to reduce locking and blocking in the original table.
  • Reducing the batch size and adding a WAITFOR DELAY of a few seconds or minutes at the end of the WHILE loop, to reduce the aggressivity in which the T-SQL code copies data into the partitioned table.
  • Reduce the SELECT TOP batch size from 10,000,000 to a smaller value

The common way to copy historical and seldom updated data from the original table into the new partitioned tables is by filtering in the INSERT SELECT statement those rows that are considered historical or seldom updated, using an appropriate date column of your data model. For example, the following INSERT SELECT statement filters rows that have not been modified in the last 30 days by evaluating the ModifiedDate column:

 

INSERT INTO #pk_tracker
SELECT TOP 10000000 BusinessEntityID
FROM [Person]
WHERE BusinessEntityID > @batch_maximum_counter
AND ModifiedDate < GETDATE() - 30
ORDER BY BusinessEntityID ASC;

 

Perform a final data synchronization

Once most of the rows have been copied in the historical data copy during business hours using a method like described above, the next step is to stop the workload that utilizes the source monolithic table, to do the final data synchronization between source and destination. This step will copy the remaining rows that were not copied before (in the historical data copy) and will update the rows that were copied before if they have been modified since, once more by evaluating the ModifiedDate column. For this step, a MERGE statement can be used:

 

MERGE INTO [Person_partitioned] AS tgt  
USING [Person] as src
ON tgt.BusinessEntityID = src.BusinessEntityID
WHEN MATCHED AND tgt.[ModifiedDate] <> src.[ModifiedDate] -- UPDATE row if there has been a modification
	THEN UPDATE SET [PersonType] = src.[PersonType],
		[Title] = src.[Title],
		[FirstName] = src.[FirstName],
		[MiddleName] = src.[MiddleName],
		[LastName] = src.[LastName],
		[Suffix] =src.[Suffix],
		[EmailPromotion] = src.[EmailPromotion],
		[rowguid] = src.[rowguid],
		[ModifiedDate] = src.[ModifiedDate]
WHEN NOT MATCHED BY TARGET -- INSERT if the row has never been copied before
	THEN INSERT ([BusinessEntityID], [PersonType], [Title], [FirstName], [MiddleName], [LastName], [Suffix], [EmailPromotion], [rowguid], [ModifiedDate]) VALUES (src.[BusinessEntityID], src.[PersonType], src.[Title], src.[FirstName], src.[MiddleName], src.[LastName], src.[Suffix], src.[EmailPromotion], src.[rowguid], src.[ModifiedDate]);

 

The MERGE statement above will be best served by an index on (BusinessEntityID, ModifiedDate) on both source (Person) and target tables (Person_partitioned).

There are times where there are too many participating rows to be inserted, updated, or deleted by the MERGE statement above and tempdb could get completely filled, at which point the database will return an error similar to “The database 'tempdb' has reached its size quota”. If you run into this issue, you will have to use the alternative method to perform these operations by using a script similar to the one discussed above, using a pk_tracker to circle through all the rows and perform deletes, updates and inserts in three different blocks of code. In other words, you are doing the same underlying merge but instead of using a MERGE statement, you are using DELETE, INSERT and UPDATE.

 
Switch table names

After all data copy and validation is done, and before enabling your workload again, we do the final table renaming to make the partitioned table the formal table the workload is going to utilize moving forward.

 

EXEC sp_rename 'Person', 'Person_original';
EXEC sp_rename 'Person_partitioned', 'Person';

 

Partitioning heap tables

This section explains how to partition in-place a table that is implemented as a heap. The main difference from partitioning a clustered table is that for a heap, a clustered index over a partition scheme has to be created to actually shuffle the rows among the partitions, and subsequently dropped so the table becomes a heap table again. Note that partitioning occurs at the time of the cluster index creation and then the table reverts back to a heap, still maintaining the underlying partitioning.

The following is a simple example for partitioning the following Person heap table by the BusinessEntityID column in-place:

 

-- Table Person_heap
CREATE TABLE [Person_heap](
	[BusinessEntityID] [int] NOT NULL,
	[PersonType] [nchar](2) NOT NULL,
	[Title] [nvarchar](8) NULL,
	[FirstName] [nvarchar](50) NOT NULL,
	[MiddleName] [nvarchar](50) NULL,
	[LastName] [nvarchar](50) NOT NULL,
	[Suffix] [nvarchar](10) NULL,
	[EmailPromotion] [int] NOT NULL,
	[rowguid] [uniqueidentifier] ROWGUIDCOL  NOT NULL,
	[ModifiedDate] [datetime] NOT NULL,
 CONSTRAINT [PK_Person_heap_BusinessEntityID] PRIMARY KEY NONCLUSTERED 
(
	[BusinessEntityID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Person_heap_LastName_FirstName_MiddleName] ON [Person_heap]
(
	[LastName] ASC,
	[FirstName] ASC,
	[MiddleName] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO

-- Insert data
DECLARE @i int = 1
WHILE @i<20000
BEGIN
	INSERT INTO Person_heap values (@i,'A','Sr','John','William','Smith','PhD',34,NEWID(),GETDATE())
	SELECT @i = @i+1
END

-- Partitioning of Person_heap
BEGIN TRANSACTION

CREATE PARTITION FUNCTION [PF](int) AS RANGE LEFT FOR VALUES (N'5000', N'10000', N'15000')

CREATE PARTITION SCHEME [PS] AS PARTITION [PF] ALL TO ([PRIMARY])

CREATE CLUSTERED INDEX [ClusteredIndex_on_PS] ON [Person_heap]
(
	[BusinessEntityID]
)WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PS]([BusinessEntityID])

DROP INDEX [ClusteredIndex_on_PS] ON [Person_heap]

COMMIT TRANSACTION

 

Note that IX_Person_heap_LastName_FirstName_MiddleName is not aligned in this example.
To partition a source heap table into a new partitioned heap table, follow the step by step earlier in this blog, but when you create the Person_partitioned table on step #3, make sure you create it with a non-clustered PK index such as the one below:

 

CREATE TABLE [Person_partitioned](
[BusinessEntityID] [int] NOT NULL,
	[PersonType] [nchar](2) NOT NULL,
	[Title] [nvarchar](8) NULL,
	[FirstName] [nvarchar](50) NOT NULL,
	[MiddleName] [nvarchar](50) NULL,
	[LastName] [nvarchar](50) NOT NULL,
	[Suffix] [nvarchar](10) NULL,
	[EmailPromotion] [int] NOT NULL,
	[rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL,
	[ModifiedDate] [datetime] NOT NULL,
CONSTRAINT [PK_Person_partitioned_BusinessEntityID] PRIMARY KEY NONCLUSTERED
(
[BusinessEntityID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PS] ([BusinessEntityID])
) ON [PS] ([BusinessEntityID])
GO

 

Sliding window table partitioning scenario

Another benefit of implementing table partitioning is that it permits implementing a sliding window data management scenario for a table in which the same number of partitions is kept over time, and as new time periods start, new partitions are created and brought into the table while the older partitions are evicted. For example, in a scenario where 5 years of data is retained using monthly partitions, every month a new partition will be created and then merged into the table while the oldest month from 5 years ago will be split out of the table for archival.

Below we will briefly cover the two operations to accomplish this scenario.

 

SPLIT and MERGE partition

You can change the way a table or index is partitioned by splitting an existing partition into two or merging two partitions into one partition. It is recommended to always keep empty partitions at both ends of the partition range to guarantee that the operations will not incur any data movement. The partition split occurs at the beginning and the partition range merge occurs at the end of the time range. Avoid splitting or merging populated partitions as it can be inefficient because the split and merge operation may incur actual data movement and cause as much as four times more log generation and may also cause severe locking.

Here are some examples using the partition function we have used for the Person table.

SPLIT

For this example, we are going to use the PF function that was defined earlier in this document for the Peron table:

 

CREATE PARTITION FUNCTION [PF](int) AS RANGE LEFT FOR VALUES (5000, 10000, 15000)
GO

 

By running the following alter statement we split the range from 10000 to 15000 into two partitions, resulting in the following ranges (5000, 10000, 12000, 15000):

 

ALTER PARTITION FUNCTION PF ()  
SPLIT RANGE (12000);

 

MERGE

Similarly, as in the above split example, this MERGE example merges two partitions into one:

 

CREATE PARTITION FUNCTION [PF](int) AS RANGE LEFT FOR VALUES (5000, 10000, 12000, 15000)
GO

 

By running the following alter statement we go back to having the ranges (5000, 10000, 15000):

 

ALTER PARTITION FUNCTION PF ()  
MERGE RANGE (12000);

 

For more information on these two partition operations, please refer to the following documentation: ALTER PARTITION FUNCTION (Transact-SQL) - SQL Server | Microsoft Learn

 

Truncate table partitions

The TRUNCATE TABLE statement allows you to truncate a specific partition or set of partitions from a table and enjoys the same advantages of the traditional TRUNCATE TABLE. The filter mechanism on the TRUNCATE TABLE statement is achieved by specifying the WITH PARTITIONS() option with the partition or sets of partitions to operate in. TRUNCATE TABLE is the preferred method for purging all the data from a partitioned table, as using the DELETE statement would result in more overhead due to additional logging and the need for rebuilding non-aligned indexes if any are present. To truncate a partitioned table, the table and all its indexes must be aligned (partitioned on the same partition function).

By running the following TRUNCATE TABLE statement we will truncate all rows from partition 1, and partitions 2 through 4):

 

TRUNCATE TABLE Person   
WITH (PARTITIONS (1, 3 TO 4))
GO

 

For more information on TRUNCATE TABLE, please refer to the following documentation: TRUNCATE TABLE (Transact-SQL) - SQL Server | Microsoft Learn

The following query is useful to identify what partitions are available to split, merge or truncate:

 

SELECT OBJECT_SCHEMA_NAME(pstats.object_id) AS SchemaName,
       OBJECT_NAME(pstats.object_id) AS TableName,
       ps.name AS PartitionSchemeName,
       ds.name AS PartitionFilegroupName,
       pf.name AS PartitionFunctionName,
       CASE pf.boundary_value_on_right
           WHEN 0 THEN 'Range Left'
           ELSE 'Range Right'
       END AS PartitionFunctionRange,
       CASE pf.boundary_value_on_right
           WHEN 0 THEN 'Upper Boundary'
           ELSE 'Lower Boundary'
       END AS PartitionBoundary,
       prv.value AS PartitionBoundaryValue,
       c.name AS PartitionKey,
       CASE
           WHEN pf.boundary_value_on_right = 0 THEN c.name + ' > ' + CAST(ISNULL(LAG(prv.value) OVER(PARTITION BY pstats.object_id
ORDER BY pstats.object_id, pstats.partition_number), 'Infinity') AS VARCHAR(100)) + ' and ' + c.name + ' <= ' + CAST(ISNULL(prv.value, 'Infinity') AS VARCHAR(100))
           ELSE c.name + ' >= ' + CAST(ISNULL(prv.value, 'Infinity') AS VARCHAR(100)) + ' and ' + c.name + ' < ' + CAST(ISNULL(LEAD(prv.value) OVER(PARTITION BY pstats.object_id                                           ORDER BY pstats.object_id, pstats.partition_number), 'Infinity') AS VARCHAR(100))
       END AS PartitionRange,
       pstats.partition_number AS PartitionNumber,
       pstats.row_count AS PartitionRowCount,
       p.data_compression_desc AS DataCompression
FROM sys.dm_db_partition_stats AS pstats
	INNER JOIN sys.partitions AS p ON pstats.partition_id = p.partition_id
	INNER JOIN sys.destination_data_spaces AS dds ON pstats.partition_number=dds.destination_id
	INNER JOIN sys.data_spaces AS ds ON dds.data_space_id = ds.data_space_id
	INNER JOIN sys.partition_schemes AS ps ON dds.partition_scheme_id = ps.data_space_id
	INNER JOIN sys.partition_functions AS pf ON ps.function_id = pf.function_id
	INNER JOIN sys.indexes AS i ON pstats.object_id = i.object_id
	AND pstats.index_id = i.index_id
	AND dds.partition_scheme_id = i.data_space_id
	AND i.type <= 1 /* Heap or Clustered Index */
	INNER JOIN sys.index_columns AS ic ON i.index_id = ic.index_id
	AND i.object_id = ic.object_id
	AND ic.partition_ordinal > 0
	INNER JOIN sys.columns AS c ON pstats.object_id = c.object_id
	AND ic.column_id = c.column_id
	LEFT JOIN sys.partition_range_values AS prv ON pf.function_id = prv.function_id
	AND pstats.partition_number = (CASE pf.boundary_value_on_right
					  WHEN 0 THEN prv.boundary_id
					  ELSE (prv.boundary_id+1) END)
ORDER BY TableName,
         PartitionNumber;
GO

 

Example output:

DiegoCaracciolo_0-1674529434444.png

 

Feedback and suggestions

If you have feedback or suggestions for improving this data migration asset, please contact the Azure Databases SQL Customer Success Engineering Team). Thank you for your support!

Co-Authors
Version history
Last update:
‎Aug 23 2023 09:36 AM
Updated by: