Part 1 - Azure SQL DB Hyperscale Table Partitioning - Best Practices & Recommendations
Published Jan 31 2023 09:37 AM 6,931 Views
Microsoft

Introduction

Implementing table partitioning on a table that is exceptionally large in Azure SQL Database Hyperscale is not trivial due to the large data movement operations involved, and potential downtime needed to accomplish them efficiently. To help customers implement partitioning on these large tables, this 2-part article goes over the details.

Part 1 of this blog aims to help consider table partitioning fundamentals alongside other database features and concepts, in preparation to implement table partitioning in an actual table which is covered in Part 2.

 

Table partitioning fundamentals

The fundamentals of table partitioning in Azure SQL Database Hyperscale is no different than in SQL Server. The only particularity is that in Hyperscale, there is only one filegroup (PRIMARY) and so the partition schemes have to map all partitions to it, using the ALL TO ('PRIMARY') syntax.

For more information on partitioning fundamentals, please refer to the following product documentation:

Create partitioned tables and indexes - SQL Server, Azure SQL Database, Azure SQL Managed Instance |...

 

Table partitioning and table maintenance

One of the benefits of implementing table partitioning is that it allows for easier index and statistics maintenance on large tables, by permitting to do them in one or a small subset of partitions. This reduces the scope for these maintenance operations, and in consequence more manageable and less resource consuming.

Many customers that use Azure SQL Database Hyperscale with tables larger than 1 TB in size may benefit from partitioning them to implement a maintenance plan tailored to a group of partitions, understanding and customizing the maintenance needs of each subgroup of partitions, instead of having to inevitably execute a maintenance plan for the table as a whole. One example is with index maintenance, when most of the data modifications are made in a reduced number of partitions. In this case you can perform index reorganize and index rebuild operations only on those partitions that really need it, thus making it more efficient by reducing the time and resources needed for this maintenance.

The recommendation is to partition your large tables (i.e., larger than 1 TB) so you can implement efficient table maintenance, reducing maintenance time and freeing up resources that can be used for a better workload performance.

 

For more information on index and statistics maintenance, please refer to: Optimize index maintenance to improve query performance and reduce resource consumption - SQL Server...

For a turnkey solution, you may refer to Microsoft MVP Ola Hallengren's SQL Server Index and Statistics Maintenance (hallengren.com) and use the PartitionLevel parameter.

 

Table partitioning and compression

One of the features that is commonly used with table partitioning is the possibility of using compressions for different sections of the partitioned table. For example, you may decide to page compress historical data in your large table while leaving the most recent data uncompressed. Another benefit of using partitioning and compression together is the possibility of implementing compression one partition at a time, reducing or spreading the amount of time and resources required, compared to having to compress the table as a whole.

Regardless of what are the benefits you are looking for, the recommendation is to implement a table partitioning strategy first, and then implement your compression strategy. Implementing partitioning first and compression second, will allow you to:

  1. Implement compression at the partition level.
  2. Design a compression strategy based on the type of data residing in each partition.
  3. Reduce risks by not implementing them at the same time.

 

Table partitioning and columnstore indexes

The concept of partitioning is the same whether a table has a clustered index, is a heap, or has a columnstore index. For columnstore clustered and columnstore non-clustered indexes, you use the ON option of the CREATE COLUMNSTORE INDEX statement, and the basic benefits mentioned in the previous fundamentals section apply here.

The recommendation in this scenario is that each partition has to contain a minimum of 1 million rows for optimal compression and performance of clustered columnstore tables. If the resulting number of rows per partition is less when you are designing your strategy, consider reducing the number of partitions to accommodate this minimum number of rows for optimal compression.

For more information on the implementation of columnstore indexes on partitioned tables, please read: Each table partition has its own rowgroups and delta rowgroups - SQL Server and Azure SQL index arch...

 

Table partitioning and tempdb

Under certain circumstances when implementing table partitioning, some data may have to be temporarily stored in tempdb as data goes through different stages of the data movement operation. This is true in particular, when sorting is required from a source non partitioned table to a destination partitioned table. Since tempdb space is limited, performing certain data movement operation sorting could result in filling the tempdb completely, resulting in a failure.

In order to avoid encountering such “tempdb full” failure scenarios, the recommendation is to implement the data copy from source table to destination partitioned table in batches. An example of how to do this is shown in section 3.2 called Partitioning into a new table structure.

 

Table partitioning and Index alignment

The concept of index alignment refers to an index that is physically partitioned in the same way as its corresponding base table. In other words, it is an index that is implemented on top of the same partition scheme as its base table. The main advantage of having all the indexes of a partitioned table aligned to it, is that it enables some partition switch operations to qualify as being metadata-only operations. This means that no actual data movement is performed inside the database, and hence the metadata-only operation is completed in a very short time, normally in a matter of seconds. This characteristic makes index alignment an attractive design choice.

If you are going to use table partitioning in a sliding window type of data management strategy that relies on partition switching, it is highly recommended that you keep all the indexes aligned to their base tables. For more information on index alignment, please read: Aligned index - Partitioned tables and indexes | Microsoft Learn

If your data management strategy does not use a sliding window strategy at all, then index alignment is optional and special considerations need to be taken into account when the table is designed to have a large number of partitions  (> 1000) as described here: Partitioned index operations – Partitioned tables and indexes | Microsoft Learn

 

Table partitioning and statistics

 

Statistics sampling for partitioned tables

When a partitioned index is created or rebuilt, the statistics are not created by default scanning all the rows in the table. Instead, the database uses the default sampling algorithm to generate statistics. This is not the case when the index is not partitioned, in which case by default all the rows are read for the statistics creation. This behavior was introduced to reduce the memory required when support for up to 15000 partitions was introduced. Due to this default sample used, the recommendation is to monitor your workload that use these statistics on partitioned indexes and determine if the statistics based on sampling is good enough or they need to be created with a sample rate higher than the default, potentially WITH FULLSCAN, based on your workload performance requirements.

For more information on this statistics update behavior change, please refer to: Behavior changes in statistics computation during partitioned index operations - Partitioned tables ...

 

Incremental Statistics

Implementing table partitioning opens the possibility of using what is called incremental statistics, which means the statistics are going to be computed at the partition level and then all the statistics of all partitions are be combined to generate the table level statistics. This introduces efficiencies to update statistics when data changes are only concentrated in one or a few partitions. We encourage you to study this feature and decide if it is appropriate for your environment and workload.

For more information please refer to the following documentation: Incremental Statistics - SQL Server | Microsoft Learn

Incremental Statistics behavior can be controlled in the following ways:

 

You can use the following query to retrieve information about the update statistics sample size from your database that can help you make informed decisions:

 

SELECT 
  sh.[name] as [schema_name], 
  so.[name] as [table_name], 
  pr.[rows] as [table_rows], 
  st.[name] as [stats_name], 
  st.[stats_id], 
  STRING_AGG(cl.[name], ', ') WITHIN GROUP (
    ORDER BY 
      st.[stats_id] ASC
  ) as [column_name], 
  st.[auto_created], 
  st.[is_incremental], 
  st.[is_temporary], 
  st.[filter_definition], 
  sp.[last_updated], 
  sp.[rows],
  sp.[rows_sampled], 
  CAST(
    (
      sp.[rows_sampled] * 1.00 / sp.[rows] * 1.00
    )* 100 as decimal(8, 2)
  ) as [pct_sampled], 
  sp.[persisted_sample_percent], 
  sp.[modification_counter] as [row_modification_counter], 
  CASE         WHEN cp.[compatibility_level] <= 120 THEN CAST(
    500 + (0.2 * sp.[rows]) AS integer
  )         WHEN cp.[compatibility_level] > 120 
  AND (
    500 + (0.2 * sp.[rows])
  ) < (
    SQRT(1000 * sp.[rows])
  ) THEN CAST(
    500 + (0.2 * sp.[rows]) AS integer
  )         ELSE CAST(
    SQRT(1000 * sp.[rows]) AS integer
  )       END [row_modification_threshold] 
FROM 
  sys.stats st   
  JOIN sys.stats_columns sc ON st.[object_id] = sc.[object_id] 
  AND st.[stats_id] = sc.[stats_id]   
  JOIN sys.columns cl ON sc.[object_id] = cl.[object_id] 
  AND sc.[column_id] = cl.[column_id]   
  JOIN sys.objects so ON so.[object_id] = st.[object_id]   
  JOIN sys.schemas sh ON so.[schema_id] = sh.[schema_id]   
  JOIN (
    SELECT 
      [object_id], 
      SUM([rows]) [rows] 
    FROM 
      sys.partitions         
    WHERE 
      [index_id] IN (0, 1) 
    GROUP BY 
      [object_id]
  ) pr ON so.[object_id] = pr.[object_id]   CROSS APPLY sys.dm_db_stats_properties(st.[object_id], st.[stats_id]) sp   CROSS 
  JOIN (
    SELECT 
      [compatibility_level] 
    FROM 
      sys.databases 
    WHERE 
      database_id = db_id()
  ) cp 
GROUP BY 
  sh.[name], 
  so.[name], 
  pr.[rows], 
  st.[name], 
  st.[stats_id], 
  st.[auto_created], 
  st.[is_incremental], 
  st.[is_temporary], 
  st.[filter_definition], 
  sp.[last_updated], 
  sp.[modification_counter], 
  sp.[rows], 
  sp.[rows_sampled], 
  sp.[persisted_sample_percent], 
  cp.[compatibility_level] 
ORDER BY 
  sh.[name], 
  so.[name], 
  st.[stats_id]

 

Table partitioning and Replication

There are certain guidelines that need to be strictly followed when partitioned tables are part of a replication topology. A set of properties that specify how partitioned tables and indexes should be replicated needs to be used and include properties for Publications, Articles, Subscriptions, etc. There are also requirements and limitations that are in play in this scenario such as what partition function and partition scheme commands are replicated, and in some cases, it is the responsibility of the database administrator to make table partitioning changes manually at the Subscribers.

For more information on replicating partitioned tables and indexes, please review the following documentation: Replicate Partitioned Tables and Indexes - SQL Server | Microsoft Learn

 

Table partitioning and Change data capture (CDC)

There are a few specific scenarios in which doing operations in a partitioned table can generate inconsistencies if the partitioned table is enabled for change data capture, and they are directly related to PARTITION SWITCH, MERGE and SPLIT. If you are partitioning your table exclusively to make table maintenance easier, and thus not relying on these three commands typically used in rolling window scenarios, then there is nothing to worry about.

For more information on table partitioning and Change data capture, please refer to [@allow_partition_switch = ]  in the following documentation: sys.sp_cdc_enable_table (Transact-SQL) - SQL Server | Microsoft Learn

 

Table partitioning and page servers

In the Azure SQL Database architecture, all the database storage is presented as a single filegroup named PRIMARY. In the case of the Hyperscale service tier, all the storage provisioned is presented as part of this single PRIMARY filegroup.

When implementing table partitioning in Hyperscale, the partitions are laid down on the PRIMARY filegroup and all the page servers with their data files are filled proportionally, meaning there is no way to influence how the data is divided among page servers/data files. Even though this may look like a limitation, Hyperscale can reach higher levels of table partitioning scalability when compared to the other Azure SQL Database service tiers.

This design choice data distribution characteristic should by no means deter you from evaluating table partitioning for your workloads and exploiting the benefits listed in this document.

 

Feedback and suggestions

If you have feedback or suggestions for improving this data migration asset, please contact the Azure Databases SQL Customer Success Engineering Team. Thank you for your support!

Co-Authors
Version history
Last update:
‎Aug 23 2023 09:42 AM
Updated by: