Replicated tables now in preview for Azure SQL Data Warehouse

Community Manager

SQL Data Warehouse is a fully managed, petabyte-scale cloud service for data warehousing. SQL Data Warehouse is highly elastic, enabling you to provision in minutes and scale capacity in seconds. With the public preview release of Replicated tables, the ability to reason over large amounts of data on SQL Data Warehouse at lightning fast speeds, just got faster. Replicated tables is a feature designed to speed up queries by reducing data movement. Data movement happens in distributed data warehouse systems when tables are joined or aggregated in a manner inconsistent with how they are spread across Compute nodes. Data movement is reduced with Replicated tables because a full copy of data is stored on each Compute node. As a result, queries that required data movement steps to complete now run faster with Replicated tables.

 

replicated-table

 

Prior to the availability of Replicated tables dimension tables, such as a Date dimension, were typically implemented with a round-robin distribution type. When joining to fact tables, the query plan would implement a data movement step to align the date dimension with the fact table. That extra data movement step added runtime to the overall query execution. With Replicated tables, the join happens directly on the Compute node with distributed fact data and a full local copy of the dimension data. Early adopting customers have seen some query runtimes reduce by 40% and up to a 10x reduction in the number of steps required to complete a query.

 

Learn more about it on the Azure blog.

0 Replies