When we introduced the new CE back in SQL Server 2014, several changes were done regarding the assumptions of the model. These changes were introduced because the vast potential for variations in data distribution, volume and query patterns, made some of the model assumptions not applicable.
The 4 assumptions for the pre-2014 model were:
A few of those assumptions changed in SQL Server 2014 CE, namely:
You can read more about CE changes in the white paper Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator .
For correlated columns in the 2014 CE, to derive cardinality we need to sort the filters according to their density, where the smallest density value is first. Since the density is expressed in a range between 0.xxxx and 1 the smaller values means lower density, better selectivity or more different values. We then use only the first four most selective filters to calculate the combined density, using Exponential Backoff , expressed like:
p 0 × p 1 1⁄2 × p 2 1⁄4 × p 3 1⁄8 × Tc
This hasn’t changed per se in the 2016 CE, but there is a relevant difference. While in the 2014 CE we always used the single column statistics for each of the predicate columns, in 2016 CE, if multi-column statistics over the predicate columns are available, those will be used to derive estimations with a different calculator. Allow me to demonstrate using AdventureWorksDW2014.
First we need to set the database compatibility level to 120, to force usage of the 2014 CE to see what statistics are loaded and used in SQL 2014:ALTER DATABASE [AdventureWorksDW2014] SET COMPATIBILITY_LEVEL = 120
The create a multi-column statistics object over the columns we’ll be using as predicates:CREATE STATISTICS [StatDemo] ON [dbo].[FactInternetSales] (CurrencyKey,SalesTerritoryKey)
Now create a xEvent session to track the query_optimizer_estimate_cardinality event:CREATE EVENT SESSION [XeNewCE] ON SERVER
Let’s start the session:USE AdventureWorksDW2014
Run the query and stop the session to look at the events:SELECT * FROM FactInternetSales
Now open the session file, and take a look at the calculator field. As expected, using Exponential Backoff and we see the selectivity derived for each predicate column:
If you see the stats_collection field, the derived cardinality is shown at 2285.33 which is the result of the Exponential Backoff derivation (0.112 * SQRT(0.114) * 60398):
Next, change the database compatibility level to 130, to force usage of the 2016 CE and see what statistics are loaded now.ALTER DATABASE [AdventureWorksDW2014] SET COMPATIBILITY_LEVEL = 130
Let’s start the session again:DBCC FREEPROCCACHE
Run the same query and stop the session to look at the events:SELECT * FROM FactInternetSales
We can see a different calculator in use, using a multi-column statistic we created on the predicate columns CurrencyKey and SalesTerritoryKey . Individual stats are still available if required:
And in the stats_collection field, the derived cardinality is shown at 3355.44 which is the result of simple density calculation (0.05555556 * 60398):
Just have a look at the density for the predicate columns:
The practical outcome with this query is still the same for this case, as seen in the scans chose below, but generally, more accurate estimations will allow better plans to be used.
Pedro Lopes ( @sqlpto ) - Program Manager
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.