Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

- Home
- SQL Server
- SQL Server Support Blog
- SQL Server CE: Multiple single-column statistics connected by OR

- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content

By

Published
Sep 07 2021 07:49 AM
1,102
Views

Sep 07 2021
07:49 AM

Sep 07 2021
07:49 AM

In post SQL Server CE: Multiple single-column statistics, I discussed the selectivity when **AND** is used to join the columns in Where clause. Today, I'm going to talk about the OR operator

Let's say we have four predicates connected by OR with selectivites P0,P1,P2 and P3,

**Legacy CE**

The combined selectivity is : 1-(1-P0)*(1-P1)*(1-P2)*(1-P3).

If there are only two predicates, the formula is relative simple: (P0+P1)-(P0*P1)

**New CE**

1.The combined selectivity is : 1-(1-P0)*(1-P1)^(1/2)*(1-P2)^(1/4)*(1-P3)^(1/8)

2.P0,P1,P1 and P2 are the selectivity of each value of the column in WHERE clause and P0>P1>P2>P3.

3.We can have up to 4 predicates. If there are more than 4 columns in the where clause, it only counts the first 4, the rest of them are ignored.

Combined selectivity of four predicates: 1-(1-P0)*(1-P1)^(1/2)*(1-P2)^(1/4)*(1-P3)^(1/8), where P0>P1>P2>P3.

In practice, Please use following formula to calcualte the selectivity if the combined selectivity has more than 2 predicate.

1-

As you expected, more columns used in where clause, more estiamted rows you get.

Here are examples , adventure 2019 OLTP database is used in this example

use AdventureWorks2019

go

IF exists(select 1 from sys.tables where name='SalesOrderDetail' and schema_id=schema_id('dbo'))

drop table SalesOrderDetail

Go

select * into SalesOrderDetail from Sales.SalesOrderDetail

GO

create statistics I_ProductID on SalesOrderDetail(ProductID) with fullscan

create statistics I_SalesOrderID on SalesOrderDetail(SalesOrderID) with fullscan

Here is the T-SQL query I'm going to test

select * from SalesOrderDetail where ProductID=708 or SalesOrderID=44288

The selectivity of ProductId 708 is 3007/121317=0.02478630

dbcc show_statistics(SalesOrderDetail,I_ProductID)

The selectivity of SalesOrderID 44288 is 34/121317=0.00028025

**Legacy CE:**

Combined selectivity:(P0+P1)-(P0*P1)=(0.02478630+0.00028025)-(0.02478630*0.00028025)=0.0250596036394250

Estimated row: 0.0250596036394250*121317=3040.15, is rounded down to 3040

select * from SalesOrderDetail where ProductID=708 or SalesOrderID=44288 option(recompile,querytraceon 9481)--trace flag 9481 is used to force legacy CE

**New CE:**

Combined selectivity:1-(1-P0)*sqrt(1-P1)=1-(1-0.02478630)*(sqrt(1-0.00028025)=0.0249229613952225

Estimated row: 0.0249229613952225*121317=3023.57 is rounded up to 3024

select * from SalesOrderDetail where ProductID=708 or SalesOrderID=44288 option(recompile,querytraceon 2312)--2312 is used to force New CE

**Related trace flags**

**Trace flag 9471**: Pickup the lowest selectivity(Most rows) from all predicates, and ignore rest of them. This trace flag only works when New CE is used.

Combined selectivity: Max(P0,P1,P2,P3,P4..)

In this case, it's Max(P0,P1)*CARD=MAX(0.02478630,0.00028025)*CARD=0.02478630*CARD=0.02478630*121317=3007

select * from SalesOrderDetail where ProductID=708 or SalesOrderID=44288 option(recompile,querytraceon 2312,querytraceon 9471)---2312 is used to force New CE

**Trace flag 9472**:Switch to independent formula, the result is as same as the result of Legacy CE and this trace flag only works when New CE is used.

select * from SalesOrderDetail where ProductID=708 or SalesOrderID=44288 option(recompile,querytraceon 2312,querytraceon 9472)--force New CE

**Assessment:**

**T-SQL**

use AdventureWorks2019

go

IF exists(select 1 from sys.tables where name='SalesOrderDetail' and schema_id=schema_id('dbo'))

drop table SalesOrderDetail

go

select * into SalesOrderDetail from Sales.SalesOrderDetail

GO

--create five statistics

create statistics I_ProductID on SalesOrderDetail(ProductID) with fullscan

create statistics I_orderqty on SalesOrderDetail(orderqty) with fullscan

create statistics I_SalesOrderID on SalesOrderDetail(SalesOrderID) with fullscan

create statistics I_SpecialOfferid on SalesOrderDetail(SpecialOfferid) with fullscan

create statistics I_modifiedDate on SalesOrderDetail(modifiedDate) with fullscan

**1.Please calculate the estimated rows of following queries and run the queries to verify**

select *From SalesOrderDetail where ProductId=708 or orderqty=10 or modifiedDate='2011-05-31 00:00:00.000' or SpecialOfferid=4 or SalesOrderID=44288 option(recompile,querytraceon 9481)

go

select *From SalesOrderDetail where ProductId=708 or orderqty=10 or modifiedDate='2011-05-31 00:00:00.000' or SpecialOfferid=4 or SalesOrderID=44288 option(recompile,querytraceon 2312)

**2.If there are multi-column stats, does it make different?**

- Tags:
- ce

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Labels