First published on MSDN on Feb 28, 2016
When you query a table with clustered columnstore index, it is run in parallel unless you force it to run serially using MAXDOP or if SQL Server determines at run-time that there are not enough resources to run this query in parallel.
This is all well and good but there is one interesting aspect. The delta rowgroup is scanned using a single thread. However, if you have multiple delta rowgroups, they are scanned in parallel with one thread per delta rowgroup.
Let us look at an example
Note, I only inserted 1 million rows and not (1024*1024 = 1048576), so the inserted rows will stay in the delta rowgroup. Now, let me run the following query
select avg (convert (bigint, unitsold))
If you look at the actual execution plan below, it shows that it used 4 threads (note, the machine I am running it on has 4 cores)
But if you look at the XML plan, it shows that only one thread processed the 1000000 rows
Now, let me compress the rows in delta rowgroup using the following command
-- run this to compressed OPEN RGs
alter index t_colstor_NCCI on t_colstor reorganize with (COMPRESS_ALL_ROW_GROUPS = ON)
Now, I run the same query as before (use TF 8649 to force parallel execution if needed), and here is what I see for compressed rowgroup. Note, a single compressed rowgroup can be scanned by multiple concurrent threads.
Do you need to be concerned about that a delta rowgroup is scanned single threaded? The answer is NO for two reasons (a) most columnstore indexes have very few delta rowgroups (b) if you have multiple delta rowgroups, they can be scanned in parallel with one thread per delta rowgroup.
You should see the same behavior with CCI (clustered columnstore index) as well.