Forum Discussion
Kevin_Fleming
Microsoft
Jan 16, 2019When NOT to use shuffle hint?
We are having a discussion here and thought I’d ask the larger group. When would you NOT want a shuffle hint when doing a join and summary? And if you always want to use it, why is it not by defaul...
Alexander Sloutsky
Microsoft
Jan 17, 2019Full docs of shuffle join and shuffle summarize are here:
https://docs.microsoft.com/en-us/azure/kusto/query/shufflejoin
https://docs.microsoft.com/en-us/azure/kusto/query/shufflesummarize
Docs say:
".Shuffle summarize strategy can provide significant performance benefit when the 'by' clause has columns with high cardinality which may be causing the regular summarize strategy to hit query limits."
When not to use: when the cardinality of the key is low.
For example: if you have table 'Data' with column Level which is one of "Error", "Info", "Warning" (cardinality = 3) - you don't want to use shuffle summarize as it will move the data between the nodes executing the query.
Similar logic applies to join