Estimating Data Explorer Cluster Size

Microsoft

I have two customers that want to use Azure Data Explorer and I am unsure how to size their system to get costs:

 

  1. The first customer will ingest 5 GB per day of logging and occasionally write queries.

  2. The second customer will ingest 10 TB per day and will have an application running queries.

  3. Can I have small cluster and not worry about the SSD size?  I think for the first customer a small cluster would be fine, due to the low data volume.  The second customer I think I need the data to be in the SSDs for fast query access (last 10 days of data....)

  4. In either of the above cases I'm assuming I could have 1 PB of data in data explorer and a small cluster (1 TB worth of SSDs).  The two are unrelated?

Thanks,

Adam

 

1 Reply

When choosing which data gets stored as hot and which gets stored as cold, it's important to understand the scenario and the query patterns. it's also important to note that Cache policy does not make Kusto a cold storage technology

 

for the first customer, 5GB is a low enough volume to estimate that a minimum sized cluster will do just fine to begin with.

 

regarding the scale of the larger customer's cluster -

1. ingestion capacity scales linearly with the number of nodes in the cluster.

2. having a larger volume of "hot" data (stored on SSD) will potentially require increasing the size of the cluster, if the data compressed size of the data which is required for interactive queries can't "fit" within the capacity of the cluster.

3. Choosing the L16 option will potentially be more cost efficient for a such a data-bound cluster.