Azure Data Factory is a fully managed cloud-based data integration service. You can use the service to populate your Azure Data Explorer database with data from various locations and save time when building your analytics solutions. To estimate the total time and E2E cost to ingest Azure blob using Azure Data Factory to Azure Data Explorer, see below:
Azure Data Explorer estimated cost: please use ADX calculator http://aka.ms/adx.cost to gain insight of the cluster size and cost.
Learn about ADF pricing. Here is some guidelines to estimate ADF cost:
The general speed equation of ADF is XdataMB/YMBps/3600s = Yh. The speed to copy Azure Blob to Azure Data Explorer cluster using a single copy activity is around 11MB/s speed.
To achieve greater throughput, you can run multiple copy activities in parallel and achieve 11MBps * M in aggregated throughput. Therefore, copy duration for copying Azure Blob to Azure Data Explorer using ADF is XdataMB/11MBps/3600s = Yh
Cost equation for cloud copy by default with 4 DIUs (Data Integration Units) and $0.25 per DIU per hour, copy cost estimation: Yh * Z*$0.25 * 4 DIU = $$$.
Z - Number of DIUs
YMBps copy speed for ADX copy activity
Example of ADF costs:
A customer wants to upload one blob a day with 250 GB of CSV data.
250GB * 1024 / 11 MB / 3600 = 6.5h
6.5* 0.25$ * 4 DIUs = 6.5$ per 250 GB.
Costs may include:
Additional orchestration and operation cost (usually much less than data movement).
Those numbers can be impacted by the following list:
Locations of your data/ADF/ADX
Reduce cost by having source, ADF, and target cluster in the same region.
ADX cluster – number of nodes (used the smaller cluster of 2 nodes)
Number of DIUs (4~32 DIUs)
Additional DIUs are helpful when more than one file/blob copied in parallel