How to calculate the cost of copy data from AzureBlob to Azure Data Explorer with Azure Data Factory
Published Jul 08 2019 03:32 PM 4,928 Views

Azure Data Factory is a fully managed cloud-based data integration service. You can use the service to populate your Azure Data Explorer database with data from various locations and save time when building your analytics solutions. To estimate the total time and E2E cost to ingest Azure blob using Azure Data Factory to Azure Data Explorer, see below:

 

  • Azure Data Explorer estimated cost: please use ADX calculator http://aka.ms/adx.cost to gain insight of the cluster size and cost.
  • Learn about ADF pricing. Here is some guidelines to estimate ADF cost:
  • The general speed equation of ADF is XdataMB/YMBps/3600s = Yh.
    The speed to copy Azure Blob to Azure Data Explorer cluster using a single copy activity is around 11MB/s speed. 
  • To achieve greater throughput, you can run multiple copy activities in parallel and achieve 11MBps * M in aggregated throughput.
    Therefore, copy duration for copying Azure Blob to Azure Data Explorer using ADF is XdataMB/11MBps/3600s = Yh
  • Cost equation for cloud copy by default with 4 DIUs (Data Integration Units) and $0.25 per DIU per hour, copy cost estimation: Yh * Z*$0.25 * 4 DIU = $$$.
    • Z - Number of DIUs
    • YMBps copy speed for ADX copy activity 

 

Example of ADF costs:

A customer wants to upload one blob a day with 250 GB of CSV data.

250GB * 1024 / 11 MB / 3600 = 6.5h

6.5* 0.25$ * 4 DIUs = 6.5$ per 250 GB.

 

Costs may include:

  • Additional orchestration and operation cost (usually much less than data movement).
  • Failure/retries.

Those numbers can be impacted by the following list:

  • Locations of your data/ADF/ADX
    • Reduce cost by having source, ADF, and target cluster in the same region.
  • ADX cluster – number of nodes (used the smaller cluster of 2 nodes)
  • Number of DIUs (4~32 DIUs)
    • Additional DIUs are helpful when more than one file/blob copied in parallel
  • Data format (used CSV)

Azure Data Factory billing Deck:

https://microsoft.sharepoint.com/:p:/t/admsteam/EbV6yJMo4NZBnoNgZUsV88QBsQqcPpv9Fk0WTdgUPfMkoA?e=grV...

2 Comments
Copper Contributor

can any one help me how to calculate DIUs (Data Integration Units) in Azure data factory?

in ADF, i have 60 pipeline created and each pipeline runs every 3 mins in a day. i am seeing resource cost is high for the month.  pipe line runs is more than 18000 times. From pipeline monitor, i am seeing data moment in DIU but can not search for 18000 times.. so looking how to calculate DIU?

@Vasu1983 it seems that you are asking a general question on ADF, so please redirect your question to ADF channel 

Version history
Last update:
‎Jul 06 2021 02:38 AM
Updated by: