synapse analytics vs ADF pipeline using SQL server vs ADF pipeline using CSV ? Cost-effective?

Copper Contributor

Synapse analytics vs ADF pipeline using SQL server vs ADF pipeline using CSV ? Cost-effective?


Hi folks,

My task here is like, i wanted to delete some stale data in my dataverse table and display those deleted records into a power BI report and main thing is this should be cost-effective.
for this I found 3 different ways to achieve this,

1. Data verse table to Power BI using Azure synapse analytics(storing stale data into azure synapse workspace)  

2. Data verse table to Power BI using ADF pipeline with azure SQL server(to store deleted records):
In detail, I have stored dataverse table data into azure data lake. once data is stored into a csv file within the data lake
we have created ADF pipeline and dataflow to read that csv file and applied where clause using data flow filter and stored the filtered records into a Azure SQL DB and shown those records into power BI.
3. Data verse table to Power BI using ADF pipeline with CSV file(to store deleted records):
In detail, I have stored dataverse table data into azure data lake. once data is stored into a csv file within the data lake
we have created ADF pipeline and dataflow to read that csv file and applied where clause using data flow filter and stored the filtered records into another CSV file and shown those records into power BI.

 

 

All the ways will work
But for a 1GB of table stale data what way is much cost effective? if you can calculate what will be the approx. cost for each way?
or how to calculate the cost for each ways?

Thanks in advance

2 Replies
Hi @charlesmicheloo7,

which solution did you end up going with? I'd love to know what your solution was.

@charlesmicheloo7, Pricing estimates are subject to many other variables other than just the amount of data collected from data sources like Dataverse tables, but here are some estimates based on a series of assumptions I've made. If you have a Microsoft Power Platform or Microsoft 365 license where Azure Synapse Link for Dataverse is included, this option will be cheaper than adding additional components such as Data Factory and Azure SQL Database for data processing and storage.

 

  1.  Synapse Link for Dataverse with Azure Synapse Workspace: if you configure Synapse Link for Dataverse and set the Append only option to Yes for Dataverse tables, deleted records can be queried from Data Lake Storage where the isDeleted column = true. Since Synapse Link performs the data replication for you without needing to build your own ETL process in Data Factory, the the only cost would be Data Lake Storage, which is currently $0.02/GB/month for a Hierarchical Namespace-enabled storage account with Local Redundant Storage (LRS) and a Hot Data Access Tier in most US regions. As a result, the total cost is as low as $0.02/month with 1GB of data and analyzing directly with Power BI.
  2. Data Factory and Azure SQL Database: this option would incur additional compute costs for the Compute and Activity runs used by Data Factory, as well as the SQL purchase model. Using the Azure Integration Runtime, Data Factory would incur $0.25/DIU-hour for each Copy activity as well as $1 per 1,000 Activity runs. Assuming a daily ETL process from Dataverse tables running with 4 DIUs and 15 minutes of copy runtime per day, this would amount to $30/mo ($0.25 DIU-hour * 4 DIUs * 30 days) plus $1 for up to 1,000 activity runs. As a result, the total estimate is  $31/month for Data Factory ETL to land data in the Data Lake. An additional set of copy activities to an Azure SQL Database would incur a similar cost, with an added $14.7187/month for a vCore model Standard S0 Service tier database with 250 GB of storage capacity, or an added $4.8971/month for a DTU model Basic service tier database with only 2 GB of storage capacity. As a result, the total estimate of this approach is $76.7187/month (with the Data Factory processing cost multiplied twice plus the vCore Standard S0 Service tier database, since I assume 2 GB of capacity offered by the DTU Basic Service tier offering is not sufficient).
  3. Finally, Data Factory alone with 2 sets of CSV files would keep the Data Factory estimate the same as approach 2 ($31/month * 2), while storage costs from approach 1 would double from $0.02/month to $0.04/month. The total estimate of this approach is $62.04/month.