mco365 thanks for the response. Just to address a couple items:
-- ADF is just one option. You could easily implement the same logic with an Azure Function.
-- wernerzirkel we aren't deduping in ADF just dropping extents. That step takes less than a second so doesn't really add cost. I looked at MVs to do the dedup but wasn't comfortable the solution would be 100% accurate. That's the only reason I didn't go that route.
-- The method being used to delete the records is 100% recommended. It's using tags on extents and then dropping the entire extents. This is very efficient.
-- With the current method that ADF exports the data. Each day producing a new csv that contains month to date data, along with the fact that corrections could have been made to previous days. I believe this is the simplest method. There are certainly other methods to doing this and at the end of the day as long as the correct data gets ingested they are just as correct!
-- You can def put the data in Azure SQL if you prefer. I don't think that's the best analytic store for this data but it is an option. I will say that the Azure Cost Management team has also decided that ADX is the better store for this data than Azure SQL.
Sorry about the long winded reply. Just getting around to responding and wanted to answer everything at once!
I'd love to evolve this solution and anyone interested in contributing ideas feel free to submit ideas via the GitHub repo!