Azure Data Explorer is an append only database that isn’t designed to support frequent data deletion. If you accidentally ingest your data into Azure Data Explorer multiple times, the following tips can help you handle the duplicate records:
// create table with the extent ids that include the duplicate data
// add the specific date
.set ExtentsToCompress <| bla //original table name
| extend eid = extent_id()
| dt=ingestion_time() // one option to find the date
| where dt in a date range // alternative option to find the date
|summarize by eid
// present extent ids
ExtentsToCompress
// ingest the distinct rows into a temp table
// increase performance
.set BlaTmp <| bla
| extend eid = extent_id()| where eid in (ExtentsToCompress)
| project-away eid
| distinct *
// drop extents with duplicates values
.drop extents <| .show table bla extents | where ExtentId in(ExtentsToCompress)
// re-ingest the distinct values
.set-or-append bla <| BlaTmp
For more information regarding how to handle queries with duplicated records read: Handle duplicate data in Azure Data Explorer
Learn more about Azure Data Explorer (Kusto):
Join us to share questions, thoughts, or ideas about Azure Data Explorer (Kusto) and receive answers from the diverse and knowledgeable Azure Data Explorer community.
Azure Data Explorer product team
“Join the conversation on the Azure Data Explorer community”.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.