Blog Post

Azure Data Explorer Blog
2 MIN READ

Update records in a Kusto Database (Public Preview)

vplauzon's avatar
vplauzon
Icon for Microsoft rankMicrosoft
Feb 20, 2024

Kusto databases, either in Azure Data Explorer or in Fabric KQL Database, are optimize for append ingestion.

 

In recent years, we've introduce the .delete command allowing you to selectively delete records.

 

Today we are introducing the .update command.  This command allows you to update records by deleting existing records and appending new ones in a single transaction.

 

This command comes with two syntaxes, a simplified syntax covering most scenarios efficiently and an expanded syntax giving you the maximum of control.

 

Here is an example of the simplified syntax:

 

.update table MyTable on Id <|
MyTable
| where Id==3
| extend Color="Orange"

This command will update all records where Id==3 by replacing the Color column value by "Orange".

 

As mentioned above, the command really does a .delete and .append in one go.  In this case, it is equivalent to those 2 commands:

 

.delete table MyTable records <|
MyTable
| where Id==3

.append MyTable <|
MyTable
| where Id==3
| extend Color="Orange"

The only exception to running those 2 commands is that the append command is run with the state of the table prior to the deletion.  Indeed, if you would run those two commands, the .append command wouldn't do anything since the records with Id==3 would have been deleted by the first command.

 

This is a good way to show how the same command would be represented using the expanded syntax:

 

.update table MyTable delete D append A <|
let D = MyTable
| where Id==3;
let A = MyTable
| where Id==3
| extend Color="Orange";

 

The expanded syntax allows you to explicitly define the delete and append queries.

 

Both syntaxes support a whatif mode where the command doesn't change the table but returns the expected changes.  We recommend always starting with a whatif mode to validate the predicates.

 

We encourage you to go through the many examples of the online documentation page to familiarize yourself with the syntax.

 

We believe this new command gives you an alternative for your data pipelines.  Many loading scenarios involve updating records.  For instance, ingesting new data in a staging table to then update the records of a main table with those new records.  This is now possible with the .update command.

 

The command is in public preview and we are looking forward for your feedback!

Updated Feb 21, 2024
Version 2.0
  • It's a completely opt-in feature:  if you do not use it, you do not pay any performance hit (e.g. we didn't slow down the indexes to accommodate for it).  It's sweet spot is to fix a handful of records as it's limited to 5M rows deleted.

  • wernerzirkel's avatar
    wernerzirkel
    Brass Contributor

    This new feature makes me happy and unhappy at the same time :smile::sad: It seems to be a shift in the basic ADX paradigm to not change data once it has been stored. Which potentially decreases performance (many queries using .update).  On the other hand I see the need for a simple update solution to fix basic requirements such as GDPR data deletion requests or fix Data Privacy issues. 

  • Is there going to be a way to prevent its use?  I am sure many companies will want to make their records do not change once they have been ingested.