(Preview) Configure ADX table retention, caching and batching policies via one-click
Published Mar 21 2022 10:50 PM 4,151 Views
Microsoft

One click ingestion is an Azure Data Explorer(ADX) web experience that makes the data ingestion process fast, intuitive and easy - it enables you to quickly create tables, and ingest data from a wide range of data sources. One click allows new ADX users to get started quickly without understanding the Kusto Query Language (KQL) syntax, while still allowing to gradually ramp-up by providing auto-generated KQL statements for each UI interaction.

 

While one-click has primarily supported data ingestion operations, we wanted to make it easier for users to perform additional ingestion related operations viz. viewing, creating & updating table level policies via the existing experience. 

 

You can now configure retention, batching & caching policies for tables & materialized views via one-click. We have added two options under Manage tab  - Table Retention policy & Table Batching policy.

 

Anshul_Sharma_0-1647918175438.png

 

Table Retention/caching Policy

 

Update retention policy allows you to configure retention & caching policy for a table or a materialized view . By default, the values are inherited from the database, allowing you to override. 

 

The retention policy controls the mechanism that automatically removes data from tables or materialized views. It's useful to remove data that continuously flows into a table, and whose relevance is age-based. For example, the policy can be used for a table that holds diagnostics events that may become uninteresting after two weeks.

 

Retention policy provides following configuration options :

 

  • Recoverability:  
    • Data recoverability (Enabled/Disabled) after the data was deleted.
    • If set to Yes, the data will be recoverable for 14 days after it's been soft-deleted.
    • Time span for which it's guaranteed that the data is kept available to query. The period is measured starting from the time the data was ingested. 
    • When altering the soft-delete period of a table or database, the new value applies to both existing and new data.
    • Maps to SoftDeletePeriod in the retention policy object.

 

Azure Data Explorer stores its ingested data in reliable storage (most commonly Azure Blob Storage), away from its actual processing (such as Azure Compute) nodes. To speed up queries on that data, Azure Data Explorer caches it, or parts of it, on its processing nodes, SSD, or even in RAM. Azure Data Explorer cache provides a granular cache policy that customers can use to differentiate between: hot data cache and cold data cache(Azure blob)

 

Caching policy provides following configuration options:

 

  • Data (days) : Time span for which the data will be available on the cluster SSD. The period is measured starting from the time the data was ingested. 
  • Index (days): Time span for which the indexes will be available on the cluster SSD. The period is measured starting from the time the data was ingested. 

In the example below, we are setting a retention period of 365 days and data/index hot cache to 31 days. i.e.  last 365 days of data will be available for query, but only last 31 days of data will be on the cluster SSD.

 

Anshul_Sharma_1-1647919619814.png

 

Table Batching Policy

 

Update batching policy allows you to configure batching policy for a table or materialized view. By default, the values are inherited from the cluster or database, allowing you to override. 

 

Azure Data Explorer has an aggregation (batching) policy for data ingestion, designed to optimize the queued ingestion process. The default batching policy is configured to seal a batch once one of the following conditions is true for the batch: a maximum delay time of 5 minutes, total size of 1G, or 1000 blobs. To avoid latency, users can adjust the policy per their needs. Please refer Do I need to change the batching policy for further details. 

 

Batching policy provides following configuration options:

 

  • Number of items: Total number of blobs to batch
  • Time(seconds): Maximum delay in seconds
  • Size(MB): Maximum total blobs size

In the example below, we are configuring a policy to seal a batch when the batch reaches either 500 items or 3mins or 1GB total size. 

 

Anshul_Sharma_0-1647923608368.png

 

We hope the update policy options make it further easier for you to ingest your data. We look forward to your feedback, & comments.

 

You’re welcome to suggest more ideas and vote for them here - https://aka.ms/adx.ideas

1 Comment
Co-Authors
Version history
Last update:
‎Mar 22 2022 12:42 AM
Updated by: