Blog Post

Azure Data Explorer Blog
2 MIN READ

Query Acceleration for Delta External Tables (Preview)

Anshul_Sharma's avatar
Anshul_Sharma
Icon for Microsoft rankMicrosoft
Nov 25, 2024

Turbo charge queries over delta tables

An external table is a schema entity that references data stored external to a Kusto database. Queries run over external tables can be less performant than on data that is ingested due to various factors such as network calls to fetch data from storage, the absence of indexes, and more. Query acceleration allows specifying a policy on top of external delta tables. This policy defines a number of days to cache data for high-performance queries.

Query Acceleration policy  allows customers to set a policy on top of external delta tables to define the number of days to cache. Behind the scenes, Kusto continuously indexes and caches the data for that period, allowing customers to run performant queries on top.

QAP is supported by Azure Data Explorer (ADX) over ADLSgen2/blob storage and Eventhouse over OneLake/ADLSgen2/blob storage.

Query Acceleration policy

We are introducing a new policy to enable acceleration for delta external tables:

Syntax

.alter external table <TableName> policy query_acceleration 'Policy'

Where:

  • <TableName> is the name of a Delta Parquet external table.
  • <Policy> is a string literal holding a JSON property bag with the following properties:
    • IsEnabled : Boolean, required.
             - If true, query acceleration is enabled.
    • Hot: TimeSpan, last 'N' days of data to cache.

Steps to enable Query Acceleration

 

  1. Create a delta external table as described in this document:
.create-or-alter external table <TableName> kind=delta ( h@'https://storageaccount.blob.core.windows.net/container;<credentials> )

 

  1. Set a query acceleration policy
.alter external table <TableName> policy query_acceleration ```{ "IsEnabled": true, "Hot": "36500d" }```

 

  1. Query the table.
external_table('TableName')

Note: Indexing and caching might take some time depending on the volume of data and              cluster size. For monitoring the progress, see Monitoring command

Costs/Billing

Enabling Query Acceleration does come with some additional costs. The accelerated data will be ingested in Kusto and count towards the SSD storage, similar to native Kusto tables. You can control the amount of data to accelerate by configuring number of days to cache. 

Conclusion

Query Acceleration is a powerful feature designed to enhance your data querying capabilities on PetaBytes of data. By understanding when and how to use this feature, you can significantly improve the efficiency and speed of your data operations - whether you are dealing with large datasets, complex queries, or real-time analytics, Query Acceleration provides the performance boost you need to stay ahead.

Updated Nov 26, 2024
Version 3.0
No CommentsBe the first to comment