Azure Data Explorer Insights (ADX Insights) provides a unified view of your clusters' usage, performance, and health. Now, you can use the new "Ingestion" tab to monitor the status of batching ingestion operations to ADX.
In the batching ingestion process, Azure Data Explorer optimizes data ingestion for high throughput by batching incoming small chunks of data into batches based on a configurable ingestion batching policy. The batching policy allows you to set the trigger conditions for sealing a batch (data size, number of blobs, or time passed). These batches are then optimally ingested.
In this post, you will learn how to use ADX Insights to monitor batching ingestion.
On the Azure portal, go to the ADX cluster page > "Insights" blade ()> "Ingestion" tab.
The number of failed and successful ingestions is the number of blobs that were ingested or failed to be ingested. (The ingestion process is performed in blobs. Event Hub and IoT Hub ingestion events are aggregated into a single blob (multiple events per blob) and then processed as a single blob (source) blob for ingestion)
Succeeded ingestions - "per-table" monitoring
Failed ingestions - "per-table" monitoring
Click on the "Successful" or "Failures" tabs to drill down and see more details per database and table, including:
Table level monitoring is based on diagnostic logs. To see table-level details ,make sure to enable the ingestion diagnostic logs, according to your monitoring needs, and send them to Log-Analytics.
On the other two tabs, you will find information about:
In the batching ingestion process, Azure Data Explorer optimizes data ingestion for high throughput by batching incoming small chunks of data into batches based on a configurable ingestion batching policy. The batching policy allows you to set the trigger conditions for sealing a batch to be ingested (the conditions are: data size, number of blobs, or time passed. More possible conditions that can’t be configured in the batching policy, can be found here: batching types). These batches are then optimally ingested for fast query results.Batching ingestion stages
There are four stages to batching ingestion, and there are specific components for each step:
Example:
You can monitor your data connections (per event hub or IoT hub) and track the "received data size" by each data connection.
You can also monitor the "discovery latency" – this is the time frame from data enqueue until data is discovered by ADX. This time frame is upstream to Azure Data Explorer. Discovery latency is available only for data connections (Event Hub, IoT Hub, or Event Grid ingestion) and it measures the time until data is discovered by the data connection.
When you see a long latency until data is ready for query, analyzing the discovery Latency and the next stage latencies (in the next steps) can help you understand whether the long latency is because of long latency in ADX, or is upstream to ADX.
This is how the latency is built up over the components:
When applying Event Hub, IoT Hub, or Event Grid ingestion, it can be useful to compare the number of events arriving at ADX data connection with the number of events sent for the next steps of the ingestion process (in other words, they were processed successfully by the data connection stage). The tiles Events Received, Events Processed, and Events Dropped allow you to make this comparison.
Data connection monitoring
The second component of the batching ingestion proceeds is the Batching Manager, which optimizes ingestion throughput by batching data based on the ingestion batching policy.
This step allows you to monitor aspects such as:
Moreover, you can view "per-table" data: batching duration per table, the batching size per table, and how the batches were sealed per table (as determined by the ingestion batching policy details.)
Batching monitoring - per DB or per table
The 3ed and 4th components are Ingestion Manager and Storage Engine, respectively. In the Storage Engine, you can see the accumulative latency per database - the time from the moment ADX accepts the data until the data is received by the Storage Engine, and it is available for query.
The "Amount of data processed" tile shows the number of blobs received, blobs processed (== successfully), and blobs dropped. Blobs that have been processed by the Storage Engine are ready for query.
Storage Engine monitoring
Each definition described here can be found throughout the experience!
The definitions of the batching steps are hidden by default, but can be shown by using the "Show help" toggle.
In-product help and definitions
Feel free to comment on this blog post. You can also use the feedback button () on the top of the "Insights" page.
More information about the metrics of the batching ingestion can be found here.
More information about other tabs of ADX Insights can be found here.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.