Azure Data ExplorerInsights(ADXInsights) providesa unified view of yourclusters'usage, performance,and health. Now, you can use the new "Ingestion" tab to monitor the status of batching ingestion operations to ADX.
In thebatching ingestionprocess, Azure Data Explorer optimizesdata ingestionfor high throughput by batching incoming small chunks of data into batches based on a configurableingestion batching policy. The batching policy allows you to set the trigger conditions for sealing a batch (data size, number of blobs, or time passed). These batches are then optimally ingested.
In this post, you will learn how to use ADX Insights to monitor batching ingestion.
Here are some questions you can get answers to with ADX Insights:
What is the result of my ingestion attempts? How many ingestions have succeeded or failed? (by database or table granularity)
Arethere any tables that may be missing data due to ingestion errors? What exactly are the error details?
Whatwasthe amount of data processed by the ingestion pipeline?
What is the latency of the ingestion process? Did the latency build up in ADX's pipeline or upstream of ADX?
How can I better understand howbatchesaregenerated during ingestion?
ForingestionusingEvent Hub, Event Grid, or IoT Hub, how can I compare the number of events arrivingatADX withthenumberofevents sent for ingestion?
A bird's-eye view of the ingestion results
On the Azure portal,go to the ADX cluster page> "Insights" blade ()> "Ingestion" tab.
On the top of the screen is a "traffic light" representing the number of failed and successful ingestion operations. Other indicators are the overall ingestion latency and the ingestion utilization.
The number of failed and successful ingestions is the numberofblobs that wereingested or failedtobe ingested. (The ingestion process is performed in blobs. Event Hub and IoT Hub ingestion events are aggregated into asingle blob (multiple events per blob) and then processed as a single blob (source) blob foringestion)
Succeeded ingestions - "per-table" monitoring
Failed ingestions - "per-table" monitoring
Click on the "Successful" or "Failures" tabs to drill down and see more details per database and table, including:
The number of successful ingestions per table, including the ingestion success rate.
Thenumber of failed ingestions foreach table,alongwiththestatus (permanentortransient),error code, and sample error text. You can use the icon to dig deeper into the log and view more details, for example, a list of other error texts associated with a certain error code.
Atime chart showing successful and failed ingestions over time.
Table level monitoring is based on diagnostic logs. To see table-level details ,make sure to enable the ingestion diagnostic logs, according to your monitoring needs, and send them to Log-Analytics.
"SucceededIngestion" log: These logs have information about successfully completed ingestion operations.
"FailedIngestion" log: These logs have detailed information about failed ingestion operations including error details.
On the other two tabs, you will find information about:
The "Total latency" (accumulative) - thetime from the pointat whichADXacceptsthe datauntilitisavailablefor query.
"Ingestion utilization" - percentage of actual resources used to ingest data from the total resources allocated, in the capacity policy, to perform ingestion. Total latency by database
Visibility into the ingestion process - understand the batching stages
In the batching ingestion process, Azure Data Explorer optimizes data ingestion for high throughput by batching incoming small chunks of data into batches based on a configurable ingestion batching policy. The batching policy allows you to set the trigger conditions for sealing a batch to be ingested (the conditions are: data size, number of blobs, or time passed. More possible conditions that can’t be configured in the batching policy, can be found here: batching types). These batches are then optimally ingested for fast query results.
Batching ingestion stages
Therearefourstagestobatchingingestion, and there are specific componentsfor eachstep:
Data Connection - For Event Grid, Event Hub and IoT Hub ingestion, there is a Data Connection that gets the data from external sources and performs initial data rearrangement.
The Batching Manager batches the received references to data chunks to optimize ingestion throughput based on a batching policy.
The Ingestion Manager sends the ingestion command to the ADX's Storage Engine.
The ADX's Storage Engine stores the ingested data, making it available for query.
You can monitor your data connections (per event hub or IoT hub) and track the "received data size" by each data connection. You can also monitor the "discovery latency" – this is the timeframefrom data enqueue untildatais discoveredby ADX. This time frame is upstream to Azure Data Explorer. Discovery latency is available only for data connections (Event Hub, IoT Hub, or Event Grid ingestion) and it measures the time until data is discovered by the data connection.
When you see a long latency until data is ready for query, analyzing the discovery Latency and the next stage latencies (in the next steps) can help you understand whether the long latency is because of long latency in ADX, or is upstream to ADX.
This is how the latency is built up over the components:
When applying Event Hub, IoT Hub, or Event Grid ingestion, it can be useful to compare the number of eventsarrivingatADX data connectionwiththenumberofevents sent for the next steps of the ingestion process (in other words, they were processed successfully by the data connection stage). The tiles Events Received,Events Processed, andEvents Droppedallow you to make this comparison.
Data connection monitoring
Thesecond component of the batching ingestion proceeds is the Batching Manager,whichoptimizesingestion throughput by batchingdatabased on the ingestion batching policy.
This stepallowsyoutomonitor aspectssuchas:
Batch seal reason - the types of reasons (triggers) that sealed the batch (a batch is sealed for ingestion when the first condition is met.). The full list of possible reasons can be found here.
Batching duration-the duration ofabatch from the momentitis openedtowhenit is sealed,
Batch size - uncompressed expected data size in a batch for ingestion.
Moreover, you canview "per-table" data:batching duration per table, the batching size per table, and how the batches were sealed per table (as determined by the ingestion batching policy details.)
Batching monitoring - per DB or per table
The 3ed and 4th componentsareIngestion Manager and Storage Engine, respectively.Inthe Storage Engine, you can see the accumulative latency per database-the time from themoment ADX accepts the data until the data is received by the Storage Engine, and it is available for query.
The "Amount of data processed" tile shows the number of blobs received, blobs processed (== successfully), and blobs dropped. Blobs that have been processed by the Storage Engine are ready for query.
Storage Engine monitoring
No need to learn by heart
Eachdefinitiondescribedherecanbefoundthroughout the experience! The definitions of the batching steps are hidden by default,butcan be shown by usingthe "Show help" toggle.
In-product help and definitions
Feel free tocomment on this blog post.You can also use thefeedback button () on the top of the "Insights" page. More information about the metrics of the batching ingestion can be found here. More information about other tabs of ADX Insights can be found here.