Data Ingestion for Azure Event Hubs: Quick Guide
Use Case:
A customer needs to determine the size and throughput requirements of an Event Hub that will export data to a Log Analytics workspace. They want to analyze the data ingestion rate per second to understand the throughput (MB/s) needed for efficient data export.
Solution:
To ensure scalability and accommodate varying data volumes, you can configure the Auto-Inflate feature in Event Hubs. This feature automatically scales the number of throughput units (TUs) to handle increased usage as needed. This way, Event Hubs can dynamically adjust to workload demands.
As data volumes grow, particularly in Log Analytics workspaces, Event Hub scaling becomes crucial. For sustained scalability, consider using Standard, Premium, or dedicated event hub Tiers with Auto-inflate enabled. For more information on configuring this feature, refer to Automatically scale up Azure Event Hubs throughput units.
Analyzing Data Before Exporting:
Before setting up data export, you can run queries in Kusto Query Language (KQL) to analyze the data volume. For instance, if you're exporting SecurityEvent logs, the following query assesses events per second:
SecurityEvent
| where TimeGenerated > ago(1d)
| summarize count() by bin(TimeGenerated, 1s)
| summarize percentiles(count_, 90, 98, 99)
Guidance on how to query above data can be found here- Use queries in Azure Monitor Log Analytics - Azure Monitor | Microsoft Learn
Please make a note that you should have proper permissions to access tables for querying logs. Screenshot for reference on results as per above query for last 3 days
Using above results, you can better align your throughput settings against defined limits and quotas for Event Hub TUs. For detailed limits, refer to the Quotas and limits - Azure Event Hubs - Azure Event Hubs | Microsoft Learn
This approach will help ensure that your Event Hub configuration meets both current and future data ingestion demands effectively.