Good article, I especially like the very concise and understandable explanation of what 'lag' actually is.
I just wanted to mention two things:
- If you are on Premium or Dedicated SKUs of Event Hubs, lag metrics are already available (you can enable it via Diagnostics Settings). https://learn.microsoft.com/en-us/azure/event-hubs/monitor-event-hubs-reference#application-metrics-logs is the documentation for this.
- If you are on Standard or Basic SKUs and you do not want to run this yourself (as explained in this article by yodobrin and demonstrated in the linked GitHub repo), you can also use a Marketplace app that we created (https://azuremarketplace.microsoft.com/en-us/marketplace/apps/huditechughaftungsbeschrnkt1673457598758.lag-metrics?tab=overview), which will perform the task as a managed solution. One advantage is that it will find all Event Hubs and automatically and write lag metrics for all of them.