Benchmarking Azure Event Hubs Premium for Kafka and AMQP workloads

Published May 24 2022 06:53 AM 1,493 Views
Microsoft

 

Azure Event Hubs is a fully managed, real-time data streaming service for your Kafka and AMQP event streaming workloads. We recently announced the general availability of Azure Event Hubs Premium, a new product tier that comes with a state-of-the-art event streaming architecture for high-end event streaming scenarios which require elastic, superior, and predictable performance.

With Premium, you get superior performance and better isolation for your event streaming workloads with in a managed multitenant PaaS environment.

 

We conducted a performance benchmarking test for Azure Event Hubs Premium and analyzed how it performs against Apache Kafka and AMQP workloads by focusing on performance metrics: end-to-end event streaming latency and event publishing latency.

 

This article provides you with a summary of the performance benchmarking methodology, tools we used and an analysis of the results that we obtained.

 

Benchmarking Framework

Before diving into the details of the performance benchmarking, it’s important to discuss the performance benchmarking framework that we used. We explored on open-source benchmarking tools that are available for event streaming and selected the Linux Foundation project, OpenMessaging Benchmark Framework.

 

OpenMessaging Benchmark Framework (OMB) is a suite of tools that makes it easy to benchmark distributed messaging systems in the cloud. OMB allows you to perform benchmarking for asynchronous messaging or event streaming use cases where you can specify the workload and use existing drivers of different broker implementations (such as Apache Kafka).

 

Test Setup

The performance benchmarking of Azure Event Hubs Premium was conducted on the test setup shown in the following figure. All the applications and services were deployed in the same Azure region and tests were carried out for both Kafka and AMQP using the OMB publisher and consumer APIs.

 

perf-setup.png

 

 

OMB Application

OpenMessaging Benchmark Framework application was executed on an Azure Linux virtual machine deployed in the same region. The VM was created with the following configuration: 

  • Size: Standard B4ms (4 vcpus, 16 GiB memory)
  • OS: Linux (ubuntu 18.04)
  • Location: East US
  • Networking Accelerated networking: Enabled
  • JDK: openjdk 11.0.11 2021-04-20

 

Azure Event Hubs Premium namespace

For the benchmarking test, we used an Azure Event Hubs Premium namespace with 4 Processing Units (PUs)of capacity and we provisioned it in the same region that we run our OMB benchmarking client application.

 

Also, the namespace that we used for this benchmark test is also in a region that has Availability Zones (AZ) support enabled. The Event Hubs Premium namespace is replicated across Azure availability zones which are physically separate locations within each Azure region that are tolerant to local failures. Therefore, this benchmarking test measures the performance of event streaming which ensures the replication of events across all these availability zones. Also note that AZ support is included without any additional cost, in the Premium tier for all the regions that currently support AZs.

 

Kafka and Azure Event Hubs AMQP drivers for OMB

For our benchmarking test, we wanted to test event streaming workloads that use both Kafka and AMQP. For Apache Kafka workloads, we used the existing Kafka driver of OMB without any code changes. For AMQP, we built a new AMQP driver for OMB using Azure Event Hubs Java SDK.  We are planning to contribute it to the Open Messaging Benchmark Framework project in the future. A sample implementation of the AMQP driver can be found here.

 

Latency Test

Low latency event ingestion and consumption is increasingly becoming a key requirement of most of the event streaming use cases. Therefore, we conducted an end-to-end(E2E) latency test for Event Hubs Premium tier, where we measure the latency for a message to traverse from producer to consumer through the event streaming engine or broker.

We measured end-to-end latency using different partition counts and using a moderate event streaming workload of 1 MB/s.

 

Apache Kafka

For Apache Kafka workload we used the following producer and consumer configurations.  To optimize the use case for the latency test, we batched events up to a maximum of 1 ms (linger.ms=1) with batch size of 131072 bytes which is the maximum number of bytes that will be included in a batch.

The full configuration for Kafka producer and consumer can be found below. 

 

 

 

replicationFactor: 3
topicConfig: |
  min.insync.replicas=2

commonConfig: |
  bootstrap.servers=<eventhubs-namespace-name>.servicebus.windows.net:9093
  security.protocol=SASL_SSL
  sasl.mechanism=PLAIN
  sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="xxx";
  request.timeout.ms=120000

producerConfig: |
  acks=all
  linger.ms=1
  batch.size=131072

consumerConfig: |
  auto.offset.reset=earliest
  enable.auto.commit=false
  max.partition.fetch.bytes=1048576
  fetch.max.wait.ms=10

 

 

 

 

AMQP

For AMQP workload we used the Event Hubs SDK to implement the OMB driver. To optimize the use case for the latency tests we didn’t use batching (batch size = 1 event) and avoided frequent checkpointing which involves committing offsets to an external checkpoint store.  Other than that, we stick to the default values of Event Hubs SDK configurations for both producer and consumer implementations. 

 

Latency Results

The following graph outlines the results that we obtained from the latency tests for both Kafka and AMQP. Azure Event Hubs Premium consistently performs with an end-to-end latency of ~10ms or less for both Kafka and AMQP workloads for different partition counts.

 

latency2.png

 

In addition to the end-to-end latency, we also measured the publisher latency, which represents time it takes to successfully publish and event to Event Hubs and getting an acknowledgement.

 

Let’s have a closer look at the latency number we obtain for each workload.

 

Latency results for streaming events with Apache Kafka in Event Hubs

In the following table we have recorded the latency numbers that we obtained for streaming data using Apache Kafka API of Event Hubs. You can find latency numbers for both end-to-end latency and publisher latency. Also, we have measured how latency changes based on the number of partitions that you have in an event hub (or topic).

 

Kasun_Indrasiri_0-1653350612172.png

 

Also please note that we increased the consumer count of the Kafka consumer application up to the number of partitions that we used in the test so that we have sufficient connections from consumer to Event Hubs to prevent any bottleneck at the event consumption path.

 

Latency results for streaming events with AMQP in Event Hubs

We obtained the same latency measurements for AMQP using OMB driver which is implemented using Azure Event Hubs Java SDK.

 

Kasun_Indrasiri_1-1653350679607.png

 

Unlike the Kafka test, we didn’t have to increase the number of consumers instances of Azure Event Hubs SDK based consumer as it creates sufficient connections based on the number of partitions it has.

One key observation from these latency measurements is that we were able to observe similar low latency numbers for both Kafka and AMQP APIs of Event Hubs.

 

Latency Predictability – Kafka Workloads

To measure the predictability of the event streaming cloud services, we measured the average end-to-end latency for the same workloads (1 MB/s) across multiple test iterations. For this test we also compared latency results of Event Hubs Premium with an external Kafka cloud provider as well.

 

The intention of this test is to compare the end-to-end latency of cloud services that support Kafka workloads without manually managing Kafka clusters. Therefore, no specific service side tunning was required for this test and all the client side (producer and consumer) configurations that we used in the above latency test for Kafka were used as it is.

 

As you can see in the following graph, the end-to-end latency for Event Hubs Premium consistently stays below 10ms while the latency numbers for external Kafka cloud provider heavily fluctuated between 20ms and 80ms.

 

latency-predictability.png

We used the same Kafka producer and consumer configuration across both services and tests are carried out multiple times to ensure the consistency.  

 

Summary

In this article, we analyzed the details of Azure Event Hubs Premium benchmarking test for event streaming latency. Azure Event Hubs Premium provides low latency event streaming for both Kafka and AMQP workloads. For the workloads and configurations, we tested, we were able to reach less than 10 ms end to end latency and publishing latency.

 

Azure Event Hubs Premium's latency is also predictable, which means it does not vary depending on the workloads and other tenants in the cloud service. In comparison, with similar cloud services that offer Kafka, Event Hubs was able to consistently stream data with low latency.

 

To try out and learn more about Azure Event Hubs Premium check out the links below.

 

Co-Authors
Version history
Last update:
‎May 24 2022 09:27 AM
Updated by: