Internet of Things Blog

7 MIN READ

Azure IoT Operations MQTT Broker: Performance Benchmarking on Throughput and Latency

davidemakenemi

Microsoft

Jul 02, 2025

1. Introduction

When deploying an MQTT broker in a production environment, understanding its performance characteristics is crucial. Whether you're handling IoT sensor data, real-time event streams, or enterprise messaging, knowing how the broker performs under load helps in optimizing deployments.

In this post, we evaluate the performance of the Azure IoT Operations MQTT Broker (subsequently referred to as Broker for brevity), focusing on:

Throughput – How many messages per second the broker can handle.
Latency – The time taken for messages to travel from publishers to subscribers.

All tests were conducted using MQTT QoS 1 to ensure consistent balance between reliability and throughput.

By following a structured performance testing approach, we aim to provide insights into how the Broker scales and where potential bottlenecks may arise.

👉 If you're looking for a quick summary, jump to the Key Takeaways section below.

2. Test Setup

For accurate benchmarking, we set up Standard_D4s_v5 virtual machines (VMs) to ensure consistent and efficient message handling. To replicate our performance results, use the same VM SKU and test configuration.

2. 1 Infrastructure configuration

Hardware configuration

VM Architecture: x64
VM Image: Ubuntu Server 22.04 LTS - x64 Gen2
VM SKU: Standard_D4s_v5
vCPUs: 4
Memory: 16 GiB RAM
Networking: All VMs are within the same virtual network (VNet) to minimize latency and reduce external network delays

Software configuration

OS Flavor: Ubuntu Server 22.04 LTS
Version: 22.04 LTS
Kubernetes distribution: K3s
Kubernetes version: v1.28.5

2.2 Azure IoT Operation configuration

The Azure IoT Operations configuration defined below is optimized for performance testing and MUST NOT be used in production as TLS encryption, authentication, and diagnostics pods are disabled to reduce variability.

The Broker consists of frontend and backend partitions for optimized message handling: This setup is optimized for a 5-node cluster, ensuring scalability, and redundancy.

Broker Configuration:

Frontend:

5 replicas

Backend:

5 partitions
Redundancy factor of 2
2 workers

Note: Increased redundancy doubles CPU usage, and therefore it also reduces the total available CPU for performing the same workload, potentially impacting overall efficiency.

Broker Listener: Configured with a Load Balancer port 1883
Broker Nodes: 5 x Azure D4s_v5 VMs (4 vCPUs, 16 GiB memory, Ubuntu 22.04)
Client Node: 1 x Azure D16s v5 VM (16 vCPUs, 64 GiB memory for load testing)

Note: A more powerful 8-core VM is recommended to prevent the client from becoming a bottleneck, as EMQTT-bench by EMQX has high CPU consumption.

The broker configuration is available in Azure IoT Mqtt Optimization. Json

3. Methodology

To evaluate the performance of IoT Operations MQTT broker we used emqtt-bench, an open source MQTT v5.0 benchmark tool designed by EMQX. For optimal performance during testing, the inflight queue should be configured to a minimum of 100.

3.1 Client Configuration

For 5-node cluster testing, a dedicated high-performance VM is required to act as the client. This VM must be separate from the cluster to prevent resource contention, ensuring that benchmarking reflects the broker's actual optimal performance.

3.2 Understanding the Performance Metrics

Maximum Throughput – Measures the highest number of messages per second the broker can process.

Note: Optimal performance requires finding a balance—publishers should send messages fast enough to fully utilize subscribers without overwhelming them.

Average Latency– The time, in milliseconds, it takes for a message to travel from a publisher to a subscriber.
Message Size – Tested with 16 Bytes, 8 KB, and 255 KB payloads to understand size impact. Evaluates how different payload sizes impact throughput and latency.
Data Throughput – Measures the total volume of data transmitted per second, expected in megabytes per second (MB/sec).

3.3 Test Scenarios

We tested the broker under different conditions to observe how it handles increasing workloads:

Varying Publisher Rates – Analyzing throughput changes with increasing message rates.
Different Payload Sizes – Measuring the impact of small (16 B), medium (8 KB), and large (255 KB) payloads.
Fan-In / Balanced / Fan-Out – Comparing multiple publishers to one subscriber (fan-in) vs. one publisher to many subscribers (fan-out) vs an equal number of publishers and subscribers (balanced).
Publisher / Subscriber Configuration – Vary number of publishers and subscribers across the three scenarios.
QoS - All tests were performed using MQTT QoS 1, which ensures at least once message delivery. This strikes a balance between reliability and performance, making it more representative of real-world production scenarios where message loss is unacceptable, but the overhead of QoS 2 is not justified.

We measured broker efficiency using different payload sizes across different publisher-to-subscriber ratios. The Fan-In test evaluated performance with a high number of publishers sending messages to a single subscriber. The Fan-Out stress test analyzed message distribution from a limited number of publishers to many subscribers under high throughput conditions. The Balanced test simulated a mixed workload with equal publishers and subscribers.

4. Results

Detailed Performance Metrics: Fan-In, Fan-Out, and Balanced Scenarios

Scenario	Configuration	Payload Size	Max Throughput (msg/sec)	Data Throughput (MB/sec)	Average Latency (ms)	Workload Description
Fan-In	1000 pub 1 sub	16 B	41,352	0.63	124	High-Load Fan-In
	1000 pub 1 sub	8 KB	14,439	112.67	26	High-Load Fan-In
	1000 pub 1 sub	255 KB	992	246.8	520	High-Load Fan-In
Balanced	1 pub 1 sub	16 B	50,739	6.49	2	Balanced Mixed-Load
	1 pub 1 sub	8 KB	9,500	77.8	10	Balanced Mixed-Load
	1 pub 1 sub	255 KB	1,314	327.08	540	Balanced Mixed-Load
	100 pub 100 sub	16 B	279,949	4.27	350	Balanced Mixed-Load
	100 pub 100 sub	8 KB	34,193	266.95	139	Balanced Mixed-Load
	100 pub 100 sub	255 KB	2,871	715.42	2,800	Balanced Mixed-Load
Fan-Out	1 pub 1000 sub	16 B	42,000	0.64	4	Large-scale Broadcast Fan-Out
	1 pub 1000 sub	8 KB	15,003	117.25	6	Large-scale Broadcast Fan-Out
	1 pub 1000 sub	255 KB	1,000	249.86	130	Large-scale Broadcast Fan-Out

5. Key Takeaways

Takeaway 1: Data Throughput Scales with Payload Size. Even though the number of messages per second drops with larger payloads, data throughput (MB/sec) increases significantly. For example:
- Fan-In at 255 KB: 246.8 MB/sec
- Balanced at 255 KB: 715.4 MB/sec

Takeaway 2: Performs Best in Low-Latency Use Cases.

When message sizes are small (e.g. 16 B, 8 KB) and the topology is lightweight (e.g. 1 pub to 1 sub), the broker achieves:
- Avg latency as low as 1-2 ms
- Throughput over 270,000 msg/sec (Balanced scenario at 16 B)
Ideal low-latency use cases:
- Real-time control systems (e.g. robotic arm commands, PLC feedback loops)
- Smart home device synchronization
- Autonomous vehicle telemetry coordination
- Industrial automation events (e.g. triggers from sensors to actuators)
For time-sensitive operations, our broker provides sub-10 ms latencies and massive message fanout capability, even under constrained payload sizes.
Takeaway 3: Fan-In Saturates Faster Than Fan-Out

In QoS 1 tests, we observed Fan-In topologies (1000 devices → 1 endpoint) hit latency walls earlier than Fan-Out topologies (1 device → 1000 endpoints), even with similar message throughput.
- Fan-In (8 KB): 14,439 msg/sec @ 26 ms latency
- Fan-Out (8 KB): 15,003 msg/sec @ only 6 ms latency
What this shows:
- In Fan-In, the broker handles thousands of simultaneous inbound QoS 1 acknowledgments — creating coordination pressure.
- In Fan-Out, a single publisher sends at a controlled rate, making it easier for the broker to fan out efficiently.
We designed our broker to sustain intelligent traffic shaping and are continuing to enhance its performance under Fan-In workloads where coordination pressure is highest.

6. Optimization Strategies

The Azure IoT Operations MQTT broker is built to support scalable, high-throughput, and low-latency messaging. To harness its full potential across diverse workload patterns, optimization should focus on balanced resource utilization and minimizing message delivery bottlenecks.

Maximize Throughput without overload

Fan-Out scenarios achieved strong throughput with consistently low latency, even under high subscriber counts. While they didn’t reach the raw message rate of Balanced workloads, their efficiency under broadcast pressure makes them ideal for scenarios requiring timely delivery to many endpoints.

Recommended Actions:

Batch and Compress Messages: Reduces overhead, improving payload transmission rates.
Balance Publish Load: Distribute publishers evenly across broker nodes to avoid overloading a single point of ingestion.

Maintain Low Latency for Real-Time Use Cases

The broker excels in low-latency performance for small payloads and tightly coupled pub-sub pairs — as seen in 1:1 scenarios with 16 B payloads achieving 2–6 ms latency. These characteristics are crucial for real-time, control-plane workloads.

Recommended Actions:

Use Smaller Payloads for Time-Sensitive Ops: Critical for scenarios like robotics, actuator control, or telemetry alerting.
Load Balance Across Nodes: Adjust broker cardinality, including frontend replicas (for client connection distribution) and backend partitions (for message throughput scaling) to ensure even load distribution across nodes and optimal performance.
Enable MQTT Persistent Sessions: Minimizes reconnection overhead for frequently offline clients, learn how to enable persistent sessions with Mosquitto CLI by setting -c and --session-expiry-interval

Optimize Deployment Scale for Workload Demands

Scalability depends on configuring the right cardinality — that is, the number of frontend replicas, backend partitions, and compute resources — to match your connection and throughput requirements.

Recommended Actions:

High Connection Load: Scale frontend replicas to match node count (e.g., 3 replicas for 3 nodes) to distribute client connections evenly.
High Message Throughput: Increase backend partitions to parallelize message processing (e.g., start with 1 partition per node and scale as needed).
Heavy Payload Scenarios: Allocate more memory and CPU to worker pods to avoid slowdowns from large payload serialization and transmission.
Backend Resiliency: Ensure redundancyFactor remains at the default of 2 (or more) so that each partition has at least two replicas, enabling failover protection without additional configuration

For detailed guidance on these optimizations, visit our documentation: Learn more →

7. Conclusion

The Azure IoT Operations MQTT broker is engineered for high performance, scalability, and efficiency, as demonstrated through rigorous benchmarking. In high-throughput balanced configurations, it sustained up to 279,949 messages/sec with 16 B payloads, showcasing best-in-class throughput for high-volume, symmetric pub-sub workloads. For bandwidth-heavy use cases, the broker handled up to 715 MB/sec (255 KB payloads), proving its scalability for large data transfers.

Balanced 1:1 scenario also delivered predictable low-latency performance, with average latencies as low as 2 ms, making them ideal for real-time messaging. Meanwhile, Fan-In configurations remain optimal for centralized data aggregation tasks like telemetry logging, handling tens of thousands of messages/secs with acceptable latency.

To maximize performance, we recommend key optimization strategies including load balancing, latency reduction, and workload-specific tuning. These approaches ensure efficiency at scale—whether you're managing high connection loads, scaling throughput, or handling large payloads in real-world deployments. For in-depth configuration guidance, visit our documentation.

Updated Jun 27, 2025

Version 1.0

manufacturing & mobility

microsoft iot

davidemakenemi

Microsoft

Joined April 03, 2025

View Profile

Internet of Things Blog

Follow this blog board to get notified when there's new activity