We are excited to announce the public preview of JSON schema support in the Azure Event Hubs schema registry for Kafka applications. The Event Hubs schema registry offers a central repository for schema documents utilized by messaging-centric and event-driven applications.
By using Azure Schema Registry, both the producer and consumer applications can seamlessly exchange data without the need to handle and distribute the schema.
We are further expanding the capabilities of Azure Schema Registry in Event Hubs and adding support for new schema formats to enable schema driven event streaming.
The Azure Event Hubs schema registry is available at no additional charge in the Standard, Premium and Dedicated SKUs.
Why JSON Schema?
JSON Schema is used to define and validate the structure of JSON data. It helps ensure consistency, completeness, and accuracy of data by providing a clear definition of the expected format.
By incorporating JSON Schema validation into the event streaming applications, developers can ensure that any data being produced or consumed adheres to the predefined schema. This helps to prevent issues such as missing fields, incorrect data types, and inconsistent data formats.
Schema validation with Apache Kafka application using
In client-side schema validation use cases, you can use Azure Event Hubs schema registry for client application event serialization and de-serialization. In the following example, Kafka producer application uses JSON schema stored in Azure Schema Registry to serialize the event and publish them to a Kafka topic/event hub in Azure Event Hubs. The Kafka consumer deserializes events that it consumes from Event Hubs using schema ID of the event and JSON schema, which is fetched from Azure Schema Registry.
.
To use JSON Schema, you can create a new schema group in Azure Event Hubs Schema Registry in Event Hubs and use JSON Schema as the schema format. Then under that schema group you can create the JSON Schemas that you plan to use for schema validation.
Kafka producer with schema validation
Azure Event Hubs Schema Registry streamlines the process of enabling schema validation for Kafka applications using JSON Schema, making it incredibly simple. As part of your Kafka producer application, you just need to use the JSON Schema serializer(com.microsoft.azure.schemaregistry.kafka.json.KafkaJsonSerializer) and include the schema registry metadata to establish connectivity.
The JSON Schema serializer for the schema registry takes care of; connecting to the schema registry, fetching schemas and serializing the event when it is published through Kafka.
Kafka consumer with schema validation
When you use Kafka on the consumer side of Event Hubs, you can use the same terminology to only specify the connection details of the schema registry and the de-serializer class (com.microsoft.azure.schemaregistry.kafka.json.KafkaJsonDeserializer) used for de-serializing events.
Next steps
The addition of JSON schema support to Azure Event Hubs for Kafka applications is a significant improvement that allows for more streamlined and efficient data processing. By enabling the validation of data against a schema, users can ensure that their data is consistent and accurate, leading to better insights and decision-making.
To learn more about the Azure Event Hubs schema registry and JSON Schema support, please refer to the following documentation.
Azure Schema Registry Concepts - Azure Event Hubs | Microsoft Learn
Client-side schema enforcement - Schema Registry - Azure Event Hubs | Microsoft Learn
Use JSON Schema with Apache Kafka applications - Azure Event Hubs | Microsoft Learn
Introduction to Apache Kafka in Event Hubs on Azure Cloud - Azure Event Hubs | Microsoft Learn