azure event hubs
76 TopicsJSON Structure: A JSON schema language you'll love
We talk to many customers moving structured data through queues and event streams and topics, and we see a strong desire to create more efficient and less brittle communication paths governed by rich data definitions well understood by all parties. The way those definitions are often shared are schema documents. While there is great need, the available schema options and related tool chains are often not great. JSON Schema is popular for its relative simplicity in trivial cases, but quickly becomes unmanageable as users employ more complex constructs. The industry has largely settled on "Draft 7," with subsequent releases seeing weak adoption. There's substantial frustration among developers who try to use JSON Schema for code generation or database mapping—scenarios it was never designed for. JSON Schema is a powerful document validation tool, but it is not a data definition language. We believe it's effectively un-toolable for anything beyond pure validation; practically all available code-generation tools agree by failing at various degrees of complexity. Avro and Protobuf schemas are better for code generation, but tightly coupled to their respective serialization frameworks. For our own work in Microsoft Fabric, we're initially leaning on an Avro-compatible schema with a small set of modifications, but we ultimately need a richer type definition language that ideally builds on people's familiarity with JSON Schema. This isn't just a Microsoft problem. It's an industry-wide gap. That's why we've submitted JSON Structure as a set of Internet Drafts to the IETF, aiming for formal standardization as an RFC. We want a vendor-neutral, standards-track schema language that the entire industry can adopt. What Is JSON Structure? JSON Structure is a modern, strictly typed data definition language that describes JSON-encoded data such that mapping to and from programming languages and databases becomes straightforward. It looks familiar—if you've written "type": "object", "properties": {...} before, you'll feel right at home. But there's a key difference: JSON Structure is designed for code generation and data interchange first, with validation as an optional layer rather than the core concern. This means you get: Precise numeric types: int32 , int64 , decimal with precision and scale, float , double Rich date/time support: date , time , datetime , duration —all with clear semantics Extended compound types: Beyond objects and arrays, you get set , map , tuple , and choice (discriminated unions) Namespaces and modular imports: Organize your schemas like code Currency and unit annotations: Mark a decimal as USD or a double as kilograms Here's a compact example that showcases these features. We start with the schema header and the object definition: { "$schema": "https://json-structure.org/meta/extended/v0/#", "$id": "https://example.com/schemas/OrderEvent.json", "name": "OrderEvent", "type": "object", "properties": { Objects require a name for clean code generation. The $schema points to the JSON Structure meta-schema, and the $id provides a unique identifier for the schema itself. Now let's define the first few properties—identifiers and a timestamp: "orderId": { "type": "uuid" }, "customerId": { "type": "uuid" }, "timestamp": { "type": "datetime" }, The native uuid type maps directly to Guid in .NET, UUID in Java, and uuid in Python. The datetime type uses RFC3339 encoding and becomes DateTimeOffset in .NET, datetime in Python, or Date in JavaScript. No format strings, no guessing. Next comes the order status, modeled as a discriminated union: "status": { "type": "choice", "choices": { "pending": { "type": "null" }, "shipped": { "type": "object", "name": "ShippedInfo", "properties": { "carrier": { "type": "string" }, "trackingId": { "type": "string" } } }, "delivered": { "type": "object", "name": "DeliveredInfo", "properties": { "signedBy": { "type": "string" } } } } }, The choice type is a discriminated union with typed payloads per case. Each variant can carry its own structured data— shipped includes carrier and tracking information, delivered captures who signed for the package, and pending carries no payload at all. This maps to enums with associated values in Swift, sealed classes in Kotlin, or tagged unions in Rust. For monetary values, we use precise decimals: "total": { "type": "decimal", "precision": 12, "scale": 2 }, "currency": { "type": "string", "maxLength": 3 }, The decimal type with explicit precision and scale ensures exact monetary math—no floating-point surprises. A precision of 12 with scale 2 gives you up to 10 digits before the decimal point and exactly 2 after. Line items use an array of tuples for compact, positional data: "items": { "type": "array", "items": { "type": "tuple", "properties": { "sku": { "type": "string" }, "quantity": { "type": "int32" }, "unitPrice": { "type": "decimal", "precision": 10, "scale": 2 } }, "tuple": ["sku", "quantity", "unitPrice"], "required": ["sku", "quantity", "unitPrice"] } }, Tuples are fixed-length typed sequences—ideal for time-series data or line items where position matters. The tuple array specifies the exact order: SKU at position 0, quantity at 1, unit price at 2. The int32 type maps to int in all mainstream languages. Finally, we add extensible metadata using set and map types: "tags": { "type": "set", "items": { "type": "string" } }, "metadata": { "type": "map", "values": { "type": "string" } } }, "required": ["orderId", "customerId", "timestamp", "status", "total", "currency", "items"] } The set type represents unordered, unique elements—perfect for tags. The map type provides string keys with typed values, ideal for extensible key-value metadata without polluting the main schema. Here's what a valid instance of this schema looks like: { "orderId": "f47ac10b-58cc-4372-a567-0e02b2c3d479", "customerId": "7c9e6679-7425-40de-944b-e07fc1f90ae7", "timestamp": "2025-01-15T14:30:00Z", "status": { "shipped": { "carrier": "Litware", "trackingId": "794644790323" } }, "total": "129.97", "currency": "USD", "items": [ ["SKU-1234", 2, "49.99"], ["SKU-5678", 1, "29.99"] ], "tags": ["priority", "gift-wrap"], "metadata": { "source": "web", "campaign": "summer-sale" } } Notice how the choice is encoded as an object with a single key indicating the active case— {"shipped": {...}} —making it easy to parse and route. Tuples serialize as JSON arrays in the declared order. Decimals are encoded as strings to preserve precision across all platforms. Why Does This Matter for Messaging? When you're pushing events through Service Bus, Event Hubs, or Event Grid, schema clarity is everything. Your producers and consumers often live in different codebases, different languages, different teams. A schema that generates clean C# classes, clean Python dataclasses, and clean TypeScript interfaces—from the same source—is not a luxury. It's a requirement. JSON Structure's type system was designed with this polyglot reality in mind. The extended primitive types map directly to what languages actually have. A datetime is a DateTimeOffset in .NET, a datetime in Python, a Date in JavaScript. No more guessing whether that "string with format date-time" will parse correctly on the other side. SDKs Available Now We've built SDKs for the languages you're using today: TypeScript, Python, .NET, Java, Go, Rust, Ruby, Perl, PHP, Swift, and C. All SDKs validate both schemas and instances against schemas. A VS Code extension provides IntelliSense and inline diagnostics. Code and Schema Generation with Structurize Beyond validation, you often need to generate code or database schemas from your type definitions. The Structurize tool converts JSON Structure schemas into SQL DDL for various database dialects, as well as self-serializing classes for multiple programming languages. It can also convert between JSON Structure and other schema formats like Avro, Protobuf, and JSON Schema. Here's a simple example: a postal address schema on the left, and the SQL Server table definition generated by running structurize struct2sql postaladdress.json --dialect sqlserver on the right: JSON Structure Schema Generated SQL Server DDL { "$schema": "https://json-structure.org/meta/extended/v0/#", "$id": "https://example.com/schemas/PostalAddress.json", "name": "PostalAddress", "description": "A postal address for shipping or billing", "type": "object", "properties": { "id": { "type": "uuid", "description": "Unique identifier for the address" }, "street": { "type": "string", "description": "Street address with house number" }, "city": { "type": "string", "description": "City or municipality" }, "state": { "type": "string", "description": "State, province, or region" }, "postalCode": { "type": "string", "description": "ZIP or postal code" }, "country": { "type": "string", "description": "ISO 3166-1 alpha-2 country code" }, "createdAt": { "type": "datetime", "description": "When the address was created" } }, "required": ["id", "street", "city", "postalCode", "country"] } CREATE TABLE [PostalAddress] ( [id] UNIQUEIDENTIFIER, [street] NVARCHAR(200), [city] NVARCHAR(100), [state] NVARCHAR(50), [postalCode] NVARCHAR(20), [country] NVARCHAR(2), [createdAt] DATETIME2, PRIMARY KEY ([id], [street], [city], [postalCode], [country]) ); EXEC sp_addextendedproperty 'MS_Description', 'A postal address for shipping or billing', 'SCHEMA', 'dbo', 'TABLE', 'PostalAddress'; EXEC sp_addextendedproperty 'MS_Description', 'Unique identifier for the address', 'SCHEMA', 'dbo', 'TABLE', 'PostalAddress', 'COLUMN', 'id'; EXEC sp_addextendedproperty 'MS_Description', 'Street address with house number', 'SCHEMA', 'dbo', 'TABLE', 'PostalAddress', 'COLUMN', 'street'; -- ... additional column descriptions The uuid type maps to UNIQUEIDENTIFIER , datetime becomes DATETIME2 , and the schema's description fields are preserved as SQL Server extended properties. The tool supports PostgreSQL, MySQL, SQLite, and other dialects as well. Mind that all this code is provided "as-is" and is in a "draft" state just like the specification set. Feel encouraged to provide feedback and ideas in the GitHub repos for the specifications and SDKs at https://github.com/json-structure/ Learn More We've submitted JSON Structure as a set of Internet Drafts to the IETF, aiming for formal standardization as an RFC. This is an industry-wide issue, and we believe the solution needs to be a vendor-neutral standard. You can track the drafts at the IETF Datatracker. Main site: json-structure.org Primer: JSON Structure Primer Core specification: JSON Structure Core Extensions: Import | Validation | Alternate Names | Units | Composition IETF Drafts: IETF Datatracker GitHub: github.com/json-structure69Views1like0CommentsGeneral Availability: Large Message Support in Azure Event Hubs
Azure Event Hubs is a cloud-native service that streams millions of events per second with minimal latency, fully compatible with Apache Kafka and requiring no code changes for existing Kafka workloads. Today, we are excited to announce the general availability of Large Message Support in Azure Event Hubs, enabling you to send and receive messages up to 20 MB in self-serve scalable Dedicated clusters, with enhanced reliability for seamless handling of large messages and greater flexibility for your data streaming solutions. This feature enables fast and reliable processing of larger, indivisible events. Large Message Support works with both AMQP and Apache Kafka protocols, allowing you to send bigger payloads as usual without changing your client code. It is advisable to check your client settings to ensure that timeouts and maximum message size limits are not set too low on the client side. To enable Large Message Support, simply configure your eligible event hubs dedicated clusters using the Azure Portal. For further details and eligibility, please visit aka.ms/largemessagesupportforeh. Your feedback is invaluable to us, and we look forward to hearing about your experiences. Read more: Azure Event Hubs for Apache Kafka - Azure Event Hubs | Microsoft Learn Quickstart: Send and Receive Large Messages with Azure Event Hubs (Preview) - Azure Event Hubs | Microsoft Learn150Views1like0CommentsGeo-Replication is Here! Now generally available for Event Hubs Premium & Dedicated
Today, we are thrilled to announce the General Availability of the Geo-replication feature for Azure Event Hubs, now available in both Premium and Dedicated tiers. This milestone marks a significant enhancement in our service, providing our customers with robust business continuity and disaster recovery capabilities – ensuring high availability for their mission-critical applications. The Geo-replication feature allows you to replicate your Event Hubs data across multiple regions either synchronously or asynchronously, ensuring that your data remains accessible in the event of maintenance activities, regional degradation, or a regional outage. With Geo-replication, you can seamlessly promote a secondary region to a primary, minimizing downtime and ensuring business continuity. Before failover (promotion of secondary to primary) After failover (promotion of secondary to primary) With general availability, we are excited to announce that the Geo-replication feature now supports all the features that are generally available in the service today. This includes private networking, customer-managed key encryption, Event Hubs Capture, and many more. These enhancements ensure that you can leverage the full capabilities of Event Hubs while benefiting from the added reliability of Geo-replication. We have also increased visibility into the health and metrics of your replicas. This means you can now monitor the status of your replicas more effectively and know exactly when it is appropriate to promote your secondary to primary. This added visibility ensures that you can make informed decisions and maintain the high availability of your applications. Since the announcement of public preview, we’ve had several customers try out the Geo-replication feature and appreciate the enhanced reliability and peace of mind that comes with having a robust disaster recovery solution in place. Learn more Learn more about geo-replication concepts and the pricing model and try out this quickstart to learn how to setup geo-replication for your premium and dedicated tier namespaces. We encourage our customers to try out the Geo-replication feature and experience the benefits of turnkey business continuity and disaster recovery features firsthand. Your feedback is invaluable to us, and we look forward to hearing about your experiences.965Views2likes0CommentsAnnouncing the Event Hubs Data Explorer: a handy tool for getting started and debugging
Transform your event-driven architectures with the new Event Hubs Data Explorer! Whether you're debugging, optimizing, or just getting started, this tool offers a unified interface for producing and consuming event data, providing invaluable insights. Explore the endless possibilities with Event Hubs Data Explorer!2.8KViews3likes3CommentsAnnouncing the General Availability of Event Hubs Data Explorer
We are excited to announce the general availability of the Event Hubs Data Explorer in the Azure portal! Ever since our preview announcement in September, we've heard customers rave about how the Event Hubs Data Explorer has already made its way into their daily workflows to onboard, debug and review the data in their Event Hubs with very little effort. Customer-Centric Design We listened to your feedback and designed the Event Hubs Data Explorer to address your needs. We've had a lot of customers try this tool and share feedback on how its saving them significant time and effort when it comes to viewing their Event Hubs in action and performing basic debugging tasks. Simplified Onboarding and Debugging The Event Hubs Data Explorer is perfect for both new and experienced users. It provides a comprehensive view of event data, making it easy to test event producers and consumers. You can quickly validate your setup with custom workloads or predefined datasets, ensuring everything is configured correctly. Debugging is now more straightforward than ever. With the ability to inspect data at specific timestamps or offsets, you can quickly identify and resolve issues, optimizing your event processing workflows. Getting Started To start using the Event Hubs Data Explorer, navigate to your Event Hubs namespace in the Azure portal. From there, you can access the Data Explorer and begin sending and viewing events with just a few clicks. You can also check out the documentation here. We are excited to see how you leverage the Event Hubs Data Explorer to drive innovation and efficiency in your projects. Your feedback has been instrumental in shaping this tool, and we look forward to continuing to improve our offerings based on your insights.379Views1like0CommentsIntroducing Kafka Support in Event Hubs emulator
Azure Event Hubs is a cloud-native data streaming service that streams millions of events per second with low latency, from any source to any destination. Compatible with Apache Kafka®, it allows you to run existing Kafka workloads without code changes. Earlier this year, we released the Event Hubs emulator for local development, which initially only supported AMQP protocol. We are now excited to announce Apache Kafka® protocol support in the Event Hubs emulator. Why emulator? Developers across the globe love emulators! While there are numerous compelling reasons to use emulators, here are just a few of those reasons to consider: Optimized Development Loop: The emulator speeds up dev/testing against Azure Event Hubs. Pre-migration Trial: Try Azure Event Hubs for Apache Kafka® using your existing Kafka applications before migrating to the cloud. Isolated Environment: Use the emulator for dev/test setup without network latency or cloud resource constraints. Cost-efficient: The emulator is free and can be run on your local machine for dev/testing. Note: The emulator is intended only for development and testing. It should not be used for production workloads. Official support is not provided, and any issues or suggestions should be reported via GitHub. Kickstart development with Event Hubs emulator The emulator is available as a Docker image on Microsoft Artifact Registry and is platform-agnostic – it can run on Windows, macOS, and Linux. You can either use our automated scripts from the Installer repository or spin up the emulator container using the docker compose command. The producer and consumer APIs are currently compatible with the emulator. Additional API support will be provided in future incremental versions. To test Apache Kafka® applications locally with the Event Hubs emulator, visit aka.ms/devtestwithehemulator. Learn more about Event Hubs: Azure Event Hubs: Data streaming platform with Kafka support - Azure Event Hubs | Microsoft Learn Introduction to Apache Kafka® in Event Hubs on Azure Cloud - Azure Event Hubs | Microsoft Learn We appreciate your feedback and encourage you to share it with us. Please provide feedback or report any issues at our GitHub repository: Issues · Azure/azure-event-hubs-emulator-installer May the Event Hubs emulator light up your test cases in green! 😊787Views0likes0CommentsAnnouncing public preview for Geo-replication for Azure Event Hubs Dedicated
Geo-replication for Azure Event Hubs Dedicated is now in public preview. Learn how to enable this feature for your Event Hubs namespaces and enjoy the benefits of high availability, disaster recovery, and regional compliance.2.8KViews0likes0Comments