IoT Solutions and Azure Cosmos DB

Published Nov 20 2019 08:00 AM 5,089 Views

Here are the 2 main reasons why Azure Cosmos DB is great for in IoT applications:

  • Cosmos DB can ingest semi-structured data at extremely high rates.
  • Cosmos DB can serve indexed queries back out with extremely low latency and high availability.

For a fascinating history and in-depth technical outline of Cosmos DB, check out this blog by Dharma Shukla Founder of Azure Cosmos DB, Technical Fellow, Microsoft.

The two primary IoT use cases for Cosmos DB are storing device telemetry and managing a device catalog. Read on to learn key details and catch this Channel 9 IoT Show episode with Andrew Liu, Program Manager, Cosmos DB team:



Storing device telemetry

The first IoT use case for Cosmos DB is storing telemetry data in a way that allows for rapid access to the data for visualization, post processing and analytics.

XTO Energy, a subsidiary of ExxonMobil, uses Cosmos DB for this exact scenario. The company has 50,000 oil wells generating telemetry data like flow rate and temperature at a high throughput. With lots and lots of bytes/second coming in, XTO DB uses Cosmos DB because of its fast, indexed data store, partitioning, and automatic indexing. See the architecture diagram below showing how Cosmos DB fits in their IoT solution. Note that data comes into Azure IoT Hub. Typically, you would position a hot store (with fast, interactive queries) and cold store behind IoT Hub. In this case, Azure Cosmos DB is used for the hot store. Read more about XTO Energy use of Azure Cosmos DB and Azure IoT.


CosmosDB picture1.png


Device catalog

A second IoT scenario that benefits using Cosmos DB is managing a device catalog for critical processes. This is where the low latency and "always on" guarantees of Cosmos DB comes in. Less than 10-ms latencies for both, reads (indexed) and writes at the 99th percentile, all around the world and 99.999% availability for multi-region writes!

Azure Cosmos DB transparently replicates your data across all the Azure regions associated with your Cosmos account. Cosmos DB employs multiple layers of redundancy for your data managing partitioning for you. Learn more about Cosmos DB high availability.

A real-life scenario example is that of an automobile manufacturer running an assembly line with high availability requirements. The manufacturer drives commands to robots that manufacture the cars. In order to ensure that the assembly line never stops, the manufacturer uses Cosmos DB to ensure extremely high availability and low latency.


Getting started with Cosmos DB for IoT

There are four “must-know” aspects that IoT developers need to understand to work with Cosmos DB:

  • Partitioning: The first topic an IoT developer needs to understand is partitioning. Partitioning is how you will optimize for performance. Cosmos DB does the heavy lifting for you taking care of partition management and routing in particular. What you will need to do is choose a partition key that makes sense during design. The Partition key is what will determine how data is routed in the various DB partitions by Cosmos DB and needs to make sense in the context of your specific scenario. The Device Id is generally the “natural” partition key for IoT applications. Learn more on partitioning.



  • Request Units (RUs): The second topic an IoT developer needs to understand is Request Units. A Request Unit (RU) is the measure of throughput in Azure Cosmos DB. RUs are compute units for both performance and cost. With RUs, you can dynamically scale up and down while maintaining availability, optimizing for cost, performance and availability at the same time. RUs offer programmatic APIs for scenarios like batch jobs. Learn more about Request Units



  • Time to Live: The third topic IoT developers should understand is Time-to-Live or TTL. With TTL, Azure Cosmos DB provides the ability to delete items automatically from a container after a certain period of time. With the right understanding and use of TTL settings in Cosmos DB, and benefiting from the free deletes, you will save 50% compared to systems that charge IOPs for deletes. Learn more about Time to Live.


  • Change Feed: The final “need to know” topic is change feed. A common design pattern in IoT is to use changes to the data to trigger additional actions. Change feed support in Azure Cosmos DB works by listening to an Azure Cosmos container for any changes. It then outputs the sorted list of documents that were changed in the order in which they were modified. The change feed is a single source of truth. You will avoid consistency issues across multiple systems when using it correctly. Learn more about change feed.



Try a hands-on lab

Not convinced yet? Try a hands-on lab. In the lab, you will use Azure Cosmos DB to ingest streaming vehicle telemetry data as the entry point to a near real-time analytics pipeline built on Cosmos DB, Azure Functions, Event Hubs, Azure Databricks, Azure Storage, Azure Stream Analytics, Power BI, Azure Web Apps, and Logic Apps.



Senior Member

Given the volume of data and the 10Gb partition limit, what is the approach for keeping data sizes in check? In your example, a city with multiple oil wells would probably hit that limit rather quickly (days/weeks). Is the approach just to introduce look-back reporting constraints for hot data and leverage TTL? In very high volume IoT scenarios with business requirements around look-back this could be an issue.


Given the volume of data and the 10Gb partition limit, what is the approach for keeping data sizes in check? In your example, a city with multiple oil wells would probably hit that limit rather quickly (days/weeks). Is the approach just to introduce look-back reporting constraints for hot data and leverage TTL? In very high volume IoT scenarios with business requirements around look-back this could be an issue.


You're absolutely correct that devices generally produce data at a fast rate - which is why Cosmos DB is usually used as a landing zone for ingest.

If you were to assume a 1 KB record per device per second * 60 seconds per minute * 60 minutes per hour * 24 hours in a day => a device will produce an estimated ~2.5GB per Month. Generally speaking - people handle this 2 different ways (which of course, aren't mutually exclusive):


1st method is to to set the partition key to an artificial field called /partitionKey - and assign a composite value (e.g. Device ID + Current Month and Year). This enables the workload to write to a extremely high cardinality of values (1 discrete value per device) and thus load balancing the throughput on the system's underlying partitions. At the same time, this preserves efficiency of common query patterns in hot telemetry data without having to blindly fan-out (e.g. SELECT * FROM c WHERE c.partitionKey IN ("device123-Jan-2019", "device123-Feb-2019", "device123-March-2019"). The rest of the queries can fan-out but that's okay - as the telemetry generally follows a pareto principal and is generally heavily skewed to being a write-heavy scenario with relatively low QPS (>80% of operations/sec are writes, not reads... optimize for the 80% instead of the 20%).


2nd method is to tier data in to a hot-store (e.g. Cosmos DB) + cold-store (e.g. Blob Storage) by using a combination of TTL to automatically prune data from the hot store and change feed to replicate data from hot store => cold store. Hot store enables low latency indexed queries over recent telemetry; meanwhile cold store enables low cost commodity storage for batch processing and/or retaining data long term for regulatory/compliance reasons. The Cosmos DB team is actually working on operationalizing this pattern (since it comes up so often as a common scenario) in the form of its analytical storage. This analytical storage feature is still in a gated preview (at the time of this post) - so I'd recommend using TTL + change feed for product use-cases until analytical storage feature graduates to GA.

Senior Member

This is by far the best reply to a comment on the Internet I've ever experienced. I'd suggest building on this and moving it in to an article or reference architecture for others to refer to as well. I'm sure I'm not the only one who saw that 10GB limit and thought it looked like a deal breaker.


I still think the limit forces a look-back gap for real time reporting without beefing up RUs for fan out reads. Maybe the answer is more compute to roll up stats to report from, but it's not as clean as what I would hope for. That limit forces my collect and store strategy to be at odds with my read and report strategy.


Thanks again for the deep dive in reply to my query.


That's a great idea - will look in to this :)


Very interesting topic and it underlines the importance of thinking and working on good architectures.


In case it is useful, in the last IoT projects, we have been using a kind of mix of the two mentioned alternatives: method 2 + "rich" partition keys. Besides performance (natural and straightforward requirement), we did this to optimize costs.


Conceptually speaking, TTL is a very simple feature but extremely useful.

Version history
Last update:
‎Nov 18 2019 01:38 PM