Forum Discussion
Stream Processing Changes: #Azure #CosmosDB change feed + Apache Spark
Azure Cosmos DB is a blazing fast, globally distributed, multi-model database service. Regardless of where your customers are, they can access data stored in Azure Cosmos DB with single-digit latencies at the 99th percentile at a sustained high rate of ingestion. This speed supports using Azure Cosmos DB, not only as a sink for stream processing, but also as a source. In a previous blog, we explored the potential of performing real-time machine learning with Apache Spark and Azure Cosmos DB. In this article, we will further explore stream processing of updates to data with Azure Cosmos DB change feed and Apache Spark.
Azure Cosmos DB change feed provides a sorted list of documents within an Azure Cosmos DB collection in the order in which they were modified. This feed can be used to listen for modifications to data within the collection to perform real-time (stream) processing on updates. Changes in Azure Cosmos DB are persisted and can be processed asynchronously, and distributed across one or more consumers for parallel processing. Change feed is enabled at collection creation and is simple to use with the change feed processor library.
Read about it in the Azure blog.