In this blog we’ll discuss the concept of Structured Streaming and how a data ingestion path can be built usingAzure Databricksto enable the streaming of data in near-real-time. We’ll touch on some of the analysis capabilities which can be called from directly within Databricks utilising theText Analytics APIand also discuss how Databricks can be connected directly intoPower BIfor further analysis and reporting. As a final step we cover how streamed data can be sent from Databricks toCosmos DBas the persistent storage.
Structured streaming is a stream processing engine which allows express computation to be applied on streaming data (e.g. a Twitter feed). In this sense it is very similar to the way in which batch computation is executed on a static dataset. Computation is performed incrementally via the Spark SQL engine which updates the result as a continuous process as the streaming data flows in.