This guide describes how to build near real time Power BI reports leveraging Synapse Link and SQL On-Demand. The intent is to demonstrate the simplicity of using these technologies.
Now let's start!
Ensure you have the new Synapse Workspace enabled in your subscription:
Ensure you have Synapse Link enabled at your Cosmos DB account
Create your Database and container, verify the container has the Analytical Store enabled
As a prerequisite you need to ensure you are running Cosmos DB Python SDK v4.1.0 by executing the code below in a notebook:
import azure.cosmos as cosmos
print (cosmos.__version__)
Result should be 4.1.0, if it’s below then run this command in a new cell:
pip install --force-reinstall azure-cosmos
You’ll then need to open a new notebook to get the new version taken into account and run the following code:
import azure.cosmos
from azure.cosmos.partition_key import PartitionKey
database = cosmos_client.create_database_if_not_exists('RetailDemo')
print('Database RetailDemo created')
container = database.create_container_if_not_exists(id='WebsiteData', partition_key=PartitionKey(path='/CartID'),analytical_storage_ttl=-1)
print('Container WebsiteData created')
Note that you have created in the database RetailDemo a container named WebsiteData partitioned on CartID and you enabled the Analytical Store with the the parameter "analytical_storage_ttl=-1"
Once the container is created you can check the Analytical Store is enabled by default:
Then let start to load a small sample of data, for this you just need to create a new notebook in your Cosmos DB Data Explorer as follow:
Code to run in your notebook:
%%upload --databaseName RetailDemo --containerName WebsiteData --url https://cosmosnotebooksdata.blob.core.windows.net/notebookdata/websiteData-small.json
Here is how the data look like now:
Once this first step is complete you have a container with a few items and an Analytical Store on it.
Next Step is to go to Synapse Workspace and from there to Synapse Studio, create a SQL On Demand Database and test querying the Cosmos DB Analytical Store from there.
You can either use an existing Synapse Workspace or create a new one and launch Synapse Studio directly on your Workspace.
To discover how to create a SQL On-Demand Database and start learning how it works simply use this link:
https://docs.microsoft.com/en-us/azure/synapse-analytics/quickstart-sql-on-demand
Now you can create a view in the On-Demand Database with the following syntax:
Click on the Develop icon on the left side to access the SQL script and Notebooks, click on the ‘+’ sign to get a new SQL script then connected to the SQL On-Demand engine and ‘myondemanddb’ database.
Code to run in your SQL script:
CREATE VIEW CosmosDBTest
AS
SELECT
*
FROM OPENROWSET
(
'CosmosDB',
'account=cosmosdblp2;database=RetailDemo;region=northeurope;key=your_key',
WebsiteData
)
AS q1
In my my current example region is northeurope.
Once the view is created, you can run simplistic queries such as the ones below and get the results from the CosmosDB container WebsiteData:
Code to run in your SQL script:
select * from CosmosDBTest;
select country,sum(price)
from CosmosDBTest
group by country;
Note that Synapse Link take care of the JSON document flattening into a table format for you.
Now that the general mechanism is now in place so you are ready to build a Power BI report on top of this. In case you don’t have PBI Desktop already you can get it from there:
https://www.microsoft.com/en-us/download/details.aspx?id=58494
Start PBI Desktop and select the Azure SQL Database source and put the SQL On-Demand endpoint as the server name:
Do not forget to specify port 1433 (yours-ondemand.sql.azuresynapse.net,1433) and DirectQuery mode:
After giving your credentials let’s pick up the view CosmosDBTest that you created precedingly:
Here is the simple report we can build and let focus on Guinea-Bissau, the current price value is 7.5:
Let’s go back to CosmosDB and select the corresponding items where we can raise the price of the corresponding item by 100 for instance:
After a latency of around a minute the price increase is reflected on the report:
And what if you load a larger set of Data into your Cosmos DB container?
Let’s then take a larger data set and bulk load it into your container as follow:
How is this rendered in Power BI?
Almost there!
What if you publish to your Power BI Tenant?
This is it!
To wrap this up:
- I enabled the Analytical Store on a Cosmos DB container
- I created a SQL on-demand view on this container Analytical Store
- I created a Power BI report connected to the SQL on-demand database as a regular Azure SQL Database
Call to Action:
- Please try now the Cosmos DB to Spark integration by following these examples:
- Synapse/Notebooks/PySpark/Synapse Link for Cosmos DB samples at main · Azure-Samples/Synapse (github.com)