Author: @SuryaJ is a Program Manager in the Azure Synapse Customer Success Engineering (CSE) team.
Continuing the series of Demystifying Data Explorer, in this blog, let’s take a look at the various ways of retrieving, consuming and visualizing data from Azure Synapse Data Explorer:
In the previous blog, Demystifying Data Ingestion into Azure Synapse Data Explorer, ACME Chemicals used batch and streaming enabled sources to ingest logs and sensor data into Azure Synapse Data Explorer. Continuing this business scenario, here are the data retrieval and visualization needs of ACME Chemicals:
1. ACME Chemicals want to visualize their data to prepare a dashboard with the following tiles:
a. Show near real time chemical machinery(vats) telemetry (Ex: Temperature, Mixer speed)
b. Status of critical metrics in relation to time (Ex: Spin rate of a centrifuge, output rate over time)
c. Top 5 products by production volume
2. ACME Chemicals want to send alert notifications when monitored values are breaching thresholds
3. ACME Chemicals want to train machine learning models to predict chemical properties of the output products using data from past batch.
4. ACME Chemicals want to Archive data from product batch productions for regulatory compliance.
Azure Synapse Data Explorer provides multiple tools and approaches to multiple personas in the modern data ecosystem to consume data.
Persona |
Data Retrieval and Visualization Tools |
Data Engineer |
Query Tools, Data Pipeline Orchestrators, Export |
Analytics Engineer |
Query Tools, Notebook |
Data Scientist |
Notebook (Synapse, Spark, Jupyter, Azure Data Studio) |
BI User |
PowerBI, Excel, Tableau (via ODBC), Qlik (via ODBC) |
End User |
ADX Dashboard, PowerBI, Excel, Grafana, Kibana, Redash |
App Developer |
REST API, SDKs |
SQL Pros |
MS-TDS Endpoint, ODBC, JDBC, SSMS, Azure Data Studio |
Kusto Query Language (KQL) is the language for querying data stored in Azure Synapse Data Explorer. KQL was specifically designed to make ad-hoc, interactive analytics simple, powerful, intuitive and easy to read. The following tools are provided to write KQL and interact with the data.
To unlock scenarios wherein ADX is an intermediate component of a data pipeline, Azure Synapse Data Explorer has native integration with the following services:
Tech Tip: Customers can use Logic Apps/Power Automate to enable alerting/notifications on their ADX data – use Power Automate with ADX to send notifications.
Azure Synapse Data Explorer integrates with most of the industry leading BI and Data Visualization tools to enable BI Users and End Users build powerful, rich, interactive reports and dashboards.
Azure Synapse Data Explorer is also available as data source in a lot of BI tools for visualization
To learn more, read Azure Synapse Data Explorer data visualization.
ADX can natively export its data using a set of control commands. Using this method, the results of a KQL query are written to an external storage (Azure Blob Storage or Azure Data Lake Store Gen 2), or a SQL table. Export can be a good way to store data for later inspection, archiving or to make data available for downstream platforms such as spark for training Machine Learning models.
Export can be done in 2 ways:
Characteristics |
One-time Export |
Continuous Export |
How? |
||
Frequency |
Manual, on every execution |
Based on defined time period |
Supported formats |
csv, json, parquet |
csv, json, parquet |
Use cases |
One time export of historical data for storage, output of query returning more than 500k rows or 64MB of data or ad-hoc business need |
Regular, schedule-based to support Disaster Recovery scenarios, data backup, supporting downstream applications and platforms that rely on the data stored in Azure Synapse Data Explorer |
Synapse/ Apache Spark - Using kusto-spark connector, users can read from their spark notebooks from Azure Synapse Data Explorer and also write data directly to an ADX table. This capability is useful in supporting machine learning or complex data transformation scenarios that are performed using Spark notebooks.
Additionally, when using Synapse Data Explorer (i.e. Data Explorer pools in Synapse Workspace), there is a native integration between Synapse Spark Notebooks and Data Explorer pools enabling users to right click on an ADX table and select “Write DataFrame to table” to get a prebuilt spark script with table’s data added to a data frame in a Notebook cell.
Kql Magic – kqlmagic is a library that extends the capabilities of the Python kernel in Jupyter Notebooks or Azure Data Studio enabling users to write Kusto Query Language statements within the notebook. For example, users can easily combine Python and Kusto Query Language to query and visualize data using the rich Plot.ly library integrated with render commands.
To programmatically access data from Azure Synapse Data Explorer, users can use REST API or SDK of ADX.
API - Using REST API, we can
REST API supports the following endpoints and actions:
SDK: Similarly, client libraries are available in python, R, java, .Net, Go, Node js and PowerShell SDK. For more information and to learn more, review Client Libraries.
Now that we have learned tools and approaches to consume data from Azure Synapse Data Explorer, lets solve the ACME Chemicals business scenario.
Example: ACME chemicals data consumption scenario solved
1. Using native ADX Dashboards, ACME Chemicals can create a dashboard
a. Use stat visuals to showcase near real time chemical vats metrics like current temperature, and mixer speed
b. Use time chart to show spin rate, catalyst consumption, output rate over time
c. Use Pie chart to show top 5 chemicals and their distribution by production volume
2. ACME chemicals used Logic Apps to run KQL queries at 30 min intervals to check for conditions and send email alerts.
3. Using Kusto-Spark connector, ACME chemicals read the data from Data Explorer into Synapse spark notebooks and train ML models.
4. ACME Chemicals uses continuous export to export data to Azure Storage account and read this data using external table.
This blog introduces the various methods and tools to retrieve and visualize data from Azure Synapse Data Explorer. As an analytical platform, Azure Synapse Data Explorer provides a comprehensive, rich ecosystem of options that can suit varied personas with different skill sets and requirements. This ecosystem of consumption and visualization options has been built with the clear intention and purpose of meeting users where they are.
This blog also marks the end of the Demystifying Data Explorer series of blogs that began with explaining what is data explorer (Demystifying Data Explorer), followed up by how to ingest data in data explorer (Demystifying Data Ingestion in Azure Synapse Data Explorer) and concluding with this blog on how to consume and visualize the data from Data Explorer.
Our team publishes blog(s) regularly and you can find all these blogs at https://aka.ms/synapsecseblog. For deeper level of understanding of Synapse implementation best practices, please refer to our Success by Design (SBD) site at https://aka.ms/Synapse-Success-By-Design
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.