You need just 5 minutes to create Synapse workspace and run your first Data Lake query!!!
Published Oct 06 2020 10:18 AM 5,325 Views
Microsoft

Azure Synapse workspace is easy to create service that enables you to analyze data on Azure Data Lake storage, Azure CosmosDB, and other data sources that contain valuable data. You don't need to pre-provision compute, create databases or ETL your data in order to start analyzing information. You just need to provision very lightweight workspace that included serverless SQL query endpoint and start querying your data on Data Lake.

 

In this article you will discover the easiest way to get started with Synapse workspace and run your first query on Data Lake.

Deploy workspace (ETA <5min)

Azure Synapse workspace can be deployed using portal , Azure CLI, and Deployment Templates. The easiest way to create new workspace is to use this Deploy to Azure button that will show you preconfigured form where you can send your deployment request:

Deploy to Azure 1

 

You will see a form where you need to enter some basic info like subscription, region, workspace name and username/password. Probably you will need less than a minute to fill-in and submit the form

Once you enter all data this template will deploy Azure Data Lake storage account and Synapse workspace. The total deployment time for template and breakdown for individual components is shown in the following figure (taken from one of my experiments):

 

JovanPop_0-1602002593915.png

In total you need less than 4.5min to create your fully functional workspace.

 

Run the query 

Once you provision the resource you will see the button that will lead you to resource group and Synapse workspace on Azure Portal. Now you need to Launch Synapse Studio by following the link on top right corner:

JovanPop_1-1602002836394.png

 

When you open your Synapse Studio just follow New -> SQL query option and paste the following query:

 

 

select top 10 *
from openrowset(bulk 'https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/ecdc_cases/latest/ecdc_cases.parquet',
format='parquet') as a

 

 

This query uses publicly available parquet file on that contains ECDC data about COVID cases recorded worldwide. If you run this query, in a couple of seconds you will get the results from this PARQUET file.

Yu can easily create more complex queries and visualize results using built-in Synapse Studio charting functionality. 

 

In total you will probably spend around 5 minutes to get started with synapse analytics and start analyzing data.

 

If you want to follow the steps described in this article you can see the actions in the following video:

 

Version history
Last update:
‎Oct 06 2020 10:18 AM
Updated by: