Blog Post

Startups at Microsoft
2 MIN READ

Part 2 - Building a Data Lakehouse using Azure Data explorer - The Deployment

Denise_Schlesinger's avatar
Jun 07, 2023

Introduction

As businesses continue to generate massive amounts of data, the need for efficient data management solutions becomes increasingly important. This is where a data lake house comes in - an hybrid solution that combines the best features of a datalake and a data warehouse.

 

 

 

 

Part 1 - Building a Data Lakehouse using Azure Data explorer

We explored how to build a data lakehouse using Azure Data Explorer (ADX) where the data flows from Azure SQL DB using Change Data Capture (CDC) through Azure Data Factory and events flowing from events hub.

This article is Part 2 in the series, here we will deploy this solution using Bicep, a powerful infrastructure as code (IaC) tool from Microsoft. With this guide, you'll be able to create a data lakehouse that can handle large volumes of data and provide valuable insights for your business.

Requirements

  • An Azure account and a logged in user with admin permissions

Infrastructure Deployment

  • Go to github and download the files from here:

https://github.com/denisa-ms/azure-data-and-ai-examples/tree/master/adx-datalakehouse

  • Go to the azure portal and login with a user that has administrator permissions
  • Open the cloud shell in the azure portal

 

  • Upload the file “all.zip” in the github repo by using the upload file button in the cloud shell

 

  • Unzip the file by writing unzip all.zip

 

  • Run ./createAll.ps1

 

 

NOTE: This takes time so be patient

 

Explanation

The code here creates the following entities

 

 

Azure SQL Server

Contains an Azure SQL database with the Adventure works sample data.

Azure Data Factory – (adxdlhouse-adf)

Contains 2 data pipelines:

  • SQLToADX_orders: copies the orders from the Adventureworks sample DB in Azure SQL Server into ADX tables bronzeOrders
  • SQLToADX_products: copies the products from the Adventureworks sample DB in Azure SQL Server into ADX tables bronzeProducts

 

 

 

 

 

Azure Events Hub

 

Contains a hub called “clicks-stream” that streams click events into ADX table bronzeClicks

 

How to Demo

In order to run this demo, you should:

  1. Create all the infrastructure by following the steps above in the infrastructure deployment section.
  2. Run the 2 pipelines in Azure Data factory to copy products and orders to ADX
  3. Ingest sample click events into the bronzeClicks table using the file HERE in Azure Data Explorer using 1 click ingestion as follows:

 

 

Select file

azure-data-and-ai-examples/sample events.json at master · denisa-ms/azure-data-and-ai-examples · GitHub

 

 

 

Click start Ingestion.

We are done!

We have products and orders from our operational DB (Azure SQL) and events coming from a stream in events hub.

In this demo I chose to add synthetic events using one-click ingestion, but you can create events and publish them to Events hub and they will be ingested using streaming ingestion to the bronzeClicks table.

 

I hope you enjoyed this

Thanks

Denise

Updated Jun 08, 2023
Version 5.0
No CommentsBe the first to comment