Part 2 - Building a Data Lakehouse using Azure Data explorer - The Deployment
Published Jun 07 2023 12:28 PM 2,447 Views

Introduction

As businesses continue to generate massive amounts of data, the need for efficient data management solutions becomes increasingly important. This is where a data lake house comes in - an hybrid solution that combines the best features of a datalake and a data warehouse.

data lakehouse adx.png

 

 

 

 

Part 1 - Building a Data Lakehouse using Azure Data explorer

We explored how to build a data lakehouse using Azure Data Explorer (ADX) where the data flows from Azure SQL DB using Change Data Capture (CDC) through Azure Data Factory and events flowing from events hub.

This article is Part 2 in the series, here we will deploy this solution using Bicep, a powerful infrastructure as code (IaC) tool from Microsoft. With this guide, you'll be able to create a data lakehouse that can handle large volumes of data and provide valuable insights for your business.

Requirements

  • An Azure account and a logged in user with admin permissions

Infrastructure Deployment

  • Go to github and download the files from here:

https://github.com/denisa-ms/azure-data-and-ai-examples/tree/master/adx-datalakehouse

  • Go to the azure portal and login with a user that has administrator permissions
  • Open the cloud shell in the azure portal

Denise_Schlesinger_0-1686158279582.png

 

  • Upload the file “all.zip” in the github repo by using the upload file button in the cloud shell

Denise_Schlesinger_1-1686158279586.png

 

  • Unzip the file by writing unzip all.zip

Denise_Schlesinger_2-1686158279590.png

 

  • Run ./createAll.ps1

Denise_Schlesinger_3-1686158279593.png

 

 

NOTE: This takes time so be patient

 

Explanation

The code here creates the following entities

 

Denise_Schlesinger_4-1686158279598.png

 

Azure SQL Server

Contains an Azure SQL database with the Adventure works sample data.

Azure Data Factory – (adxdlhouse-adf)

Contains 2 data pipelines:

  • SQLToADX_orders: copies the orders from the Adventureworks sample DB in Azure SQL Server into ADX tables bronzeOrders
  • SQLToADX_products: copies the products from the Adventureworks sample DB in Azure SQL Server into ADX tables bronzeProducts

 

Denise_Schlesinger_5-1686158279602.png

 

 

 

 

Azure Events Hub

Denise_Schlesinger_6-1686158279605.png

 

Contains a hub called “clicks-stream” that streams click events into ADX table bronzeClicks

 

How to Demo

In order to run this demo, you should:

  1. Create all the infrastructure by following the steps above in the infrastructure deployment section.
  2. Run the 2 pipelines in Azure Data factory to copy products and orders to ADX
  3. Ingest sample click events into the bronzeClicks table using the file HERE in Azure Data Explorer using 1 click ingestion as follows:

Denise_Schlesinger_7-1686158279606.png

 

Denise_Schlesinger_8-1686158279608.png

 

Select file

azure-data-and-ai-examples/sample events.json at master · denisa-ms/azure-data-and-ai-examples · Git...

Denise_Schlesinger_9-1686158279609.png

 

 

Denise_Schlesinger_10-1686158279614.png

 

Click start Ingestion.

We are done!

We have products and orders from our operational DB (Azure SQL) and events coming from a stream in events hub.

In this demo I chose to add synthetic events using one-click ingestion, but you can create events and publish them to Events hub and they will be ingested using streaming ingestion to the bronzeClicks table.

 

I hope you enjoyed this

Thanks

Denise

Version history
Last update:
‎Jun 08 2023 05:23 AM
Updated by: