Integrating Microsoft Fabric with Azure Databricks Delta Tables
Published Sep 04 2023 08:55 AM 9,005 Views
Microsoft

Introduction

Microsoft Fabric is a powerful unified analytics solution that allows users to seamlessly connect to various data sources, including Azure Databricks, and create insightful reports and visualizations without the need to move the data.

 

In this tutorial, we'll show you step-by-step how to connect to Azure Databricks generated Delta Tables and create a report in Microsoft Fabric.

By the end of this tutorial, you'll have the knowledge needed to read Azure Databricks generated Delta Tables from a Microsoft Fabric using Notebook and SQL query. You will also learn how to create a Power BI report that can help drive business decisions. So, let's get started!

 

Anu_Singh_6-1693676822491.png

 

Prerequisites

Before you connect, complete these steps:

  • An Azure Databricks workspace
  • An ADLS Gen2 account to store delta table and a parquet file
  • A Fabric workspace. For more information, see Fabric trial docs.
  • A lakehouse in Fabric workspace. For more information, see Create a lakehouse with OneLake.

 

Create Delta Table in Databricks

  1. Create a delta table from Databricks Notebook.

   In this case, I am using parquet files stored in ADLS Gen2 account to create a delta table.

 

%SQL

--create external table from parquet files stored in ADLS Gen2
CREATE TABLE fact_internet_sales
LOCATION 'abfss://demo@[yourdatalake].dfs.core.windows.net/fact_internet_sales'
AS
SELECT * FROM PARQUET.`abfss://data@[yourdatalake].dfs.core.windows.net/fact_internet_sales`;

 

Anu_Singh_1-1693605529594.png

 

 

%Python

#Read parquet files stored in ADLS Gen2
df = spark.read.format('parquet')\
    .load("abfss://data@[yourdatalake].dfs.core.windows.net/fact_internet_sales")

#Write to an external delta table
df.write \
    .format("delta") \
    .option("path", "abfss://demo@[yourdatalake].dfs.core.windows.net/fact_internet_sales") \
    .saveAsTable("fact_internet_sales")

 

Anu_Singh_2-1693605761876.png

 

 

Read the Delta Table in Fabric

Open the Fabric workspace to read, analyse and visualise the data.

 

Create shortcut to delta table

 

  1. In the lakehouse, click on ellipses (…) next to the Tables and select New Shortcut.

     Anu_Singh_1-1693606671369.png

     

  2. In the New shortcut wizard, select Azure Data Lake Storage Gen2 tile. For more information, see Create an Azure Data Lake Storage Gen2 shortcut                                                                                                                Anu_Singh_2-1693606905366.png

     

  3. Enter the storage account URL in Connection settings.

    Field

    Details

    URL

    https://StorageAccountName.dfs.core.windows.net

     Anu_Singh_3-1693607504533.png                                               

  4. Enter the connection details (sign in if required) and select Next

    In this case, I am using ‘Organization Account’ Authentication kind and hence need to sign in.

    Field

    Details

    Connection

    Existing connections for the specified storage location will appear in the drop-down. If none exist, create a new connection.

    Connection name

    The Azure Data Lake Storage Gen2 connection name.

    Authentication kind

    The supported models are: Organizational account, Account Key, Shared Access Signature (SAS), and Service principal. For more information, see ADLS shortcuts.

     Anu_Singh_4-1693607697749.png                                                                                   

  5. Enter the Shortcut Name and Sub path details and then click Create

    Field

    Details

    Shortcut Name

    Name of your shortcut

    URL

    The Azure Data Lake Storage Gen2 URL from the last page.

    Sub Path

    The directory where the delta table resides.

    Anu_Singh_5-1693607870447.png

  6. The shortcut pointing to the delta table (fact_internet_sales) created in the last section will now appear as a delta table under Tables in the Explorer pane.                                                                                                        
  7. Click on the table (fact_internet_sales) and the data in the table will show up.

     
    Anu_Singh_2-1693671634119.png

     

 

Read the data from Notebook - Lakehouse mode

The data in the table can now be queried directly from the notebook in Fabric.

 

  1. Right-click on the table or click on ellipses (…) next to the table, click Open in notebook and then New notebook.
     
     

     
    Anu_Singh_2-1693839332613.png
     

     

  2. New notebook will appear with the query automatically generated to read the data in the table.

  Select the Run Cell button or press Ctrl+Enter to execute the query and view the data.

Anu_Singh_8-1693673096446.png

 

 

Read the data using SQL - SQL Endpoint mode

The data in the table can also be queried directly using T-SQL query in Fabric.

 

  1. Browse to the SQL Endpoint created as part of Lakehouse provisioning from your workspace.
  2. After opening SQL Endpoint from the workspace, expand the database, schema and tables folder in the object Explorer to see all tables listed.

  3. Right-click on the table (fact_internet_sales) or click on ellipses (…) next to the table, click New SQL Query and then Select TOP 100 rows

     
    Anu_Singh_2-1693674552264.png

      

  4.  The script will be automatically generated and executed to show the data in the table or click on Run to execute the query.

    Anu_Singh_4-1693675947483.png

 

 

Visualise the data in Fabric using Power BI

The data in delta table can now be accessed and analysed in Power BI. You can either create a new dataset or use the default dataset created as part of lakehouse provisioning for a new report. For more information, see Direct Lake in Power BI and Microsoft Fabric

 

Using new dataset

  1. If you are in the Lakehouse mode, click on New Power BI dataset

      Anu_Singh_9-1693677766359.png

 

If you are in the SQL endpoint mode, click on New Power BI dataset from Reporting tab.

Anu_Singh_8-1693677639859.png

 

  1. In the New Dataset dialog, select the table to be included in the dataset and click Confirm.

Anu_Singh_12-1693599480206.png

 

  1. The dataset is automatically saved in the workspace, and then opens the dataset. In the web modelling experience page, click on New Report

Anu_Singh_5-1693839480214.png

 

  1. In the report authoring page, drag or select the attributes from Data pane to the left-hand side pane to be included in the visualization.

Anu_Singh_3-1693839409972.png

 

 

 

Using default dataset

  1. Select your workspace and open the default dataset.

Anu_Singh_10-1693677984415.png

 

  1. On the dataset page, click on Start from scratch to create a new report.

Anu_Singh_11-1693678126559.png

 

  1. In the report authoring page, drag or select the attributes from Data pane to the left-hand side pane to be included in the visualization.

Anu_Singh_4-1693839429999.png

 

 

 

 

Summary

In conclusion, this guide provides a seamless solution for accessing Azure Databricks generated delta tables from Microsoft Fabric and visualizing the data in Power BI without the need to move the data. By following the steps outlined in this guide, you can easily connect to your delta tables and extract valuable insights from your data. With the power of Azure Databricks and Microsoft Fabric combined, you can take your data analysis to the next level and make informed decisions that drive business success.

 

Give it a try and let me know if this was helpful.

 

3 Comments
Co-Authors
Version history
Last update:
‎Sep 04 2023 08:54 AM
Updated by: