Blog Post

FastTrack for Azure
4 MIN READ

Integrating Microsoft Fabric with Azure Databricks Delta Tables

Anu_Singh's avatar
Anu_Singh
Icon for Microsoft rankMicrosoft
Sep 04, 2023

Introduction

Microsoft Fabric is a powerful unified analytics solution that allows users to seamlessly connect to various data sources, including Azure Databricks, and create insightful reports and visualizations without the need to move the data.

 

In this tutorial, we'll show you step-by-step how to connect to Azure Databricks generated Delta Tables and create a report in Microsoft Fabric.

By the end of this tutorial, you'll have the knowledge needed to read Azure Databricks generated Delta Tables from a Microsoft Fabric using Notebook and SQL query. You will also learn how to create a Power BI report that can help drive business decisions. So, let's get started!

 

 

Prerequisites

Before you connect, complete these steps:

  • An Azure Databricks workspace
  • An ADLS Gen2 account to store delta table and a parquet file
  • A Fabric workspace. For more information, see Fabric trial docs.
  • A lakehouse in Fabric workspace. For more information, see Create a lakehouse with OneLake.

 

Create Delta Table in Databricks

  1. Create a delta table from Databricks Notebook.

   In this case, I am using parquet files stored in ADLS Gen2 account to create a delta table.

 

%SQL

--create external table from parquet files stored in ADLS Gen2
CREATE TABLE fact_internet_sales
LOCATION 'abfss://demo@[yourdatalake].dfs.core.windows.net/fact_internet_sales'
AS
SELECT * FROM PARQUET.`abfss://data@[yourdatalake].dfs.core.windows.net/fact_internet_sales`;

 

 

 

%Python

#Read parquet files stored in ADLS Gen2
df = spark.read.format('parquet')\
    .load("abfss://data@[yourdatalake].dfs.core.windows.net/fact_internet_sales")

#Write to an external delta table
df.write \
    .format("delta") \
    .option("path", "abfss://demo@[yourdatalake].dfs.core.windows.net/fact_internet_sales") \
    .saveAsTable("fact_internet_sales")

 

 

 

Read the Delta Table in Fabric

Open the Fabric workspace to read, analyse and visualise the data.

 

Create shortcut to delta table

 

  1. In the lakehouse, click on ellipses (…) next to the Tables and select New Shortcut.

     

     

  2. In the New shortcut wizard, select Azure Data Lake Storage Gen2 tile. For more information, see Create an Azure Data Lake Storage Gen2 shortcut                                                                                                                

     

  3. Enter the storage account URL in Connection settings.

    Field

    Details

    URL

    https://StorageAccountName.dfs.core.windows.net

                                                    

  4. Enter the connection details (sign in if required) and select Next

    In this case, I am using ‘Organization Account’ Authentication kind and hence need to sign in.

    Field

    Details

    Connection

    Existing connections for the specified storage location will appear in the drop-down. If none exist, create a new connection.

    Connection name

    The Azure Data Lake Storage Gen2 connection name.

    Authentication kind

    The supported models are: Organizational account, Account Key, Shared Access Signature (SAS), and Service principal. For more information, see ADLS shortcuts.

                                                                                        

  5. Enter the Shortcut Name and Sub path details and then click Create

    Field

    Details

    Shortcut Name

    Name of your shortcut

    URL

    The Azure Data Lake Storage Gen2 URL from the last page.

    Sub Path

    The directory where the delta table resides.

  6. The shortcut pointing to the delta table (fact_internet_sales) created in the last section will now appear as a delta table under Tables in the Explorer pane.                                                                                                        
  7. Click on the table (fact_internet_sales) and the data in the table will show up.

     

     

 

Read the data from Notebook - Lakehouse mode

The data in the table can now be queried directly from the notebook in Fabric.

 

  1. Right-click on the table or click on ellipses (…) next to the table, click Open in notebook and then New notebook.
     
     

     
     

     

  2. New notebook will appear with the query automatically generated to read the data in the table.

  Select the Run Cell button or press Ctrl+Enter to execute the query and view the data.

 

 

Read the data using SQL - SQL Endpoint mode

The data in the table can also be queried directly using T-SQL query in Fabric.

 

  1. Browse to the SQL Endpoint created as part of Lakehouse provisioning from your workspace.
  2. After opening SQL Endpoint from the workspace, expand the database, schema and tables folder in the object Explorer to see all tables listed.

  3. Right-click on the table (fact_internet_sales) or click on ellipses (…) next to the table, click New SQL Query and then Select TOP 100 rows

     

      

  4.  The script will be automatically generated and executed to show the data in the table or click on Run to execute the query.

    

 

 

Visualise the data in Fabric using Power BI

The data in delta table can now be accessed and analysed in Power BI. You can either create a new dataset or use the default dataset created as part of lakehouse provisioning for a new report. For more information, see Direct Lake in Power BI and Microsoft Fabric

 

Using new dataset

  1. If you are in the Lakehouse mode, click on New Power BI dataset

      

 

If you are in the SQL endpoint mode, click on New Power BI dataset from Reporting tab.

 

  1. In the New Dataset dialog, select the table to be included in the dataset and click Confirm.

 

  1. The dataset is automatically saved in the workspace, and then opens the dataset. In the web modelling experience page, click on New Report

 

  1. In the report authoring page, drag or select the attributes from Data pane to the left-hand side pane to be included in the visualization.

 

 

 

Using default dataset

  1. Select your workspace and open the default dataset.

 

  1. On the dataset page, click on Start from scratch to create a new report.

 

  1. In the report authoring page, drag or select the attributes from Data pane to the left-hand side pane to be included in the visualization.

 

 

 

 

Summary

In conclusion, this guide provides a seamless solution for accessing Azure Databricks generated delta tables from Microsoft Fabric and visualizing the data in Power BI without the need to move the data. By following the steps outlined in this guide, you can easily connect to your delta tables and extract valuable insights from your data. With the power of Azure Databricks and Microsoft Fabric combined, you can take your data analysis to the next level and make informed decisions that drive business success.

 

Give it a try and let me know if this was helpful.

 

Updated Sep 04, 2023
Version 1.0
  • GuillaumeB's avatar
    GuillaumeB
    Brass Contributor

    Anu_Singh, by doing that the Delta Tables from Databricks are not optimized for Direct Lake mode (v-Order), is there any option to apply this in the databricks notebook?

     

    Secondly, if I'm using Databricks Unity Catalog can i use OneLake to host the managed Delta Tables? Will they be recognized as Delta Table by Fabric or just files (unmanaged zone of a Fabric Lakehouse). Thanks 

  • suchintya's avatar
    suchintya
    Copper Contributor

    Anu_Singh  :: I am also looking at how to go about configuring the V-order compression in Databricks.. Any suggestions