Introduction
Microsoft Fabric is a powerful unified analytics solution that allows users to seamlessly connect to various data sources, including Azure Databricks, and create insightful reports and visualizations without the need to move the data.
In this tutorial, we'll show you step-by-step how to connect to Azure Databricks generated Delta Tables and create a report in Microsoft Fabric.
By the end of this tutorial, you'll have the knowledge needed to read Azure Databricks generated Delta Tables from a Microsoft Fabric using Notebook and SQL query. You will also learn how to create a Power BI report that can help drive business decisions. So, let's get started!
Prerequisites
Before you connect, complete these steps:
- An Azure Databricks workspace
- An ADLS Gen2 account to store delta table and a parquet file
- A Fabric workspace. For more information, see Fabric trial docs.
- A lakehouse in Fabric workspace. For more information, see Create a lakehouse with OneLake.
Create Delta Table in Databricks
- Create a delta table from Databricks Notebook.
In this case, I am using parquet files stored in ADLS Gen2 account to create a delta table.
%SQL
--create external table from parquet files stored in ADLS Gen2
CREATE TABLE fact_internet_sales
LOCATION 'abfss://demo@[yourdatalake].dfs.core.windows.net/fact_internet_sales'
AS
SELECT * FROM PARQUET.`abfss://data@[yourdatalake].dfs.core.windows.net/fact_internet_sales`;
%Python
#Read parquet files stored in ADLS Gen2
df = spark.read.format('parquet')\
.load("abfss://data@[yourdatalake].dfs.core.windows.net/fact_internet_sales")
#Write to an external delta table
df.write \
.format("delta") \
.option("path", "abfss://demo@[yourdatalake].dfs.core.windows.net/fact_internet_sales") \
.saveAsTable("fact_internet_sales")
Read the Delta Table in Fabric
Open the Fabric workspace to read, analyse and visualise the data.
Create shortcut to delta table
- In the lakehouse, click on ellipses (…) next to the Tables and select New Shortcut.
- In the New shortcut wizard, select Azure Data Lake Storage Gen2 tile. For more information, see Create an Azure Data Lake Storage Gen2 shortcut
- Enter the storage account URL in Connection settings.
Field
Details
URL
- Enter the connection details (sign in if required) and select Next.
In this case, I am using ‘Organization Account’ Authentication kind and hence need to sign in.
Field
Details
Connection
Existing connections for the specified storage location will appear in the drop-down. If none exist, create a new connection.
Connection name
The Azure Data Lake Storage Gen2 connection name.
Authentication kind
The supported models are: Organizational account, Account Key, Shared Access Signature (SAS), and Service principal. For more information, see ADLS shortcuts.
- Enter the Shortcut Name and Sub path details and then click Create.
Field
Details
Shortcut Name
Name of your shortcut
URL
The Azure Data Lake Storage Gen2 URL from the last page.
Sub Path
The directory where the delta table resides.
- The shortcut pointing to the delta table (fact_internet_sales) created in the last section will now appear as a delta table under Tables in the Explorer pane.
-
Click on the table (fact_internet_sales) and the data in the table will show up.
Read the data from Notebook - Lakehouse mode
The data in the table can now be queried directly from the notebook in Fabric.
- Right-click on the table or click on ellipses (…) next to the table, click Open in notebook and then New notebook.
- New notebook will appear with the query automatically generated to read the data in the table.
Select the Run Cell button or press Ctrl+Enter to execute the query and view the data.
Read the data using SQL - SQL Endpoint mode
The data in the table can also be queried directly using T-SQL query in Fabric.
- Browse to the SQL Endpoint created as part of Lakehouse provisioning from your workspace.
-
After opening SQL Endpoint from the workspace, expand the database, schema and tables folder in the object Explorer to see all tables listed.
-
Right-click on the table (fact_internet_sales) or click on ellipses (…) next to the table, click New SQL Query and then Select TOP 100 rows.
- The script will be automatically generated and executed to show the data in the table or click on Run to execute the query.
Visualise the data in Fabric using Power BI
The data in delta table can now be accessed and analysed in Power BI. You can either create a new dataset or use the default dataset created as part of lakehouse provisioning for a new report. For more information, see Direct Lake in Power BI and Microsoft Fabric
Using new dataset
- If you are in the Lakehouse mode, click on New Power BI dataset.
If you are in the SQL endpoint mode, click on New Power BI dataset from Reporting tab.
- In the New Dataset dialog, select the table to be included in the dataset and click Confirm.
- The dataset is automatically saved in the workspace, and then opens the dataset. In the web modelling experience page, click on New Report
- In the report authoring page, drag or select the attributes from Data pane to the left-hand side pane to be included in the visualization.
Using default dataset
- Select your workspace and open the default dataset.
- On the dataset page, click on Start from scratch to create a new report.
- In the report authoring page, drag or select the attributes from Data pane to the left-hand side pane to be included in the visualization.
Summary
In conclusion, this guide provides a seamless solution for accessing Azure Databricks generated delta tables from Microsoft Fabric and visualizing the data in Power BI without the need to move the data. By following the steps outlined in this guide, you can easily connect to your delta tables and extract valuable insights from your data. With the power of Azure Databricks and Microsoft Fabric combined, you can take your data analysis to the next level and make informed decisions that drive business success.
Give it a try and let me know if this was helpful.