Analytics on Azure Blog

16 MIN READ

Approaches to Integrating Azure Databricks with Microsoft Fabric: The Better Together Story!

Rafia_Aqil

Microsoft

Sep 12, 2025

Co-Authored By: PeterLo and LavanyaSreedhar
Peer Reviewed By: ArvindPeriyasamy, DarwinSchweitzer, Hamood_Aleem and jbarry15

Azure Databricks and Microsoft Fabric can be combined to create a unified and scalable analytics ecosystem. This document outlines eight distinct integration approaches, each accompanied by step-by-step implementation guidance and key design considerations. These methods are not prescriptive—your cloud architecture team can choose the integration strategy that best aligns with your organization’s governance model, workload requirements and platform preferences. Whether you prioritize centralized orchestration, direct data access, or seamless reporting, the flexibility of these options allows you to tailor the solution to your specific needs.

Direct Publish from DBSQL to Fabric

What is it: Direct Publish from Databricks SQL (DBSQL) to Power BI enables seamless integration between Databricks and Microsoft Fabric for reporting. This method leverages the native Power BI connector for Databricks SQL, allowing users to query data directly from Databricks SQL endpoints into Power BI dashboards and reports.

Select one of the two steps to implement:

- Using SQL Warehouse Connection Details: Go to the Compute section in your Databricks workspace and select SQL Warehouses from the menu. Create a new SQL Warehouse or choose an existing one. Then click on Connection Details to find the Server Hostname, HTTP Path, JDBC URL and OAuth URL.

There are many ways of bringing Databricks data to Fabric using the above details. One way is to navigate to Microsoft Fabric, select Azure Databricks in your Dataflow Gen2 pipeline, select desired tables, add destination (i.e. Lakehouse, Fabric SQL Database) and start reporting.

- Publishing from Catalog to Power BI: This step refers to a one-click publishing experience from the Databricks Catalog UI directly to Microsoft Fabric, PowerBI workspace.

Select “Publish to PowerBI”; this will redirect you to your “Semantic Models” settings. Set up a refresh schedule and update the settings to match your organization’s requirements.

Navigate to your workspace, find your newly published dataset as “Semantic model” and start creating reports.

Considerations: Find difference between Import Mode and DirectQuery Mode here: Use DirectQuery in Power BI Desktop - Power BI | Microsoft Learn.

Mirroring Azure Databricks Unity Catalog

What it is: Mirroring automates the shortcut approach by syncing an entire Databricks Unity Catalog (or selected schemas) into Fabric. Fabric creates a mirrored catalog with shortcuts for each Databricks table, so Fabric users see a ready-to-query database reflecting Databricks data. Note: To leverage this integration, you don't need any compute resources on Azure Databricks. Only Fabric capacity is required, and it is utilized solely when accessing Unity Catalog tables to read data from Fabric.

Steps to implement:

Prepare Databricks Unity Catalog: In Azure Databricks:

- Ensure Unity Catalog is enabled and your tables are in it. Enable “External Data Access” on the metastore (to allow Fabric to read the underlying data).

- Also, the Databricks user or service principal used for mirroring must have the “EXTERNAL USE SCHEMA” and “Data Reader” on the target schemas and at least SELECT on the tables.

- Create Mirrored Catalog in Fabric: In a Fabric workspace, click New > Mirrored Azure Databricks Catalog. Provide your Databricks workspace details (URL, OAuth credentials or PAT). Select the Unity Catalog and specific databases/schemas to mirror. Fabric will list tables; check those you want to include (or all). Name the new Fabric catalog and create it.

- Fabric sets up Shortcuts: After creation, Fabric will automatically create OneLake shortcuts for each Databricks table you selected and bundle them as a SQL Endpoint (with the mirrored catalog name). There will also be a default Power BI dataset for these tables, ready for report building.

- Use and Manage: Query the mirrored tables in Fabric’s Data Warehouse or Power BI. The data stays in Databricks storage and updates in real-time (Fabric queries fetch latest data via the shortcuts). Manage user access through Fabric’s permissions (mirroring doesn’t carry over UC’s fine-grained ACLs, so restrict Fabric catalog/tables as needed to authorized users).

- (Recommended) You can use shortcuts to pull-in data from mirrored tables into your lakehouses/warehouses/datasets.

Select “Microsoft OneLake” then enable the desired mirrored data from the explorer.

Note: Within Databricks “External Data Access” on the metastore is a preview feature; also, Unity Catalog’s row-level security or privileges do not enforce on Fabric side – treat the Fabric copy as a separate interface and secure it. This method greatly simplifies exposing a lot of Databricks data to Fabric, with zero data duplication. Before adopting mirroring, it's essential to understand current limitations around networking configurations, supported data types, and other platform constraints. Please review the official Microsoft documentation for the most up-to-date details: Azure Databricks Mirroring Limitations.

At present, mirroring Azure Databricks Unity Catalog is not supported when the Databricks workspace is behind a private endpoint or has IP access restrictions. This is because Fabric must reach the Databricks workspace’s public API to establish the mirrored catalog. For customers operating in secure environments, this limitation can be a blocker. However, there are viable alternatives.

How do I connect to my Azure Databricks workspaces that are behind a private endpoint?

Connecting to Azure Databricks workspaces that are behind a private endpoint are not supported yet. We recommend evaluating the use of IP access lists in this scenario. By controlling external access to your catalog through external processing engines and combining it with IP access lists, you can prevent data exfiltration risks. To use IP access lists, you will need to add the Power BI and Power Query Online IPs to your IP access list from the list of IPs at Download Azure IP Ranges and Service Tags – Public Cloud from Official Microsoft Download Center.

Delta Sharing for Cross-Platform Data Exchange

What it is: Delta Sharing is an open protocol for sharing Delta Lake data between platforms. Databricks can act as a Delta Share server and Fabric can consume shared data via its connectors. This method is useful if direct Fabric connectivity to Databricks is not possible (different orgs, strict network separation) or if you want an industry-standard way to exchange data.

Steps to implement:

- Create a Share in Databricks: In Databricks Unity Catalog, define a Share and add the desired table(s) as shareable assets. For example, create share SalesShare with the sales.curated Delta table. Then add a Recipient:

- If the Fabric user is within your org, you might use an Azure Entra ID recipient using OIDC Federation (enter their tenant ID, etc.). Otherwise, generate a Token-based Recipient, which will give you a token and endpoint URL.

- Copy the Delta Sharing endpoint URL/bearer token (often provided as a JSON or .share file). This contains all info Fabric needs to access the share.

- Add policy to share with high security.

- Set Up Fabric Dataflow/Pipeline: In Fabric, create a Dataflow Gen2 or a Data Factory pipeline to import the data:

- For a Dataflow: choose Delta Sharing as the data source. When prompted, provide the sharing endpoint URL and token from step 1. Select the shared table to load.

- For a Pipeline: use a Copy Data activity. Configure the Source as Delta Sharing (enter URL and token) and the Sink as a Fabric destination (Lakehouse table, Warehouse, etc.).

- Run to Import Data: Execute the dataflow or pipeline. Fabric will connect to the Databricks share and pull the data into Fabric’s OneLake (or the specified destination). The data is now materialized in Fabric (this is an ETL approach, unlike methods 1 and 2).

- Use/Refresh: Use the imported data in Fabric normally (reports, queries). To keep it updated, schedule the dataflow/pipeline to run at needed intervals or triggers. If using token-based auth, note the token’s expiry and renew it as required. (Recommended) You can use shortcuts to pull-in data from the delta-share into your lakehouses/warehouses/datasets.

Considerations: Delta Sharing ensures secure, read-only access without having to set up network peering or give direct storage permissions. It’s ideal for sharing with external parties or across tenants. However, unlike shortcuts, this method creates a copy of data in Fabric (so ensure you manage the data lifecycle and storage costs). Also, schema changes in Databricks won’t automatically sync – you’d need to adjust the dataflow and share definition accordingly. For internal integrations with full control, methods 1 or 2 are usually easier, but Delta Sharing is a robust alternative when others aren’t feasible.

Azure Databricks Activity in Fabric Pipelines (Orchestrate Databricks from Fabric)

What it is: Fabric’s Data Factory allows running an Azure Databricks activity inside a pipeline, similar to how Azure Data Factory does. This lets you trigger Databricks notebooks, scripts, or jobs as part of a Fabric workflow. Use this to coordinate Databricks tasks with other Fabric ETL steps in one unified pipeline.

Steps to implement:

- Create a Pipeline in Fabric: In your Fabric workspace, add Databricks Activity: In the pipeline editor, click + and add the Azure Databricks activity. If you haven’t connected Fabric to Databricks before, click New to create a Linked Service for Databricks:

- Enter the Databricks workspace URL.

- Choose authentication (typically a PAT token or Azure AD OAuth).

- Provide the token or credentials, then test and create the connection.

- Configure Cluster Settings: In the Databricks activity, specify how to run the task.

- Specify the Task: On the activity, choose the run type (Notebook, Python, JAR, Databricks Job).

- Link in Pipeline: Connect the Databricks activity in sequence with other activities. For example, you might have:

- Copy data from source to lake (Fabric Copy activity) -> Run Databricks notebook to transform -> Copy results to Warehouse -> etc. Use the output of previous activities as input parameters for the notebook if needed (via pipeline variable mappings).

- Run and Monitor: Execute the pipeline (debug or publish and trigger). The Fabric pipeline will invoke the Databricks activity; monitor logs in Fabric for high-level status and in Databricks for detailed notebook logs. On success, downstream pipeline steps will proceed. On failure, handle it with pipeline error paths or alerts.

Tip: The Databricks activity allows end-to-end orchestration. Leverage cluster auto-termination to save costs in Databricks.

Automatic Publishing to Power BI from Databricks (Power BI Tasks for Jobs)

What it is: Instead of Fabric calling Databricks, this method has Databricks call Fabric’s Power BI. Azure Databricks Workflows (Jobs) include a Power BI task type that publishes datasets or refreshes them in Fabric (Power BI service). This allows Databricks to automatically update Power BI dashboards whenever data processing is complete, achieving near real-time BI without manual refresh schedules.

Steps to implement:

- Set Up Power BI Access in Databricks: Add Power BI Task to Databricks Job in Databricks, create or edit a Workflow Job. After your data processing task(s), add a new task and select Power BI as the task type. Configure the following:

- Pick a Databricks SQL Warehouse (or cluster) and specify the Unity Catalog tables or views you want in Power BI.

- Select the Power BI connection and target Workspace.

- Provide a Dataset name (existing or new). If new, the task will create a dataset in the Power BI workspace; if existing, it will update it.

- Choose the data storage mode for the dataset: Import (push full data into Power BI) or DirectQuery (Power BI will query Databricks live). (Use Import for smaller datasets or when you want Power BI caching, DirectQuery for large or frequently changing data.)

- Map the Unity Catalog tables to the dataset. Optionally set up relationships or let Power BI infer them (you can refine in Power BI later if needed).

- Run the Databricks Job: When the job executes, the Power BI task will connect to Fabric and publish the dataset.

- Verify in Power BI: In the Fabric Power BI workspace, confirm the dataset appears/updates. Build or update reports to use this dataset.

- Automate and Iterate: Now your Databricks pipeline directly triggers BI updates. Adjust the job schedule (or trigger it via API on data arrival) to ensure reports are as fresh as required.

Key benefits: This tight integration means Power BI is always in sync with Databricks. No more waiting for scheduled refresh times or manual exports. It also ensures one source of truth (the data comes straight from Unity Catalog).

Integrate Databricks External Tables with OneLake

What it is: This approach enables you to integrate Databricks Unity Catalog – specifically external tables with Microsoft Fabric. This integration uses a Databricks API(authenticated by API or OAuth) to integrate Unity catalog tables to a Fabric Lakehouse using shortcuts.

Steps to implement:

- Get the Databricks Workspace URL- Navigate to your Databricks workspace and get the URL from https until .net

- Creation of a Token: Create a Personal access token which is used by the Databricks API to authenticate for accessing the Unity Catalog Tables. Navigate to User profile a the Top right corner, Settings , User section --> Developer--> Access tokens

- Generate a new token by providing a comment (name) and select Token lifetime. I set it to 90 days.

After clicking generate make sure to copy the token as you won’t be able to see it again

- Unity Catalog and Schema Information: In this step we will gather the details of the Unity Catalog name and Schema for which we plan to create a shortcut and Integrate with a Fabric Lakehouse. In this example I have a catalog names gold and schema edw_gold and table types are External.

- Microsoft Fabric Lakehouse: In Microsoft Fabric, navigate to Workspaces: If you don’t already have one, create a new workspace. Otherwise, select an existing workspace to proceed. Create a Lakehouse. Once created, grab the details for Workspace Connection ID : copy ID after groups/............/list

Lakeshouse connection ID – Open the Lakehouse and copy ID after the lakeshouses/.......?experience

- Create a Cloud Storage connection: In this step you will need to create a connection used by your Unity catalog external tables. Follow the link to https://learn.microsoft.com/en-us/fabric/data-factory/connector-azure-data-lake-storage-gen2 create a cloud storage connection to Azure Data Lake Storage Gen2 used by Unity catalog external tables.

- Notebook and execution: Create a notebook and copy the contents of this of Python code in Step1: fabric-samples/docs-samples/onelake/unity-catalog/nb-sync-uc-fabric-onelake.ipynb at main · microsoft/fabric-samples. You will need to replace the values “<>” with values noted from Steps above. Before you execute cell, ensure that you attach the Lakehouse where the shortcuts would be created to the notebook

- Observe results: Add Step 2 which will Sync Unity Catalog Tables to OneLake as Shortcuts

Note: Log lists the successful sync from Unity Catalog to Fabric for external tables and shows that the Unity Catalog managed Delta tables are not supported.

Considerations: Its always recommended using Databricks OAuth for authentication and Azure Key Vault for storing secrets

Key benefits: With this integration you will be able to access and automatically sync your Unity Catalog external delta tables to Fabric.

Directly Write into OneLake from Databricks Notebook

What is it: This integration method enables Azure Databricks notebooks to write data directly into Microsoft Fabric’s.

When to use the notebook solution:

- You need to read/write shortcuts from Fabric to your unity catalog table data in ADLS.

- You need to access Unity Catalog tables from Databricks on AWS and GCP.

- You don’t have your UC data in default unity catalog managed storage which is created with your Azure Databricks workspace.

- If your Azure Databricks workspaces are behind a private endpoint.

Pre-requisite:

- Databricks: Ensure connectivity to storage, reference: Connect to Azure Data Lake Storage and Blob Storage.

- Fabric: Set up your Azure Data Lake Storage Gen2 connection in Fabric.

- Prepare your Fabric Lakehouse and retrieve its ABFS path (e.g., abfss://<workspace>@onelake.dfs.fabric.microsoft.com/<lakehouse>). Set up your Azure Data Lake Storage Gen2 connection.

Steps to implement:

- Retrieve OneLake ABFS Path

- - - In Microsoft Fabric, navigate to your Lakehouse.

- - - Go to the Files section (or Tables if you plan to write to a table folder).

- - - Click on the ellipsis (…) next to the folder or table and select Copy ABFS Path.

- - - The ABFS path will look like this: abfss://@onelake.dfs.fabric.microsoft.com//Files/

- - - This path is what you’ll use in your Databricks notebook to write data directly into OneLake.

- Configure Credentials in Databricks

- - - Databricks needs permission to write to OneLake.

- - - Authentication options: Connect to Azure Data Lake Storage and Blob Storage - Azure Databricks | Microsoft Learn. Tip: Store secrets in Azure Key Vault and access them via Databricks Secret Scopes for security.

- Write Data to OneLake

- - - Use Spark DataFrame write APIs to write in supported formats:

- - - Integrate OneLake with Azure Databricks - Microsoft Fabric | Microsoft Learn

- Verify in Fabric

- - - In Fabric Lakehouse, go to:

- - - Tables, Check if your Delta table appears.

- - - Files, Verify raw files if you wrote to the Files folder.

- - - You can now use Power BI or Fabric Notebooks to query the data.

Considerations:

- If you already have data pipelines in Azure Data Factory (ADF), you can integrate them with Microsoft Fabric to create a unified orchestration layer. This allows you to reuse existing ADF pipelines while incorporating steps to write data into Fabric Lakehouse. For guidance, see: Bring Azure Data Factory to Fabric - Microsoft Fabric | Microsoft Learn.

- Additionally, if your workflow requires Databricks notebooks, you can execute them as part of your ADF pipeline using the Databricks Notebook activity. This enables you to combine data engineering in Databricks with data movement and orchestration in ADF, ensuring an end-to-end automated process. For details, refer to: Run a Databricks Notebook with the activity - Azure Data Factory | Microsoft Learn.

OneLake Shortcuts with Trusted Workspace Access

Disclaimer: Databricks does not recommend using Trusted Workspace Access, as it bypasses Unity Catalog’s governance model. However, it may be a viable option for customers who either do not: use Unity Catalog or have their Databricks environment secured behind a private endpoint. In such cases, it provides a method to bring data into Microsoft Fabric.

What it is: Use Microsoft Fabric’s OneLake shortcut feature to mount Azure Databricks data (stored in ADLS Gen2) into Fabric. Trusted Workspace Access ensures this link is secure even if the storage account has firewall or private endpoint restrictions. This method avoids any data copies – Fabric directly reads the Delta Lake files managed by Databricks. Find the prerequisites to enable Trusted Workspace access here: Trusted workspace access in Microsoft Fabric.

Steps to implement:

- In Microsoft Fabric, navigate to Workspaces: If you don’t already have one, create a new workspace. Otherwise, select an existing workspace to proceed. Create a Lakehouse that will be used to access the ADLS Gen2 shortcut we'll set up later. To enable Trusted Workspace Access, you’ll need to create a workspace identity. Go to ‘Workspace settings’ and choose ‘Workspace identity’. Then click on “+ Workspace identity” to create one.

- Prepare the ADLS Gen2 Storage: If you don't have one yet, create a new Storage Account. If you already do, make sure it's an ADLS Gen2 storage account with the Hierarchical Namespace enabled. Create a container (e.g. “fabricdata”) if not already present. (Optionally, upload a sample file to this container for testing connectivity later

- Grant the Workspace Identity Access to ADLS: In the Azure Portal, navigate to the storage account’s container (or the account itself) and open the Access Control (IAM) tab. Click Add role assignment:

- Select Role: Storage Blob Data Reader (for read-only access; or Contributor if write access is needed).

- Select Assign access to: User, group, or service principal.

- Click + Select members, find the Fabric workspace identity (use the workspace name from step 1) and select it.

- Review and assign the role to grant the Fabric workspace read access to that container.

This follows least-privilege by limiting access to only the needed container. (Note: Alternatively, you could use a dedicated service principal or user and assign it, but using the workspace identity is simpler for trusted access.)

- Disable Public Access on the Storage Account. In the storage account’s Networking settings, set Public network access to Disabled and save. This ensures the storage isn’t reachable from any public IP.

Now the storage is accessible only via its private endpoint (or trusted Azure services with exceptions). If you try to list the container in the portal or connect without proper exceptions, you should get a 403 error, confirming the lock-down.

- Enable Trusted Access via Resource Instance Rule: Because the storage has public access disabled or a firewall, configure an exception for the Fabric workspace:

- Go to Azure Portal’s Custom deployment and load a template (ARM or Bicep) that adds a resourceAccessRules entry for your Fabric workspace.

Update the following from the script: (Microsoft’s documentation provides a sample template.)

- <storage account name>, <subscription id of storage account>, <resource group name> You can find all of this on the Storage Account’s Overview page in the Azure portal.

- <region> The Azure region where the Storage Account is deployed.

- <tenantid> The Tenant ID associated with your Entra ID. You can find this on the main page of Microsoft Entra in the Azure portal, or by searching "Tenant ID" in the top search bar.

- <workspace-id> → In Fabric, navigate to the workspace being used, the Workspace ID can be found in the URL—it's the value that appears after “.../groups/”. For this demo, the ID is intentionally obscured.

- Note: You must use the Fabric placeholder subscription ID 00000000-0000-0000-0000-000000000000 in the resourceId.

- This command creates the necessary firewall exception. Next select the resource group for the deployment, then ‘Review + create’. This will deploy the script and wait for it to deploy successfully.

- After deployment, verify the storage’s Networking settings: under Resource instances allowed, you should see an entry for Microsoft.Fabric/workspaces with your workspace ID. (This confirms Fabric can access it even via private endpoint. Without this, Fabric would get HTTP 403.)

- Now create OneLake Shortcut in Fabric: In the Fabric workspace, open your Lakehouse (create one if needed). Click Get Data -> Azure Data Lake Storage Gen2. Next select ‘Azure Data Lake Storage Gen2’

When adding the connection:

- Select ‘Create new connection’ and place the Data Lake storage URL in the URL section. (https://<storage account name>.dfs.core.windows.net/ )
  From the ‘Authentication kind’ select ‘Workspace identity’.

- Once you select ‘Next’ and see the containers, it indicates that the connection and authentication were successful.

- Now select the files to shortcut to and select ‘Create’.

- Access Your Data: Now you can visualize the data within Fabric via a OneLake shortcut.

Best practice: Keep using Delta format on Databricks for optimal compatibility and use OneLake Direct Lake in Power BI for the best performance.

Conclusion

The choice of method depends on the use-case – in many cases, multiple methods are used in tandem. With the steps outlined in this article, you can confidently implement each integration and unlock unified analytics across Databricks and Fabric.

Approach	Pros	Cons	Use Cases
Direct Publish from DBSQL to Power BI	- Simple setup for quick reporting - No transformation needed - Direct query from Fabric to Power BI	- Not suitable for complex ETL - Performance issues with large datasets	- Ad-hoc dashboards - Quick reporting needs
Azure Databricks Mirroring Catalog in Fabric	- Zero data duplication - Read-only via open protocol	- Limited to supported features - Read-only; no write-back	- Enterprise-scale integration - Centralized governance
Azure Databricks Delta Sharing with Fabric	- Ideal for cross-org sharing - Secure, controlled access - No direct connectivity needed	- Read-only in Fabric - Requires Delta Sharing setup	-Partner/vendor data exchange - Multi-tenant collaboration
Azure Databricks Pipeline Activities in Fabric	- Centralized orchestration in Fabric - Native integration for hybrid workflows	- Requires pipeline setup - Batch-oriented, not real-time	- Complex workflows - Fabric as orchestration hub
Power BI Task for Jobs in Azure Databricks	- BI refresh controlled by Databricks - Tight integration with Databricks pipelines	- Requires job orchestration logic - Adds complexity if Fabric orchestration also needed	- Databricks-driven reporting cadence
Integrate Databricks External Tables with OneLake	- No data duplication - Seamless access to Unity Catalog tables - Uses OneLake shortcuts	- Read-only access - Dependent on Unity Catalog setup	- Governance-first scenarios - Large datasets with zero-copy
Directly Write into OneLake from Databricks Notebook	- Full control over data format and schema - Flexible for custom integration	- Requires custom code - Risk of schema drift	- Custom ETL pipelines - Fine-grained control
OneLake Shortcuts (Trusted Access) with Fabric	- No data copies - High-performance access - No Databricks compute required	- Requires trusted access config - Limited to supported scenarios	- Unified analytics without duplication