Microsoft Fabric's database mirroring feature is a game changer for organizations using Azure SQL DB, Azure Cosmos DB or Snowflake in their cloud environment! Mirroring offers near real-time replication with just a few clicks AND for at no cost!
Features:
What can you do?
Metadata-driven pipelines in Microsoft Fabric enable you to streamline data ingestion and transformations with minimal coding, lower maintenance, and enhanced scalability. And when your source is Azure SQL DB, Azure Cosmos, or Snowflake, this becomes even easier!
For this architecture, I considered using:
Since SQL views over mirrored tables always resort to Direct Query rather than Direct Lake, I decided against building a semantic model over SQL views. Instead, I created a semantic model over the mirrored tables and a separate Fabric Data Warehouse and semantic mode over it.
The semantic model over the mirrored tables:
The semantic model over the Fabric Data Warehouse:
This solution addresses two key use cases: providing near real-time responses to queries about specific data transactions and statuses, and delivering rapid analytics over large datasets. Continue reading to learn how to implement this architecture in your own environment.
Below are detailed steps to build the metadata pipeline. The data source is the Wide World Importers SQL database, which you can download here. Then follow the instructions to import into an Azure SQL DB.
From the Synapse Data Warehouse experience, choose the Mirrored Azure SQL DB Option:
Then choose the tables to mirror:
After the mirroring has started, the main canvas will say “Mirrored Azure SQL Database is running’; Click on Monitor replication to see the number of rows replicated and the last completion time:
At this point, both a SQL analytics endpoint and default semantic model are created (1a). But I created a new semantic model(1b) and set up the table relationships:
2. Create a Fabric Data Warehouse
Create the fact tables or any other tables that will be incrementally loaded (2a). You can also manually create the dimension tables or tables are full loaded OR you can specify to auto-create the tables in the pipeline Copy Data activity, as I do later.
Create the views over the mirrored database tables and stored procedures (2b):
Create and load the metadata table (2c) with information on how to load each fact or dimension table:
3. Create the data pipelines to load the fact and dimension tables
Below is the orchestrator pipeline:
Set variable – set pipelinestarttime to the current date/time. This is logged in the metadata driven pipeline table for each table
Lookup – get the table load attributes from metadata table for each table to load
For each table to load
Invoke the pipeline, passing in the current row object and the date/time the orchestrator pipeline started:
Load warehouse table pipeline
Set variable pipeline start time for tracking the time of each table load
If activity – check if full or incremental load
If full load
Use Copy Data Activity to load the Data Warehouse table, set the pipeline end time variable and update the metadata table with load information
Copy data activity
Source settings reference the view over the mirrored database:
Destination settings reference data warehouse table:
Note that the data warehouse table will be dropped and re-created each time
Set the pipeline end time variable
Run Script to update the pipeline run details for this table
If not a full load, then run the incremental load activities
Lookup activity calls a stored procedure to insert or update new or changed records into the destination table. The value for the StartDate parameter is the latest date of the previous load of this table. The value for the EndDate parameter is usually a null value and only set if there is a need to load or reload a subset of data.
The stored procedure performs an insert or update, depending upon whether or not the key value exists in the destination. Only the records from the source that have changed since the last table load are selected. This reduces the number of updates performed.
The stored procedure returns how many rows were inserted or updated, along with the latest transaction data of the data loaded, which is needed for the next incremental load.
Set the pipeline end time
Script activity updates the table load details:
4. Build a new semantic model in the Fabric/Power BI service
Create relationships between the tables, DAX calculations, dimension hierarchies, display formats – anything you need for your analytics
Note that all tables have Direct Lake connectivity as noted by the dashed, blue line. Direct Lake has the performance of Import semantic models without the overhead of refreshing the data.
5. Create reports
Create reports from your semantic model
Continue on building out more reports and dashboards, setting up security, scheduling data warehouse refresh (which will now be super fast since the source data is already in Fabric), creating apps, adding more data sources – whatever it takes to get the analytics your organization needs into Fabric!
Mirroring - Microsoft Fabric | Microsoft Learn
What is data warehousing in Microsoft Fabric? - Microsoft Fabric | Microsoft Learn
Data Factory in Microsoft Fabric documentation - Microsoft Fabric | Microsoft Learn
Work with semantic models in Microsoft Fabric - Training | Microsoft Learn
Create reports in the Power BI - Microsoft Fabric | Microsoft Learn
Dimensional modeling in Microsoft Fabric Warehouse - Microsoft Fabric | Microsoft Learn
If your source database is not supported for mirroring (yet!), check out these other articles I wrote:
Metadata Driven Pipelines for Microsoft Fabric - Microsoft Community Hub
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.