Previously, we had talked about Microsoft Fabric - How can a SQL user or DBA connect – Part 2 - Microsoft Community Hub
With the release of Microsoft Fabric and now that you are playing around with it – I am sure there are few questions that most of you would have:
-
Lakehouse or Datawarehouse or PowerBI Datamart??
-
Dataflow or Copy Activity (similar to ADF) or use SPARK notebooks for Data Orchestration??
We hear you and so MS docs have a very good article that would help you make decision based on your use case and criteria.
Lakehouse or Datawarehouse or PowerBI Datamart??
Use this reference guide and the example scenarios to help you choose between the data warehouse or a lakehouse for your workloads using Microsoft Fabric.
Data warehouse and lakehouse properties
Data warehouse |
Lakehouse |
Power BI Datamart |
|
---|---|---|---|
Data volume |
Unlimited |
Unlimited |
Up to 100 GB |
Type of data |
Structured |
Unstructured,semi-structured,structured |
Structured |
Primary developer persona |
Data warehouse developer,SQL engineer |
Data engineer,data scientist |
Citizen developer |
Primary developer skill set |
SQL |
Spark(Scala, PySpark, Spark SQL, R) |
No code, SQL |
Data organized by |
Databases, schemas, and tables |
Folders and files,databases and tables |
Database, tables, queries |
Read operations |
Spark,T-SQL |
Spark,T-SQL |
Spark,T-SQL,Power BI |
Write operations |
T-SQL |
Spark(Scala, PySpark, Spark SQL, R) |
Dataflows, T-SQL |
Multi-table transactions |
Yes |
No |
No |
Primary development interface |
SQL scripts |
Spark notebooks,Spark job definitions |
Power BI |
Security |
Object level (table, view, function, stored procedure, etc.),column level,row level,DDL/DML |
Row level,table level (when using T-SQL),none for Spark |
Built-in RLS editor |
Access data via shortcuts |
Yes (indirectly through the lakehouse) |
Yes |
No |
Can be a source for shortcuts |
Yes (tables) |
Yes (files and tables) |
No |
Query across items |
Yes, query across lakehouse and warehouse tables |
Yes, query across lakehouse and warehouse tables;query across lakehouses (including shortcuts using Spark) |
No |
For Scenarios and details click here: Fabric decision guide - lakehouse or data warehouse - Microsoft Fabric | Microsoft Learn
Copy activity, Dataflow, or Spark
Pipeline copy activity |
Dataflow Gen 2 |
Spark |
|
---|---|---|---|
Use case |
Data lake and data warehouse migration,data ingestion,lightweight transformation |
Data ingestion,data transformation,data wrangling,data profiling |
Data ingestion,data transformation,data processing,data profiling |
Primary developer persona |
Data engineer,data integrator |
Data engineer,data integrator,business analyst |
Data engineer,data scientist,data developer |
Primary developer skill set |
ETL,SQL,JSON |
ETL,M,SQL |
Spark (Scala, Python, Spark SQL, R) |
Code written |
No code,low code |
No code,low code |
Code |
Data volume |
Low to high |
Low to high |
Low to high |
Development interface |
Wizard,canvas |
Power query |
Notebook,Spark job definition |
Sources |
30+ connectors |
150+ connectors |
Hundreds of Spark libraries |
Destinations |
18+ connectors |
Lakehouse,Azure SQL database,Azure Data explorer,Azure Synapse analytics |
Hundreds of Spark libraries |
Transformation complexity |
Low:lightweight - type conversion, column mapping, merge/split files, flatten hierarchy |
Low to high:300+ transformation functions |
Low to high:support for native Spark and open-source libraries |
For scenarios and details click here: Fabric decision guide - copy activity, dataflow, or Spark - Microsoft Fabric | Microsoft Learn
Check out the video where we go over the same scenarios:
Microsoft Fabric Decision Trees, Deciding Which Service to USE!!
Updated Jun 14, 2023
Version 1.0neerajny
Microsoft
Joined February 08, 2022
FastTrack for Azure
Follow this blog board to get notified when there's new activity