How to build an End-to-End Analytics Solution with Lakehouse in Microsoft Fabric
Published Jan 16 2024 12:00 AM 8,150 Views
Iron Contributor

Architectural Setup  

byansianthony_0-1705324392402.gif

 

 

Introduction  
 
Data is the new oil, and data analytics is the key to unlocking its value. Data analytics can help organizations to gain insights, make decisions, and optimize outcomes. However, data analytics is not a simple task. It involves various challenges, such as data quality, data integration, data processing, data modelling, data visualization, and data governance. 

To address these challenges, Microsoft Fabric offers a comprehensive suite of services for data engineering, data science, data warehousing, real-time analytics, and business intelligence.  

 

 

Scenario 
Imagine you are a chef who wants to create delicious dishes using various ingredients and recipes. You have a kitchen with all the tools and appliances you need, such as a stove, an oven, a blender, a mixer, etc. You also have access to a pantry where you can store and retrieve different kinds of food items, such as grains, fruits, vegetables, meats, dairy, etc. You want to share your culinary creations with your customers and get their feedback and ratings. Microsoft Fabric is like your kitchen and pantry, where you can cook up data solutions using various tools and services, such as Azure Synapse, Azure Data Factory, Power BI, etc. You can also store and access different kinds of data items, such as structured, unstructured, batch, streaming, etc. You can also share your data insights and reports with your stakeholders and get their feedback and ratings.  
 
In this blog, you will learn how to use Microsoft Fabric to create an end-to-end analytics solution using Lakehouse. You will also learn how to address some of the common challenges and considerations when working with data, such as: 
 

  • How to create and manage a Lakehouse that can store and process data of any scale, type, and structure 
  • How to ingest, transform, and load data from various sources and formats into your Lakehouse 
  • How to perform data engineering and data science tasks  
  • How to query and analyze data using SQL and Spark  
  • How to create and share interactive reports and dashboards using Power BI 

 

Pre-requisites 

  1. A Microsoft 365 Developer Account with Admin rights. 
  2. A Microsoft Fabric Free Trial 
     

How to create and manage a Lakehouse  

The following steps are the steps to setup a Lakehouse in Microsoft; 
 
1. Create a Fabric workspace: A workspace is a place where colleagues collaborate to create items such as Lakehouse, warehouses, and reports. From the experience switcher located at the bottom left, select the Data Engineering Experience. 

byansianthony_1-1705324392419.png

 

 

My Workspace” is your own default workspace, it’s where you will always be the sole Owner of the content and experimenting how items work without affecting items that the rest of collaborators have access to, for now ignore that and Select workspaces and new workspaces. 

 

byansianthony_2-1705324392427.jpeg

 

Fill out the create workspace form with the following details:  

  • Name: Enter a unique name for your workspace 
  • Description: Enter an Optional description for your workspace 
  • Advanced: Under License mode, select Premium capacity and then choose a premium capacity that you have access to. Select apply to create and open the workspace.  
    byansianthony_3-1705324392436.png

     

     

Note: Every fabric item you create will be stored in the open workspace by default, be sure to change to another workspace if you want the item to be stored to a different workspace by navigating to the workspaces tab and selecting the workspace into which you want to store your fabric items. 

 

 2. Create a Lakehouse: Navigate to the workspace you created earlier and from the top navigation ribbon, select new, from the item drop down menu, select Lakehouse. 
 
byansianthony_4-1705324392448.png

 

 Congratulations, you now have your Lakehouse set up. Next, you will ingest the Sample Data 
 

3. Ingest Sample Data: If you don’t have OneDrive Configured, sign up for the Microsoft 365 free trail. Download the dimension_customer.csv file from the Fabric Samples repo. In the Lakehouse explorer, you see options to load data into Lakehouse. Select New Dataflow Gen2. On the new dataflow pane, select Import from a Text/CSV file.  
 
byansianthony_5-1705324392459.png

 

 On the Connect to data source pane. Drag and drop the dimension_customer.csv file that you download. 
 

byansianthony_6-1705324392466.png

 

 4. Build a report: From the preview file data page, preview the data and select create to proceed and return to the dataflow canvas. In the Query settings pane, update the Name field to dimension customer. Note Fabric adds a space and number at the end of the table name by default. Table names must be lower case and must not contain spaces.  

byansianthony_7-1705324392475.png

 

 In case you have other data items that you want to associate with the Lakehouse, from the menu items, select Add data destination and Select Lakehouse. If needed, from the Connect to data destination screen, sign into your account and select Next. 
 

byansianthony_8-1705324392486.png

 

From the dataflow Canvas, we can transform our data based on your business requirements. Once done, select Publish button at the bottom of the screen or publish later from the drop-down menu of the publish button 

byansianthony_9-1705324392492.png

 

 

byansianthony_10-1705324392495.png

 

 

 

 A spinning circle next to the dataflow’s name will indicate publishing is in progress in the item view 

byansianthony_11-1705324392498.png

 

 

When publishing is complete, select the ellipse (…)  

byansianthony_12-1705324392502.png

 

 

 Select Properties. Rename the dataflow to Load Lakehouse Table and select Save 

byansianthony_13-1705324392506.png

 

Once the dataflow is refreshed, select your new Lakehouse in the left navigation panel to view the dimension-customer delta table. Select the table to preview its data.  

The SQL analytics endpoint of the Lakehouse enables you to query the data with SQL statements. Select the SQL analytics endpoint from the Lakehouse drop-down menu.
 
 

byansianthony_14-1705324392514.png

 

 Select the dimension-customer table to preview the data and select New SQL query at the top of the Fabric interface. 

byansianthony_15-1705324392521.png

 

The above sample query aggregates the row count based on the BuyingGroup column of the dimension_customer table.  
To run the script, select the Run icon at the top of the script.  

byansianthony_16-1705324392527.png

 

In the item view of the workspace, select the “yourlakehouse” default semantic model 

byansianthony_17-1705324392535.png

 

From the semantic model pane, You can view all the tables. You can create reports either from scratch, pignated report, or let Power BI automatically create a report based on your data. 

byansianthony_18-1705324392541.png


Notice that the report is generated in a couple of seconds, Click on
view report to view the automatically generated report. 

byansianthony_19-1705324392547.pngSave the report by selecting the save from the top ribbon. Notice there is a right navigation bar that you can use to make more changes to the report to meet your requirements by including or excluding other tables or columns. 

byansianthony_20-1705324392553.png

 
To share your report, head back to the item view for your workspace, click on the share icon to share the report within your organization. 

byansianthony_21-1705324392560.png

 

Congratulations, you have just created an end-to-end Analytics solution with Microsoft Fabric in a matter of minutes. Well done! To manage your storage and costs effectively, don’t forget to clean up your resources after you are done. 

 

Further study guides 

Microsoft Fabric Learn Together STARTING JANUARY 23, 2024 THROUGH FEBRUARY 8, 2024 (18 EPISODES)  Expert-led live walk-throughs covering all the Learn modules to prepare you for the the upcoming DP-600 exam leading to the Fabric Analytics Engineer Associate certification. 9 episodes delivered in both India and Americas timezones.Register now for this exclusive live learning experience.

Also, sign up for the Fabric Cloud Skills Challenge at https://aka.ms/fabric30dtli  and complete all the modules to become eligible for a 50% discount on the DP-600 exam.


Learn how to use copilot in Microsoft Fabric, your data insights AI assistant 

Join the Fabric Community to stay updated on the latest About Microsoft Fabric 

Consider joining the
Fabric Career Hub so you won’t miss out on any Careers in Microsoft Fabric 

 

 
 

 

 

 

2 Comments
Version history
Last update:
‎Jan 16 2024 12:27 AM
Updated by: