SQL Stored Procedures in Fabric Warehouse offer Blazing Speed and Power at Scale

Greg_Beaumont

Microsoft

Nov 06, 2024

If you come from a SQL background the Fabric Warehouse is made for you.

This article will discuss Stored Procedures (SPROCs) in the Fabric Warehouse, how they can benefit customers with SQL skills and provides a video to demonstrate the speed and power of SPROCs in the Fabric Warehouse.

Before Fabric Pipelines and Dataflows, before Azure Data Factory, and before SQL Server Integration Services (SSIS), there were Stored Procedures on SQL Server. For data professionals who are literate in SQL, Stored Procedures are a simple and reliable way to perform tasks such as creating and updating tables, running queries for data transformations, and more.

In the Healthcare industry segment, especially with Providers, both on-premises and cloud SQL Server deployments are still a popular and reliable platform for data and analytics. Many data professionals working in these environments write and think about the SQL language more frequently than they do spoken languages.

A great deal of the press and media for Microsoft Fabric has focused on Spark Notebooks (deservedly so, Spark Notebooks are fantastic). While you can write SQL in Spark Notebooks, the most popular language for Notebooks is Python which is very different from the SQL Server world many of us grew up embracing. I've had some customers tell me that "Fabric looks promising, but our SQL experts will need to re-skill for Spark Notebooks. Also, our SQL Views and Stored Procedures will need to be rewritten." The Fabric Warehouse, often overlooked by the press and media, is designed in the spirit of a SQL Server environment. For some use cases, it provides a great path for moving to Fabric without re-skilling and re-writing queries.

Along with my colleague Inder Rana, we recently created a GitHub Repo for deploying 249 million rows of real CMS data to a Fabric environment for testing and demos. The end-to-end solution takes about 20 minutes to deploy using fully automated scripts in Spark Notebooks, and then has instructions to deploy the Power BI components. As an alternative to the Spark Notebooks, there is also an option to deploy the Gold layer of the medallion architecture to the Fabric Warehouse using Stored Procedures. You can try out the GitHub Repo from this link: fabric-samples-healthcare/analytics-bi-directlake-starschema at main · gregbeaumont/fabric-samples-healthcare

Initially, we offered the Stored Procedures option as an alternative to Spark Notebooks for the purposes of 1) demonstrating different options in Fabric, 2) providing a path that SQL experts would appreciate, and 3) comparing semantic model query performance in the Warehouse verses the Lakehouse. When testing, the Stored Procedures often ran twice as fast as the Spark Notebooks! The results were at first surprising, and after validating that the Warehouse Stored Procedures are indeed very powerful, I was inspired to write this video and blog.

If you come from a SQL background and you prefer to write SQL, or if you are assessing migration of analytic workloads to Fabric but you don't want to convert your Stored Procedures and Views to Python, the Fabric Warehouse is a powerful option and worth the effort to evaluate.

Below is a link to the video that walks through deployment of Stored Procedures that transform a flattened table of data (249M rows) into a star schema with four dimensions and a fact table in the Fabric Warehouse that are optimized for Power BI query performance:

https://youtu.be/G6t4d5FU0zI