Blog Post

Healthcare and Life Sciences Blog
3 MIN READ

Deploy an End-to-End Azure Synapse Analytics and Power BI Solution using CMS Medicare Data

Greg_Beaumont's avatar
Greg_Beaumont
Icon for Microsoft rankMicrosoft
Dec 02, 2020

For many people, hands-on experience is often the best way to learn and evaluate data tools. I've been working with a colleague from our Azure team, Kunal Jain, to put together an end-to-end Azure Synapse and Power BI solution using 120+ million rows of real CMS Medicare Part D Data that is available for use in the public domain. If you're not highly technical, and you've never used Azure or Power BI before, you can still deploy this solution with a few simple steps using an Azure ARM template. We also provide a video to walk you through a paint-by-numbers tutorial. The Azure ARM template will automatically:

  • Create Azure Data Lake, Azure Data Factory, and Azure Synapse
  • Pull the raw data from CMS into a Data Lake using Azure Data Factory
  • Shape the data and create a dimensional model for Azure Synapse
  • Deploy the solution to Azure Synapse, including performance tuning settings

The entire process takes about an hour and a half to run, with most of that time spent on waiting for the scripts from the ARM template to finish. Once deployed, another video also walks you through the steps of connecting it to a pre-built Power BI report template. The whole process should take about 1-2 hours with no code required, and at the end you can review and evaluate and end-to-end Azure and Power BI solution using real CMS Medicare Part D Data:

Here is a link to the GitHub site: https://github.com/kunal333/E2ESynapseDemo 

 

While the source CMS data is real public Healthcare data, the intent of this project is to provide you with a simple end-to-end solution for the purposes of learning, demos, and tool evaluation. We intend to enhance and build upon this solution in the future, but it is not a supported solution intended to be used for production purposes. 

 

Below is a tutorial video for deploying the solution. Note that this is designed to be low code with only a few things to cut and paste. All you need is an Azure account, and you can pause the Synapse instance or delete the entire Resource Group at any time. There are also simple instructions on the GitHub page:

 

Here is another tutorial video describing the process by which to connect the Power BI Template file containing the business logic:

 

The following diagram summarizes the steps of the whole process:

Azure Data Factory, along with Power BI, creates the following logical model that enables highly performant end user queries for complicated questions about the data. Notice that a CSV file is also added to the Power BI layer to demonstrate that custom criteria from a business user can be used to query the Synapse data:

 Calculations have been added to the Power BI Semantic Layer to enable complex analytics such as Pareto Analysis: 

 Below is a screenshot of the pre-built Power BI report:

 

More information about the solution is available at the GitHub site: https://github.com/kunal333/E2ESynapseDemo 

 

If you deploy this solution, we'd appreciate if you could take the time to provide some feedback. What was your experience with the ARM template? How do you plan to use this solution? What types of similar solutions can we provide in the future that would be valuable?:

 

 

Updated Dec 02, 2020
Version 1.0
  • dz6kb4's avatar
    dz6kb4
    Copper Contributor

    Hello.

     

    I get as far as running the deploySynapseStorageADF.ps1 script  step 13/15 I think line 219.  I then get the below error - it seems to try over and over but can't get past this bit.  Any ideas please? Thankyou

     

    Get-AzDataFactoryV2PipelineRun: /home/anthony/E2ESynapseDemo/deploySynapseStorageADF.ps1:219
    Line |
    219 | … ceGroupName -DataFactoryName $DataFactoryName -PipelineRunId $element
    | ~~~~~~~~
    | Cannot validate argument on parameter 'PipelineRunId'. The argument is null or empty. Provide an argument that is not null or empty, and then try the
    | command again.

  • Worked like a charm to me today when I first run the script, choosing create all resources from scratch. dz6kb4  You can maybe try the same and see if that works.