Building a Real World Health Condition Dashboard with Azure Databricks and Delta Lake
Published Dec 16 2020 06:00 AM 5,980 Views

This post was authored by Bruce Nelson, Senior Solutions Architect at Databricks and Clinton Ford, Staff Partner Marketing Manager at Databricks



Healthcare organizations are improving the patient experience and delivering better health outcomes with analytic dashboards and machine learning models on top of existing electronic health records (EHR), digital medical images and streaming data from medical devices and wearables. Azure Databricks and Delta Lake make it easier to work with large clinical datasets to identify top patient conditions.


Using Delta Lake to build a comorbidity dashboard

Simulated EHR data are based on roughly 10,000 patients in Massachusetts and generated using the Synthea simulator. Our ETL notebook ingests and de-identifies our data, then prepares it for our visualization notebook. We create visualizations and a simple dashboard that show the top conditions (comorbidities) in our real world data and also analyze the correlation between any two conditions specified by the user.


Modern Analytics for Healthcare and Life Sciences (HLS) .png


Extract, transform and load (ETL)

To begin, we use pyspark to read EHR data from comma-separated values (CSV) files, de-identify patient personally identifiable information (PII) and write to Delta Lake for analysis. Using Delta Lake is a best practice for ingestion, ETL and stream processing as an open source format with support for ACID transactions, faster processing with Delta Engine and easy integration with other Azure services for additional use cases.


Data Ingestion, Streaming and ETL for HLS.png



EHR data analysis and comorbidity dashboard

In this notebook we visualize top conditions in the database and create a simple dashboard to analyze the correlation between any two conditions specified by the user. You can share this notebook as a dashboard following these instructions.


Comorbid Condition Browser.png

Next steps

For additional background on this use case see this blog post. See live demos or get hands on at an Azure Databricks event. Go even deeper with this 3-part webinar training series to operationalize machine learning models for your own organization.

Version history
Last update:
‎Jan 26 2021 03:06 PM