Let’s start this article off with a simple summary of the use case:
Consider the diagram below. A User can see aggregated data for all Specialties, but only two States. When the User drills to detail, they can only see detail data containing PII for one State and one Specialty:
Control Access to Detail Level PII without Compromising Analytics for Aggregated Data
Traditional Row Level Security (RLS) can be used to prevent users from viewing data they are not supposed to see, but it also prevents them from viewing aggregated metrics of the rows obfuscated by RLS. Also, even when PII has been scrubbed from data, re-identification can still be an issue. For example, there may be only one person in a zip code with a rare medical condition. Even though their PII has been scrubbed from a reporting data set, showing data for that zip code and medical condition could lead to re-identification of the person. It is often a best practice to set the grain of de-identified data at a count of persons that reduces or eliminates this risk. Implementing this concept into Power BI Aggregation Tables or Azure Synapse Materialized Views can prevent queries from yielding granular data below a certain patient count. In the example below, that minimum is 8 patients:
Reduce the risk of Re-Identification by Aggregating Counts of Patients in the Table Grain
In the example above, the Aggregation Table (left) can be set to have different RLS than the Detail Table (right). If the Aggregation is cached in Power BI as an Import table, and the Detail Table is in Azure Synapse, then drilling to details which are different from the aggregates can be achieved such as in the example below:
Allow Analytics of Aggregated Data while still controlling Access to Detail-Level PII
When using Azure Synapse Analytics with Power BI, you can apply RLS to any of the three basic types of data tables: 1) Power BI Import tables for ease-of-use and highly efficient query performance from an optimized cache of data, 2) Azure Synapse Materialized Views for MPP performance using efficient queries, and 3) Azure Synapse Tables for limitless scale. The diagram below shows these three options that you can mix-and-match in a Composite Model and even layer them as Power BI Aggregations:
Power BI Composite Models offer Different Options for Persisting Data
Using a methodology of controlling access to Detail level data differently from Aggregated data can be used in many different, flexible and useful ways. Addressing concerns with protecting PII is a common use case. There are a few considerations that should be evaluated and rigorously tested if considering this type of architecture:
This article is the fourth in a series exploring how Power BI paired with Azure data tools creates a flexible, scale-able, and achievable healthcare analytics architecture:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.