Serverless Synapse SQL pools enable you to read Parquet/CSV files or Cosmos DB collections and return their content as a set of rows. In some scenarios, you would need to ensure that a reader cannot access some rows in the underlying data source. This way, you are limiting the result set that will be returned to the users based on some security rules. In this scenario, called Row-level security, you would like to return a subset of data depending on the reader’s identity or role.
Row-level security is supported in dedicated SQL pools, but it is not supported in serverless pools (you can propose this feature in Azure feedback site). In some cases, you can implement your own custom row-level security rules using standard T-SQL code.
In this post you will see how to implement RLS by specifying the security rules in the WHERE clause or by using inline table-value function (iTVF).
Let’s imagine that we have a view or external table created on top of the COVID data set placed in the Azure Data Lake storage:
create or alter view dbo.covid as select * from openrowset( bulk 'https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/ecdc_cases/latest/ecdc_cases.parquet', format = 'parquet') as rows
The readers will get all COVID cases that are written in the Parquet file. Let’s imagine that the requirement is to restrict the user access and allow them to see just a subset of data based on the following rules:
We can represent these security rules using the following predicate (this is T-SQL pseudo-syntax):
( USER_NAME = 'firstname.lastname@example.org' AND geo_id = 'RS' ) OR ( USER IS IN ROLE 'AfricaAnalyst' AND continent_exp = 'Africa' )
You can use the system functions like SUSER_SNAME() or IS_ROLEMEMBER() to identify the caller and easily check should you return some rows to the current user. We just need to express this condition in T-SQL and add it as a filtering condition in the view.
In this post you will see two methods for filtering the rows based on security rules:
As a prerequisite, you should setup a database role that should have limited access to the underlying data set.
CREATE ROLE AfricaAnalyst; CREATE USER [email@example.com] FROM EXTERNAL PROVIDER CREATE USER [firstname.lastname@example.org] FROM EXTERNAL PROVIDER CREATE USER [email@example.com] FROM EXTERNAL PROVIDER ALTER ROLE AfricaAnalyst ADD MEMBER [firstname.lastname@example.org]; ALTER ROLE AfricaAnalyst ADD MEMBER [email@example.com]; ALTER ROLE AfricaAnalyst ADD MEMBER [firstname.lastname@example.org];
In this code we have added a role and added three users to this role.
The easiest way to implement row-level security is to directly embed security predicates in the WHERE condition of the view:
create or alter view dbo.covid as select * from openrowset( bulk 'https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/ecdc_cases/latest/ecdc_cases.parquet', format = 'parquet') as rows WHERE ( SUSER_SNAME() = 'email@example.com' AND geo_id = 'RS') OR ( IS_ROLEMEMBER('AfricaAnalyst', SUSER_SNAME()) = 1 AND continent_exp = 'Africa')
The Azure AD principal firstname.lastname@example.org will see only the COVID cases reported in Serbia:
Azure AD principals that belong to ‘AfricaAnalysts’ role will see the cases reported in Africa when running the same query:
Without any change in the query, the readers will be able to access only the rows that they are allowed to see based on their identity.
In some cases, it would be hard to maintain security rules if you put them directly in the view definition. The security rules might change, and you don’t want to track all possible views to update them. The WHERE condition in the view may contain other predicates, and some bug in AND/OR logic might change or disable the security rules.
The better idea would be to put the security predicates in a separate function and just apply the function on the views that should return limited data.
We can create a separate schema called secure, and put the security rules in the inline table value function (iTVF):
create schema secure go create or alter function secure.geo_predicate(@geo_id varchar(20), @continent_exp varchar(20)) returns table return ( select condition = 1 WHERE (SUSER_SNAME() = 'email@example.com' AND @Geo_id = 'RS') OR (IS_ROLEMEMBER('AfricaAnalyst', SUSER_SNAME())=1 AND continent_exp='Africa') )
This predicate is very similar to the predicates used in native row-level security in the T-SQL language. This predicate evaluates a security rule based on geo_id and continent_exp values. These values are provided as the function parameters. The function will internally use SUSER_SNAME() and IS_ROLEMEMBER() functions and evaluate security rules based on geo_id and continent_exp values. If the current user can see a row with the provided values, the function fill return value 1. Otherwise, it will not return any value.
Instead of modifying every view by adding new conditions in the WHERE clause, we can create the secure wrapper views where we will apply this predicate. In the following example, I’m creating a view secure.covid where I will apply the predicate and pass the columns that will be used in the predicate to evaluate should this row be returned or not:
create or alter view secure.covid as select * from dbo.covid cross apply security.geo_predicate(geo_id, continent_exp) go
Note that in this case the view dbo.covid will not have the WHERE clause that contains the security rule. Security rules are filtering rows by applying security.geo_predicate() iTVF on the original view. For every row in the dbo.covid view. The query will provide the values of geo_id and continent_exp columns to the iTVF predicate. The predicate will remove the row from the output if these values do not satisfy criterion.
A user firstname.lastname@example.org will see the filtered results based on his context:
Placing the security rules in a separate iTVFs and creating the secure wrapper views will make your code more maintainable.
Although native row-level-security is not available in serverless SQL pool, you can easily implement the similar security rules using the standard T-SQL functionalities.
You just need to ensure that a reader cannot bypass the security rules.
This method uses the core T-SQL functionalities and will not introduce additional overhead compared to the native row-level security. If this workaround doesn’t work in your scenario, you can propose this feature in Azure feedback site.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.