Azure Synapse Analytics is analytics service that enables you to implement solutions that enable you users to access data in Azure storage and define permission models that define what user can access some data. Azure Active directory is recommended model for accessing data and defining permission rules on your data. In addition to Azure AD permission model, you can define additional security policies that protect your data even in some cases where Azure AD permission model cannot be used. In this article you will see how to setup fine-grained security policy for SQL users that can access some parts of storage using workspace identity or SAS key. This is must-have setup for scenarios where SQL principals access data or serverless Synapse SQL pool access storage using Managed Identity or SAS token.
Synapse SQL runtime in Azure Synapse Analytics workspace enables you to define access rights and permissions to read data in two security layers:
Permissions of the user who would like to read data from storage files are checked in both layers:
In order to pass SQL security check, user would need to have some SELECT, GRANT REFERENCES CREDENTIAL, ADMINISTER BULK OPERATIONS SQL permission to initiate call to ADLS storage layers.
Then request will be checked on storage layer by verifying validity of SAS token that is used to access files, or ACL permissions of Azure AD identity that is accessing storage.
You have flexibility to choose would you configure security check only in SQL, storage or on both places depending on your use case.
If you are using Azure Active directory passthrough authentication, you can define granular access rules in Azure storage layer and specify which users could access some files and folders by assigning Azure roles such as Storage Blob Reader or Storage Blob Owner. In this case you can rely only on ADLS storage security checks.
However, there are some cases when you will not use AAD passthrough:
If you are using some of these authentication methods, your Synapse SQL runtime has access to any file/folder/container placed in storage layer. If you have different user roles who are accessing data, you need to ensure that some users have access only to some subset of folders. Since you don’t have fine-grained ACL permissions on storage, you need to do the following steps to define permissions in SQL runtime:
Let’s see how to apply this security model in the scenario where two user roles can access only some subfolders in storage.
We have ADLS storage with three data sets – Product, RetailSales, and StoreDemographics placed in different folders on the same ADLS storage account. Synapse SQL access storage using Managed Identity that has full access to all folders in storage.
We have two roles in this scenario:
We need to ensure that these roles can access only subsets of data, although Synapse SQL has full access. Therefore, we need to define access rights on SQL layer that will protect access to the resources.
In this step we will create two logins that will enable sales managers and store managers to access Synapse SQL:
CREATE LOGIN StoreManager WITH PASSWORD = '100reM4n4G3r!@#$';
GO
CREATE USER StoreManager FROM LOGIN StoreManager;
GO
CREATE LOGIN SalesManager WITH PASSWORD = 'Sa<M4n4G3r!@#$';
GO
CREATE USER SalesManager FROM LOGIN SalesManager;
GO
Now we have two username/password pairs that can access Synapse SQL, but they still cannot access storage.
We need some database scoped credential that Synapse SQL runtime will use to the ADLS access storage. Let’s imagine that we are enabling Synapse SQL to access private storage protected with firewall using Managed Identity of the workspace:
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'Y...0'
CREATE DATABASE SCOPED CREDENTIAL WorkspaceIdentity WITH IDENTITY = 'Managed Identity'
GO
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::WorkspaceIdentity TO StoreManager;
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::WorkspaceIdentity TO SalesManager;
GO
Once we create DATABASE SCOPED CREDENTIAL, we need to enable users to reference that credential so they can access storage.
Users who have REFERENCES permission on some storage might use OPENROWSET function to access any file on that storage. Therefore, we need to ensure that they cannot use this function by explicitly denying ADMINISTER BULK OPERATIONS in master database and ADMINISTER DATABASE BULK OPERATIONS permissions in data warehouse:
--> USE master
DENY ADMINISTER BULK OPERATIONS TO StoreManager;
DENY ADMINISTER BULK OPERATIONS TO SalesManager;
--> USE RetailStore
DENY ADMINISTER DATABASE BULK OPERATIONS TO StoreManager;
DENY ADMINISTER DATABASE BULK OPERATIONS TO SalesManager;
Since we have three datasets placed in three folders, we need to create three external tables that will access storage using some credential:
CREATE EXTERNAL DATA SOURCE [Data] WITH
( LOCATION = N'https://....dfs.core.windows.net/data', CREDENTIAL = WorkspaceIdentity )
GO
CREATE SCHEMA store
GO
CREATE EXTERNAL TABLE store.Product (...)
WITH (DATA_SOURCE = Data, LOCATION = N'Product/',FILE_FORMAT = ParquetSnappy)
GO
CREATE EXTERNAL TABLE store.[RetailSales] (...)
WITH (DATA_SOURCE = Data, LOCATION = N'RetailSales/',FILE_FORMAT = ParquetSnappy)
GO
CREATE EXTERNAL TABLE [store].[StoreDemographics] (...)
WITH (DATA_SOURCE = Data, LOCATION = N'StoreDemographics/',...)
GO
Any user that can select data from these tables can read the content of underlying files in ADLA storage.
Finally, we need to implement required security settings and allow store managers and sales managers to access only their data sets via proxy external tables:
GRANT SELECT ON OBJECT::store.Product TO StoreManager;
GRANT SELECT ON OBJECT::store.StoreDemographics TO StoreManager;
GO
GRANT SELECT ON OBJECT::store.Product TO SalesManager;
GRANT SELECT ON OBJECT::store.RetailSales TO SalesManager;
Now if we try to select data as Store Managers, we will get the results:
However, if these users try to access store.RetailSales they will get error:
The similar results will get sales manager when trying to access the tables.
Serverless Synapse SQL runtime enables to define fine-grained permissions and control what resources your users can access. Even if you provide full storage access to Synapse SQL runtime, you are still not loosing ability to define fine-grained permission to your users using SQL runtime permission model.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.