Blog Post
Data Security: Azure key Vault in Data bricks
Great article on securing secrets in Databricks! However, I'd like to suggest an even more secure approach: Azure Key Vault-backed secret scopes (https://learn.microsoft.com/en-us/azure/databricks/security/secrets/)
The current approach still requires storing service principal credentials (tenant ID, client ID, and client secret) somewhere in your code or environment variables, which introduces security risks:
- Client secrets in code = security vulnerability
- Environment variables = still exposed and manageable
A better solution: Azure Key Vault-backed secret scopes
With Azure Key Vault-backed secret scopes, you can:
- Create a secret scope directly linked to your Azure Key Vault
- Use Databricks' built-in integration with Azure Key Vault
- Eliminate the need to store any credentials in your notebooks or environment
Usage:
# No credentials needed in code!
# Just reference the secret scope and key name
connection_string = dbutils.secrets.get(scope="my-keyvault-scope", key="db-connection-string")Advanced Use secrets in Spark config and environment variables (https://learn.microsoft.com/en-us/azure/databricks/security/secrets/secrets-spark-conf-env-var):
You can also reference secrets directly in cluster configuration:
# Spark configuration property
spark.password {{secrets/my-keyvault-scope/db-password}}
# Environment variable
SPARKPASSWORD={{secrets/my-keyvault-scope/db-password}}Then access them in your code:
# From Spark config
password = spark.conf.get("spark.password")
# From environment variable (in init scripts)
password = os.environ.get("SPARKPASSWORD")Benefits:
- No service principal credentials in code or environment
- Databricks authenticates to Key Vault using managed identity
- Secrets are never exposed in notebook output
- Fine-grained access control via Databricks secret scope ACLs
- Secrets automatically redacted from logs
- Works seamlessly with cluster-scoped init scripts