Azure Policy has multiple uses including general governance, monitoring setup, security, and compliance. It should not be used to deal with items better handled with role-based access control (RBAC). The following rules codify this:
Note: Many professionals use security and compliance interchangeably. Security encompasses much more than some checkboxes on a compliance spreadsheet; however, complying with Microsoft Cloud Security Benchmark and NIST-880-53 are a decent baseline for enforcing security aspects with Azure Policy.
I am not covering PaC solutions in detail here. The author recommends Enterprise Azure Policy as Code (EPAC). I’m one of the maintainers of Enterprise Azure Policy as Code (EPAC). Not surprisingly, I believe EPAC to be vastly superior to any other PaC solution.
Cloud and most on-prem datacenters are software defined leading to the term Infrastructure as Code (IaC). Azure Policy is a special form of infrastructure; therefore, we call the approach: Policy as Code (PaC). When adopting (or building) a Policy as Code solutions, you should ensure that deployments are:
Custom Policy/Initiative Definitions and Policy Assignments need to be deployed at a scope.
Custom definitions should always be deployed at the top Management Group (MG) in each tenant. That MG should be the single MG (no siblings) underneath the “Tenant root group” as recommended by Microsoft (see https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/landing-zone/design-areas) or at the actual “Tenant root group” if you are not following Microsoft’s recommendation verbatim.
Policy Assignment must be at this level or lower. They should be at the highest MG group possible. Do NOT assign Policies to subscriptions or resource groups.
Note 1: The landing zones diagram in the link above shows Policy Assignments at the subscription level which is technically incorrect as they are applied at the management group scope and inherited upon subscriptions. The rest of the Cloud Adoption Framework documentaion puts it correctly at Management Group level (see https://github.com/Azure/Enterprise-Scale/wiki/ALZ-Policies).
Note 2: You must set the default location for new subscriptions in a MG at or below the scope where the security-oriented Policy Assignments are deployed to prevent rogue subscriptions from bypassing your security controls enforcement with Azure Policy.
Policies are inert elements in Azure until you create a Policy Assignment at a scope. Each assignment should:
“Azure Security Benchmark” (ASB - "name": "1f3afdf9-d0c9-4c3d-847f-89da613e70a8") is automatically assigned by Defender for Cloud in each subscription to protect new environments. All Policy effects are set to “Audit”. In most scenarios, you will set some of the effects to “Deny”. It is best to create a new Assignment at a MG (see above in “Management Groups and Policy Resources” to change the effects centrally. Once done you should remove the auto-assigned Policy Assignments to avoid difficulties on overlaps.
It is essential that ASB is assigned to cover all subscriptions. Defender for Cloud depends on this Policy Assignment.
You may assign additional security-oriented and compliance-oriented Initiatives, such as "NIST SP 800-53 Rev. 5" ("name": "179d1daa-458f-4e47-8086-2a68d0d6c38f"). You should limit yourself to no more than 5 Initiatives (including custom Initiatives). Larger numbers will make maintenance and managing Policy Exemptions extremely difficult.
Assignments containing Policies with Modify or DeployIfNotExists Policies require a Managed Identity (MI). The MI must be granted Azure roles, as specified in the details section of the Policy rule.
I prefer System-assigned Managed Identity SPN (service principal names) since they cannot be used outside a single assignment, eliminating the minimal (Azure provides controls for the usage) threat of malicious usage.
To reduce the number of role assignments, user-assigned MI is used.
First question the need for any custom Policy/Initiative definition requested. While the built-in Policies are not perfect, the choices made are often made due to constraints and conflicts between settings and include tradeoffs in risk versus usability. If you still think you need custom definitions, sleep on it and revisit the topic one more time.
If you have multiple tenants, the same definition should be propagated to every tenant (DRY principle). Do not use a separate repo which would cause copy/paste issue (WET anti-pattern).
Custom Policy definitions are notoriously hard to design/implement. Debugging issues is even harder. There are a few items which will make the experience easier.
Azure’s community contributed Policy definitions repo (https://github.com/Azure/Community-Policy/blob/master) contains a script which validates the above and corrects the definition if necessary (see https://github.com/Azure/Community-Policy/blob/master/Submit-PolicyDefinitionFile.ps1)
Do not include system generated properties:
Policy effects should always be parameterized. Name the parameter “effect”, displayName is “Effect” and specify “allowedValues” and a “defaultValue”. Recommended combinations are:
"allowedValues" Sets |
Recommended "defaultValue" |
"Append", "Deny", "Audit", "Disabled" |
Append |
"Append", "Audit", "Disabled" |
Append |
"Modify", "Deny", "Audit", "Disabled" |
Modify |
"Modify", "Audit", "Disabled" |
Modify |
"Deny", "Audit", "Disabled" |
Audit |
"Audit", "Disabled" |
Audit |
"DeployIfNotExists", "AuditIfNotExists", "Disabled" |
AuditIfNotExists or DeployIfNotExists |
"AuditIfNotExists", "Disabled" |
AuditIfNotExists |
"DenyAction", "Disabled" |
DenyAction |
"Manual", "Disabled" |
Manual |
Append, Modify and DeployIfNotExists Policies are only advisable if the required parameters are known at Policy Assignment time.
Note: Modify and Append can interfere with desired state deployment technologies (e.g., Terraform). Terraform has an element “ignore_changes” to account for this problem (see https://developer.hashicorp.com/terraform/language/meta-arguments/lifecycle#ignore_changes).
Like Policy definitions, Initiative (Policy Set) definitions benefit from the same guidelines.
Parameters (especially effect parameters) should be surfaced by the Initiative. You will need to prefix the Policy level name with an indicator for the Policy in the Initiative.
When including Policies with a GUID name, I recommend that you make the policyDefinitionReferenceId a short version of the Policy’s displayName to make the Initiative readable.
Even with the best intentions some Policies may get in the way. If there is a business reason within acceptable risk parameters, you can grant an Exemption.
Exemptions come in two flavors (without any technical meaning):
Exemptions allow metadata. Add a link in metadata to the work item (e.g., Azure DevOps work item, GitHub issue, Jira ticket, etc.) to keep a record of why the exemption was granted and who granted it.
If you exempt an entire subscription with a Mitigated, it is likely that you should have used notScope (called Excluded Scope in Azure Portal) in the Assignment instead.
Warning: When you delete a Policy Assignment with Exemptions, then the Exemptions are not deleted and become orphaned.
Operational tasks (e.g., Remediation tasks, generating documentation) must be scripted. Do not use CI/CD tools (including Terraform) to execute operational tasks since CI/CD is intended to deploy resources, not to operate those resources.
I keep track of changes by cloning and following Microsoft’s official Azure Policy repo on GitHub (https://github.com/Azure/azure-policy/tree/master/built-in-policies). When I receive an email about a merged PR (pull request), I’ll fetch the latest version from GitHub into my clone. This allows me to use Visual Studio Code on my local clone instead of using Azure Portal or GitHub web interface.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.