Blog Post

Core Infrastructure and Security Blog

7 MIN READ

Azure Policy Recommended Practices

Microsoft

May 04, 2023

Azure Policy has multiple uses including general governance, monitoring setup, security, and compliance. It should not be used to deal with items better handled with role-based access control (RBAC). The following rules codify this:

Prohibit anybody and any service from doing something: Azure Policy.
Prohibit specific users and service principals from doing something: RBAC.

Note: Many professionals use security and compliance interchangeably. Security encompasses much more than some checkboxes on a compliance spreadsheet; however, complying with Microsoft Cloud Security Benchmark and NIST-880-53 are a decent baseline for enforcing security aspects with Azure Policy.

Policy as Code

I am not covering PaC solutions in detail here. The author recommends Enterprise Azure Policy as Code (EPAC). I’m one of the maintainers of Enterprise Azure Policy as Code (EPAC). Not surprisingly, I believe EPAC to be vastly superior to any other PaC solution.

Cloud and most on-prem datacenters are software defined leading to the term Infrastructure as Code (IaC). Azure Policy is a special form of infrastructure; therefore, we call the approach: Policy as Code (PaC). When adopting (or building) a Policy as Code solutions, you should ensure that deployments are:

Idempotent (you can run the deployment multiple times without any harm).
Desired state (reverses any drift from the last deployment).
Co-existence of different teams owning some aspects of Azure Policy.
Do not Repeat Yourself for the code/definition (DRY principle) instead of a definition which repeats the same information in multiple files (Write Everything Twice/Thrice – WET anti-pattern).
CI/CD following GitHub flow (https://docs.github.com/en/get-started/quickstart/github-flow) or a similar easy branching strategy.
Solution can read an existing environment and extract the existing deployment to be ingested later (round-trip capable).
Minimize the amount of JSON, Bicep, or Terraform to be written.

Management Groups and Policy Resources

Custom Policy/Initiative Definitions and Policy Assignments need to be deployed at a scope.

Custom definitions should always be deployed at the top Management Group (MG) in each tenant. That MG should be the single MG (no siblings) underneath the “Tenant root group” as recommended by Microsoft (see https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/landing-zone/design-areas) or at the actual “Tenant root group” if you are not following Microsoft’s recommendation verbatim.

Policy Assignment must be at this level or lower. They should be at the highest MG group possible. Do NOT assign Policies to subscriptions or resource groups.

Note 1: The landing zones diagram in the link above shows Policy Assignments at the subscription level which is technically incorrect as they are applied at the management group scope and inherited upon subscriptions. The rest of the Cloud Adoption Framework documentaion puts it correctly at Management Group level (see https://github.com/Azure/Enterprise-Scale/wiki/ALZ-Policies).

Note 2: You must set the default location for new subscriptions in a MG at or below the scope where the security-oriented Policy Assignments are deployed to prevent rogue subscriptions from bypassing your security controls enforcement with Azure Policy.

Policy Assignments

Policies are inert elements in Azure until you create a Policy Assignment at a scope. Each assignment should:

Define semi-readable short name (limited to 24 characters by Azure)
Define a readable displayName (visible in Portal).
Define a description.
May have metadata, such as a work item id.

“Azure Security Benchmark” (ASB - "name": "1f3afdf9-d0c9-4c3d-847f-89da613e70a8") is automatically assigned by Defender for Cloud in each subscription to protect new environments. All Policy effects are set to “Audit”. In most scenarios, you will set some of the effects to “Deny”. It is best to create a new Assignment at a MG (see above in “Management Groups and Policy Resources” to change the effects centrally. Once done you should remove the auto-assigned Policy Assignments to avoid difficulties on overlaps.

It is essential that ASB is assigned to cover all subscriptions. Defender for Cloud depends on this Policy Assignment.

You may assign additional security-oriented and compliance-oriented Initiatives, such as "NIST SP 800-53 Rev. 5" ("name": "179d1daa-458f-4e47-8086-2a68d0d6c38f"). You should limit yourself to no more than 5 Initiatives (including custom Initiatives). Larger numbers will make maintenance and managing Policy Exemptions extremely difficult.

Assignments containing Policies with Modify or DeployIfNotExists Policies require a Managed Identity (MI). The MI must be granted Azure roles, as specified in the details section of the Policy rule.

I prefer System-assigned Managed Identity SPN (service principal names) since they cannot be used outside a single assignment, eliminating the minimal (Azure provides controls for the usage) threat of malicious usage.

To reduce the number of role assignments, user-assigned MI is used.

Custom Definitions

First question the need for any custom Policy/Initiative definition requested. While the built-in Policies are not perfect, the choices made are often made due to constraints and conflicts between settings and include tradeoffs in risk versus usability. If you still think you need custom definitions, sleep on it and revisit the topic one more time.

If you have multiple tenants, the same definition should be propagated to every tenant (DRY principle). Do not use a separate repo which would cause copy/paste issue (WET anti-pattern).

Policy Definitions

Custom Policy definitions are notoriously hard to design/implement. Debugging issues is even harder. There are a few items which will make the experience easier.

The name should be a GUID or a unique name within your company. Using a GUID simplifies contributing the Policy to the community or merging multiple tenants, especially in a merger (companies) scenario.
Create a nested properties structure with only the name outside.
Supply a displayName for the Policy.
Description is highly recommended.
version - in metadata; use semantic versioning.
category - in metadata, must be one of the categories in the built-in Policies and Policy Sets.

Azure’s community contributed Policy definitions repo (https://github.com/Azure/Community-Policy/blob/master) contains guidance (https://github.com/Azure/Community-Policy/blob/main/CONTRIBUTING.md) and a script for validating a Policy definition (https://github.com/Azure/Community-Policy/blob/main/Scripts/Confirm-PolicyDefinitionIsValid.ps1) and a script to reformat/repair a Policy definition (https://github.com/Azure/Community-Policy/blob/main/Scripts/Out-FormattedPolicyDefinition.ps1).

Do not include system generated properties:

properties.policyType
properties.metadata
- createdOn
- createdBy
- updatedOn
- updatedBy

Policy effects should always be parameterized. Name the parameter “effect”, displayName is “Effect” and specify “allowedValues” and a “defaultValue”. Recommended combinations are:

"allowedValues" Sets	Recommended "defaultValue"
"Append", "Deny", "Audit", "Disabled"	Append
"Append", "Audit", "Disabled" Use only when Deny is not possible	Append
"Modify", "Deny", "Audit", "Disabled"	Modify
"Modify", "Audit", "Disabled" Use only when Deny is not possible	Modify
"Deny", "Audit", "Disabled"	Audit
"Audit", "Disabled" Use only when Deny is not possible	Audit
"DeployIfNotExists", "AuditIfNotExists", "Disabled"	AuditIfNotExists or DeployIfNotExists
"AuditIfNotExists", "Disabled"	AuditIfNotExists
"DenyAction", "Disabled"	DenyAction
"Manual", "Disabled"	Manual

Append, Modify and DeployIfNotExists Policies are only advisable if the required parameters are known at Policy Assignment time.

Note: Modify and Append can interfere with desired state deployment technologies (e.g., Terraform). Terraform has an element “ignore_changes” to account for this problem (see https://developer.hashicorp.com/terraform/language/meta-arguments/lifecycle#ignore_changes).

Policy Set Definitions

Like Policy definitions, Initiative (Policy Set) definitions benefit from the same guidelines.

The name should be a GUID or a unique name within your company. Using a GUID simplifies contributing the Initiative to the community or merging multiple tenants, especially in a merger (companies) scenario.
Create a nested properties structure with only the name outside.
Supply a displayName for the Initiative.
Description is highly recommended.
version - in metadata; use semantic versioning.
category - in metadata, must be one of the categories in the built-in Policies and Policy Sets.

Parameters (especially effect parameters) should be surfaced by the Initiative. You will need to prefix the Policy level name with an indicator for the Policy in the Initiative.

When including Policies with a GUID name, I recommend that you make the policyDefinitionReferenceId a short version of the Policy’s displayName to make the Initiative readable.

Policy Exemptions

Even with the best intentions some Policies may get in the way. If there is a business reason within acceptable risk parameters, you can grant an Exemption.

Exemptions come in two flavors (without any technical meaning):

Mitigated – Most often used for permanent exemptions. An example is allowing public IP addresses for a storage account which is used as an upload folder AND mitigations, such as Virus scans and deleting of processed data.
Waiver – Most often used for temporary exemptions to allow a solution team to fix their non-compliant deployment. Generally granted until Monday after the ETA (estimated time of arrival) for the fix.

Exemptions allow metadata. Add a link in metadata to the work item (e.g., Azure DevOps work item, GitHub issue, Jira ticket, etc.) to keep a record of why the exemption was granted and who granted it.

If you exempt an entire subscription with a Mitigated, it is likely that you should have used notScope (called Excluded Scope in Azure Portal) in the Assignment instead.

Warning: When you delete a Policy Assignment with Exemptions, then the Exemptions are not deleted and become orphaned.

Operating Azure Policy

Operational tasks (e.g., Remediation tasks, generating documentation) must be scripted. Do not use CI/CD tools (including Terraform) to execute operational tasks since CI/CD is intended to deploy resources, not to operate those resources.

Keeping track of built-in changes

I keep track of changes by cloning and following Microsoft’s official Azure Policy repo on GitHub (https://github.com/Azure/azure-policy/tree/master/built-in-policies). When I receive an email about a merged PR (pull request), I’ll fetch the latest version from GitHub into my clone. This allows me to use Visual Studio Code on my local clone instead of using Azure Portal or GitHub web interface.

Updated Jul 21, 2023

Version 5.0

HeinrichGantenbein

Heinrich_Gantenbein

Microsoft

Joined June 20, 2019

View Profile

Core Infrastructure and Security Blog

Follow this blog board to get notified when there's new activity

14 Comments

akumar1911
MCT
Oct 14, 2024
Great summary and guidance on Azure policies.
abarrionuevo
Copper Contributor
Jun 26, 2024
Hello, Heinrich_Gantenbein
I consider as inconsistent this part of this article "Policy Assignment must be at this level or lower. They should be at the highest MG group possible. Do NOT assign Policies to subscriptions or resource groups." with the note "This is the most streamlined approach for creating a remediation task and is supported for policies assigned on a subscription. For policies assigned on a management group, remediation tasks should be created using https://learn.microsoft.com/en-us/azure/governance/policy/how-to/remediate-resources?tabs=azure-portal#option-1-create-a-remediation-task-from-the-remediation-page or https://learn.microsoft.com/en-us/azure/governance/policy/how-to/remediate-resources?tabs=azure-portal#option-2-create-a-remediation-task-from-a-non-compliant-policy-assignment after evaluation has determined resource compliance." at the article https://learn.microsoft.com/en-us/azure/governance/policy/how-to/remediate-resources?tabs=azure-portal

As for this last article, if we create a remediation task as a part of the policy assignment creation, and assign it to a MG, the remediation task is not effectively created, and we have to create it from the Remediation section afterwards.

Could you please clarify?

Thanks
Best regards
Abhijeetbhor
Copper Contributor
Aug 05, 2023
What is best way to automate custom policy testing using cannary subscription to evaluate policy behaviour and validate if it is working as expected?
Heinrich_Gantenbein
Microsoft
Jul 21, 2023
Jesse Loudon : Thank you for reporting this. I fixed the links. We recently renamed both the branch (to main) and the names of the scripts.
Jesse Loudon
Brass Contributor
Jul 20, 2023
Just a heads up this link to the script mentioned as validating custom definitions appears to be broken.
https://github.com/Azure/Community-Policy/blob/master/Submit-PolicyDefinitionFile.ps1

Policy Definitions
Custom Policy definitions are notoriously hard to design/implement. Debugging issues is even harder. There are a few items which will make the experience easier.
The name should be a GUID or a unique name within your company. Using a GUID simplifies contributing the Policy to the community or merging multiple tenants, especially in a merger (companies) scenario.
Create a nested properties structure with only the name outside.
Supply a displayName for the Policy.
Description is highly recommended.
version - in metadata; use semantic versioning.
category - in metadata, must be one of the categories in the built-in Policies and Policy Sets.
Azure’s community contributed Policy definitions repo (https://github.com/Azure/Community-Policy/blob/master/CONTRIBUTING.md) contains a script which validates the above and corrects the definition if necessary (see https://github.com/Azure/Community-Policy/blob/master/Submit-PolicyDefinitionFile.ps1)
Do not include system generated properties:
properties.policyType
properties.metadata
createdOn
createdBy
updatedOn
updatedBy
Heinrich_Gantenbein
Microsoft
May 31, 2023
joe-zuchora : Microsoft Defender for Cloud (aka Azure Security Center) has no way of knowing the structure of your MGs and therefore can only apply settings at subscription level. With EPAC (https://aka.ms/epac) you can have it remove all the auto-assignment (default behavior) and create the assignments at the MG-level(s).
Heinrich_Gantenbein
Microsoft
May 31, 2023
amityadav : Hi, we do not provide support for Policy issues here. I recommend that you open a support ticket with Microsoft. Thank you.
amityadav
Copper Contributor
May 31, 2023

Hi team,

Unable to fix - SQL servers on machines should have vulnerability findings resolved.

Policy is showing Notfound for non compliance resources, below is the details, i am not understanding how to debug and fix to make resource complaint.

Compliance details
Resource name:sqlvulnerability
Resource type
:
Microsoft.Compute/virtualMachines
Scope
:
Corp-Development-01/rg-vm-dev-eastus2-01
Parent resource
:
resourcegroups/rg-vm-dev-eastus2-01
Location
:
East US 2
Resource ID
:
/subscriptions/6a0bd5a4-9826-44fa-9023-813f69860137/resourcegroups/rg-vm-dev-eastus2-01/providers/microsoft.compute/virtualmachines/sqlvulnerability
Compliance state
Non-compliant
Last evaluated
5/31/2023, 10:38 AM
Definition version
1.0.0
Initiative version
57.14.0
Reason for non-compliance
No related resources match the effect details in the policy definition. (Error code: AssessmentNotFound)
Existence condition
Type
Microsoft.Security/assessments
Name
f97aa83c-9b63-4f9a-99f6-b22c4398f936
joe-zuchora
Copper Contributor
May 11, 2023
Good insight but very frustrating that the ASB policy is assigned at the subscription level by default when it's clearly not a best practice.
irshad
Brass Contributor
May 08, 2023
Heinrich_Gantenbein Many thanks for your support .it helps in my work.