Jun 30 2020 09:24 AM - last edited on Feb 05 2021 01:02 PM by Eric Starker
Will there be an out-of-the-box Diagnostic setting Policy Initiative for Azure Resources to enable monitoring for your resources at scale?
The best you can find right now is from Tao Yang:
There are some policies available, but not for all of the resource, making it difficult to enable monitoring at scale.
Jun 30 2020 09:31 AM - edited Jun 30 2020 09:35 AMSolution
@pvyver great question! No "out-of-the-box" diagnostic setting policy initiative is planned as of yet. As Tao points out the best way to do this is via Azure policy. Agreed, that pre-built policies only exist for a subset of resource types, but a policy can be authored per resource type. Definitely something we can look into though.
Jun 30 2020 09:48 AM - edited Jun 30 2020 09:51 AM
@pvyver You can also check out a script that we have available, that creates Azure Custom Policies for Azure resource types that support Azure Diagnostics logs and metrics. Policies can be created for both Event Hub and Log Analytics sink points with this script. Check it out here: https://github.com/JimGBritt/AzurePolicy/tree/master/AzureMonitor/Scripts
Jun 30 2020 09:59 AM
@Rahul Bagaria Thanks for surfacing this script. Yao's approach is definitely where things started. The biggest challenge was that there was no programmatic way to create those Policies into an Initiative represented in an ARM template "easily" without a bunch of work. Reading through Yao's blog post there are a ton of things he needed to do and hours of testing as well. The testing is valuable as a way to ensure all is tested and working for your environment. Also understanding the amount of ingestion that may occur, etc. Building the policies to begin with is a bit of an investigative task all it's own. The script that Rahul surfaced documented here: https://aka.ms/AzPolicyScripts allows a customer to discover the resources that support Azure Diagnostics in their environment. And then generate policies for those. If a policy initiative is needed (in the form of an ARM template) which is actually a recommendation to help manage this at scale, this script can help export that. It does require a discovery of existing resources across a ResourceGroup/Subscription/ManagementGroup/or Tenant but at least the flexibility is there to do that discovery at different scopes. This type of experience (generating the ARM template) can be leveraged in a pipeline doing diff to actually detect drift and updates that may occur to the PaaS resources being managed. That would be the last piece in this puzzle. If you use the ARM template published on Tao's site, it will get you partially there for the resources that existed at the time that he integrated as well as the categories that were supported for logs. Work will need to be done to true it up given updates that have occurred since his post. This script can help. I'd say they both have value - Tao's post and his hard work surfaced a fundamental challenge and a way to solve it. This script helps automate that approach.
Jun 30 2020 10:02 AM
@pvyver Spot on yes. There is a gap today in what you are looking for around "Auto Generation of Policy". There is work being done to improve this experience but for now this is probably one of the best approaches to take to at least remove the strain of building the policies to begin with. Your point is well taken however and has been a challenge from the beginning as unless the resource exists, there is no way to know what that resourceType supports related to metrics and log categories.