azure monitor
1297 TopicsGenerally Available - High scale mode in Azure Monitor - Container Insights
Container Insights is Azure Monitor’s solution for collecting logs from your Azure Kubernetes Service (AKS) clusters. As the adoption of AKS continues to grow, we are seeing an increasing number of customers with log scaling needs that hit the limits of log collection in Container Insights. Last August, we announced the public preview of High Scale mode in Container Insights to help customers achieve a higher log collection throughput from their AKS clusters. Today, we are happy to announce the General Availability of High Scale mode. High scale mode is ideal for customers approaching or above 10,000 logs/sec from a single node. When High Scale mode is enabled, Container Insights does multiple configuration changes leading to a higher overall throughput. These include using a more powerful agent setup, using a different data pipeline, allocating more memory for the agent, and more. All these changes are made in the background by the service and do not require input or configuration from customers. High Scale mode impacts only the data collection layer (with a new DCR) – the rest of the experience remains the same. Data flows to our existing tables, your queries and alerts work as before too. High Scale mode is available to all customers. Today, High scale is turned off by default. In the future, we plan to enable High Scale mode by default for all customers to reduce the chances of log loss when workloads scale. To get started with High Scale mode, please see our documentation at https://aka.ms/cihsmode94Views1like0CommentsAutomate Your Log Analytics Workflows with AI and Logic Apps
In this post, we’ll demonstrate how to build a simple yet powerful workflow using Azure Logic Apps, Log Analytics queries, and LLMs to automate log analysis, save time, and spot issues faster. While we focus here on an example using Application Insights data with Azure OpenAI, the same approach can be applied to any Log Analytics source - whether raw logs, security events, or custom logs. By customizing your queries and AI prompts to match your data and the model’s capabilities, you can easily adapt this workflow to meet your specific needs. Note: This blog post offers guidance for automating workflows with Log Analytics data and LLMs using existing Azure Monitor products. It’s intended as a flexible approach based on user needs and preferences, providing an additional option alongside other Microsoft experiences, such as Azure Monitor issues and investigations (preview). Application Insights as a Use Case Imagine you’re an Application Insights user relying on the AppTraces table - detailed logs of events, errors, and critical traces. You need to spot hour-over-hour spikes or drops, identify operations causing the most issues, and detect recurring patterns or keywords that reveal deeper problems. These insights help turn raw data into actionable information. Running queries and analyzing logs regularly is essential, and automation offers a way to make this process more efficient. This saves time and helps you focus on the most impactful insights - so you can quickly move on to what matters next. With Azure Logic Apps, you can create a recurring workflow that automatically runs your Log Analytics queries, sends the summarized results to Azure OpenAI for analysis, and delivers a clear, actionable report straight to your inbox on your preferred schedule. From Logs to Insights: Step-by-Step AI Workflow 1. Create a Logic App Go to the Azure Portal and create a new Logic App. Open the Logic App Designer to start building your workflow. Helpful resource: Overview - Azure Logic Apps | Microsoft Learn 2. Set a Trigger Add a trigger to start your flow - for this scenario, we recommend using the Recurrence trigger to schedule it on a weekly basis (or any frequency you prefer). Of course, you can choose other triggers depending on your specific needs. 3. Query Your Log Analytics Data Add the Azure Monitor Logs - “Run query and list results” connector to your Logic App. Connect it to your Log Analytics workspace (or another relevant resource). Write a Kusto Query Language (KQL) query to pull data from Log Analytics Tables. In our example, the query retrieves aggregated error-level (SeverityLevel = 3) and critical-level (SeverityLevel = 4) traces from the last week, grouped by hour and operation name, with three sample messages for context. This not only shows the number of errors, when they occurred, and which operations were most affected, but also gives the LLM in the next step a solid foundation for uncovering deeper insights and trends. The query: AppTraces | where TimeGenerated > startofday(ago(7d)) | where SeverityLevel in (3, 4) // Error = 3, Critical = 4 | summarize TracesCount = count(), SampleMessages = make_list(Message, 3) by bin(TimeGenerated, 1h), SeverityLevel, OperationName | order by TimeGenerated asc Tip: Log datasets can be huge - use the summarize operator to aggregate results and reduce the volume for the AI model. Helpful resource: Connect to Log Analytics or Application Insights - Azure Logic Apps | Microsoft Learn 4. Prerequisite - Azure OpenAI Resource Configuration Make sure you have an Azure OpenAI resource set up and an AI model (e.g., GPT-4) deployed before continuing with your workflow. Helpful resource: What is Azure OpenAI in Azure AI Foundry Models? | Microsoft Learn 5. Analyze and Summarize with Azure OpenAI In Logic Apps, add an HTTP action and set all the parameters to call the OpenAI API endpoint. Pass the query results from the previous step (step 3) as input and instruct the OpenAI model to: Summarize key findings - for example, the total number of errors and critical events, and identify the top operations generating the most issues. Highlight anomalies or trends - such as trends and spikes in errors over time (hour-by-hour), and detection of recurring error patterns or keywords. Provide recommendations prioritized by urgency to guide the next steps. Format the output in HTML for easy email rendering. Tip: The body structure sent to the AI includes both System and User rules, formatted together as one string (see below). Helpful resource: How to use Assistants with Logic apps - Azure OpenAI | Microsoft Learn Here’s the prompt example: "messages": [ { "role": "system", "content": "You are an Azure AI tool that creates a weekly report based solely on this prompt and input JSON data from Log Analytics. The input is a list of records, each with these fields: TimeGenerated (ISO 8601 timestamp string), SeverityLevel (integer, where 3=Error, 4=Critical), OperationName (string), TracesCount (integer), SampleMessages (JSON string representing a list of up to 3 messages). Your tasks: 1) Sum the TracesCount values accurately to provide total counts for the entire week and broken down by day and SeverityLevel. 2) Present TracesCount counts per OperationName, grouped by hour and day with severity-level breakdowns. 3) Identify and list the top 10 OperationNames by combined Error and Critical TracesCount for the week, including up to 3 unique sample messages per OperationName, removing duplicates. 4) Compare TracesCount hour-by-hour and day-by-day, calculating percentage changes and highlighting spikes (>100% increase) and significant drops. 5) Detect any new OperationNames appearing during the week that did not appear before. 6) Highlight recurring Errors and Critical issues based on keywords: timeout, exception, outofmemory, connection refused. 7) Assign urgency levels based on frequency, impact, and trends. 8) Provide clear, prioritized recommendations for resolving the main issues. Format your output as valid inline-styled HTML using only these tags: <h2>, <h3>, <p>, <ul>, <li>, and <hr>. Include these report sections in this order: Executive Summary, Weekly Totals and Daily Breakdown, Hourly and Daily Trend Comparison, New & Emerging OperationNames, Detailed Operation Errors, Data Quality & Confidence, Recommendations. Include an opening title with the report’s time period." }, { "role": "user", "content": "string(outputs('Run_query_and_list_results'))" } ] } 6. Send the Report via Email Use the Send an email (V2) connector, or another endpoint connector, such as Teams. Send the AI-generated report to your team, stakeholders, or yourself. Customize the email subject, body, and importance level as needed. Section of the final email report: Important reminder: Once your flow is ready, enable it in Logic Apps to ensure it starts running according to the schedule. Key Takeaways By combining Azure Logic Apps, Log Analytics, and Azure OpenAI, you can turn raw, complex logs into clear, actionable insights - automatically. This workflow helps reduce manual analysis time and enables faster responses to critical issues. Ready to try? Build your own automated log insights workflow today and empower your team with AI-driven clarity.767Views2likes0CommentsGeneral Availability of Auxiliary Logs and Reduced Pricing
Azure Monitor logs are trusted by hundreds of thousands of organizations to monitor mission-critical workloads. But with such a diverse customer base, there’s no one-size-fits-all solution. That’s why we’re excited to announce a series of major advancements to Auxiliary Logs, the Azure Monitor plan that is designed for high volume logs. Auxiliary Logs works in tandem with all other Azure Monitor tools including the more powerful Basic and Analytics Logs plans. Together, they are the one-stop-shop for all the logging needs of an organization. Auxiliary Logs were introduced last year and have gained a lot of traction since. There are many customers that ingest data into Auxiliary Logs, with several teams ingesting more than a petabyte of logs per day. Over the last few months, we have moved Auxiliary Logs to General Availability status, made them available in all regions, and made numerous enhancements to the service. Auxiliary Logs were first introduced with support for Custom Logs only; security data was added shortly afterwards and is now also supported. Additional table support will be available soon. Learn more about table plans here. We’re also announcing a significant price reduction for Auxiliary Logs, making them even more cost-effective and accessible for high-volume scenarios. For detailed pricing information and charges, visit the Azure Monitor pricing page. This is part of a broader strategy as we align and evolve our data lake assets and pricing models, with the goal of enabling customers to benefit from modern data lake technology including batch computing and federated access without duplication for multiple use-cases spanning security and observability on a common technology stack. The Sentinel data lake announced recently is a key part of this evolution. Data ingested into the Sentinel data lake can be accessed in Auxiliary logs without copy and vice-versa. Stay tuned for more information later this year on how we’re evolving our data strategy for operational scenarios. Enhanced Query Capabilities We have worked to make queries on Auxiliary Logs faster and more powerful. That includes: Expanded KQL Support: All KQL operators on a single table are now supported, including the lookup operator to Analytics tables. Performance Boosts: Built on Delta Parquet, Auxiliary Logs now benefit from improved encoding and partitioning to make queries much more efficient, though indexed technologies like Basic Logs and Analytics Logs will perform better. Extended Time Range: Queries are no longer limited to the last 30 days - you can now query across any time period. Cost Estimation Preview: Get a cost estimate before running your query. General Availability of Summary Rules We are also announcing the General Availability of summary rules. Summary rules have quickly become a key resource for optimizing data ingestion and analysis, having been adopted by a significant number of customers during the preview period. Summary rules enable users to efficiently summarize high-ingestion-rate streams across Analytics, Basic, or Auxiliary plans, supporting robust analysis, dashboarding, and long-term reporting via summarized Analytics tables. Unlike conventional ETL processes, raw data remains in its original tables, allowing for detailed investigations as needed. Key enhancements include: Increased rule limits per workspace Enabling users to retry bins affected by incidents Expanded regional availability Customers can now utilize summary rules on a greater scale with increased confidence. Learn more about summary rules here. Search Jobs: More Power, More Flexibility Search jobs allow users to scan vast amounts of data asynchronously and ingest the results into Analytics table for further investigation Based on customer feedback, we’ve made the following improvements: Enabling more results to be loaded, up to 100 million records (coming soon) Improved user interface that streamlines the search job execution Providing a cost prediction before running a search job Increasing concurrently and removing additional limits. Added support for all KQL operators on a single table with the lookup operator to Analytics tables (coming soon). Learn more about search jobs here. Public Preview of KQL Transformations for Auxiliary Logs Last, but not least, we’re excited to announce the public preview of KQL-based transformations for Auxiliary Logs. This milestone brings Auxiliary Logs to feature parity in terms of ingestion-time transformations with other Azure Monitor log tiers, eliminating the previous limitation of ingesting only raw custom logs into Auxiliary storage. With this new capability, you can now apply filtering and transformation logic at ingestion time, enabling a more strategic and cost-effective approach to managing high-volume, low-fidelity logs. By using Data Collection Rules (DCRs) with Kusto Query Language (KQL) expressions, you can: Filter out noise to reduce data volume. Parse and shape fields to prepare logs for efficient downstream consumption. Split data across multiple tables or tiers, for cost-performance optimization. What makes this especially powerful is that transformations apply to both custom and standard log streams directed to custom tables in the Auxiliary tier. For example, you can now route a portion, or the entirety, of specific platform logs to a custom table in Auxiliary storage, applying transformations as needed. Applying custom transformations and filtering data ingested into the Auxiliary tier will incur a log processing charge. For detailed pricing information and charges, please refer to the Azure Monitor pricing page. Learn more about ingestion-time transformations here.848Views2likes3CommentsOptimize Azure Log Costs: Split Tables and Use the Auxiliary Tier with DCR
This blog is continuation of my previous blog where I discussed about saving ingestion costs by splitting logs into multiple tables and opting for the basic tier! Now that the transformation feature for Auxiliary logs has entered Public Preview stage, I’ll take a deeper dive, showing how to implement transformations to split logs across tables and route some of them to the Auxiliary tier. A quick refresher: Azure Monitor offers several log plans which our customers can opt for depending on their use cases. These log plans include: Analytics Logs – This plan is designed for frequent, concurrent access and supports interactive usage by multiple users. This plan drives the features in Azure Monitor Insights and powers Microsoft Sentinel. It is designed to manage critical and frequently accessed logs optimized for dashboards, alerts, and business advanced queries. Basic Logs – Improved to support even richer troubleshooting and incident response with fast queries while saving costs. Now available with a longer retention period and the addition of KQL operators to aggregate and lookup. Auxiliary Logs – Our new, inexpensive log plan that enables ingestion and management of verbose logs needed for auditing and compliance scenarios. These may be queried with KQL on an infrequent basis and used to generate summaries. Following diagram provides detailed information about the log plans and their use cases: More details about Azure Monitor Logs can be found here: Azure Monitor Logs - Azure Monitor | Microsoft Learn **Note** This blog will be focussed on switching to Auxiliary logs only. I would recommend going through our public documentation for detailed insights about feature-wise comparison for the log plans which should help you in taking right decisions for choosing the correct log plans. At this stage, I assume you’re aware about different log tiers that Azure Monitor offers and you’ve decided to switch to Auxiliary logs for high volume, low-fidelity logs. Let’s look at the high-level approach we’re going to follow to achieve this: Review the relevant tables and figure out which portion of the log can be moved to Auxiliary tier Create a DCR-based custom table which same schema as of the original table. For Ex. If you wish to split Syslog table and ingest a portion of the table into Auxiliary tier, then create a DCR-based custom table with same schema as of the Syslog table. At this point, switching table plan via UI is not possible, so I’d recommend using PowerShell script to create the DCR-based custom table. Once DCR-based custom table is created, implement DCR transformation to split the table. Configure total retention period of the Auxiliary table (this configuration will be done while creating the table) Let’s get started Use Case: In this demo, I’ll split Syslog table and route “Informational” logs to the Auxiliary table. Creating a DCR-based custom table: Previously a complex task, creating custom tables is now easy, thanks to a PowerShell script by MarkoLauren. Simply input the name of an existing table, and the script creates a DCR-based custom table with the same schema. Let’s see it in action now: Download the script locally. Update the resourceID details in this script and save it. Upload the updated script in Azure Shell. Load the file and enter the table name from which you wish to copy the schema. In my case, it's going to be "Syslog" table. Enter new table name, table type and total retention period, shown below: **Note** We highly recommend you review the PowerShell script thoroughly and do proper testing before executing it in production. We don't take any responsibility for the script. As you can see, Aux_Syslog_CL table has been created. Let’s validate in log analytics workspace > table section. Since the Auxiliary table has been created now, next step is to implement transformation logic at data collection rule level. The next step is to update the Data Collection Rule template to split the logs Since we already created custom table, we should create a transformation logic to split the Syslog table and route the logs with SeverityLevel “info” to the Auxiliary table. Let’s see how it works: Browse to Data Collection Rule blade. Open the DCR for Syslog table, click on Export template > Deploy > Edit Template as shown below: In the dataFlows section, I’ve created 2 streams for splitting the logs. Details about the streams as follows: 1 st Stream: It’s going to drop the Syslog messages where SeverityLevel is “info” and send rest of the logs to Syslog table. 2 nd Stream: It’s going to capture all Syslog messages where SeverityLevel is “info” and send the logs to Aux_Syslog_CL table. Save and deploy the updated template. Let’s see if it works as expected Browse to Azure > Microsoft Sentinel > Logs; and query the Auxiliary table to confirm if data is being ingested into this table. As we can see, the logs where SeverityLevel is “info” is being ingested in the Aux_Syslog_CL table and rest of the logs are flowing into Syslog table. Some nice cost savings are coming your way, hope this helps!General Availability of Azure Monitor Network Security Perimeter Features
We’re excited to announce that Azure Monitor Network Security Perimeter features are now generally available! This update is an important step forward for Azure Monitor’s security, providing comprehensive network isolation for your monitoring data. In this post, we’ll explain what Network Security Perimeter is, why it matters, and how it benefits Azure Monitor users. Network Security Perimeter is purpose-built to strengthen network security and monitoring, enabling customers to establish a more secure and isolated environment. As enterprise interest grows, it’s clear that this feature will play a key role in elevating the protection of Azure PaaS resources against evolving security threats. What is Network Security Perimeter and Why Does It Matter? Network Security Perimeter is a network isolation feature for Azure PaaS services that creates a trusted boundary around your resources. Azure Monitor’s key components (like Log Analytics workspaces and Application Insights) run outside of customer virtual networks; Network security perimeter allows these services to communicate only within an explicit perimeter and blocks any unauthorized public access. In essence, the security perimeter acts as a virtual firewall at the Azure service level – by default it restricts public network access to resources inside the perimeter, and only permits traffic that meets your defined rules. This prevents unwanted network connections and helps prevent data exfiltration (sensitive monitoring data stays within your control). For Azure Monitor customers, Network Security Perimeter is a game-changer. It addresses a common ask from enterprises for “zero trust” network security on Azure’s monitoring platform. Previously, while you could use Private Link to secure traffic from your VNets to Azure Monitor, Azure Monitor’s own service endpoints were still accessible over the public internet. The security perimeter closes that gap by enforcing network controls on Azure’s side. This means you can lock down your Log Analytics workspace or Application Insights to only accept data from specific sources (e.g. certain IP ranges, or other resources in your perimeter) and only send data out to authorized destinations. If anything or anyone outside those rules attempts to access your monitoring resources, Network Security Perimeter will deny it and log the attempt for auditing. In short, Network Security Perimeter brings a new level of security to Azure Monitor: it allows organizations to create a logical network boundary around their monitoring resources, much like a private enclave. This is crucial for customers in regulated industries (finance, government, healthcare) who need to ensure their cloud services adhere to strict network isolation policies. By using the security perimeter, Azure Monitor can be safely deployed in environments that demand no public exposure and thorough auditing of network access. It’s an important step in strengthening Azure Monitor’s security posture and aligning with enterprise zero-trust networking principles. Key Benefits of Network Security Perimeter in Azure Monitor With Network Security Perimeter now generally available, Azure Monitor users gain several powerful capabilities: 🔒 Enhanced Security & Data Protection: Azure PaaS resources in a perimeter can communicate freely with each other, but external access is blocked by default. You define explicit inbound/outbound rules for any allowed public traffic, ensuring no unauthorized network access to your Log Analytics workspaces, Application Insights components, or other perimeter resources. This greatly reduces the risk of data exfiltration and unauthorized access to monitoring data. ⚖️ Granular Access Control: Network Security Perimeter supports fine-grained rules to tailor access. You can allow inbound access by specific IP address ranges or Azure subscription IDs, and allow outbound calls to specific Fully Qualified Domain Names (FQDNs). For example, you might permit only your corporate IP range to send telemetry to a workspace, or allow a workspace to send data out only to contoso-api.azurewebsites.net. This level of control ensures that only trusted sources and destinations are used. 📜 Comprehensive Logging & Auditing: Every allowed or denied connection governed by Network Security Perimeter can be logged. Azure Monitor’s Network Security Perimeter integration provides unified access logs for all resources in the perimeter. These logs give you visibility into exactly what connections were attempted, from where, and whether they were permitted or blocked. This is invaluable for auditing and compliance – for instance, proving that no external IPs accessed your workspace, or detecting unexpected outbound calls. The logs can be sent to a Log Analytics workspace or storage for retention and analysis. 🔧 Seamless Integration with Azure Monitor Services: Network Security Perimeter is natively integrated across Azure Monitor’s services and workflows. Log Analytics workspaces and Application Insights components support Network Security Perimeter out-of-the-box, meaning ingestion, queries, and alerts all enforce perimeter rules behind the scenes. Azure Monitor Alerts (scheduled query rules) and Action Groups also work with Network Security Perimeter , so that alert notifications or automation actions respect the perimeter (for example, an alert sending to an Event Hub will check Network Security Perimeter rules). This end-to-end integration ensures that securing your monitoring environment with Network Security Perimeter doesn’t break any functionality – everything continues to work, but within your defined security boundary. 🤝 Consistent, Centralized Management: Network Security Perimeter introduces a uniform way to manage network access for multiple resources. You can group resources from different services (and even different subscriptions) into one perimeter and manage network rules in one place. This “single pane of glass” approach simplifies operations: network admins can define a perimeter once and apply it to all relevant Azure Monitor components (and other supported services). It’s a more scalable and consistent method than maintaining disparate firewall settings on each service. Network Security Perimeter uses Azure’s standard API and portal experience, so setting up a perimeter and rules is straightforward. 🌐 No-Compromise Isolation (with Private Link): Network Security Perimeter complements existing network security options. If you’re already using Azure Private Link to keep traffic off the internet, Network Security Perimeter adds another layer of protection. Private Link secures traffic between your VNet and Azure Monitor; Network Security Perimeter secures Azure Monitor’s service endpoints themselves. Used together, you achieve defense-in-depth: e.g., a workspace can be accessible only via private endpoint and only accept data from certain sources due to Network Security Perimeter . This layered approach helps meet even the most stringent security requirements. In conclusion, Network Security Perimeter for Azure Monitor provides strong network isolation, flexible control, and visibility – all integrated into the Azure platform. It helps organizations confidently use Azure Monitor in scenarios where they need to lock down network access and simplify compliance. For detailed information on configuring Azure Monitor with a Network Security Perimeter, please refer to the following link: Configure Azure Monitor with Network Security Perimeter.936Views1like0CommentsIntroducing the Improved Search Job Experience in Azure Monitor Log Analytics
A search job is an asynchronous query that runs on any data in your Log Analytics workspace, including data from the long-term retention, making the results available for further queries in a new Analytics table within your workspace. To efficiently search massive datasets, Search Job divides queries into smaller time-based segments, processes them in parallel, and returns the results. This approach optimizes scalability and enables reliable analysis, even over petabytes of data. We’re excited to announce significant enhancements to Search Jobs, designed to make large-scale data exploration faster, easier, and more efficient. What’s New in Search Job Our latest update includes several powerful improvements: Intuitive and streamlined UI experience for faster and simpler setup. Cost estimation preview before running a Search Job. Previously, we had system limitations in place to ensure stability. Now, as more customers use Search Job, we’re removing most of these limits to enhance your experience: Result limits are being increased, with support for up to 100 million records coming soon. Enhanced concurrency, allowing more jobs to run in parallel. Removed the search date-range limit, now supporting any date range over the table’s retention. These updates make it easier to explore massive datasets while giving you greater control over costs and performance. Explore the New UI Experience Let’s walk through a familiar scenario to showcase the new UI. Imagine you want to check if a specific client IP address has repeatedly accessed your system over the past year, as part of investigating suspicious activity. With the new Search Job experience, scanning through massive volumes of logs is now fast, simple, and intuitive. Step-by-Step: Start by typing your query or selecting the relevant table - here, we’re querying the SecurityEvent table for a suspicious IP address. Open the ellipsis menu (…) on the right and choose "Search Job". Use the time picker to set your date range. For example, select ‘Last year’ to view a full year of activity, or choose a longer period if needed. Name your new results table, such as SecurityEventJuly25. Before running the job, you’ll see an approximate cost estimation, helping you decide if you want to proceed with the query. Click Run to launch the Search Job. A new table is created in your workspace, allowing you to analyze results efficiently without impacting performance. This new UI flow makes it seamless to handle even large-scale investigations like this, with fewer clicks and better visibility along the way. What’s Next? We’re continuing to enhance Search Job with broader KQL operator support and additional features. Stay tuned for more updates! For a deeper dive into all these improvements, check out the full documentation https://aka.ms/LogAnalyticsSearchJobs. For questions or feedback, feel free to leave a comment on the blog or use the “Give feedback” form directly in the Logs UI.426Views0likes0CommentsPublic Preview: Smarter Troubleshooting in Azure Monitor with AI-powered Investigation
Investigate smarter – click, analyze, and easily mitigate with Azure Monitor investigations! We are excited to introduce the public preview of Azure Monitor issue and investigation. These new capabilities are designed to enhance your troubleshooting experience and streamline the process of resolving health degradations in your application and infrastructure.2KViews6likes2CommentsAnnouncing the Public Preview of Azure Monitor health models
Troubleshooting modern cloud-native workloads has become increasingly complex. As applications scale across distributed services and regions, pinpointing the root cause of performance degradation or outages often requires navigating a maze of disconnected signals, metrics, and alerts. This fragmented experience slows down troubleshooting and burdens engineering teams with manual correlation work. We address these challenges by introducing a unified, intelligent concept of workload health that’s enriched with application context. Health models streamline how you monitor, assess, and respond to issues affecting your workloads. Built on Azure service groups, they provide an out-of-the-box model tailored to your environment, consolidate signals to reduce alert noise, and surface actionable insights — all designed to accelerate detection, diagnosis, and resolution across your Azure landscape. Overview Azure Monitor health models enable customers to monitor the health of their applications with ease and confidence. These models use the Azure-wide workload concept of service groups to infer the scope of workloads and provide out-of-the-box health criteria based on platform metrics for Azure resources. Key Capabilities Out-of-the-Box Health Model Customers often struggle with defining and monitoring the health of their workloads due to the variability of metrics across different Azure resources. Azure Monitor health models provide a simplified out-of-the-box health experience built using Azure service group membership. Customers can define the scope of their workload using service groups and receive default health criteria based on platform metrics. This includes recommended alert rules for various Azure resources, ensuring comprehensive monitoring coverage. Improved Detection of Workload Issues Isolating the root cause of workload issues can be time-consuming and challenging, especially when dealing with multiple signals from various resources. The health model aggregates health signals across the model to generate a single health notification, helping customers isolate the type of signal that became unhealthy. This enables quick identification of whether the issue is related to backend services or user-centric signals. Quick Impact Assessment Assessing the impact of workload issues across different regions and resources can be complex and slow, leading to delayed responses and prolonged downtime. The health model provides insights into which Azure resources or components have become unhealthy, which regions are affected, and the duration of the impact based on health history. This allows customers to quickly assess the scope and severity of issues within the workload. Localize the Issue Identifying the specific signals and resources that triggered a health state change can be difficult, leading to inefficient troubleshooting and resolution processes. Health models inform customers which signals triggered the health state change, and which service group members were affected. This enables quick isolation of the trouble source and notifies the relevant team, streamlining the troubleshooting process. Customizable Health Criteria for Bespoke Workloads Many organizations operate complex, bespoke workloads that require their own specific health definitions. Relying solely on default platform metrics can lead to blind spots or false positives, making it difficult to accurately assess the true health of these custom applications. Azure Monitor health models allow customers to tailor health assessments by adding custom health signals. These signals can be sourced from Azure Monitor data such as Application Insights, Managed Prometheus, and Log Analytics. This flexibility empowers teams to tune the health model to reflect the unique characteristics and performance indicators of their workloads, ensuring more precise and actionable health insights. Getting Started Ready to simplify and accelerate how you monitor the health of your workloads? Getting started with Azure Monitor health models is easy — and during the public preview, it’s completely free to use. Pricing details will be shared ahead of general availability (GA), so you can plan with confidence. Start Monitoring in Minutes Define Your Service Group Create your service group and add the relevant resources as members to the service group. If you don’t yet have access to service groups, you can join here. Create Your Health Model In the Azure Portal navigate to Health Models and create your first model. You’ll get out-of-the-box health criteria automatically applied. Customize to Fit Your Needs In many cases the default health signals may suit your needs, but we support customization as well. Investigate and Act Use the health timeline and our alerting integration to quickly assess impact, isolate issues, and take action — all from a single pane of glass. You can access health models today in the Azure portal! For more details on how to get started with health models, please refer to our documentation. We Want to Hear From You Azure Monitor health models are built with our customers in mind — and your feedback is essential to shaping the future of this experience. Whether you're using the out-of-the-box health model or customizing it to fit your unique workloads, we want to know what’s working well and where we can improve. Share Your Feedback Use the “Give Feedback” feature directly within the Azure Monitor health models experience to send us your thoughts in context. Post your ideas in the Azure Monitor community. Prefer email? Reach out to us at azmonhealthmodels@service.microsoft.com — we’re listening. Your insights help us prioritize features, improve usability, and ensure Azure Monitor continues to meet the evolving needs of modern cloud-native operations.5.7KViews8likes0Comments