Blog Post

Azure Observability Blog
7 MIN READ

What’s new in Observability and Resiliency @ Ignite 2024

Julie_Wang's avatar
Julie_Wang
Icon for Microsoft rankMicrosoft
Nov 19, 2024

Observability and resiliency are crucial in IT operations as they ensure the continuous performance and reliability of systems. With Azure Monitor, organizations can gain full-stack visibility into their applications and infrastructure, enabling proactive issue detection and resolution. Combined with powerful tools like Azure Chaos Studio, customers can further enhance resiliency at scale by leveraging controlled experimentation to test and improve systems.

At Ignite 2024, we are excited to announce new capabilities that will help enhance the performance of your business-critical applications and workloads across cloud-to-edge environments.

Improvements we’re making in the observability space

Simplified user experience

Across Azure Monitor, we are striving to make the user experience as simple and intuitive as possible so you can focus on mission-critical tasks and troubleshoot with ease. We are excited to announce a few features that will enable this, some available today and others coming soon.

First, we’re making it easier to see all your Prometheus metrics, managed infrastructure, and container logs in a single pane of glass with the new unified monitoring blade for AKS (public preview by early January 2025), allowing you to easily detect where the issues lie across the stack. Our new blended experience offers basic monitoring out of the box, free for all AKS customers, with the option to upgrade monitoring through managed Prometheus and container logs. Learn more: aka.ms/unifiedAKSmonitoring

Until now, Azure Monitor Logs relied on Kusto Query Language (KQL) for users to express their questions as queries. KQL is a powerful language, however, it requires some knowledge to operate. Log Analytics simple mode (public preview) experience was created to bridge the knowledge gap, allowing most popular KQL operators and actions to be utilized using a very simple, point-and-click experience with no KQL knowledge at all. You can easily switch from Simple to KQL mode to get the full power of KQL for deeper insights. Simple mode will be generally available in early January 2025. Learn more: aka.ms/LogAnalyticsSimpleMode

Navigate traditional log search alerts in a simpler and faster way with event-based alerts (public preview by early January 2025). Unlike traditional log search alerts that aggregate rows over a defined period, event-based alerts evaluate each row individually. This feature can also help customers using basic logs to create alerts as it offers a "simple mode" for log search alerts, defined on a single row in a specific log analytics table. Stay tuned for more information!

We are also making it easier for you to take full advantage of Azure Monitor’s application observability capabilities for your Java and Node.js applications running on Azure Kubernetes Service (AKS) with auto-instrumentation (public preview by early January 2025). You’ll soon be able to enable OpenTelemetry-based application monitoring for your AKS namespaces without modifying any application code and get detailed observability into every step of your end-to-end transactions, across all your microservices. Stay tuned for more information!

Next, combine monitoring and diagnostics in one place with the integration of Performance Diagnostics with Azure Monitor (public preview by early January 2025). With the integration of the Performance Diagnostics tool in the VM Overview Monitoring tab and VM Insights blade of Azure Monitor, you’ll be able to troubleshoot directly within your Azure Monitor workflow and easily access continuous and on-demand insights, recommendations, and diagnostics data for VM performance issues. Stay tuned and learn more by visiting the documentation

Finally, leverage Application Insights Code Optimizations to help you identify and resolve performance bottlenecks at the code level in your .NET applications running on Azure. Utilizing an advanced AI-based model, it analyzes Application Insights profiler traces and displays actionable next steps in the Azure portal at no additional cost. This saves your time and increases your productivity from detection in the Azure portal to code-level resolution with GitHub Copilot. More details in this blog or documentation.

Enhanced data visibility

Gain richer context and improved visibility into your workloads with new Kubernetes metadata and logs filtering capabilities (generally available today) in Azure Monitor Container Insights. This feature enhances the ContainerLogsV2 schema with additional Kubernetes metadata such as PodLabels, PodAnnotations, PodUid, Image, ImageID, ImageRepo and ImageTag. The logs filtering feature provides filtering capabilities for both workload and platform (i.e. system namespaces) logs coming out of containers. Additionally, enhance your Kubernetes metadata experience by leveraging the Grafana dashboard to visualize log levels, volume, rate, records and much more. Learn more: aka.ms/KubernetesMetadataLogsFilteringGA

Next, announcing the availability of Grafana 11: with this version, we will have improved visualizations powered by new features for the Azure Monitor plugin in Grafana including support for Basic Logs, ContainerLogv2 and Prometheus Exemplars to App Insights traces. Learn more: aka.ms/AzureManagedGrafana11

Increased support for security

Security is a top priority for Microsoft as we know how important it is for our customers. Observability is especially critical for security scenarios because it provides deep insights into network traffic and application performance, enabling proactive detection and resolution of potential security threats.

Today, we are announcing the general availability of Advanced Container Networking Services, a solution designed to address the evolving needs of modern containerized applications. By offering comprehensive network visibility and robust security features, this service enables users to confidently manage, secure, and observe the intricate network of their AKS clusters. Advanced Container Networking Services has two main pillars, Container Network Observability and Container Network Security. To learn more Container Network Security and its capabilities, see What is Container Network Security? To learn more about Advanced Container Networking Services and its capabilities, see What is Advanced Container Networking Services?

Azure Monitor also now supports Network Security Perimeter (NSP) features, enhancing security and monitoring capabilities across 6 Azure cloud regions. NSP allows Azure PaaS resources to communicate securely within a trusted boundary and helps prevent unauthorized access and data exfiltration. Administrators can define logical network isolation boundaries and configure common public access controls for multiple PaaS resources using a uniform API and a consistent user experience. This centralized management simplifies the process of securing Azure PaaS resources.

The Network Security Perimeter (NSP) enables logging for resources inside the perimeter, providing visibility into ingress and egress traffic patterns. This helps with auditing and compliance, as well as identifying potential security threats. NSP supports complex network setups and integrates seamlessly with other Azure services, ensuring security measures are consistently applied across different services. Specific benefits of NSP include enhanced security, centralized management, granular access control, logging and monitoring, seamless integration, and support for complex setups, making it a valuable tool for enhancing network security and ensuring data integrity.

Please note that during the public preview, Azure Monitor NSP features will be operational only in 6 production regions (East US, East US 2, North Central US, South Central US, West US, and West US 2). More regions will be rolled out in mid-December and by the end of January, all the remaining 50 regions will begin supporting NSP features in Azure Monitor. Learn more by reading the blog.

Recent updates for Azure Chaos Studio

Since Azure Chaos Studio went live one year ago at Ignite 2023, we’ve been working to enhance the platform to meet all your chaos experimentation needs. Read on to learn more about the updates we’ve made since last Ignite.

New product functionality

Pause Process fault for Windows Virtual Machines: Azure Chaos Studio now supports a new agent-based fault action for Windows virtual machines. This new agent-based fault allows customers to pause a Windows process running in an Azure virtual machine for a specified duration as part of a Chaos Experiment. This can help test applications running on virtual machines and validate their resilience to stutters, maintenance events, or resource constraints.

Network Isolation Fault for Virtual Machines: Azure Chaos Studio now supports a new agent-based fault action for Windows and Linux virtual machines and virtual machine scale sets. This allows customers to isolate an Azure virtual machine from network connections by dropping all packets (subject to certain environment limitations) for the specified duration as part of a Chaos Experiment. This can help test applications inside virtual machines and their resilience to losing network traffic. It can be used in Azure Chaos Studio by deploying templates, using the REST API, or designing experiments in the Azure portal. Details, limitations, and examples are available in the fault library.

Support for AKS faults with AKS-Managed Microsoft Entra: Azure Chaos Studio’s faults for Azure Kubernetes Service clusters, using Chaos Mesh, now work with managed identity authentication. Previously, local authentication through the Kubernetes API server was the only supported authentication method. To enable customers with an enhanced security posture for AKS clusters to perform fault injection, the Version 2.2 AKS Chaos Mesh faults now also support managed identity authentication. Learn more: aka.ms/ChaosAKSEntra

Support for resource tags: This enables users to apply tags, or metadata elements, to their Azure resources, resource groups, and subscriptions. Tagging helps them organize Azure resources and manage hierarchy.

New Azure built-in roles to simplify access management: Chaos Studio Experiment Contributor, Chaos Studio Operator, and Chaos Studio Reader are new Azure built-in roles users can utilize to manage access to Azure Chaos Studio within their organizations.

New regions supported

Azure Chaos Studio is now available in the Canada Central region. Availability for Germany West Central is coming soon (expected mid-January 2025).

For more information about Azure Chaos Studio, see Azure Chaos Studio documentation.

How to learn more about observability and resiliency at Ignite 2024

Attend the following breakout sessions and live demos:

We also encourage you to meet the team and ask your questions at the Copilot & Azure Monitor expert meet-up booth!

Continue the conversation here with us in the Observability Tech Community Blog. For more announcements, check out this blog on what’s new in operations and management from VP of Observability, Farzana Rahman.

Updated Nov 19, 2024
Version 5.0