Cloud Native Monitoring for Applications on Microsoft Azure

Published May 11 2021 06:31 AM 3,418 Views
Microsoft

Introduction

Application management is the core function of maintaining existing application portfolios. While traditional approaches to application management can constrain enterprises and hamper modernization and digital transformation initiatives, the latest monitoring and automatic alerting capabilities can help increase speed and agility.

With the advent of cloud platforms, most organizations now have an application footprint that resides in the cloud. This trend is increasing over time and brings about a paradigm shift in how cloud-hosted applications must be monitored. Most cloud platforms, including Microsoft Azure, provide their own set of native tools that enable application monitoring. This article covers the relationship between application management and monitoring tools, recommendations on how to choose the right monitoring tool and DXC Technology’s automation solution for monitoring applications in Microsoft Azure.

 

Application management services and monitoring tools: How they play together

Application management services address ongoing support for applications, typically involving defect repair and issues that arise in a supported application. Application issues are typically reported by users or customers even before the support team realizes an issue exists, leading to poor customer satisfaction, and more often than not, a portion of these issues tends to be repetitive, which means the same procedure to resolve the issue has to be applied manually over and over again. These manual and repetitive tasks often account for up to 40 percent of the application support team’s workload.

This is where monitoring tools can play a major role. By using techniques such as real-user monitoring and synthetic monitoring, it is possible to proactively identify issues with the application. Real-user monitoring can detect issues that occur in the application during use. Synthetic monitoring enables checking whether the application is available and also enables simulating a user for specific scenarios to test whether the application is working as expected.

Troubleshooting an issue when a monitoring tool is in place means that the support team uses real data points captured by the tool, which enables it to identify the root cause of the issue — thus eliminating the need for guesswork. By adding hooks to these monitoring mechanisms, it is possible to automatically detect when an issue or incident occurs in a specific application and also to alert relevant stakeholders when the issue occurs.

These detection and reporting capabilities mean the application support team can be made aware of the issue immediately and don’t have to wait for users or customers to report the issue. Automatic detection and alerting enables the team respond to the incident faster, reducing the mean time to restore (MTTR). Depending on the scenario, real-time notification means the application support team may be able to either fix the issue before it is found by the user or at least add an upfront warning message to the user that a specific feature is unavailable and undergoing maintenance.    

Correspondingly, for a subset of the repetitive issues, it may be possible to automate a sequence of manual steps to arrive at an automatic resolution. Implementing the appropriate monitoring solution, coupled with automation capabilities, makes it possible to lower costs up to 30 percent.

 

Application monitoring strategies: 

As more enterprises adopt cloud platforms for hosting applications, it is important to have a strategy to monitor the applications. This enables the application management team to respond quickly to issues that arise.

Cloud platform vendors such as Microsoft provide cloud-native monitoring tools to the cloud platform. In addition, vendors that used to provide tools for monitoring on-premises applications hosted have jumped onto the cloud bandwagon and now provide tools for monitoring in the cloud. As the focus of this article is on Cloud native monitoring, only the key advantages of the Cloud-native monitoring tools are covered here:

Key advantages of cloud-native monitoring tools

  • No installation or additional licensing requirements – one can provision and configure the tools and the monitoring of the application starts immediately.
  • No additional licensing requirements - the cost of the monitoring tool is charged like any other Azure resource as part of the monthly Cloud Spend.

The DXC approach: Application Service Automation and AMS

DXC Technology’s approach to monitoring and automation, known as Application Service Automation (ASA), includes a modular framework that covers the entire closed-loop automation cycle from automatic detection to correction.

This framework is both platform- and technology-agnostic and can be adapted to various tool stacks based on customer needs. The solution described next adheres to DXC’s underlying ASA framework but is based fully on cloud-native tools provided by Microsoft Azure. The next section presents how Azure-native tools can be leveraged to provide an Azure-based variant of DXC’s ASA framework.

For more information on DXC’s Application Management Services and ASA, visit DXC Application Management Services

 

Leveraging Azure cloud-native monitoring

This solution focuses on providing application monitoring and adds some custom-built automation features. It provides a high-level guideline on how the cloud-native monitoring tools provided by the Azure platform can be used in combination to provide a successful monitoring solution. Here is an introduction to the tools that would be used as part of the solution.

 

Azure cloud-native monitoring tools

The Azure platform brings with it monitoring and orchestration tools that have been incorporated into this solution. Depicted below is a high-level reference view of the Azure-native tools used.

 

thenitishanand_0-1614860237886.png

 Figure 1. Microsoft Azure cloud-native monitoring tools

The in-scope applications shown in the diagram represent the portfolio of applications for which cloud-native monitoring using Azure-native tools needs to be enabled. The IT Service Management (ITSM) tool shown on the right-hand side represents the customer’s ticketing tool where the incident details are captured.

Azure Monitor is a comprehensive native monitoring service provided by Azure as part of its cloud platform for monitoring, which can collect and analyze telemetry data from applications. It comprises several tools, each providing various types of support from an application-monitoring perspective. The main ones are:

  • Azure Monitor for VMs and Azure Monitor for Containers provide monitoring of the infrastructure components.
  • Azure Application Insights enable tracing of user requests as they travel through an application, and they have the ability to collect and capture telemetry that is emitted from applications. Azure Application Insights also provide features such as multistep web tests and URL ping tests, which enable synthetic monitoring capabilities.
  • Azure Monitor Alerts provide a mechanism to trigger an action based on the evaluation of a metric captured in Azure Application Insights and Azure Monitor.

Azure Logic Apps are a serverless compute capability available on Azure platform and used as part of the automated resolution strategy of the solution. DXC’s solution approach for monitoring applications based on Azure Application Insights is defined below.

 

Solution approach

In-scope application — Enabling application insights

The in-scope application is first enabled to emit telemetry by adding the specific Application Insights software development kit (SDK) — either the Java SDK or the .NET SDK — to the application. Azure Application Insights has SDKs for Java, .NET, ASP.NET Core, Node.js, Python and JavaScript at the time this article is written.

The application is recompiled and deployed onto the Azure virtual machine (VM). Only the addition of the SDK is needed, and no further intrusive code changes are necessary. Once this is done, the application starts to emit telemetry, which is captured in the Azure Application Insights.

Application Insights also support monitoring based on “codeless attach” or “auto-instrumentation,” where applications can be monitored without the need to add the SDK. This approach is still evolving, and not all scenarios are supported yet. For the latest information on this approach, refer to the Microsoft documentation on the topic.

 

thenitishanand_1-1614860304088.png

Figure 2 – Azure Application Insights dashboard

As can be seen in Figure 2, in case the application emits exceptions, they are then captured as well under the “Failed Requests” section. The logical next step is to propagate this error to the ITSM tool and ensure that a ticket is raised. This can be done either by

  • using an Azure Alert and then attaching an Action Group that calls a Logic App, or
  • directly by using a Logic App that polls the Application Insights Logs for exceptions and then raises a ticket in the ITSM tool via the DXC CASA module (CASA is DXC’s custom built module that helps integrate the different toolsets and is short for “Controller for ASA”). Here we take the latter approach since it provides additional flexibility.

 

thenitishanand_2-1614860384528.png   thenitishanand_3-1614860395763.png

Figure 3 – Azure Logic Apps for automatic ITSM ticket creation

 

Figure 3 shows the Azure Logic App that is querying Application Insights by using the built-in Connector for Azure Application Insights. Once the exception details are retrieved from the query, the app is used to create an ITSM ticket in the ITSM ticketing tool using connectors. As an example, a ServiceNow connector is shown here.

Here, the serverless computing elements such as Azure Logic Apps are used to integrate and orchestrate error reporting to the ITSM tool and also notify relevant stakeholders. The use of serverless components such as Logic Apps for orchestration of the error detection and resolution brings several advantages. Logic Apps tend to be low-maintenance since Azure handles patches and updates — unlike a virtual machine (VM) that would itself add management overhead costs. Logic Apps are based on the concept of “low-code, no-code” development and rapidly support implementation of the required logic with low effort. Logic Apps also store Run History for each instance that has run and provide a historical visualization of each run, including the runtime values present at the time of the run.

 

thenitishanand_4-1614860442325.png

Figure 4 – Azure Logic Apps – run history

If an error occurs in the application, this error is captured in the Azure Application Insights. By means of the orchestration built with the help of Azure Logic Apps and DXC CASA, this error is now propagated to the ITSM layer.

If we assume that this issue is a frequent and repetitive one, an evaluation is done to determine whether it has the potential to be automated. If the automation aspect is found feasible, that approach is implemented — ensuring that the issue can also be automatically fixed.

 

thenitishanand_5-1614860477588.png

Figure 5 – Automation Resolution Using Logic Apps

The Logic Apps layer is used as both the resolution orchestrator and for implementing the actual resolution flow. The resolution orchestrator will receive the open tickets from the ITSM tool via the DXC CASA module and then delegates the resolution to the specific resolve flow. Azure Logic Apps are leveraged as mechanisms for implementing the specific automated resolution (the resolve flow) as well. The automation responsible for fixing the issue would be both scenario-specific and application-specific, and therefore more details about the specific issue are not covered as part of this article. The DXC CASA module ensures that the ticket open and closure data is propagated to the Data Lake which powers the Dashboard. This Dashboard layer provides insights into the workings of the entire closed loop automation via various graphical charts.

Once the application- and scenario-specific automation runs, the issue is fixed automatically. The automation mechanism would also close the ticket that was created once the fix has been applied, thus providing an end-to-end automation of the issue.

 

Key benefits

  • Improving resiliency by ensuring a higher uptime for the application while lowering the MTTR
  • Improving customer experience by detecting and reacting to issues before the customer is effected.
  • Leveraging automation to accelerate processes, reducing redundant manual efforts
  • Reducing cost, as the overall cost of a solution built using the described approach will be much lower than solutions based on any of the leading third-party tools.

Conclusion

This article covers application-monitoring strategies, cloud-native monitoring tools and third-party tools for monitoring the relationship between application management services and monitoring tools. Its recommendation with regard to monitoring tools is to find the right fit of the tool based on business criticality and the type of application. Microsoft Azure provides all the relevant building blocks required to weave together an end-to-end solution, which includes application monitoring, automated ticket creation, as well as automated resolution. For more information, visit: DXC Application Service Automation.

 

thenitishanand_0-1620739613441.png

 

Vikram Srivatsa is a senior architect and part of the Worldwide Applications Service Line at DXC Technology, based in Bengaluru, India. He has vast experience in architecting enterprise applications, and his current area of focus includes creation of cloud-native solutions for the enterprise.

thenitishanand_1-1620739620053.png

 

Ashish Thakur is a product engineer for Application Services at DXC Technology, based in Noida, India. He is knowledgeable in cloud and service delivery automation using Azure-native tools and in architecting solutions and building proofs of concept.

Version history
Last update:
‎Apr 08 2022 11:16 AM
Updated by: