First published on CloudBlogs on Jan, 09 2014

Over the last three “Best Practices” posts, I’ve looked at how to Plan , Build , and Deploy a Hybrid Cloud – and, now that your Hybrid Cloud is up and running, it’s time to look at how to keep it running in peak form and with optimum, high-performance results.

In this post I’ll examine two critical aspects of hybrid cloud operations: Health monitoring and self-service, including how to identify and troubleshoot some of the most common maintenance questions and obstacles.

To successfully operate your Hybrid Cloud, both of these factors will need to be planned for and actively executed on a regular basis.

Health Monitoring

Simply put: You can’t manage what you don’t monitor.

Health Monitoring (which encompasses the infrastructure with the services running and the self-service delivery) is the ability to repeatedly and consistently deploy services needed by your teams and/or your end users.

To start, I want to examine the components involved in management through monitoring. You’ll need to consider the artifacts within your Hybrid Cloud which require monitoring. These artifacts include the IIS instance that is hosting your PAAS offerings , the SQL server database, and the applications themselves. With System Center Operation Manager 2012 R2 (SCOM) we can use multiple management packs to discover these artifacts and then report/analyze the health of these artifacts (all from a single interface).

A key tool for monitoring the health of your Hybrid Cloud is the System Center Management Pack for Windows Azure Fabric Management Pack which gathers data on your public infrastructure that’s running in Azure.

This management pack gives SCOM the following functionality:

  • Discovers Windows Azure Cloud services .
  • Provides info on the status of each role instance.
  • Collects and monitors performance information per role instance.
  • Collects and monitors Windows events per role instance.
  • Collects and monitors the .NET Framework trace messages from each role instance.
  • Grooms performance, event, and the .NET Framework trace data from Windows Azure storage.
  • Changes the number of role instances.
  • Discovers Windows Azure Virtual Machines.
  • Provides status of each role instance of a Virtual Machine.
  • Discovers Windows Azure Storage services.
  • Monitors availability and size of each storage instance and optionally alerts.
  • Discover relationships between discovered Azure resources to see which other resources a particular Azure resource is using. This information is then displayed in a topology dashboard.
  • Monitors management and cloud service certificates and sends an alert if the certificates are about to expire.
  • Includes a new Distributed Application template that lets you create distributed applications that span Azure as well as on-prem resources for Hybrid monitoring scenarios.
  • Includes a set of dashboards for the Hybrid monitoring scenarios.

Pretty impressive, right?

To get an even deeper look your storage, use the Azure SQL Database Management Pack . The monitoring provided by this particular management pack includes availability, performance data collection, and default thresholds. This allows you to seamlessly integrate the monitoring of Windows Azure SQL Database components into your Hybrid Cloud service monitoring scenarios.

To further enhance your ability to look deep into applications, you should check out the new Microsoft Monitoring Agent (MMA) as part of the SCOM monitoring offering. The MMA allows you to bootstrap into our PAAS services and gather more detail on the applications themselves – this essentially allows application monitoring in Azure .NET applications at the same level available with our on-prem .NET monitoring. You can use the MMA with Operations Manager to report application performance issues and failures and have them then forwarded to Visual Team Foundation Server as work items in a project.

For more details on this new multi-homing monitoring agent review this post here .

When you’re ready to kick off your operation in earnest, here’s a short list of Management Packs and links to knowledge that may help in your Hybrid Cloud management efforts:

Self Service

The National Institute of Standards and Technology defines cloud computing as having five essential characteristics :

  1. On-demand self-service
  2. Broad network access
  3. Resource pooling
  4. Rapid elasticity
  5. Measured service

With these five characteristics in mind, and looking at it from an IT perspective, it’s easy to see an underlying theme of rapidly deploying an automatically scalable service via self-service as the primary function of operating a cloud.

Adopting this definition may require a change in process. For example, the term “rapidly deploying” already assumes I am pre-authorizing compute/storage/network resources to a user/tenant of the cloud – but this doesn’t mean I’ve abandoned the approval model. Instead, what you’ll likely be doing is moving the approval to a place where it will happen sooner in the process – or maybe it’s a switch to a measurement/disapprove model based on over/under usage.

In a self-service scenario, Windows Azure Pack (WAP) becomes the extensible interface to the service administrators and tenants of the self-service experience. WAP is derived from the same service management API layer in Windows Azure, and it is back-ended by Windows Server 2012 R2 and System Center 2012 R2 to allow the management of on premises resources alongside the automation and self-service features required to become a true Hybrid loud.

With these types of tools, not only is the operation of your Hybrid Cloud remarkably simplified, but the experience across your entire cloud environment is consistent due to the similar Management API and Portal across public and private clouds:

To get started with the deployment and configuration of WAP you can read a detailed overview here .

To test your deployment, check out the Microsoft Best Practices Analyzer where you can use Task Scheduler to run on an automated interval.

When offering services in a cloud, we need to include our first line set of applications and services required in almost all enterprises – that’s the collaboration. In the WAP interface these services are surfaced as Gallery Items for VM Roles. To get you started on the right path, the Building Clouds team has already built some service templates that are ready to use for the following workloads:

These tools can be accessed using the Web Platform Installer (WebPI); you can learn about using WebPI here .

Creating these VM Role Gallery Items for your custom line of business services or other third party services is made much easier with the VM Role Authoring tool – which you can download here .


One final element to consider (and, yes, this goes beyond the two key things I mentioned in the intro) is a big key for success – Automation . To build your expertise on this topic, I really recommend the Automation track over at Building Clouds . You’ll find a number great examples to help you along this path, and the recent 4-part series on Automating Hybrid Clouds (linked below) will walk you through the specifics of taking many of the manual tasks around managing a Hybrid Cloud and converting them to Automation.