Kubernetes has become a platform of choice for building cloud native applications. Kubernetes is highly scalable, highly available, and easy to use, and has many other advantages that make it an excellent choice for building distributed applications.
However, its distributed nature means monitoring everything that is happening within the cluster can be a challenge. Prometheus and Grafana make our experience better.
In this article, we will set up a Kubernetes cluster using Azure Kubernetes Service (AKS) and deploy Prometheus and Grafana to gather monitoring data and visualize them.
Understand the tooling
Prometheus is an open source project that was originally created at SoundCloud in 2012, and contributed to the Cloud Native Computing Foundation (CNCF) in 2016 as the second open source software project after Kubernetes itself.
Prometheus collects and stores metrics from various sources and exposes them to the user in a way that is easy to understand and consume. It’s a tool that can monitor the health of your cluster, the performance of your applications, and the availability of your services.
Prometheus uses an exporter architecture. Exporters are APIs that may collect or receive raw metrics from a service and expose them in a specific format that Prometheus consumes.
Once Prometheus discovers a new exporter (or if you configure one), it will start collecting metrics from these services and store them in persistent storage.
Grafana is a web application that is used to visualize the metrics that Prometheus collects. It will not produce any metrics, but collects and displays them in a way that’s easy to understand through plots, charts and dashboards.
We will be creating a Kubernetes cluster using Azure Kubernetes Service (AKS), you will need an Azure account, the Azure CLI, Kubectl and Helm.
You will be able to install the latest versions of Kubectl and Helm using the Azure CLI, or install them manually if you prefer.
Install the CLI tools on your local machine since you will need a forward a local port to access both the Prometheus and Grafana web interfaces.
Create an Azure Kubernetes Service (AKS) Cluster
Sign into the Azure CLI by running the login command.
Install or update kubectl.
az aks install-cli
Create two bash/zsh variables which we will use in subsequent commands. You may change the syntax below if you are using another shell.
Create a resource group. We have chosen to create this in the eastus Azure region.
az group create --name $RESOURCE_GROUP --location eastus
Create a new AKS cluster using theaz aks createcommand. Here we create a 3 node cluster using theB-series Burstable VMtype which is cost-effective and suitable for small test/dev workloads such as this.
az aks create --resource-group $RESOURCE_GROUP \
--name $AKS_NAME \
--node-count 3 \
--node-vm-size Standard_B2s \
This may take a few minutes to complete.
Authenticate to the cluster we have just created.
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
We can now access our Kubernetes cluster with kubectl. Use kubectl to see the nodes we have just created.
kubectl get nodes
Install Prometheus and Grafana
Prometheus can be installed either by using Helm or by using theofficial operatorstep by step. We’ll use the Helm chart because it’s quick and easy.
The operator is part of thekube-prometheusproject, which is a set of Kubernetes manifests that will not only install Prometheus but also configure Grafana to be used along with it and make all the components highly available. Let’s install Prometheus using Helm.
Add its repository to our repository list and update it.
The default username for Grafana isadminand the default password isprom-operator. You can change it in the Grafana UI later.
Note: To ensure security, do not expose your Prometheus or Grafana endpoints to the public internet using a Service or Ingress.
Go to Dashboards -> Manage where you will see many dashboards that have been created for you.
Grafana dashboard list
These are all created by the Prometheus operator to ease the configuration process.
Click on the etcd dashboard and you’ll see an empty dashboard. What has happened?
The empty etcd dashboard
Since AKS is a managed Kubernetes service, it doesn’t allow you to see internal components such as the etcd store, the controller manager, the scheduler, etc. So, there’s no point in even trying to get those metrics out of the cluster because we won’t make it. Let's just disable this option by upgrading our Prometheus release:
Once executed, the output won’t change for you, the dashboard will continue to be empty, but we won’t be wasting resources trying to get its metrics.
Note: If you are running an older version of Kubernetes, it might be necessary to turn off the https metrics serving from the kubelet, since they expose the metrics over HTTP. For this, you’ll need to set the kubelet.serviceMonitor.https parameter in the helm chart to false: