Prometheus metric label value is overridden by pod label

Copper Contributor

Hi, I have configured scraping of Prometheus metrics with Container insights using this instruction: https://docs.microsoft.com/en-us/azure/azure-monitor/containers/container-insights-prometheus-integr...

 

I noticed that labels are stored into InsightsMetrics.Tags field, in my case I have a metric:

 

kong_latency_bucket{type="kong",service="google",route="google.route-1",le="00001.0"} 1

 

It means that in the InsightsMetrics.Tags I should have the following fields: type, service, route and le.

I noticed that if my pod has “service” label then the value of the pod is stored in InsightsMetrics.Tags.service field instead of the metric value and now I’m losing information about metric value. Is there a way to specify a prefix for Prometheus metric labels in InsightsMetrics.Tags field or anything that allows me to have correct value in service field?

3 Replies

@adamlepkowski Can you explain a bit more. Did you mean to say that the tag 'service' has pod name instead of having the label's actual value (google in your example below) ? Whats the scraping mechanism you are using ? If you can share the config map snippet it would be great.

@vishiy Tags.service has value of pod label instead of value from prometheus metric. Below configuration:

 

POD yaml defintion:

apiVersion: v1
kind: Pod
metadata:
  name: pkad
  labels:
    service: 'value that will be put into Tags'
spec:
  containers:
.......

Prometheus metric:

kong_latency_bucket{type="kong",service="google",route="google.route-1",le="00001.0"} 1

 

In my tags.service I should have value "google" but I have "value that will be put into Tags"

 

Config map:

kind: ConfigMap
apiVersion: v1
data:
  schema-version:
    #string.used by agent to parse config. supported versions are {v1}. Configs with other schema versions will be rejected by the agent.
    v1
  config-version:
    #string.used by customer to keep track of this config file's version in their source control/repository (max allowed 10 chars, other chars will be truncated)
    ver1
  log-data-collection-settings:
    [log_collection_settings]
       [log_collection_settings.stdout]
          enabled = true
          exclude_namespaces = ["kube-system"]

       [log_collection_settings.stderr]
          enabled = true
          exclude_namespaces = ["kube-system"]

       [log_collection_settings.env_var]
          enabled = true
       [log_collection_settings.enrich_container_logs]
          enabled = false
       [log_collection_settings.collect_all_kube_events]
          enabled = false
    
  prometheus-data-collection-settings:
    [prometheus_data_collection_settings.cluster]
        interval = "1m"

        monitor_kubernetes_pods = true

       [prometheus_data_collection_settings.node]
        interval = "1m"

  alertable-metrics-configuration-settings: 
    [alertable_metrics_configuration_settings.container_resource_utilization_thresholds]
        container_cpu_threshold_percentage = 95.0
        container_memory_rss_threshold_percentage = 95.0
        container_memory_working_set_threshold_percentage = 95.0
metadata:
  name: container-azm-ms-agentconfig
  namespace: kube-system

 

This is a Telegraf issue, which is a OSS component we use. I will track and fix this
Since its also adding pod labels as dimensions for the metrics, collisions are resolved in a undesired way (i would prefer use actual metric labels as opposed to pod labels for colliding time-series).

Can you work this around by any one of the options below, until we resolve this ?
1) Instead of scraping thru pod annotations, can you scrape a k8s service endpoints if possible ?
2) Can you update/patch the pod labels for your deployments may be ?

 

Thank you for bringing this to our attention, @adamlepkowski