LogoLogo
StackState.comDownloadSupportExplore playground
SUSE Observability
SUSE Observability
  • SUSE Observability docs!
  • Docs for all SUSE Observability products
  • 🚀Get started
    • Quick start guide
    • SUSE Observability walk-through
    • SUSE Rancher Prime
      • Air-gapped
      • Agent Air-gapped
    • SUSE Cloud Observability
  • 🦮Guided troubleshooting
    • What is guided troubleshooting?
    • YAML Configuration
    • Changes
    • Logs
  • 🚨Monitors and alerts
    • Monitors
    • Out of the box monitors for Kubernetes
    • Notifications
      • Configure notifications
      • Notification channels
        • Slack
        • Teams
        • Webhook
        • Opsgenie
      • Troubleshooting
    • Customize
      • Add a monitor using the CLI
      • Derived State monitor
      • Override monitor arguments
      • Write a remediation guide
  • 📈Metrics
    • Explore Metrics
    • Custom charts
      • Adding custom charts to components
      • Writing PromQL queries for representative charts
      • Troubleshooting custom charts
    • Advanced Metrics
      • Grafana Datasource
      • Prometheus remote_write
      • OpenMetrics
  • 📑Logs
    • Explore Logs
    • Log Shipping
  • 🔭Traces
    • Explore Traces
  • 📖Health
    • Health synchronization
    • Send health data over HTTP
      • Send health data
      • Repeat Snapshots JSON
      • Transactional Increments JSON
    • Debug health synchronization
  • 🔍Views
    • Kubernetes views
    • Custom views
    • Component views
    • Explore views
    • View structure
      • Overview perspective
      • Highlights perspective
      • Topology perspective
      • Events perspective
      • Metrics perspective
      • Traces perspective
      • Filters
      • Keyboard shortcuts
    • Timeline and time travel
  • 🕵️Agent
    • Network configuration
      • Proxy Configuration
    • Using a custom registry
    • Custom Secret Management
      • Custom Secret Management (Deprecated)
    • Request tracing
      • Certificates for sidecar injection
  • 🔭Open Telemetry
    • Overview
    • Getting started
      • Concepts
      • Kubernetes
      • Kubernetes Operator
      • Linux
      • AWS Lambda
    • Open telemetry collector
      • Sampling
      • SUSE Observability OTLP APIs
    • Instrumentation
      • Java
      • Node.js
        • Auto-instrumentation of Lambdas
      • .NET
      • SDK Exporter configuration
    • Troubleshooting
  • CLI
    • SUSE Observability CLI
  • 🚀Self-hosted setup
    • Install SUSE Observability
      • Requirements
      • Kubernetes / OpenShift
        • Kubernetes install
        • OpenShift install
        • Alibaba Cloud ACK install
        • Required Permissions
        • Override default configuration
        • Configure storage
        • Exposing SUSE Observability outside of the cluster
      • Initial run guide
      • Troubleshooting
        • Advanced Troubleshooting
        • Support Package (Logs)
    • Configure SUSE Observability
      • Slack notifications
      • E-mail notifications
      • Stackpacks
      • Advanced
        • Analytics
    • Release Notes
      • v2.0.0 - 11/Sep/2024
      • v2.0.1 - 18/Sep/2024
      • v2.0.2 - 01/Oct/2024
      • v2.1.0 - 29/Oct/2024
      • v2.2.0 - 09/Dec/2024
      • v2.2.1 - 10/Dec/2024
      • v2.3.0 - 30/Jan/2025
      • v2.3.1 - 17/Mar/2025
      • v2.3.2 - 22/Apr/2025
      • v2.3.3 - 07/May/2025
    • Upgrade SUSE Observability
      • Migration from StackState
      • Steps to upgrade
      • Version-specific upgrade instructions
    • Uninstall SUSE Observability
    • Air-gapped
      • SUSE Observability air-gapped
      • SUSE Observability Kubernetes Agent air-gapped
    • Data management
      • Backup and Restore
        • Kubernetes backup
        • Configuration backup
      • Data retention
      • Clear stored data
    • Security
      • Authentication
        • Authentication options
        • Single password
        • File-based
        • LDAP
        • Open ID Connect (OIDC)
          • Microsoft Entra ID
        • KeyCloak
        • Service tokens
        • Troubleshooting
      • RBAC
        • Role-based Access Control
        • Permissions
        • Roles
        • Scopes
      • Self-signed certificates
      • External secrets
  • 🔐Security
    • Service Tokens
    • API Keys
  • ☁️SaaS
    • User Management
  • Reference
    • SUSE Observability Query Language (STQL)
    • Chart units
    • Topology Identifiers
Powered by GitBook
LogoLogo

Legal notices

  • Privacy
  • Cookies
  • Responsible disclosure
  • SOC 2/SOC 3
On this page
  • Overview
  • Steps
  • Write the outline of the metric binding
  • Write the topology query
  • Write the PromQL query
  • Bind the correct time series to each component
  • Create or update the metric binding in SUSE Observability
  • Other options
  • More than 1 time series in a chart
  • Using metric labels in aliases
  • Layouts
  1. Metrics
  2. Custom charts

Adding custom charts to components

SUSE Observability

PreviousCustom chartsNextWriting PromQL queries for representative charts

Last updated 29 days ago

Overview

SUSE Observability provides already many metric charts by default on most types of components that represent Kubernetes resources. Extra metric charts can be added to any set of components whenever needed. When adding metrics to components there are 2 options:

  1. The metrics are already collected by SUSE Observability but aren't visualized on a component by default

  2. The metrics aren't yet collected by SUSE Observability at all and therefore aren't available yet

For option 1 the steps below will instruct you how to create a metric binding which will configure SUSE Observability to add a specific metric to a specific set of components.

In case of option 2, first make sure that the metrics are available in SUSE Observability, by sending them to SUSE Observability using the . Only after that continue by adding charts for the metrics to the components.

Steps

Steps to create a metric binding:

As an example the steps will add a metric binding for the Replica counts of Kubernetes deployments. This is just an example, this metric binding already exists in SUSE Observability by default.

Write the outline of the metric binding

Create a new YAML file called metric-bindings.yaml and add this YAML template to it to create your own metric binding. Open it in your favorite code editor to change it throughout this guide. At the end the SUSE Observability CLI will be used to create and update the metric bindings in SUSE Observability.

nodes:
- _type: MetricBinding
  chartType: line
  enabled: true
  tags: {}
  unit: 
  name: 
  description: 
  priority: 
  identifier: urn:custom:metric-binding:...
  queries:
    - expression:
      alias:
  scope:
  layout:
    metricPerspective:
      tab:
      section:
      weight:
    componentSummary:
      weight:

The fields in this template are:

  • _type: SUSE Observability needs to know this is a metric binding so, value always needs to be MetricBinding

  • chartType: SUSE Observability will support different chart types (line, bar, etc.), currently only line is supported

  • enabled: Set to false to keep the metric binding but not show it to users

  • tags: Will be used to organize metrics in the user interface, can be left empty using {}

  • name: The name for the metric binding

  • description: Optional description, displayed on-hover of the name

  • priority: [Deprecated] One of HIGH, MEDIUM, or LOW. Main sort order for metrics on a component (in the order they're mentioned here), secondary sort order is the name.

  • identifier: A URN (universal resource identifier), used as the unique identifier of the metric binding. It must start with urn:custom:metric-binding:, the remainder is free-format as long as it's unique amongst all metric bindings.

  • queries: A list of queries to show in the chart for the metric binding (see also the following sections)

  • scope: The topology scope of the metric binding, a topology query that selects the components on which this metric binding will be shown.

Fill in all the parts already known first (with the deployment replica counts as the example)

_type: MetricBinding
chartType: line
enabled: true
tags: {}
unit: short
name: Replica counts
priority: MEDIUM
identifier: urn:custom:metric-binding:my-deployment-replica-counts
queries:
  - expression:
    alias:
scope:

The queries and scope section will be filled in the next steps. Note that the unit used is short, which will simply render a numeric value. In case you're not sure yet about the unit of the metric, you can leave it open and decide on the correct unit when writing the PromQL query.

Write the topology query

type = "deployment" and label = "stackpack:kubernetes"

The type filter selects all deployments, while the label filter selects only components created by the Kubernetes stackpack (label name is stackpack and label value is kubernetes). The latter can also be omitted to get the same result.

Switch to the advanced mode to copy the resulting topology query and put it in the scope field of the metric binding.

Metric bindings only support the query filters, query functions like withNeighborsOf aren't supported and can't be used.

Write the PromQL query

For the total number of replicas use the kubernetes_state_deployment_replicas metric. To make the charts shown for this metric representative for the time series data extend the query to do an aggregation using the ${__interval} parameter:

max_over_time(kubernetes_state_deployment_replicas[${__interval}])

In this specific case use max_over_time to make sure the chart always shows the highest number of replicas at any given time. For longer time ranges this means that a short dip in replicas won't be shown, to emphasize the lowest number of replicas use min_over_time instead.

Copy the query into the expression property of the first entry in the queries field of the metric binding. Use Total replicas as an alias. This is the name that will show in the chart legend.

Bind the correct time series to each component

The metric binding with all fields filled in:

_type: MetricBinding
chartType: line
enabled: true
tags: {}
unit: short
name: Replica counts
priority: MEDIUM
identifier: urn:custom:metric-binding:my-deployment-replica-counts
queries:
  - expression: max_over_time(kubernetes_state_deployment_replicas[${__interval}])
    alias: Total replicas
scope: type = "deployment" and label = "stackpack:kubernetes"

Creating it in SUSE Observability and viewing the "Replica count" chart on a deployment component gives an unexpected result. The chart shows the replica counts for all deployments. Logically one would expect only 1 time series: the replica count for this specific deployment.

To fix this make the PromQL query specific for a component using information from the component. Filter on enough metric labels to select only the specific time series for the component. This is the "binding" of the correct time series to the component. For anyone experienced in making Grafana dashboards this is similar to a dashboard with parameters that are used in queries on the dashboard. Let's change the query in the metric binding to this:

max_over_time(kubernetes_state_deployment_replicas{cluster_name="${tags.cluster-name}", namespace="${tags.namespace}", deployment="${name}"}[${__interval}])

The PromQL query now filters on 3 labels, cluster_name, namespace and deployment. Instead of specifying an actual value for these labels a variable reference to fields of the component is used. In this case the labels cluster-name and namespace are used, referenced using ${tags.cluster-name} and ${tags.namespace}. Further the component name is referenced with ${name}.

Supported variable references are:

  • Any component label, using ${tags.<label-name>}

  • The component name, using ${name}

The cluster name, namespace and a combination of the component type and name are ususally enough for selecting the metrics for a specific component from Kubernetes. These labels, or similar labels, are usually available on most metrics and components.

Create or update the metric binding in SUSE Observability

Use the SUSE Observability CLI to create the metric binding in SUSE Observability. Make sure the metric-bindings.yaml is saved and looks like this:

nodes:
- _type: MetricBinding
  chartType: line
  enabled: true
  tags: {}
  unit: short
  name: Replica counts
  priority: MEDIUM
  identifier: urn:custom:metric-binding:my-deployment-replica-counts
  queries:
    - expression: max_over_time(kubernetes_state_deployment_replicas{cluster_name="${tags.cluster-name}", namespace="${tags.namespace}", deployment="${name}"}[${__interval}])
      alias: Total replicas
  scope: type = "deployment" and label = "stackpack:kubernetes"
sts settings apply -f metric-bindings.yaml

Verify the results in SUSE Observability by opening the metrics perspective for a deployment. If you're not happy with the result simply change the metric binding in the YAML file and run the command again to update it. The list of nodes supports adding many metric bindings. Simply add another metric binding entry to the YAML array using the same steps as before.

The identifier is used as the unique key of a metric binding. Changing the identifier will create a new metric binding instead of updating the existing one.

The sts settings command has more options, for example it can list all metric bindings:

sts settings list --type MetricBinding

Finally to delete a metric binding use

sts settings delete --ids <id>

The <id> in this command isn't the identifier but the number in the Id column of the sts settings list output.

The recommended way of working is to store metric bindings (and any other custom resources created in SUSE Observability) as YAML files in a Git repository. From there changes can be manually applied or it can be fully automated by using the SUSE Observability CLI in a CI/CD system like GitHub actions or GitLab pipelines.

Other options

More than 1 time series in a chart

There is only 1 unit for a metric binding (it gets plotted on the y-axis of the chart). As a result you should only combine queries that produce time series with the same unit in 1 metric binding. Sometimes it might be possible to convert the unit. For example, CPU usage might be reported in milli-cores or cores, milli-cores can be converted to cores by multiplying by 1000 like this (<original-query>) * 1000.

There are 2 ways to get more than 1 time series in a single metric binding and therefore in a single chart:

  1. Write a PromQL query that returns multiple time series for a single component

  2. Add more PromQL queries to the metric binding

  • Comparing total replicas vs desired and available

  • Resource usage: limits, requests and usage in a single chart

nodes:
- _type: MetricBinding
  chartType: line
  enabled: true
  tags: {}
  unit: short
  name: Replica counts
  priority: MEDIUM
  identifier: urn:custom:metric-binding:my-deployment-replica-counts
  queries:
    - expression: max_over_time(kubernetes_state_deployment_replicas{cluster_name="${tags.cluster-name}", namespace="${tags.namespace}", deployment="${name}"}[${__interval}])
      alias: Total replicas
    - expression: max_over_time(kubernetes_state_deployment_replicas_available{cluster_name="${tags.cluster-name}", namespace="${tags.namespace}",  deployment="${name}"}[${__interval}])
      alias: Available - ${cluster_name} - ${namespace} - ${deployment}
    - expression: max_over_time(kubernetes_state_deployment_replicas_unavailable{cluster_name="${tags.cluster-name}", namespace="${tags.namespace}",  deployment="${name}"}[${__interval}])
      alias: Unavailable - ${cluster_name} - ${namespace} - ${deployment}
    - expression: min_over_time(kubernetes_state_deployment_replicas_desired{cluster_name="${tags.cluster-name}", namespace="${tags.namespace}",  deployment="${name}"}[${__interval}])
      alias: Desired - ${cluster_name} - ${namespace} - ${deployment}
  scope: type = "deployment" and label = "stackpack:kubernetes"

Using metric labels in aliases

When a single query returns multiple time series per component, this will show as multiple lines in the chart. But in the legend they will all use the same alias. To be able to see the difference between the different time series the alias can include references to the metric labels using the ${label} syntax. For example here is a metric binding for the "Container restarts" metric on a pod, note that a pod can have multiple containers:

type: MetricBinding
chartType: line
enabled: true
id: -1
identifier: urn:custom:metric-binding:my-pod-restart-count
name: Container restarts
priority: MEDIUM
queries:
- alias: Restarts - ${container}
  expression: max by (cluster_name, namespace, pod_name, container) (kubernetes_state_container_restarts{cluster_name="${tags.cluster-name}", namespace="${tags.namespace}", pod_name="${name}"})
scope: (label = "stackpack:kubernetes" and type = "pod")
unit: short

Note that the alias references the container label of the metric. Make sure the label is present on the query result, when the label is missing the ${container} will be rendered as literal text to help troubleshooting.

Layouts

Each component can be associated with various technologies or protocols such as k8s, networking, runtime environments (e.g., JVM), protocols (HTTP, AMQP), etc. Consequently, a multitude of different metrics can be displayed for each component. For easier readability, SUSE Observability can organize these charts into tabs and sections. To display a chart (MetricBinding) within a specific tab or section, you need to configure the layout property. Any MetricsBinding without a specified layout will be displayed in a tab and section named Other.

Here is an example configuration:

layout:
  metricPerspective:
    tab: Kubernetes Pod
    section: Performance
    weight: 2
  componentSummary:
    weight: 2

Fields:

  • layout.metricPerspective - Defines metrics to display on Metrics Perspective. Metrics are grouped into tabs and then sections.

  • layout.metricPerspective.tab - Tab name. Tabs are sorted alphabetically.

  • layout.metricPerspective.section - Section name. Sections are sorted alphabetically.

  • layout.metricPerspective.weight - Metrics within a section are sorted primarily by weight (ascending) and secondarily by name (alphabetical).

  • layout.componentSummary - Specifies metrics to display in the Components details sidebar upon component selection. Charts appear only when this property is defined.

  • layout.componentSummary.weight - This represents the weight of the chart. Charts are sorted in ascending order by weight and then displays first 3 charts.

unit: The unit of the values in the time series returned by the query or queries, used to render the Y-axis of the chart. See the reference for all units.

layout: How to groups charts on different perspective views, e.g. on

Use the Explore view of the , http://your-instance/#/views/explore, and select the components that need to show the new metric. Both the basic and advanced views can be used to make the selection. The most common fields to select topology on for metric bindings are type for the component type and label for selecting all the labels. For example for the deployments:

Go to the of your SUSE Observability instance, http://your-instance/#/metrics, and use it to query for the metric of interest. The explorer has auto-completion for metrics, labels, label values but also PromQL functions, and operators to help you out. Start with a short time range of, for example, an hour to get the best results.

In SUSE Observability the size of the metric chart automatically determines the granularity of the metric shown in the chart. PromQL queries can adjusted to make optimal use of this behavior to get a representative chart for the metric. explains this in detail.

Use the to create the metric binding:

For the first option an example is given in the . The second option can be useful for comparing related metrics. Some typical use-cases:

To add more queries to a metric binding simply repeat 3. and 4. and add the query as an extra entry in the list of queries. For the deployment replica counts there are several related metrics that can be included in the same chart:

📈
supported units
Metrics perspective
Topology perspective
metric explorer
Writing PromQL for charts
SUSE Observability CLI
Prometheus remote write protocol
Write the outline of the metric binding
Write the topology query (STQL) to select the components
Write the PromQL query for the desired metric
Bind the correct time series to each component
Create or update the metric binding in SUSE Observability
next section
steps
The incorrect chart for a single deployment, it shows the replica count for all deployments
After adding the parameterized filters the resulting chart looks as expected, only 1 time series for this component
Component Highlights page that shows the labels and component name (both highlighted in red)
Metric binding with multiple metrics