Kubernetes
StackState Self-hosted v4.5.x
Last updated
StackState Self-hosted v4.5.x
Last updated
This page describes StackState v4.5.x. The StackState 4.5 version range is End of Life (EOL) and no longer supported. We encourage customers still running the 4.5 version range to upgrade to a more recent release.
StackState Agent V2
To retrieve topology, events and metrics data from a Kubernetes cluster, you will need to have the following installed in the cluster:
StackState Agent V2 on each node in the cluster
StackState Cluster Agent on one node
kube-state-metrics
To integrate with other services, a separate instance of the StackState Agent should be deployed on a standalone VM.
The Kubernetes integration collects topology data in a Kubernetes cluster, as well as metrics and events. To achieve this, different types of StackState Agent are used:
Component | Pod name |
---|---|
| |
| |
|
To integrate with other services, a separate instance of the StackState Agent should be deployed on a standalone VM. It is not currently possible to configure a StackState Agent deployed on a Kubernetes cluster with checks that integrate with other services.
StackState Cluster Agent is deployed as a Deployment. There is one instance for the entire cluster:
Topology and events data for all resources in the cluster are retrieved from the Kubernetes API
Control plane metrics are retrieved from the Kubernetes API
When cluster checks are enabled, cluster checks configured here are run by the deployed StackState ClusterCheck Agent pod.
StackState Agent V2 is deployed as a DaemonSet with one instance on each node in the cluster:
Host information is retrieved from the Kubernetes API.
Container information is collected from the Docker daemon.
Metrics are retrieved from kubelet running on the node and also from kube-state-metrics if this is deployed on the same node.
By default, metrics are also retrieved from kube-state-metrics if that is deployed on the same node as the StackState Agent pod. This can cause issues on a large Kubernetes cluster. To avoid this, it is advisable to enable cluster checks so that metrics are gathered from kube-state-metrics by a dedicated StackState ClusterCheck Agent.
The StackState ClusterCheck Agent is an additional StackState Agent V2 pod that is deployed only when cluster checks are enabled in the Helm chart. When deployed, cluster checks configured on the StackState Cluster Agent will be run by the StackState ClusterCheck Agent pod.
On large Kubernetes clusters, you can run the kubernetes_state
check on the ClusterCheck Agent. This check gathers metrics from kube-state-metrics and sends them to StackState. The ClusterCheck Agent is also useful to run checks that do not need to run on a specific node and monitor non-containerized workloads such as:
Out-of-cluster datastores and endpoints (for example, RDS or CloudSQL).
Load-balanced cluster services (for example, Kubernetes services).
StackState Agent v2.15.0 is supported to monitor the following versions of Kubernetes:
Kubernetes 1.16 - 1.21
EKS (with Kubernetes 1.16 - 1.21)
Docker container runtime (not containerd, cri-o)
Default networking
StackState Agent connects to the StackState Receiver API at the specified StackState Receiver API address. The correct address to use is specific to your installation of StackState.
The StackState Agent, Cluster Agent and kube-state-metrics can be installed together using the Cluster Agent Helm Chart:
If you do not already have it, you will need to add the StackState helm repository to the local helm client:
Deploy the StackState Agent, Cluster Agent and kube-state-metrics. Use the helm command provided in the StackState UI after you have installed the StackPack. For large Kubernetes clusters, you can enable cluster checks to run the kubernetes_state check in a StackState ClusterCheck Agent pod.
stackstate.cluster.authToken
In addition to the variables included in the provided helm command, it is also recommended to provide a stackstate.cluster.authToken
. This is an optional variable, however, if not provided a new, random value will be generated each time a helm upgrade is performed. This could leave some pods in the cluster with an incorrect configuration.
For example:
Full details of the available values can be found in the Cluster Agent Helm Chart documentation (github.com).
To upgrade the Agents running in your Kubernetes cluster, run the helm upgrade command provided on the StackState UI StackPacks > Integrations > Kubernetes screen. This is the same command used to deploy the StackState Agent and Cluster Agent.
Optionally, the chart can be configured to start an additional StackState Agent V2 pod as a StackState ClusterCheck Agent pod. Cluster checks that are configured on the StackState Cluster Agent will then be run by the deployed StackState ClusterCheck Agent pod.
To enable cluster checks and deploy the ClusterCheck Agent pod, create a values.yaml
file to deploy the cluster-agent
Helm chart and add the following YAML segment:
The kubernetes_state check is responsible for gathering metrics from kube-state-metrics and sending them to StackState. It is configured on the StackState Cluster Agent and, by default, runs in the StackState Agent pod that is on the same node as the kube-state-metrics pod.
In a default deployment, all pods running a StackState Agent must be configured with sufficient CPU and memory requests and limits to run the check. This can consume a lot of memory in a large Kubernetes cluster. Since only one StackState Agent pod will actually run the check, a lot of CPU and memory resources will be allocated, but not be used.
To remedy this situation, the kubernetes_state check can be configured to run as a cluster check. In this case, only the ClusterCheck Agent requires resources to run the check and the allocation for other pods can be reduced.
Update the values.yaml
file used to deploy the cluster-agent
, for example:
StackState Agent V2 can be configured to reduce data production, tune the process blacklist, or turn off specific features when not needed. The required settings are described in detail on the page advanced Agent configuration.
To integrate with other external services, a separate instance of the StackState Agent should be deployed on a standalone VM. It is not currently possible to configure a StackState Agent deployed on a Kubernetes cluster with checks that integrate with other services.
To check the status of the Kubernetes integration, check that the StackState Cluster Agent (cluster-agent
) pod and all of the StackState Agent (cluster-agent-agent
) pods have status READY
.
To find the status of an Agent check:
Find the Agent pod that is running on the node where you would like to find a check status:
Run the command:
Look for the check name under the Checks
section.
To uninstall the StackState Cluster Agent and the StackState Agent from your Kubernetes cluster, run a Helm uninstall: