Debug topology synchronization
StackState Self-hosted v5.1.x
Last updated
StackState Self-hosted v5.1.x
Last updated
This page explains several tools that can be used to troubleshoot a topology synchronization.
A topology synchronized using StackState Agent follows the process described below:
StackState Agent:
Connects to a data source to collect data.
Connects to the StackState Receiver to push collected data to StackState (in JSON format).
StackState Receiver:
Extracts topology and telemetry payloads from the received JSON.
Puts messages on the Kafka bus.
Kafka:
Stores received data in topics.
Read the troubleshooting steps for Kafka.
Synchronization:
Reads data from a topic as it becomes available on the Kafka bus.
Processes retrieved data.
Confirm that a custom synchronization is running:
Use the stac
CLI to list all topology synchronization streams.
The synchronization should be included in the list and have created components/relations.
If a custom synchronization isn't listed, you will need to recreate the synchronization.
If no components appear after making changes to a synchronization, or the data isn't as expected, follow the steps described in the sections below to check each step in the topology synchronization process.
If relations are missing from the topology, read the note on troubleshooting synchronization of relations.
For integrations that run through StackState Agent, StackState Agent is a good place to start an investigation.
Check the StackState Agent log for hints that it has problems connecting to StackState.
The integration can be triggered manually using the stackstate-agent check <check_name> -l debug
command on your terminal. This command won't send any data to StackState. Instead, it will return the topology and telemetry collected to standard output along with any generated log messages.
The StackState Receiver receives JSON data from StackState Agent V3.
Check the StackState Receiver logs for JSON deserialization errors.
Topology and telemetry are stored on Kafka on separate topics. The StackState topology synchronization reads data from a Kafka bus once it becomes available.
Use the stac
CLI to list the topics on Kafka and check the messages on a topic:
List all topics present on Kafka: stac topology list-topics
. A topic should be present where the name has the format sts_topo_<instance_type>_<instance url>
where <instance_type>
is the recognizable name of an integration and <instance_url>
corresponds to the StackState Agent integration YAML (usually the URL of the data source).
Check messages on a Kafka topic: stac topic show <topic_name>
. If there are recent messages on the Kafka bus, then the issue isn't in the data collection.
The StackState topology synchronization reads messages from a topic on the Kafka data bus. The Kafka topic used by a synchronization is defined in the Sts data source.
Check if the topic name defined in the Sts data source matches what is returned by the stackstate-agent check
command. Note that topic names are case-sensitive.
Check the error counter for the synchronization on the StackState UI page Settings > Topology Synchronization > Synchronizations. Increasing numbers tell you that there was an error while processing received data.
To troubleshoot processing errors, refer to the relevant StackState log files. The provided log messages will help you to resolve the issue. For details on working with the StackState log files on Kubernetes and Linux see the pages under Configure > Logging.
Check the stackstate.log
or, for Kubernetes, the stackstate-api
pod.
If there is an issue with the ID extractor, an exception will be logged here on each received topology element. No topology will be synchronized, however, the synchronization’s error counter will not increase.
Check the synchronization’s specific log file or, for Kubernetes, the stackstate-sync
pod for log messages that include the synchronization’s name.
Issues with a mapping function defined for a synchronization mapping will be reported here. The type is also logged to help determine which mapping to look at. The synchronization’s error counter will increase.
Issues with templates are also logged here. The synchronization’s error counter will increase.
It's possible that a relation references a source or target component that doesn't exist. Components are always processed before relations. If a component referenced by a relation isn't present in the synchronization’s topology, the relation won't be created. When this happens, a warning is logged to the synchronization’s specific log file or the stackstate-sync
pod. The component external ID and relation external ID are logged to help.
When StackState is deployed on Kubernetes, logs about synchronization can be found in the stackstate-sync
pod and the stackstate-api
pod. The name of the synchronization is shown in the log entries.
The stackstate-sync
pod has details of:
Template/mapping function errors.
Component types that don't have a mapping.
Relations connected to a non-existing component.
Messages that have been discarded due to a slow synchronization.
The stackstate-api
pod has details of:
ID extractor errors.
StackPacks.
Returns a list of all current topology synchronization streams.
⚠️ From StackState v5.0, the old sts
CLI is called stac
. The old CLI is now deprecated.
The new sts
CLI replaces the stac
CLI. It's advised to install the new sts
CLI and upgrade any installed instance of the old sts
CLI to stac
. For details see:
Shows the data of a specific topology synchronization stream, including detailed latency of the data being processed. The id
might be either a node id
or the identifier of a topology synchronization. The search gives priority to the node id
.
⚠️ From StackState v5.0, the old sts
CLI is called stac
. The old CLI is now deprecated.
The new sts
CLI replaces the stac
CLI. It's advised to install the new sts
CLI and upgrade any installed instance of the old sts
CLI to stac
. For details see: