Troubleshooting StackState startup
This page describes StackState version 4.0.
The StackState 4.0 version range is End of Life (EOL) and no longer supported. We encourage customers still running the 4.0 version range to upgrade to a more recent release.
Issues getting StackState started
Here is a quick guide for troubleshooting the startup of StackState:
Check whether systemd service StackGraph is started by
sudo systemctl status stackgraph.service
Check whether systemd service StackState is started by
sudo systemctl status stackstate.service
Check connection to StackState's user interface, default listening on TCP port 7070.
Check log files for errors, located at
/opt/stackstate/var/log/
Known issues
Timeout notification when uninstalling or upgrading a StackPack
Please be aware that when uninstalling or upgrading a StackPack, it can fail with a timeout message. This happens due to a high load on StackState, or high amounts of data related to this StackPack. We are working on solving this issue; however, for the time being, the solution is to retry the uninstall or upgrade operation until it succeeds.
Error InterruptedException
when opening a view
InterruptedException
when opening a viewSymptom: opening a view that is expected to contain a large topology results in an error and the /opt/stackstate/var/log/stackstate.log
log shows an exception similar to:
Cause: topology elements that are not cached are not fully retrieved from StackGraph within a certain period of time before a timeout, InterruptedException
, is triggered.
Possible solution: increase the cache size by editing StackState's configuration.
In /opt/stackstate/etc/application_stackstate.conf
add the following configuration stackgraph.vertex.cache.size = <size>
where <size>
is the number of Graph vertices. An initial cache size can be obtained by adding:
number of components * 10,
number of relations * 10,
number of checks * 5.
The default cache size is set to 8191. Make sure the cache size is defined as a power of two minus one, e.g. 2^13-1 = 8191
.
Make sure that StackState has enough memory available, the available memory can be configured by editing: /opt/stackstate/etc/processmanager/processmanager.conf
. Under process named stackstate-server
, change -Xmx1G
to -Xmx<N>G
where <N>
is the number of desired GBs of memory. For example, change the setting to -Xmx8G
to have 8 GBs of memory available to StackState.
Restart StackState, by sudo systemctl restart stackstate.service
, for the changes to be effective.
Error illegal reflective access
when starting StackState
illegal reflective access
when starting StackStateSymptom: when starting any component of StackState, the log shows a message similar to the following:
Cause: running StackState on a Java version newer than JDK 8.
Solution:
Install JDK 8 using the following commands:
Error /opt/stackstate/*/bin/*.sh: line 45: /opt/stackstate/var/log/*/*.log: Permission denied
/opt/stackstate/*/bin/*.sh: line 45: /opt/stackstate/var/log/*/*.log: Permission denied
Symptom: when starting any component of StackState, the log shows a message similar to the following:
Cause: StackState has been started using root
or other user credentials followed by starting StackState as a service.
Solution: Remove the contents of /opt/stackstate/var/log/stackstate
and /opt/stackstate/var/log/stackgraph
directories and restart StackState.
Error /opt/stackstate/var/log/license-check/license-app.log: Permission denied
/opt/stackstate/var/log/license-check/license-app.log: Permission denied
Symptom: when starting any component of StackState, the log shows a message similar to the following:
Cause: the license key registration command was executed as root
or other user followed by starting StackState as a service.
Solution: Remove the contents of /opt/stackstate/var/log/license-check
and restart StackState.
Error InvalidSchema("No connection adapters were found for '%s' % url")
InvalidSchema("No connection adapters were found for '%s' % url")
Symptom: no data received in StackState from the AWS source that has access to StackState receiver service, the CloudWatch log stream related to the AWS lambda function StackState-Topo-Cron shows a message similar to the following:
Cause: Environment variable 'STACKSTATE_BASE_URL' for lambda function is not correct.
Solution: Check if the URL provided for the STACKSTATE_BASE_URL
environment variable on AWS Lambda function is correct. Be sure that protocol is specified, e.g., http://
, and that it points to a proper port. Read more on configuring the receiver base URL.
Error java.lang.IllegalStateException: Requested index specs do not match the catalog.
java.lang.IllegalStateException: Requested index specs do not match the catalog.
Symptom: StackState is not starting after upgrade to a newer version. StackState.log reflects:
Cause: Introduced index changes.
Solution: Follow the reindex process
Error ERROR | dd.collector | checks.splunk_topology(__init__.py:1002) | Check 'splunk_topology' instance #0 failed
ERROR | dd.collector | checks.splunk_topology(__init__.py:1002) | Check 'splunk_topology' instance #0 failed
Symptom: Splunk saved search with SID (Splunk job id) results in ERROR: CheckException: Splunk topology failed with message: 400 Client Error: Bad Request for url:
message. StackState log in /var/log/stackstate/collector.log
shows the following:
Cause: Saved search definition contains an error, or the job id (SID) is not available anymore in Splunk. Jobs in Splunk expire, and they are no longer available from jobs/activity screen and saved search.
Solution: Check the status of the Splunk job using SID in Splunk Activity Screen. To do this, you need to extract the job id from the URL provided in the error message. SID is located in the URL right after /jobs/
- .../search/jobs/{SID}/...
. Now you can check this job in Splunk Activity menu.
Last updated