AAD Standalone Deployment
StackState Self-hosted v4.6.x
This page describes StackState version 4.6.
The Autonomous Anomaly Detector (AAD) is a StackState service configured and deployed as a part of standard installation. In some cases the AAD can be deployed standalone using the AAD helm chart, e.g. when StackState and the AAD are deployed in separate kubernetes clusters. The standalone AAD deployment option is recommended only for the users with advanced knowledge of Kubernetes.
The Autonomous Anomaly Detector consists of two components:
- The AAD Kubernetes service
- The AAD StackPack.
The sections below explain how to configure the AAD Kubernetes service and the AAD StackPack in order to perform standalone deployment. Note that a training period is required before the AAD can begin to report anomalies.
A minimal deployment of the AAD Kubernetes service with the default options requires one of the following instance types:
- Amazon EKS: 1 instance of type
- Azure AKS: 1 instance of type
F4s v2(Intel or AMD CPUs)
- Self-hosted Kubernetes: 1 instance with 4 CPUs and 6 Gb memory
To handle more streams or to reduce detection latency, the service can be scaled. If you want to find out how to scale the service, contact StackState support.
The AAD Kubernetes service is stateless and survives restarts. It can be relocated to a different Kubernetes node or bounced. To take full advantage of this capability, it is recommended to run the service on low cost AWS Spot Instances or Azure low-priority VMs.
Standalone deployment consists of two steps: Install the AAD StackPack and install the AAD Kubernetes service.
Install the AAD StackPack from the StackPacks page in StackState.
After installing the AAD StackPack, install the AAD Kubernetes service.
To be able to pull the Docker image, you will need access to quay.io. Access credentials can be requested from StackState support.
- 2.Add the StackState Helm repo:helm repo add stackstate https://helm.stackstate.io`
helm fetch stackstate/anomaly-detection
Create the file
values.yamlfile, including the configuration described below, and save it to disk:
- pullSecretUsername - the image registry username (from step 1).
- instance - the StackState instance URL. This must be a StackState internal URL to keep traffic inside the Kubernetes network and namespace. e.g
- ingress: - Ingress provides access to the technical interface of the AAD, this is useful for troubleshooting. The technical interface can be accessed using kube proxy command:
kubectl proxy. After proxy is running the technical interface can be accessed using the path below.http://localhost:8001/api/v1/namespaces/<namespace>/services/http:<release-name>-anomaly-detection:8090/proxy/Optionally, the technical interface can be exposed using ingress configuration. The example below shows how to configure an nginx-ingress controller. Setting up the controller itself is beyond the scope of this document. More information about how to set up Ingress can be found at:
pullSecretUsername: <image registry username>
# Stackstate instance URL
instance: <stackstate instance url>
# status UI ingress (the configuration below is example for nginx ingress controller)
hostname: <domain name> # e.g. spotlight.domain.com
port: 8090 # status page will be available on spotlight.domain.com:8090
external-dns.alpha.kubernetes.io/hostname: <domain name> # e.g. spotlight.domain.com
- host: <domain name> # e.g. spotlight.domain.com
Details of all configuration options are available in the anomaly-detection chart with the command below.
helm show all stackstate/anomaly-detection
By default, the AAD Kubernetes Service is configured to use kubernetes
tokenauthentication, so one does not need to configure anything additional to that the AAD Kubernetes service must be installed into the same cluster and namespace as StackState. If this is is not possible there are two other options for authentication:
- Stackstate Api Token authentication. One can obtain token from User Profile page....stackstate:authType: api-tokenapiToken: <stackstate api token>...
- Cookie authentication. This type of auth is not recommended and exists only for troubleshooting/testing purposes....stackstate:authType: cookieusername: <username>password: <password>...
Run the command below, specifying the StackState namespace and the image registry password. Note that the AAD Kubernetes service must be installed in the same namespace as StackState to be able to use default token authentication (Otherwise consider other types of authentication above).
helm upgrade anomaly-detector stackstate/anomaly-detection \
--namespace <stackstate-namespace> \
--set image.pullSecretPassword=<image registry password>
The AAD will need to train on your data before it can begin reporting anomalies. With data collected in 1 minute buckets, the AAD requires a 2 hour training period. If historic data exists for relevant metric streams, this will also be used for training the AAD. In this case, the first results can be expected within an hour. Up to a day of data is used for training. After the initial training, the AAD will continuously refine its model and adapt to changes in the data.
Upgrading a standalone AAD instance consists of two steps: Upgrade the AAD Stackpack and upgrade the AAD Kubernetes Service.
When new version of StackPack is available you can simply click on
Upgradeon the AAD StackPack page.
The AAD Kubernetes service upgrade is driven by availability of the new version of the helm chart therefore for upgrading one can follow the steps starting from step 3 - fetching new AAD chart.
To deactivate the AAD, uninstall the AAD StackPack. The AAD Kubernetes service will continue running and reserve its compute resources, but anomaly detection will not be executed.
To re-enable the AAD Kubernetes service, you can simply install the AAD StackPack again. It is not necessary to repeat the installation of the AAD Kubernetes service.
To completely remove the AAD Kubernetes service and the AAD StackPack:
- Uninstall the AAD Kubernetes service:helm delete anomaly-detector
- Uninstall the AAD StackPack
The status UI provides details on the technical state of the AAD. You can use it to retrieve information about scheduling progress, possible errors, the ML models selected and job statistics.
To access the status UI, one can run kubectl proxy.
The UI will be accessible by URL:
Optionally to access the status UI, the AAD Kubernetes service ingress can be configured for the anomaly-detection deployment (for the details see the configure the AAD Kubernetes service).
Common questions that can be answered in the status UI:
Is the AAD Kubernetes service running? If the status UI is accessible: The service is running. If the status UI is not available: Either the service is not running, or the Ingress has not been configured (See the install section).
Can the AAD Kubernetes service reach StackState? Check the status UI sections Top errors and Last stream polling results. Errors here usually indicate connection problems.
Has the AAD Kubernetes service selected metric streams for anomaly detection? The status UI section Anomaly Detection Summary shows the total time of all registered streams, if no streams are selected it will be zero.
Is the AAD Kubernetes service detecting anomalies? The status UI section Top Anomalous Streams shows the streams with the highest number of anomalies. No streams in this section means that no anomalies have been detected. The status UI section Anomaly Detection Summary shows other relevant metrics, such as total time of all registered streams, total checked time and total time of all anomalies detected.
Is the AAD Kubernetes service scheduling streams? The status UI tab Job Progress shows a ranked list of streams with scheduling progress, including the last time each stream was scheduled.