Using Autodiscovery with Docker

Autodiscovery was previously called Service Discovery. It's still called Service Discovery throughout the Agent's code and in some configuration options.

Docker is being adopted rapidly. Orchestration platforms like Docker Swarm, Kubernetes, and Amazon ECS make running Docker-ized services easier and more resilient by managing orchestration and replication across hosts. But all of that makes monitoring more difficult. How can you reliably monitor a service which is unpredictably shifting from one host to another?

The StackState Agent can automatically track which services are running where, thanks to its Autodiscovery feature. Autodiscovery lets you define configuration templates for Agent checks and specify which containers each check should apply to. The Agent enables, disables, and regenerates static check configurations from the templates as containers come and go. When your NGINX container moves from to, Autodiscovery helps the Agent update its NGINX check configuration with the new IP address so it can keep collecting NGINX metrics without any action on your part.

How it Works

In a traditional non-container environment, StackState Agent configuration is—like the environment in which it runs—static. The Agent reads check configurations from disk when it starts, and as long as it’s running, it continuously runs every configured check. The configuration files are static, and any network-related options configured within them serve to identify specific instances of a monitored service (e.g. a redis instance at When an Agent check cannot connect to such a service, you’ll be missing metrics until you troubleshoot the issue. The Agent check will retry its failed connection attempts until an administrator revives the monitored service or fixes the check’s configuration.

With Autodiscovery enabled, the Agent runs checks differently.

Different Configuration

Static configuration files aren’t suitable for checks that collect data from ever-changing network endpoints, so Autodiscovery uses templates for check configuration. In each template, the Agent looks for two template variables—%%host%% and %%port%%—to appear in place of any normally-hardcoded network options. For example: a template for the Agent’s Go Expvar check would contain the option expvar_url: http://%%host%%:%%port%%. For containers that have more than one IP address or exposed port, you can direct Autodiscovery to pick the right ones by using template variable indexes.

Because templates don’t identify specific instances of a monitored service—which %%host%%? which %%port%%?—Autodiscovery needs one or more container identifiers for each template so it can determine which IP(s) and Port(s) to substitute into the templates. For Docker, container identifiers are image names or container labels.

Finally, Autodiscovery can load check templates from places other than disk. Other possible template sources include key-value stores like Consul, and, when running on Kubernetes, Pod annotations.

Different Execution

When the Agent starts with Autodiscovery enabled, it loads check templates from all available template sources—not just one or another—along with the templates’ container identifiers. Unlike in a traditional Agent setup, the Agent doesn’t run all checks all the time; it decides which checks to enable by inspecting all containers currently running on the same host as the Agent.

As the Agent inspects each running container, it checks if the container matches any of the container identifiers from any loaded templates. For each match, the Agent generates a static check configuration by substituting the matching container’s IP address and port. Then it enables the check using the static configuration.

The Agent watches for Docker events—container creation, destruction, starts, and stops—and enables, disables, and regenerates static check configurations on such events.

How to set it up

Running the Agent Container

No matter what container orchestration platform you use, you’ll first need to run a single docker-sts-agent container on every host in your cluster. If you use Kubernetes, see the Kubernetes integration page for instructions on running docker-sts-agent.

If you use Docker Swarm, run the following command on one of your manager nodes:

docker service create \
  --name sts-agent \
  --mode global \
  --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
  --mount type=bind,source=/proc/,target=/host/proc/,ro=true \
  --mount type=bind,source=/sys/fs/cgroup/,target=/host/sys/fs/cgroup,ro=true \
  -e SD_BACKEND=docker \

Note that if you want the Agent to autodiscover JMX-based checks, you MUST:

  1. Use the stackstate/docker-sts-agent:latest-jmx image. This image is based on latest, but it includes a JVM, which the Agent needs in order to run jmxfetch.
  2. Pass the environment variable SD_JMX_ENABLE=yes when starting stackstate/docker-sts-agent:latest-jmx.

Setting up Check Templates

Each Template Source section below shows a different way to configure check templates and their container identifiers.

Template Source: Files (Auto-conf)

Storing templates as local files is easy to understand and doesn’t require an external service or a specific orchestration platform. The downside is that you must restart your Agent containers each time you change, add, or remove templates.

The Agent looks for Autodiscovery templates in its conf.d/auto_conf directory, which contains default templates for the following checks:

These templates may suit you in basic cases, but if you need to use custom check configurations—say you want to enable extra check options, use different container identifiers, or use template variable indexing)—you’ll have to write your own auto-conf files.

  1. On Kubernetes, add them using ConfigMaps
  2. Build a modified version of the docker-sts-agent

Example: Apache check

Here’s the apache.yaml template packaged with docker-sts-agent:

  - httpd


  - apache_status_url: http://%%host%%/server-status?auto

It looks like a minimal Apache check configuration, but notice the docker_images option. This required option lets you provide container identifiers. Autodiscovery will apply this template to any containers on the same host that run an httpd image.

Any httpd image. Suppose you have one container running library/httpd:latest and another running yourusername/httpd:v2. Autodiscovery will apply the above template to both containers. When it’s loading auto-conf files, Autodiscovery cannot distinguish between identically-named images from different sources or with different tags, and you must provide short names for container images, e.g. httpd, NOT library/httpd:latest.

If this is too limiting—if you need to apply different check configurations to different containers running the same image—use labels to identify the containers. Label each container differently, then add each label to any template file’s docker_images list (yes, docker_images is where to put any kind of container identifier, not just images).

Template Source: Key-value Store

Autodiscovery can use Consul, etcd, and Zookeeper as template sources. To use a key-value store, you must configure it in stackstate.conf or in environment variables passed to the docker-sts-agent container.

Configure in stackstate.conf

In the stackstate.conf file, set the sd_config_backend, sd_backend_host, and sd_backend_port options to, respectively, the key-value store type—etcd, consul, or zookeeper—and the IP address and port of your key-value store:

# For now only Docker is supported so you just need to un-comment this line.
service_discovery_backend: docker

# Define which key/value store must be used to look for configuration templates.
# Default is etcd. Consul is also supported.
sd_config_backend: etcd

# Settings for connecting to the backend. These are the default, edit them if you run a different config.
sd_backend_port: 4001

# By default, the agent will look for the configuration templates under the
# `/stackstate/check_configs` key in the back-end.
# If you wish otherwise, uncomment this option and modify its value.
# sd_template_dir: /stackstate/check_configs

# If you Consul store requires token authentication for service discovery, you can define that token here.
# consul_token: f45cbd0b-5022-samp-le00-4eaa7c1f40f1

If you’re using Consul and the Consul cluster requires authentication, set consul_token.

Restart the Agent to effect the configuration change.

Configure in environment variables

If you prefer to use environment variables, pass the same options to the container when starting it:

docker service create \
  --name sts-agent \
  --mode global \
  --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
  --mount type=bind,source=/proc/,target=/host/proc/,ro=true \
  --mount type=bind,source=/sys/fs/cgroup/,target=/host/sys/fs/cgroup,ro=true \
  -e SD_BACKEND=docker \
  -e SD_BACKEND_PORT=4001 \

Note that the option to enable Autodiscovery is called service_discovery_backend in stackstate.conf, but it’s called just SD_BACKEND as an environment variable.

With the key-value store enabled as a template source, the Agent looks for templates under the key /stackstate/check_configs. Autodiscovery expects a key-value hierarchy like this:

    docker_image_1/                 # container identifier, e.g. httpd
      - check_names: [<CHECK_NAME>] # e.g. apache
      - init_configs: [<INIT_CONFIG>]
      - instances: [<INSTANCE_CONFIG>]

Each template is a 3-tuple: check name, init_config, and instances. The docker_images option from the previous section, which provided container identifiers to Autodiscovery, is not required here; for key-value stores, container identifiers appear as first-level keys under check_config. (Also note, the file-based template in the previous section didn’t need a check name like this example does; there, the Agent inferred the check name from the file name.)

Example: Apache check

The following etcd commands create an Apache check template equivalent to that from the previous section’s example:

etcdctl mkdir /stackstate/check_configs/httpd
etcdctl set /stackstate/check_configs/httpd/check_names '["apache"]'
etcdctl set /stackstate/check_configs/httpd/init_configs '[{}]'
etcdctl set /stackstate/check_configs/httpd/instances '[{"apache_status_url": "http://%%host%%/server-status?auto"}]'

Notice that each of the three values is a list. Autodiscovery assembles list items into check configurations based on shared list indexes. In this case, it composes the first (and only) check configuration from check_names[0], init_configs[0] and instances[0].

Unlike auto-conf files, key-value stores may use the short OR long image name as container identifiers, e.g. httpd OR library/httpd:latest. The next example uses a long name.

Example: Apache check with website availability monitoring

The following etcd commands create the same Apache template and add an HTTP check template to monitor whether the website created by the Apache container is available:

etcdctl set /stackstate/check_configs/library/httpd:latest/check_names '["apache", "http_check"]'
etcdctl set /stackstate/check_configs/library/httpd:latest/init_configs '[{}, {}]'
etcdctl set /stackstate/check_configs/library/httpd:latest/instances '[{"apache_status_url": "http://%%host%%/server-status?auto"},{"name": "My service", "url": "http://%%host%%", timeout: 1}]'

Again, the order of each list matters. The Agent can only generate the HTTP check configuration correctly if all parts of its configuration have the same index across the three lists (they do—the index is 1).


Template Variable Indexes

For containers that have many IP addresses or expose many ports, you can tell Autodiscovery which ones to choose by appending an underscore to the template variable, followed by an integer, e.g. %%host_0%%, %%port_4%%. After inspecting the container, Autodiscovery sorts the IPs and ports numerically and in ascending order, so for a container that exposes ports 80, 443, and 8443, %%port_0%% refers to port 80. Non-indexed template variables refer to the last item in the sorted list, so in this case, %%port%% means port 8443.

You can also add a network name suffix to the %%host%% variable—%%host_bridge%%, %%host_swarm%%, etc—for containers attached to multiple networks. When %%host%% does not have a suffix, Autodiscovery picks the container’s bridge network IP address.

Template Source Precedence

If you provide a template for the same check type via multiple template sources, the Agent looks for templates in the following order (using the first one it finds):

  • Kubernetes annotations
  • Key-value stores
  • Files

So if you configure a redisdb template both in Consul and as a file (conf.d/auto_conf/redisdb.yaml), the Agent will use the template from Consul.


When you’re not sure if Autodiscovery is loading certain checks you’ve configured, use the Agent’s configcheck init script command. For example, to confirm that your redis template is being loaded from a Kubernetes annotation—not the default auto_conf/redisdb.yaml file:

# docker exec -it <agent_container_name> /etc/init.d/stackstate-agent configcheck
Check "redisdb":
  source --> Kubernetes Pod Annotation
  config --> {'instances': [{u'host': u'', u'port': u'6379', 'tags': [u'image_name:kubernetes/redis-slave', u'kube_namespace:guestbook', u'app:redis', u'role:slave', u'docker_image:kubernetes/redis-slave:v2', u'image_tag:v2', u'kube_replication_controller:redis-slave']}], 'init_config': {}}

To check whether Autodiscovery is loading JMX-based checks:

# docker exec -it <agent_container_name> cat /opt/stackstate-agent/run/jmx_status.yaml
timestamp: 1499296559130
  failed_checks: {}
    - {message: null, service_check_count: 0, status: OK, metric_count: 13, instance_name: SD-jmx_0-}