Stackstate-Docker Integration

Overview

Get metrics from Docker in real time to:

  • Visualize your containers’ performance.
  • Correlate the performance of containers with the applications running inside.

There are three ways to setup the Docker integration: install the agent on the host, on a single privileged container, and on each individual container.

For more details about configuring this integration refer to the following file(s) on GitHub:

Note: docker_daemon replaces the older docker integration going forward.

Installation

Host Installation

  1. Ensure Docker is running on the host.
  2. Install the agent as described in the agent installation instructions for your host OS.
  3. Add the agent user to the docker group: usermod -a -G docker sts-agent
  4. Create docker_daemon.yaml by copying the example file in the agent conf.d directory. If you have a standard install of Docker on your host, there shouldn’t be anything you need to change to get the integration to work.
  5. To enable other integrations, use docker ps to identify the ports used by the corresponding applications.

Single Container Installation

TODO this part of the documentation has to be updated.

Each Container Installation

  1. Ensure Docker is running on the host.
  2. Add a RUN command to the Dockerfile as listed in the agent installation instructions in the app for the OS used in the container. For instance, if the container is based on an Ubuntu image, use the Debian installation instructions.

Validate Installation

  1. Restart the agent.
  2. Execute the info command and verify that the integration check has passed. The output of the command should contain a section similar to the following:

     Checks
     ======
    
       [...]
       docker_daemon
       -------------
         - instance #0 [OK]
         - Collected 50 metrics, 0 events & 2 service checks
    

Troubleshooting

Single container install not working on Amazon Linux
Try using the following command to run the container:
docker run -d --name sts-agent -h `hostname` \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /proc/:/host/proc/:ro \
  -v /cgroup/:/host/sys/fs/cgroup:ro -e \
  API_KEY={your_api_key_here} \
  stackstate/docker-sts-agent
Certificate not trusted error
If you get an error in the collector log like: - 2017-03-31 12:18:44 CEST | CRITICAL | dd.collector | checks.docker_daemon(docker_daemon.py:185) | Error while fetching server API version: (‘Connection aborted.’, BadStatusLine(‘\x15\x03\x01\x00\x02\x02\n’,)) or - a line with CERTIFICATE_ERROR

Most likely the url in the configuration is not the same as the certificate. You can test the certificate with the following curl command ($HOST has to be the hostname which is valid for the certificate):

curl https://$HOST:2376/images/json \
--cert ~/.docker/cert.pem \
--key ~/.docker/key.pem \
--cacert ~/.docker/ca.pem

Metrics

docker.cpu.system
(gauge)
The fraction of time the CPU is executing system calls on behalf of processes of this container
shown as fraction
docker.cpu.system.95percentile
(gauge)
95th percentile of docker.cpu.system
shown as fraction
docker.cpu.system.avg
(gauge)
Average value of docker.cpu.system
shown as fraction
docker.cpu.system.count
(rate)
The rate that the value of docker.cpu.system was sampled
shown as sample/second
docker.cpu.system.max
(gauge)
Max value of docker.cpu.system
shown as fraction
docker.cpu.system.median
(gauge)
Median value of docker.cpu.system
shown as fraction
docker.cpu.user
(gauge)
The fraction of time the CPU is under direct control of processes of this container
shown as fraction
docker.cpu.user.95percentile
(gauge)
95th percentile of docker.cpu.user
shown as fraction
docker.cpu.user.avg
(gauge)
Average value of docker.cpu.user
shown as fraction
docker.cpu.user.count
(rate)
The rate that the value of docker.cpu.user was sampled
shown as sample/second
docker.cpu.user.max
(gauge)
Max value of docker.cpu.user
shown as fraction
docker.cpu.user.median
(gauge)
Median value of docker.cpu.user
shown as fraction
docker.cpu.throttled
(gauge)
Number of times the cgroup has been throttled
docker.mem.cache
(gauge)
The amount of memory that is being used to cache data from disk (e.g. memory contents that can be associated precisely with a block on a block device)
shown as byte
docker.mem.cache.95percentile
(gauge)
95th percentile value of docker.mem.cache
shown as byte
docker.mem.cache.avg
(gauge)
Average value of docker.mem.cache
shown as byte
docker.mem.cache.count
(rate)
The rate that the value of docker.mem.cache was sampled
shown as sample/second
docker.mem.cache.max
(gauge)
Max value of docker.mem.cache
shown as byte
docker.mem.cache.median
(gauge)
Median value of docker.mem.cache
shown as byte
docker.mem.rss
(gauge)
The amount of non-cache memory that belongs to the container's processes. Used for stacks, heaps, etc.
shown as byte
docker.mem.rss.95percentile
(gauge)
95th percentile value of docker.mem.rss
shown as byte
docker.mem.rss.avg
(gauge)
Average value of docker.mem.rss
shown as byte
docker.mem.rss.count
(rate)
The rate that the value of docker.mem.rss was sampled
shown as sample/second
docker.mem.rss.max
(gauge)
Max value of docker.mem.rss
shown as byte
docker.mem.rss.median
(gauge)
Median value of docker.mem.rss
shown as byte
docker.mem.swap
(gauge)
The amount of swap currently used by the container
shown as byte
docker.mem.swap.95percentile
(gauge)
95th percentile value of docker.mem.swap
shown as byte
docker.mem.swap.avg
(gauge)
Average value of docker.mem.swap
shown as byte
docker.mem.swap.count
(rate)
The rate that the value of docker.mem.swap was sampled
shown as sample/second
docker.mem.swap.max
(gauge)
Max value of docker.mem.swap
shown as byte
docker.mem.swap.median
(gauge)
Median value of docker.mem.swap
shown as byte
docker.mem.active_anon
(gauge)
The amount of "active" RSS memory. Active memory is not swapped to disk
shown as byte
docker.mem.inactive_anon
(gauge)
The amount of "inactive" RSS memory. Inactive memory is swapped to disk when necessary
shown as byte
docker.mem.active_file
(gauge)
The amount of "active" cache memory. Active memory is reclaimed by the system only after "inactive" has been reclaimed
shown as byte
docker.mem.inactive_file
(gauge)
The amount of "inactive" cache memory. Inactive memory may be reclaimed first when the system needs memory
shown as byte
docker.mem.mapped_file
(gauge)
The amount of memory mapped by the processes in the control group
shown as byte
docker.mem.pgfault
(gauge)
The rate that processes in the container trigger page faults by accessing a nonexistent or protected part of its virtual address space. Usually a page fault of this type results in a segmentation fault
docker.mem.pgmajfault
(gauge)
The rate that processes in the container trigger page faults by accessing a part virtual address space that was swapped out or corresponded to a mapped file. Usually a page fault of type results in fetching data from disk instead of from memory
docker.mem.pgpgin
(gauge)
The rate at which pages are "charged" (added to the accounting) of a cgroup
shown as page/second
docker.mem.pgpgout
(gauge)
The rate at which pages are "uncharged" (removed from the accounting) of a cgroup
shown as page/second
docker.mem.unevictable
(gauge)
The amount of memory that cannot be reclaimed. Usually this memory contains sensitive data and has been locked to ensure that it never swaps to disk
shown as byte
docker.container.size_rw
(gauge)
Total size of all the files in the container which have been created or changed by processes running in the container
shown as byte
docker.container.size_rw.95percentile
(gauge)
95th percentile of docker.container.size_rw
shown as byte
docker.container.size_rw.avg
(gauge)
Average value of docker.container.size_rw
shown as byte
docker.container.size_rw.count
(rate)
The rate that the value of docker.container.size_rw was sampled
shown as sample/second
docker.container.size_rw.max
(gauge)
Max value of docker.container.size_rw
shown as byte
docker.container.size_rw.median
(gauge)
Median value of docker.container.size_rw
shown as byte
docker.container.size_rootfs
(gauge)
Total size of all the files in the container
shown as byte
docker.container.size_rootfs.95percentile
(gauge)
95th percentile of docker.container.size_rootfs
shown as byte
docker.container.size_rootfs.avg
(gauge)
Average value of docker.container.size_rootfs
shown as byte
docker.container.size_rootfs.count
(rate)
The rate that the value of docker.container.size_rw was sampled
shown as sample/second
docker.container.size_rootfs.max
(gauge)
Max value of docker.container.size_rootfs
shown as byte
docker.container.size_rootfs.median
(gauge)
Median value of docker.container.size_rootfs
shown as byte
docker.containers.running
(gauge)
The number of containers running on this host
docker.containers.stopped
(gauge)
The number of containers stopped on this host
docker.images.available
(gauge)
The number of top-level images
docker.images.intermediate
(gauge)
The number of intermediate images, which are intermediate layers that make up other images
docker.mem.limit
(gauge)
The memory limit for the container, if set
shown as byte
docker.mem.limit.95percentile
(gauge)
95th percentile of docker.mem.limit. Ordinarily this value will not change
shown as byte
docker.mem.limit.avg
(gauge)
Average value of docker.mem.limit. Ordinarily this value will not change
shown as byte
docker.mem.limit.count
(rate)
The rate that the value of docker.mem.limit was sampled
shown as sample/second
docker.mem.limit.max
(gauge)
Max value of docker.mem.limit. Ordinarily this value will not change
shown as byte
docker.mem.limit.median
(gauge)
Median value of docker.mem.limit. Ordinarily this value will not change
shown as byte
docker.mem.sw_limit
(gauge)
The swap + memory limit for the container, if set
shown as byte
docker.mem.sw_limit.95percentile
(gauge)
95th percentile of docker.mem.sw_limit. Ordinarily this value will not change
shown as byte
docker.mem.sw_limit.avg
(gauge)
Average value of docker.mem.sw_limit. Ordinarily this value will not change
shown as byte
docker.mem.sw_limit.count
(rate)
The rate that the value of docker.mem.sw_limit was sampled
shown as sample/second
docker.mem.sw_limit.max
(gauge)
Max value of docker.mem.sw_limit. Ordinarily this value will not change
shown as byte
docker.mem.sw_limit.median
(gauge)
Median value of docker.mem.sw_limit. Ordinarily this value will not change
shown as byte
docker.mem.in_use
(gauge)
The fraction of used memory to available memory, if the limit is set
shown as fraction
docker.mem.in_use.95percentile
(gauge)
95th percentile of docker.mem.in_use
shown as fraction
docker.mem.in_use.avg
(gauge)
Average value of docker.mem.in_use
shown as fraction
docker.mem.in_use.count
(rate)
The rate that the value of docker.mem.in_use was sampled
shown as sample/second
docker.mem.in_use.max
(gauge)
Max value of docker.container.mem.in_use
shown as fraction
docker.mem.in_use.median
(gauge)
Median value of docker.container.mem.in_use
shown as fraction
docker.mem.sw_in_use
(gauge)
The fraction of used swap + memory to available swap + memory, if the limit is set
shown as fraction
docker.mem.sw_in_use.95percentile
(gauge)
95th percentile of docker.mem.sw_in_use
shown as fraction
docker.mem.sw_in_use.avg
(gauge)
Average value of docker.mem.sw_in_use
shown as fraction
docker.mem.sw_in_use.count
(rate)
The rate that the value of docker.mem.sw_in_use was sampled
shown as sample/second
docker.mem.sw_in_use.max
(gauge)
Max value of docker.container.mem.sw_in_use
shown as fraction
docker.mem.sw_in_use.median
(gauge)
Median value of docker.container.mem.sw_in_use
shown as fraction
docker.io.read_bytes
(gauge)
Bytes read per second from disk by the processes of the container
shown as byte/second
docker.io.read_bytes.95percentile
(gauge)
95th percentile of docker.io.read_bytes
shown as byte/second
docker.io.read_bytes.avg
(gauge)
Average value of docker.io.read_bytes
shown as byte/second
docker.io.read_bytes.count
(rate)
The rate that the value of docker.io.read_bytes was sampled
shown as sample/second
docker.io.read_bytes.max
(gauge)
Max value of docker.container.io.read_bytes
shown as byte/second
docker.io.read_bytes.median
(gauge)
Median value of docker.container.io.read_bytes
shown as byte/second
docker.io.write_bytes
(gauge)
Bytes written per second to disk by the processes of the container
shown as byte/second
docker.io.write_bytes.95percentile
(gauge)
95th percentile of docker.io.write_bytes
shown as byte/second
docker.io.write_bytes.avg
(gauge)
Average value of docker.io.write_bytes
shown as byte/second
docker.io.write_bytes.count
(rate)
The rate that the value of docker.io.write_bytes was sampled
shown as sample/second
docker.io.write_bytes.max
(gauge)
Max value of docker.container.io.write_bytes
shown as byte/second
docker.io.write_bytes.median
(gauge)
Median value of docker.container.io.write_bytes
shown as byte/second
docker.image.virtual_size
(gauge)
Size of all layers of the image on disk
shown as byte
docker.image.size
(gauge)
Size of all layers of the image on disk
shown as byte
docker.net.bytes_rcvd
(gauge)
Bytes received per second from the network
shown as byte/second
docker.net.bytes_rcvd.95percentile
(gauge)
95th percentile of docker.net.bytes_rcvd
shown as byte/second
docker.net.bytes_rcvd.avg
(gauge)
Average value of docker.net.bytes_rcvd
shown as byte/second
docker.net.bytes_rcvd.count
(rate)
The rate that the value of docker.net.bytes_rcvd was sampled
shown as sample/second
docker.net.bytes_rcvd.max
(gauge)
Max value of docker.container.net.bytes_rcvd
shown as byte/second
docker.net.bytes_rcvd.median
(gauge)
Median value of docker.container.net.bytes_rcvd
shown as byte/second
docker.net.bytes_sent
(gauge)
Bytes sent per second to the network
shown as byte/second
docker.net.bytes_sent_bytes.95percentile
(gauge)
95th percentile of docker.net.bytes_sent_bytes
shown as byte/second
docker.net.bytes_sent_bytes.avg
(gauge)
Average value of docker.net.bytes_sent_bytes
shown as byte/second
docker.net.bytes_sent_bytes.count
(rate)
The rate that the value of docker.net.bytes_sent_bytes was sampled
shown as sample/second
docker.net.bytes_sent_bytes.max
(gauge)
Max value of docker.container.net.bytes_sent_bytes
shown as byte/second
docker.net.bytes_sent_bytes.median
(gauge)
Median value of docker.container.net.bytes_sent_bytes
shown as byte/second
docker.disk.used
(gauge)
Average number of bytes the container's disk uses
shown as byte
docker.disk.free
(gauge)
Average number of bytes the container's disk has free
shown as byte
docker.disk.total
(gauge)
Average number of total bytes in the container's disk
shown as byte