StackState-Mesos & DC/OS Slave Integration

Overview

This Agent check collects metrics from Mesos slaves for:

  • System load
  • Number of tasks failed, finished, staged, running, etc
  • Number of executors running, terminated, etc

This check also creates a service check for every executor task.

Refer to the Mesos Master integration for information about setting up the master nodes.

Installation

This check is packaged with the Agent, so simply install the agent.

Configuration

Metrics

mesos.stats.system.cpus_total
(gauge)
Number of CPUs available
shown as
mesos.stats.system.load_15min
(gauge)
Load average for the past 15 minutes
shown as
mesos.stats.system.load_1min
(gauge)
Load average for the past minutes
shown as
mesos.stats.system.load_5min
(gauge)
Load average for the past 5 minutes
shown as
mesos.stats.system.mem_free_bytes
(gauge)
Free memory
shown as byte
mesos.stats.system.mem_total_bytes
(gauge)
Total memory
shown as byte
mesos.state.task.cpu
(gauge)
Task cpu
shown as
mesos.state.task.mem
(gauge)
Task memory
shown as mebibyte
mesos.state.task.disk
(gauge)
Task disk
shown as mebibyte
mesos.slave.tasks_failed
(count)
Number of failed tasks
shown as task
mesos.slave.tasks_finished
(count)
Number of finished tasks
shown as task
mesos.slave.tasks_killed
(count)
Number of killed tasks
shown as task
mesos.slave.tasks_lost
(count)
Number of lost tasks
shown as task
mesos.slave.tasks_running
(gauge)
Number of running tasks
shown as task
mesos.slave.tasks_staging
(gauge)
Number of staging tasks
shown as task
mesos.slave.tasks_starting
(gauge)
Number of starting tasks
shown as task
mesos.stats.registered
(gauge)
Whether this slave is registered with a master
shown as
mesos.stats.uptime_secs
(gauge)
Slave uptime
shown as
mesos.slave.cpus_percent
(gauge)
Percentage of allocated CPUs
shown as percent
mesos.slave.cpus_used
(gauge)
Number of allocated CPUs
shown as
mesos.slave.cpus_total
(gauge)
Number of CPUs
shown as
mesos.slave.disk_percent
(gauge)
Percentage of allocated disk space
shown as percent
mesos.slave.disk_used
(gauge)
Allocated disk space
shown as mebibyte
mesos.slave.disk_total
(gauge)
Disk space
shown as mebibyte
mesos.slave.mem_percent
(gauge)
Percentage of allocated memory
shown as percent
mesos.slave.mem_used
(gauge)
Allocated memory
shown as mebibyte
mesos.slave.mem_total
(gauge)
Total memory
shown as mebibyte
mesos.slave.executors_registering
(gauge)
Number of executors registering
shown as
mesos.slave.executors_running
(gauge)
Number of executors running
shown as
mesos.slave.executors_terminated
(gauge)
Number of terminated executors
shown as
mesos.slave.executors_terminating
(gauge)
Number of terminating executors
shown as
mesos.slave.frameworks_active
(gauge)
Number of active frameworks
shown as
mesos.slave.invalid_framework_messages
(gauge)
Number of invalid framework messages
shown as message
mesos.slave.invalid_status_updates
(gauge)
Number of invalid status updates
shown as
mesos.slave.recovery_errors
(gauge)
Number of errors encountered during slave recovery
shown as error
mesos.slave.valid_framework_messages
(gauge)
Number of valid framework messages
shown as message
mesos.slave.valid_status_updates
(gauge)
Number of valid status updates
shown as