Stackstate-Ceph Integration

Overview

Enable the Stackstate-Ceph integration to:

  • Track disk usage across storage pools
  • Receive service checks in case of issues
  • Monitor I/O performance metrics

Installation

The integration is meant to be enabled on each Ceph monitor host.

Configuration

Adjust the configuration file to match your environment. By default the check will use /usr/bin/ceph to retrieve metrics; this can be overriden by using the ceph_cmd option. If sudo access is required to run it, please enable the use_sudo flag.

Any extra tags specific to the cluster can be specified under tags, as usual.

For more details about configuring this integration refer to the following file(s) on GitHub:

Validation

Execute the info command /etc/init.d/stackstate-agent info and verify that the integration check was successful. The output should contain a section similar to the following:

    Checks
    ======

      [...]

      ceph
      ----
          - instance #0 [OK]
          - Collected 19 metrics, 0 events & 2 service checks

Metrics

ceph.commit_latency_ms
(gauge)
Time taken to commit an operation to the journal
shown as millisecond
ceph.apply_latency_ms
(gauge)
Time taken to flush an update to disks
shown as millisecond
ceph.op_per_sec
(gauge)
IO operations per second for given pool
shown as operation/second
ceph.read_bytes_sec
(gauge)
Bytes/second being read
shown as byte
ceph.write_bytes_sec
(gauge)
Bytes/second being written
shown as byte
ceph.num_osds
(gauge)
Number of known storage daemons
shown as item
ceph.num_in_osds
(gauge)
Number of participating storage daemons
shown as item
ceph.num_up_osds
(gauge)
Number of online storage daemons
shown as item
ceph.num_pgs
(gauge)
Number of placement groups available
shown as item
ceph.num_mons
(gauge)
Number of monitor daemons
shown as item
ceph.aggregate_pct_used
(gauge)
Overall capacity usage metric
shown as percent
ceph.total_objects
(gauge)
Object count from the underlying object store
shown as item
ceph.num_objects
(gauge)
Object count for a given pool
shown as item
ceph.read_bytes
(rate)
Per-pool read bytes
shown as byte
ceph.write_bytes
(rate)
Per-pool write bytes
shown as byte
ceph.num_pools
(gauge)
Number of pools
shown as item
ceph.pgstate.active_clean
(gauge)
Number of active+clean placement groups
shown as item
ceph.read_op_per_sec
(gauge)
Per-pool read operations/second
shown as operation/second
ceph.write_op_per_sec
(gauge)
Per-pool write operations/second
shown as operation/second
ceph.num_near_full_osds
(gauge)
Number of nearly full osds
shown as item
ceph.num_full_osds
(gauge)
Number of full osds
shown as item
ceph.osd.pct_used
(gauge)
Percentage used of full/near full osds
shown as percent