Stackstate-Hadoop YARN Integration

Overview

Capture Yarn metrics to:

  • Visualize cluster health, performance, and utilization.
  • Analyze and inspect individual application performance.

Setup

Configuration

Install Stackstate Agent on the ResourceManager

  1. Configure the agent to connect to the ResourceManager: Edit conf.d/yarn.yaml:

    init_config:
    
    instances:
        -   resourcemanager_address: localhost
            resourcemanager_port: 8088
    

  2. Restart the Agent

Validation

Execute the info command and verify that the integration check has passed. The output of the command should contain a section similar to the following:

Checks
======

  [...]

  yarn
  ----
      - instance #0 [OK]
      - Collected 8 metrics & 0 events

Data Collected

Metrics

yarn.metrics.apps_submitted
(gauge)
The number of submitted apps
shown as task
yarn.metrics.apps_completed
(gauge)
The number of completed apps
shown as task
yarn.metrics.apps_pending
(gauge)
The number of pending apps
shown as task
yarn.metrics.apps_running
(gauge)
The number of running apps
shown as task
yarn.metrics.apps_failed
(gauge)
The number of failed apps
shown as task
yarn.metrics.apps_killed
(gauge)
The number of killed apps
shown as task
yarn.metrics.reserved_mb
(gauge)
The size of reserved memory
shown as mebibyte
yarn.metrics.available_mb
(gauge)
The amount of available memory
shown as mebibyte
yarn.metrics.allocated_mb
(gauge)
The amount of allocated memory
shown as mebibyte
yarn.metrics.total_mb
(gauge)
The amount of total memory
shown as mebibyte
yarn.metrics.reserved_virtual_cores
(gauge)
The number of reserved virtual cores
shown as core
yarn.metrics.available_virtual_cores
(gauge)
The number of available virtual cores
shown as core
yarn.metrics.allocated_virtual_cores
(gauge)
The number of allocated virtual cores
shown as core
yarn.metrics.total_virtual_cores
(gauge)
The total number of virtual cores
shown as core
yarn.metrics.containers_allocated
(gauge)
The number of containers allocated
yarn.metrics.containers_reserved
(gauge)
The number of containers reserved
yarn.metrics.containers_pending
(gauge)
The number of containers pending
yarn.metrics.total_nodes
(gauge)
The total number of nodes
shown as node
yarn.metrics.active_nodes
(gauge)
The number of active nodes
shown as node
yarn.metrics.lost_nodes
(gauge)
The number of lost nodes
shown as node
yarn.metrics.unhealthy_nodes
(gauge)
The number of unhealthy nodes
shown as node
yarn.metrics.decommissioned_nodes
(gauge)
The number of decommissioned nodes
shown as node
yarn.metrics.rebooted_nodes
(gauge)
The number of rebooted nodes
shown as node
yarn.apps.progress
(rate)
The progress of the application as a percent
shown as percent
yarn.apps.started_time
(rate)
The time in which application started (in ms since epoch)
shown as second
yarn.apps.finished_time
(rate)
The time in which the application finished (in ms since epoch)
shown as second
yarn.apps.elapsed_time
(rate)
The elapsed time since the application started (in ms)
shown as second
yarn.apps.allocated_mb
(rate)
The sum of memory in MB allocated to the applications running containers
shown as mebibyte
yarn.apps.allocated_vcores
(rate)
The sum of virtual cores allocated to the applications running containers
shown as core
yarn.apps.running_containers
(rate)
The number of containers currently running for the application
yarn.apps.memory_seconds
(rate)
The amount of memory the application has allocated (megabyte-seconds)
shown as second
yarn.apps.vcore_seconds
(rate)
The amount of CPU resources the application has allocated (virtual core-seconds)
shown as second
yarn.node.last_health_update
(gauge)
The last time the node reported its health (in ms since epoch)
shown as millisecond
yarn.node.used_memory_mb
(gauge)
The total amount of memory currently used on the node (in MB)
shown as mebibyte
yarn.node.avail_memory_mb
(gauge)
The total amount of memory currently available on the node (in MB)
shown as mebibyte
yarn.node.used_virtual_cores
(gauge)
The total number of vCores currently used on the node
shown as core
yarn.node.available_virtual_cores
(gauge)
The total number of vCores available on the node
shown as core
yarn.node.num_containers
(gauge)
The total number of containers currently running on the node