Monitoring HDFS Datanode

This article will help you get the Hdfs_datanode plugin for sd-agent configured and returning metrics

Installing the hdfs_datanode plugin package

Install the hdfs_datanode plugin on Debian/Ubuntu:

sudo apt-get install sd-agent-hdfs_datanode

Install the hdfs_datanode plugin on RHEL/CentOS:

sudo yum install sd-agent-hdfs_datanode

Read more about agent plugins.

Configuring the agent to monitor HDFS

1. Configure /etc/sd-agent/conf.d/hdfs_datanode.yaml

init_config:

instances:
# The HDFS DataNode check retrieves metrics from the HDFS DataNode's JMX
# interface. This check must be installed on a HDFS DataNode. The HDFS
# DataNode JMX URI is composed of the DataNode's hostname and port.
#
# The hostname and port can be found in the hdfs-site.xml conf file under
# the property dfs.datanode.http.address
#
- hdfs_datanode_jmx_uri: http://localhost:50075

2. Restart the agent

sudo /etc/init.d/sd-agent restart

or

sudo systemctl restart sd-agent

Verifying the configuration
Execute info to verify the configuration with the following:

sudo /etc/init.d/sd-agent info 

or

/usr/share/python/sd-agent/agent.py info

If the agent has been configured correctly you'll see an output such as:

hdfs_datanode
-----
  - instance #0 [OK]
  - Collected * metrics

You can also view the metrics returned with the following command:

sudo -u sd-agent /usr/share/python/sd-agent/agent.py check hdfs_datanode

Configuring graphs

Click the name of your server from the Devices list in your Server Density account then go to the Metrics tab. Click the + Graph button on the right then choose the hdfs_datanode metrics to display the graphs. The metrics will also be available to select when building dashboard graphs.

Screen_Shot_2018-01-18_at_11.46.12.png

Monitored metrics

Metric Values
hdfs.datanode.cache_capacity

Cache capacity in bytes
byte / None
Type: float
hdfs.datanode.cache_used

Cache used in bytes
byte / None
Type: float
hdfs.datanode.dfs_capacity

Disk capacity in bytes
byte / None
Type: float
hdfs.datanode.dfs_remaining

The remaining disk space left in bytes
byte / None
Type: float
hdfs.datanode.dfs_used

Disk usage in bytes
byte / None
Type: float
hdfs.datanode.estimated_capacity_lost_total

The estimated capacity lost in bytes
byte / None
Type: float
hdfs.datanode.last_volume_failure_date

The date/time of the last volume failure in milliseconds since epoch
millisecond / None
Type: float
hdfs.datanode.num_blocks_cached

The number of blocks cached
block / None
Type: float
hdfs.datanode.num_blocks_failed_to_cache

The number of blocks that failed to cache
block / None
Type: float
hdfs.datanode.num_blocks_failed_to_uncache

The number of failed blocks to remove from cache
block / None
Type: float
hdfs.datanode.num_failed_volumes

Number of failed volumes
None / None
Type: float
Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

Monday  —  Friday.

10am  —  6pm UK.

Dedicated Support.