This article will help you get the Hdfs_datanode plugin for sd-agent configured and returning metrics
Installing the hdfs_datanode plugin package
Install the hdfs_datanode plugin on Debian/Ubuntu:
sudo apt-get install sd-agent-hdfs_datanode
Install the hdfs_datanode plugin on RHEL/CentOS:
sudo yum install sd-agent-hdfs_datanode
Read more about agent plugins.
Configuring the agent to monitor HDFS
1. Configure /etc/sd-agent/conf.d/hdfs_datanode.yaml
init_config:
instances:
# The HDFS DataNode check retrieves metrics from the HDFS DataNode's JMX
# interface. This check must be installed on a HDFS DataNode. The HDFS
# DataNode JMX URI is composed of the DataNode's hostname and port.
#
# The hostname and port can be found in the hdfs-site.xml conf file under
# the property dfs.datanode.http.address
#
- hdfs_datanode_jmx_uri: http://localhost:50075
2. Restart the agent
sudo /etc/init.d/sd-agent restart
or
sudo systemctl restart sd-agent
Verifying the configuration
Execute info to verify the configuration with the following:
sudo /etc/init.d/sd-agent info
or
/usr/share/python/sd-agent/agent.py info
If the agent has been configured correctly you'll see an output such as:
hdfs_datanode ----- - instance #0 [OK] - Collected * metrics
You can also view the metrics returned with the following command:
sudo -u sd-agent /usr/share/python/sd-agent/agent.py check hdfs_datanode
Configuring graphs
Click the name of your server from the Devices list in your Server Density account then go to the Metrics tab. Click the + Graph button on the right then choose the hdfs_datanode metrics to display the graphs. The metrics will also be available to select when building dashboard graphs.
Monitored metrics
Metric | Values |
---|---|
hdfs.datanode.cache_capacity Cache capacity in bytes |
byte / None Type: float |
hdfs.datanode.cache_used Cache used in bytes |
byte / None Type: float |
hdfs.datanode.dfs_capacity Disk capacity in bytes |
byte / None Type: float |
hdfs.datanode.dfs_remaining The remaining disk space left in bytes |
byte / None Type: float |
hdfs.datanode.dfs_used Disk usage in bytes |
byte / None Type: float |
hdfs.datanode.estimated_capacity_lost_total The estimated capacity lost in bytes |
byte / None Type: float |
hdfs.datanode.last_volume_failure_date The date/time of the last volume failure in milliseconds since epoch |
millisecond / None Type: float |
hdfs.datanode.num_blocks_cached The number of blocks cached |
block / None Type: float |
hdfs.datanode.num_blocks_failed_to_cache The number of blocks that failed to cache |
block / None Type: float |
hdfs.datanode.num_blocks_failed_to_uncache The number of failed blocks to remove from cache |
block / None Type: float |
hdfs.datanode.num_failed_volumes Number of failed volumes |
None / None Type: float |
Comments