This article will help you get the Hdfs_namenode plugin for sd-agent configured and returning metrics
Installing the hdfs_namenode plugin package
Install the hdfs_namenode plugin on Debian/Ubuntu:
sudo apt-get install sd-agent-hdfs_namenode
Install the hdfs_namenode plugin on RHEL/CentOS:
sudo yum install sd-agent-hdfs_namenode
Read more about agent plugins.
Configuring the agent to monitor HDFS
1. Configure /etc/sd-agent/conf.d/hdfs_namenode.yaml
init_config:
instances:
# The HDFS NameNode check retrieves metrics from the HDFS NameNode's JMX
# interface. This check must be installed on the NameNode. The HDFS
# NameNode JMX URI is composed of the NameNode's hostname and port.
#
# The hostname and port can be found in the hdfs-site.xml conf file under
# the property dfs.http.address or dfs.namenode.http-address
#
- hdfs_namenode_jmx_uri: http://localhost:50070
2. Restart the agent
sudo /etc/init.d/sd-agent restart
or
sudo systemctl restart sd-agent
Verifying the configuration
Execute info to verify the configuration with the following:
sudo /etc/init.d/sd-agent info
or
/usr/share/python/sd-agent/agent.py info
If the agent has been configured correctly you'll see an output such as:
hdfs_namenode ----- - instance #0 [OK] - Collected * metrics
You can also view the metrics returned with the following command:
sudo -u sd-agent /usr/share/python/sd-agent/agent.py check hdfs_namenode
Configuring graphs
Click the name of your server from the Devices list in your Server Density account then go to the Metrics tab. Click the + Graph button on the right then choose the hdfs_namenode metrics to display the graphs. The metrics will also be available to select when building dashboard graphs.
Monitored metrics
Metric | Values |
---|---|
hdfs.namenode.blocks_total Total number of blocks |
block / None Type: float |
hdfs.namenode.capacity_remaining Remaining disk space left in bytes |
byte / None Type: float |
hdfs.namenode.capacity_total Total disk capacity in bytes |
byte / None Type: float |
hdfs.namenode.capacity_used Disk usage in bytes |
byte / None Type: float |
hdfs.namenode.corrupt_blocks Number of corrupt blocks |
block / None Type: float |
hdfs.namenode.estimated_capacity_lost_total Estimated capacity lost in bytes |
byte / None Type: float |
hdfs.namenode.files_total Total number of files |
file / None Type: float |
hdfs.namenode.fs_lock_queue_length Lock queue length |
None / None Type: float |
hdfs.namenode.max_objects Maximum number of files HDFS supports |
object / None Type: float |
hdfs.namenode.missing_blocks Number of missing blocks |
block / None Type: float |
hdfs.namenode.num_dead_data_nodes Total number of dead data nodes |
node / None Type: float |
hdfs.namenode.num_decom_dead_data_nodes Number of decommissioning dead data nodes |
node / None Type: float |
hdfs.namenode.num_decom_live_data_nodes Number of decommissioning live data nodes |
node / None Type: float |
hdfs.namenode.num_decommissioning_data_nodes Number of decommissioning data nodes |
node / None Type: float |
hdfs.namenode.num_live_data_nodes Total number of live data nodes |
node / None Type: float |
hdfs.namenode.num_stale_data_nodes Number of stale data nodes |
node / None Type: float |
hdfs.namenode.num_stale_storages Number of stale storages |
None / None Type: float |
hdfs.namenode.pending_deletion_blocks Number of pending deletion blocks |
block / None Type: float |
hdfs.namenode.pending_replication_blocks Number of blocks pending replication |
block / None Type: float |
hdfs.namenode.scheduled_replication_blocks Number of blocks scheduled for replication |
block / None Type: float |
hdfs.namenode.total_load Total load on the file system |
None / None Type: float |
hdfs.namenode.under_replicated_blocks Number of under replicated blocks |
block / None Type: float |
hdfs.namenode.volume_failures_total Total volume failures |
None / None Type: float |
Comments