Monitoring HDFS Namenode

This article will help you get the Hdfs_namenode plugin for sd-agent configured and returning metrics

Installing the hdfs_namenode plugin package

Install the hdfs_namenode plugin on Debian/Ubuntu:

sudo apt-get install sd-agent-hdfs_namenode

Install the hdfs_namenode plugin on RHEL/CentOS:

sudo yum install sd-agent-hdfs_namenode

Read more about agent plugins.

Configuring the agent to monitor HDFS

1. Configure /etc/sd-agent/conf.d/hdfs_namenode.yaml

init_config: 

instances:
# The HDFS NameNode check retrieves metrics from the HDFS NameNode's JMX
# interface. This check must be installed on the NameNode. The HDFS
# NameNode JMX URI is composed of the NameNode's hostname and port.
#
# The hostname and port can be found in the hdfs-site.xml conf file under
# the property dfs.http.address or dfs.namenode.http-address
#
- hdfs_namenode_jmx_uri: http://localhost:50070

2. Restart the agent

sudo /etc/init.d/sd-agent restart

or

sudo systemctl restart sd-agent

Verifying the configuration
Execute info to verify the configuration with the following:

sudo /etc/init.d/sd-agent info 

or

/usr/share/python/sd-agent/agent.py info

If the agent has been configured correctly you'll see an output such as:

hdfs_namenode
-----
  - instance #0 [OK]
  - Collected * metrics

You can also view the metrics returned with the following command:

sudo -u sd-agent /usr/share/python/sd-agent/agent.py check hdfs_namenode

Configuring graphs

Click the name of your server from the Devices list in your Server Density account then go to the Metrics tab. Click the + Graph button on the right then choose the hdfs_namenode metrics to display the graphs. The metrics will also be available to select when building dashboard graphs.

Screen_Shot_2018-01-18_at_11.46.12.png

Monitored metrics

MetricValues
hdfs.namenode.blocks_total

Total number of blocks
block / None
Type: float
hdfs.namenode.capacity_remaining

Remaining disk space left in bytes
byte / None
Type: float
hdfs.namenode.capacity_total

Total disk capacity in bytes
byte / None
Type: float
hdfs.namenode.capacity_used

Disk usage in bytes
byte / None
Type: float
hdfs.namenode.corrupt_blocks

Number of corrupt blocks
block / None
Type: float
hdfs.namenode.estimated_capacity_lost_total

Estimated capacity lost in bytes
byte / None
Type: float
hdfs.namenode.files_total

Total number of files
file / None
Type: float
hdfs.namenode.fs_lock_queue_length

Lock queue length
None / None
Type: float
hdfs.namenode.max_objects

Maximum number of files HDFS supports
object / None
Type: float
hdfs.namenode.missing_blocks

Number of missing blocks
block / None
Type: float
hdfs.namenode.num_dead_data_nodes

Total number of dead data nodes
node / None
Type: float
hdfs.namenode.num_decom_dead_data_nodes

Number of decommissioning dead data nodes
node / None
Type: float
hdfs.namenode.num_decom_live_data_nodes

Number of decommissioning live data nodes
node / None
Type: float
hdfs.namenode.num_decommissioning_data_nodes

Number of decommissioning data nodes
node / None
Type: float
hdfs.namenode.num_live_data_nodes

Total number of live data nodes
node / None
Type: float
hdfs.namenode.num_stale_data_nodes

Number of stale data nodes
node / None
Type: float
hdfs.namenode.num_stale_storages

Number of stale storages
None / None
Type: float
hdfs.namenode.pending_deletion_blocks

Number of pending deletion blocks
block / None
Type: float
hdfs.namenode.pending_replication_blocks

Number of blocks pending replication
block / None
Type: float
hdfs.namenode.scheduled_replication_blocks

Number of blocks scheduled for replication
block / None
Type: float
hdfs.namenode.total_load

Total load on the file system
None / None
Type: float
hdfs.namenode.under_replicated_blocks

Number of under replicated blocks
block / None
Type: float
hdfs.namenode.volume_failures_total

Total volume failures
None / None
Type: float
Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

Monday  —  Friday.

10am  —  6pm UK.

Dedicated Support.