This article will help you get the Ceph plugin for sd-agent configured and returning metrics
Installing the ceph plugin package
Install the ceph plugin on Debian/Ubuntu:
sudo apt-get install sd-agent-ceph
Install the ceph plugin on RHEL/CentOS:
sudo yum install sd-agent-ceph
Read more about agent plugins.
Configuring the agent to monitor ceph
1. Configure /etc/sd-agent/conf.d/ceph.yaml
init_config: instances: - ceph_cmd: /path/to/your/ceph # default is /usr/bin/ceph use_sudo: true # only if the ceph binary needs sudo on your nodes
2. sudo for sd-agent, if required
If you set the use_sudo
option to yes, the add a line like the following to your sudoers file:
sd-agent ALL=(ALL) NOPASSWD:/path/to/your/ceph
3. Restart the agent
sudo /etc/init.d/sd-agent restart
or
sudo systemctl restart sd-agent
Verifying the configuration
Execute info to verify the configuration with the following:
sudo /etc/init.d/sd-agent info
or
/usr/share/python/sd-agent/agent.py info
If the agent has been configured correctly you'll see an output such as:
ceph ----- - instance #0 [OK] - Collected * metrics
You can also view the metrics returned with the following command:
sudo -u sd-agent /usr/share/python/sd-agent/agent.py check ceph
Configuring graphs
Click the name of your server from the Devices list in your Server Density account then go to the Metrics tab. Click the + Graph button on the right then choose the ceph metrics to display the graphs. The metrics will also be available to select when building dashboard graphs.
Monitored metrics
Metric | Values |
---|---|
ceph.aggregate_pct_used Overall capacity usage metric |
percent / None Type: float |
ceph.apply_latency_ms Time taken to flush an update to disks |
millisecond / None Type: float |
ceph.commit_latency_ms Time taken to commit an operation to the journal |
millisecond / None Type: float |
ceph.num_full_osds Number of full osds |
item / None Type: float |
ceph.num_in_osds Number of participating storage daemons |
item / None Type: float |
ceph.num_mons Number of monitor daemons |
item / None Type: float |
ceph.num_near_full_osds Number of nearly full osds |
item / None Type: float |
ceph.num_objects Object count for a given pool |
item / None Type: float |
ceph.num_osds Number of known storage daemons |
item / None Type: float |
ceph.num_pgs Number of placement groups available |
item / None Type: float |
ceph.num_pools Number of pools |
item / None Type: float |
ceph.num_up_osds Number of online storage daemons |
item / None Type: float |
ceph.op_per_sec IO operations per second for given pool |
operation / second Type: float |
ceph.osd.pct_used Percentage used of full/near full osds |
percent / None Type: float |
ceph.pgstate.active_clean Number of active+clean placement groups |
item / None Type: float |
ceph.read_bytes Per-pool read bytes |
byte / None Type: float |
ceph.read_bytes_sec Bytes/second being read |
byte / None Type: float |
ceph.read_op_per_sec Per-pool read operations/second |
operation / second Type: float |
ceph.total_objects Object count from the underlying object store |
item / None Type: float |
ceph.write_bytes Per-pool write bytes |
byte / None Type: float |
ceph.write_bytes_sec Bytes/second being written |
byte / None Type: float |
ceph.write_op_per_sec Per-pool write operations/second |
operation / second Type: float |
Comments