This article will help you get the Cassandra plugin for sd-agent configured and returning metrics
Installing the cassandra plugin package
Install the cassandra plugin on Debian/Ubuntu:
sudo apt-get install sd-agent-cassandra
Install the cassandra plugin on RHEL/CentOS:
sudo yum install sd-agent-cassandra
Read more about agent plugins.
Configuring the agent to monitor Cassandra
1. Configure the instances in /etc/sd-agent/conf.d/cassandra.yaml:
instances: - host: localhost port: 7199 cassandra_aliasing: true # user: username # password: password # process_name_regex: .*process_name.* # Instead of specifying a host, and port. The agent can connect using the attach api. # # This requires the JDK to be installed and the path to tools.jar to be set below. # tools_jar_path: /usr/lib/jvm/java-7-openjdk-amd64/lib/tools.jar # To be set when process_name_regex is set # name: cassandra_instance # # java_bin_path: /path/to/java # Optional, should be set if the agent cannot find your java executable # # java_options: "-Xmx200m -Xms50m" # Optional, Java JVM options # # trust_store_path: /path/to/trustStore.jks # Optional, should be set if ssl is enabled # # trust_store_password: password # tags: # env: stage # newTag: test
If a username and password is required then ensure to uncomment those options and set them as desired, ensuring the yaml stays valid.
It's also possible to use the attach api instead, by specifying the process name regex and the tool.jar path. Note that if SSL is enabled you'll also need to set the trust store path.
2. Restart the agent
sudo /etc/init.d/sd-agent restart
or
sudo systemctl restart sd-agent
Verifying the configuration
Execute info to verify the configuration with the following:
sudo /etc/init.d/sd-agent info
or
/usr/share/python/sd-agent/agent.py info
If the agent has been configured correctly you'll see an output such as:
cassandra ----- - instance #0 [OK] - Collected * metrics
You can also view the metrics returned with the following command:
service sd-agent jmx collect
Configuring graphs
Click the name of your server from the Devices list in your Server Density account then go to the Metrics tab. Click the + Graph button on the right then choose the cassandra metrics to display the graphs. The metrics will also be available to select when building dashboard graphs.
Monitored metrics
Metric | Values |
---|---|
cassandra.active_tasks The number of tasks that the thread pool is actively executing. |
task / None Type: float |
cassandra.bloom_filter_disk_space_used Disk space used by the Bloom filters. |
byte / None Type: float |
cassandra.bloom_filter_false_positives The number of Bloom filter false positives. |
event / None Type: float |
cassandra.bloom_filter_false_ratio The ratio of Bloom filter false positives to total checks. |
fraction / None Type: float |
cassandra.capacity The capacity of the caches, such as the key cache and row cache. |
byte / None Type: float |
cassandra.completed_tasks The number of tasks that the thread pool has completed. |
task / None Type: float |
cassandra.compression_ratio The compression ratio for all SSTables in a column family. |
fraction / None Type: float |
cassandra.currently_blocked_tasks.count The number of currently blocked tasks for the thread pool. |
task / None Type: float |
cassandra.db.bloom_filter_disk_space_used Disk space used by the Bloom filters. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.bloom_filter_disk_space_used instead) |
byte / None Type: float |
cassandra.db.bloom_filter_false_positives The number of Bloom filter false positives. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.bloom_filter_false_positives instead) |
event / None Type: float |
cassandra.db.bloom_filter_false_ratio The ratio of Bloom filter false positives to total checks. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.bloom_filter_false_ratio instead) |
fraction / None Type: float |
cassandra.db.completed_tasks Completed compaction or commitlog tasks. (Metric may not be available for Cassandra versions > 2.2.) |
task / None Type: float |
cassandra.db.compression_ratio The compression ratio for all SSTables in a column family. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.compression_ratio instead) |
fraction / None Type: float |
cassandra.db.exception_count The number of exceptions thrown. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.exceptions.count instead) |
error / None Type: float |
cassandra.db.key_cache_recent_hit_rate Ratio of key cache hits to key cache requests since the last time this attribute was read. (Metric may not be available for Cassandra versions > 2.2.) |
fraction / None Type: float |
cassandra.db.live_disk_space_used Disk space used by "live" SSTables (only counts non-obsolete files). (Metric may not be available for Cassandra versions > 2.2. Use cassandra.live_disk_space_used.count instead) |
byte / None Type: float |
cassandra.db.live_ss_table_count Number of "live" (non-obsolete) SSTables. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.live_ss_table_count instead) |
file / None Type: float |
cassandra.db.load Disk space used on a node. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.load.count instead) |
byte / None Type: float |
cassandra.db.max_row_size Size of the largest compacted row. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.max_row_size instead) |
byte / None Type: float |
cassandra.db.mean_row_size Average size of compacted rows. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.mean_row_size instead) |
byte / None Type: float |
cassandra.db.memtable_columns_count Number of columns in memtable. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.memtable_columns_count instead) |
column / None Type: float |
cassandra.db.memtable_data_size Size of data stored in memtable. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.memtable_live_data_size instead) |
byte / None Type: float |
cassandra.db.memtable_switch_count Number of times a full memtable has been switched out for an empty one due to flushing. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.memtable_switch_count.count instead) |
event / None Type: float |
cassandra.db.min_row_size Size of the smallest compacted row. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.min_row_size instead) |
byte / None Type: float |
cassandra.db.pending_tasks Pending compaction, commitlog, or column family tasks. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.pending_tasks instead) |
task / None Type: float |
cassandra.db.range_operations Count of range scan operations. (Metric may not be available for Cassandra versions > 2.2.) |
operation / None Type: float |
cassandra.db.read_count The number of local read requests for a column family. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.latency.count instead) |
read / None Type: float |
cassandra.db.read_operations Count of read operations. (Metric may not be available for Cassandra versions > 2.2.) |
operation / None Type: float |
cassandra.db.recent_range_latency_micros The latency of range scans since the last time this attribute was read. (Metric may not be available for Cassandra versions > 2.2.) |
microsecond / None Type: float |
cassandra.db.recent_read_latency_micros The latency of reads since the last time this attribute was read. (Metric may not be available for Cassandra versions > 2.2.) |
microsecond / None Type: float |
cassandra.db.recent_write_latency_micros The latency of writes since the last time this attribute was read. (Metric may not be available for Cassandra versions > 2.2.) |
microsecond / None Type: float |
cassandra.db.total_disk_space_used Disk space used by a column family. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.total_disk_space_used.count instead) |
byte / None Type: float |
cassandra.db.total_range_latency_micros Total latency for all range scans. (Metric may not be available for Cassandra versions > 2.2.) |
microsecond / None Type: float |
cassandra.db.total_read_latency_micros Total latency for all read requests. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.total_latency.count instead) |
microsecond / None Type: float |
cassandra.db.total_write_latency_micros Total latency for all write requests. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.total_latency.count instead) |
microsecond / None Type: float |
cassandra.db.update_interval The configurable update interval for the dynamic snitch, which monitors read latency to route requests away from slow nodes. |
millisecond / None Type: float |
cassandra.db.write_count The number of local write requests for a column family. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.latency.count instead) |
write / None Type: float |
cassandra.db.write_operations Count of write operations. (Metric may not be available for Cassandra versions > 2.2.) |
operation / None Type: float |
cassandra.exceptions.count The number of exceptions thrown. |
error / None Type: float |
cassandra.hits.count The number of hits to a cache. |
hit / None Type: float |
cassandra.internal.active_count The number of tasks that the thread pool is actively executing. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.active_tasks instead) |
task / None Type: float |
cassandra.internal.completed_tasks The number of tasks that the thread pool has completed. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.completed_tasks instead) |
task / None Type: float |
cassandra.internal.currently_blocked_tasks The number of currently blocked tasks for the thread pool. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.currently_blocked_tasks.count instead) |
task / None Type: float |
cassandra.internal.total_blocked_tasks The cumulative total of currently blocked tasks for the thread pool. (Metric may not be available for Cassandra versions > 2.2.) |
task / None Type: float |
cassandra.latency.count The number of client requests. |
request / None Type: float |
cassandra.latency.one_minute_rate Recent rate of client requests, as an exponentially weighted moving average over a one-minute interval. |
request / second Type: float |
cassandra.live_disk_space_used.count Disk space used by "live" SSTables (only counts non-obsolete files). |
byte / None Type: float |
cassandra.live_ss_table_count Number of "live" (non-obsolete) SSTables. |
file / None Type: float |
cassandra.load.count Disk space used on a node. |
byte / None Type: float |
cassandra.max_row_size Size of the largest compacted row. |
byte / None Type: float |
cassandra.mean_row_size Average size of compacted rows. |
byte / None Type: float |
cassandra.memtable_columns_count Number of columns in memtable. |
column / None Type: float |
cassandra.memtable_live_data_size Size of data stored in memtable. |
byte / None Type: float |
cassandra.memtable_switch_count.count Number of times a full memtable has been switched out for an empty one due to flushing. |
event / None Type: float |
cassandra.min_row_size Size of the smallest compacted row. |
byte / None Type: float |
cassandra.net.total_timeouts Count of requests not acknowledged within configurable timeout window. (Metric may not be available for Cassandra versions > 2.2. Use cassandra.timeouts.count instead) |
timeout / None Type: float |
cassandra.pending_tasks The number of pending tasks for the thread pool. |
task / None Type: float |
cassandra.requests.count The number of requests to a cache. |
request / None Type: float |
cassandra.size Size of cache. |
byte / None Type: float |
cassandra.timeouts.count Count of requests not acknowledged within configurable timeout window. |
timeout / None Type: float |
cassandra.timeouts.one_minute_rate Recent timeout rate, as an exponentially weighted moving average over a one-minute interval. |
timeout / second Type: float |
cassandra.total_disk_space_used.count Disk space used by a column family. |
byte / None Type: float |
cassandra.total_latency.count Total latency for all client requests. |
microsecond / None Type: float |
cassandra.unavailables.count Count of requests for which the required number of nodes was unavailable. |
error / None Type: float |
cassandra.unavailables.one_minute_rate Recent rate of unavailable exceptions, as an exponentially weighted moving average over a one-minute interval. |
error / second Type: float |
Comments