Configure the Server Density agent to monitor your Mesos cluster to:
• Collect important metrics, including performance and usage
• Identify overall server slowdowns caused by the cluster
Note: This plugin is still in development and is not yet available to install. We will add new plugins to our release notes.
Monitored metrics
See below for monitored metrics.
Configuring the agent to monitor Apache Mesos
1. Configure /etc/sd-agent/conf.d/mesos.yaml
init_config:
instances:
- url: "http://localhost:5050"
2. Restart the agent
sudo /etc/init.d/sd-agent restart
Verifying the configuration
Execute info to verify the configuration with the following:
sudo /etc/init.d/sd-agent info
If the agent has been configured correctly you’ll see an output such as:
mesos ----- - instance #0 [OK] - Collected 30 metrics & 3 service checks
Configuring graphs
Click the name of your server from the Devices list in your Server Density account then go to the Metrics tab. Click the + Graph button on the right then choose the Mesos metrics to display the graphs. The metrics will also be available to select when building dashboard graphs.
Mesos metrics
- mesos.framework.cpu
- mesos.framework.mem
- mesos.framework.disk
- mesos.role.cpu
- mesos.role.mem
- mesos.role.disk
- mesos.cluster.tasks_error
- mesos.cluster.tasks_failed
- mesos.cluster.tasks_finished
- mesos.cluster.tasks_killed
- mesos.cluster.tasks_lost
- mesos.cluster.tasks_running
- mesos.cluster.tasks_staging
- mesos.cluster.tasks_starting
- mesos.cluster.slave_registrations
- mesos.cluster.slave_removals
- mesos.cluster.slave_reregistrations
- mesos.cluster.slave_shutdowns_canceled
- mesos.cluster.slave_shutdowns_scheduled
- mesos.cluster.slaves_active
- mesos.cluster.slaves_connected
- mesos.cluster.slaves_disconnected
- mesos.cluster.slaves_inactive
- mesos.cluster.recovery_slave_removals
- mesos.cluster.cpus_percent
- mesos.cluster.cpus_total
- mesos.cluster.cpus_used
- mesos.cluster.disk_percent
- mesos.cluster.disk_total
- mesos.cluster.disk_used
- mesos.cluster.mem_percent
- mesos.cluster.mem_total
- mesos.cluster.mem_used
- mesos.registrar.queued_operations
- mesos.registrar.registry_size_bytes
- mesos.registrar.state_fetch_ms
- mesos.registrar.state_store_ms
- mesos.registrar.state_store_ms.count
- mesos.registrar.state_store_ms.max
- mesos.registrar.state_store_ms.min
- mesos.registrar.state_store_ms.p50
- mesos.registrar.state_store_ms.p90
- mesos.registrar.state_store_ms.p95
- mesos.registrar.state_store_ms.p99
- mesos.registrar.state_store_ms.p999
- mesos.registrar.state_store_ms.p9999
- mesos.cluster.frameworks_active
- mesos.cluster.frameworks_connected
- mesos.cluster.frameworks_disconnected
- mesos.cluster.frameworks_inactive
- mesos.stats.system.cpus_total
- mesos.stats.system.load_15min
- mesos.stats.system.load_1min
- mesos.stats.system.load_5min
- mesos.stats.system.mem_free_bytes
- mesos.stats.system.mem_total_bytes
- mesos.stats.elected
- mesos.stats.uptime_secs
- mesos.cluster.dropped_messages
- mesos.cluster.outstanding_offers
- mesos.cluster.event_queue_dispatches
- mesos.cluster.event_queue_http_requests
- mesos.cluster.event_queue_messages
- mesos.cluster.invalid_framework_to_executor_messages
- mesos.cluster.invalid_status_update_acknowledgements
- mesos.cluster.invalid_status_updates
- mesos.cluster.valid_framework_to_executor_messages
- mesos.cluster.valid_status_update_acknowledgements
- mesos.cluster.valid_status_updates
Comments