This article will help you get the Kafka plugin for sd-agent configured and returning metrics
Installing the kafka plugin package
Install the kafka plugin on Debian/Ubuntu:
sudo apt-get install sd-agent-kafka
Install the kafka plugin on RHEL/CentOS:
sudo yum install sd-agent-kafka
Read more about agent plugins.
Configuring the agent to monitor Apache Kafka
This guide is for Kafka >= 0.8.2.
1. Configure /etc/sd-agent/conf.d/kafka.yaml
instances: - host: localhost port: 9999 # This is the JMX port on which Kafka exposes its metrics (usually 9999) tags: kafka: broker # env: stage # newTag: test # user: username # password: password # process_name_regex: .*process_name.* # Instead of specifying a host, and port. The agent can connect using the attach api. # This requires the JDK to be installed and the path to tools.jar to be set below. # tools_jar_path: /usr/lib/jvm/java-7-openjdk-amd64/lib/tools.jar # To be set when process_name_regex is set # name: kafka_instance # java_bin_path: /path/to/java # Optional, should be set if the agent cannot find your java executable # trust_store_path: /path/to/trustStore.jks # Optional, should be set if ssl is enabled # trust_store_password: password # - host: remotehost # port: 9998 # Producer # tags: # kafka: producer0 # env: stage # newTag: test # - host: remotehost # port: 9997 # Consumer # tags: # kafka: consumer0 # env: stage # newTag: test
You should not have to edit anything in the init_config:
.
2. Restart the agent
sudo /etc/init.d/sd-agent restart
or
sudo systemctl restart sd-agent
Verifying the configuration
Execute info to verify the configuration with the following:
sudo /etc/init.d/sd-agent info
or
/usr/share/python/sd-agent/agent.py info
If the agent has been configured correctly you'll see an output such as:
kafka ----- - instance #0 [OK] - Collected * metrics
You can also view the metrics returned with the following command:
service sd-agent jmx collect
Configuring graphs
Click the name of your server from the Devices list in your Server Density account then go to the Metrics tab. Click the + Graph button on the right then choose the kafka metrics to display the graphs. The metrics will also be available to select when building dashboard graphs.
Monitored metrics
Metric | Values |
---|---|
kafka.consumer.bytes_in Consumer bytes in rate. |
byte / second Type: float |
kafka.consumer.delayed_requests Number of delayed consumer requests. |
request / None Type: float |
kafka.consumer.expires_per_second Rate of delayed consumer request expiration. |
eviction / second Type: float |
kafka.consumer.fetch_rate The minimum rate at which the consumer sends fetch requests to a broker. |
request / None Type: float |
kafka.consumer.kafka_commits Rate of offset commits to Kafka. |
write / second Type: float |
kafka.consumer.max_lag Maximum consumer lag. |
offset / None Type: float |
kafka.consumer.messages_in Rate of consumer message consumption. |
message / second Type: float |
kafka.consumer.zookeeper_commits Rate of offset commits to ZooKeeper. |
write / second Type: float |
kafka.expires_sec Rate of delayed producer request expiration. |
eviction / second Type: float |
kafka.follower.expires_per_second Rate of request expiration on followers. |
eviction / second Type: float |
kafka.log.flush_rate Log flush rate. |
flush / second Type: float |
kafka.messages_in Incoming message rate. |
message / None Type: float |
kafka.net.bytes_in Incoming byte rate. |
byte / second Type: float |
kafka.net.bytes_out Outgoing byte rate. |
byte / second Type: float |
kafka.net.bytes_rejected Rejected byte rate. |
byte / second Type: float |
kafka.producer.bytes_out Producer bytes out rate. |
byte / second Type: float |
kafka.producer.delayed_requests Number of producer requests delayed. |
request / None Type: float |
kafka.producer.expires_per_seconds Rate of producer request expiration. |
eviction / second Type: float |
kafka.producer.io_wait Producer I/O wait time. |
nanosecond / None Type: float |
kafka.producer.message_rate Producer message rate. |
message / second Type: float |
kafka.producer.request_latency_avg Producer average request latency. |
millisecond / None Type: float |
kafka.producer.request_rate Number of producer requests per second. |
request / second Type: float |
kafka.producer.response_rate Number of producer responses per second. |
response / second Type: float |
kafka.replication.isr_expands Rate of replicas joining the ISR pool. |
node / second Type: float |
kafka.replication.isr_shrinks Rate of replicas leaving the ISR pool. |
node / second Type: float |
kafka.replication.leader_elections Leader election rate. |
event / second Type: float |
kafka.replication.unclean_leader_elections Unclean leader election rate. |
event / second Type: float |
kafka.replication.under_replicated_partitions Number of unreplicated partitions. |
None / None Type: float |
kafka.request.fetch.failed Number of client fetch request failures. |
request / None Type: float |
kafka.request.fetch.failed_per_second Rate of client fetch request failures per second. |
request / second Type: float |
kafka.request.fetch.time.99percentile Time for fetch requests for 99th percentile. |
request / second Type: float |
kafka.request.fetch.time.avg Average time per fetch request. |
request / second Type: float |
kafka.request.handler.avg.idle.pct Average fraction of time the request handler threads are idle. |
fraction / None Type: float |
kafka.request.metadata.time.99percentile Time for metadata requests for 99th percentile. |
millisecond / None Type: float |
kafka.request.metadata.time.avg Average time for metadata request. |
millisecond / None Type: float |
kafka.request.offsets.time.99percentile Time for offset requests for 99th percentile. |
millisecond / None Type: float |
kafka.request.offsets.time.avg Average time for an offset request. |
millisecond / None Type: float |
kafka.request.produce.failed Number of failed produce requests. |
request / None Type: float |
kafka.request.produce.failed_per_second Rate of failed produce requests per second. |
request / second Type: float |
kafka.request.produce.time.99percentile Time for produce requests for 99th percentile. |
request / second Type: float |
kafka.request.produce.time.avg Average time for a produce request. |
request / second Type: float |
kafka.request.update_metadata.time.99percentile Time for update metadata requests for 99th percentile. |
millisecond / None Type: float |
kafka.request.update_metadata.time.avg Average time for a request to update metadata. |
millisecond / None Type: float |
Comments