Monitoring Kafka

This article will help you get the Kafka plugin for sd-agent configured and returning metrics

Installing the kafka plugin package

Install the kafka plugin on Debian/Ubuntu:

sudo apt-get install sd-agent-kafka

Install the kafka plugin on RHEL/CentOS:

sudo yum install sd-agent-kafka

Read more about agent plugins.

Configuring the agent to monitor Apache Kafka

This guide is for Kafka >= 0.8.2.

1. Configure /etc/sd-agent/conf.d/kafka.yaml

instances:
  - host: localhost
    port: 9999 # This is the JMX port on which Kafka exposes its metrics (usually 9999)
    tags:
      kafka: broker
      # env: stage
      # newTag: test
    # user: username
    # password: password
    # process_name_regex: .*process_name.*  # Instead of specifying a host, and port. The agent can connect using the attach api.
    # This requires the JDK to be installed and the path to tools.jar to be set below.
    # tools_jar_path: /usr/lib/jvm/java-7-openjdk-amd64/lib/tools.jar  # To be set when process_name_regex is set
    # name: kafka_instance
    # java_bin_path: /path/to/java  # Optional, should be set if the agent cannot find your java executable
    # trust_store_path: /path/to/trustStore.jks  # Optional, should be set if ssl is enabled
    # trust_store_password: password
  # - host: remotehost
  #   port: 9998 # Producer
  #   tags:
  #     kafka: producer0
  #     env: stage
  #     newTag: test
  # - host: remotehost
  #   port: 9997 # Consumer
  #   tags:
  #     kafka: consumer0
  #     env: stage
  #     newTag: test

You should not have to edit anything in the init_config:.

2. Restart the agent

sudo /etc/init.d/sd-agent restart

or

sudo systemctl restart sd-agent

Verifying the configuration
Execute info to verify the configuration with the following:

sudo /etc/init.d/sd-agent info 

or

/usr/share/python/sd-agent/agent.py info

If the agent has been configured correctly you'll see an output such as:

kafka
-----
  - instance #0 [OK]
  - Collected * metrics

You can also view the metrics returned with the following command:

service sd-agent jmx collect

Configuring graphs

Click the name of your server from the Devices list in your Server Density account then go to the Metrics tab. Click the + Graph button on the right then choose the kafka metrics to display the graphs. The metrics will also be available to select when building dashboard graphs.

Screen_Shot_2018-01-18_at_11.46.12.png

Monitored metrics

Metric Values
kafka.consumer.bytes_in

Consumer bytes in rate.
byte / second
Type: float
kafka.consumer.delayed_requests

Number of delayed consumer requests.
request / None
Type: float
kafka.consumer.expires_per_second

Rate of delayed consumer request expiration.
eviction / second
Type: float
kafka.consumer.fetch_rate

The minimum rate at which the consumer sends fetch requests to a broker.
request / None
Type: float
kafka.consumer.kafka_commits

Rate of offset commits to Kafka.
write / second
Type: float
kafka.consumer.max_lag

Maximum consumer lag.
offset / None
Type: float
kafka.consumer.messages_in

Rate of consumer message consumption.
message / second
Type: float
kafka.consumer.zookeeper_commits

Rate of offset commits to ZooKeeper.
write / second
Type: float
kafka.expires_sec

Rate of delayed producer request expiration.
eviction / second
Type: float
kafka.follower.expires_per_second

Rate of request expiration on followers.
eviction / second
Type: float
kafka.log.flush_rate

Log flush rate.
flush / second
Type: float
kafka.messages_in

Incoming message rate.
message / None
Type: float
kafka.net.bytes_in

Incoming byte rate.
byte / second
Type: float
kafka.net.bytes_out

Outgoing byte rate.
byte / second
Type: float
kafka.net.bytes_rejected

Rejected byte rate.
byte / second
Type: float
kafka.producer.bytes_out

Producer bytes out rate.
byte / second
Type: float
kafka.producer.delayed_requests

Number of producer requests delayed.
request / None
Type: float
kafka.producer.expires_per_seconds

Rate of producer request expiration.
eviction / second
Type: float
kafka.producer.io_wait

Producer I/O wait time.
nanosecond / None
Type: float
kafka.producer.message_rate

Producer message rate.
message / second
Type: float
kafka.producer.request_latency_avg

Producer average request latency.
millisecond / None
Type: float
kafka.producer.request_rate

Number of producer requests per second.
request / second
Type: float
kafka.producer.response_rate

Number of producer responses per second.
response / second
Type: float
kafka.replication.isr_expands

Rate of replicas joining the ISR pool.
node / second
Type: float
kafka.replication.isr_shrinks

Rate of replicas leaving the ISR pool.
node / second
Type: float
kafka.replication.leader_elections

Leader election rate.
event / second
Type: float
kafka.replication.unclean_leader_elections

Unclean leader election rate.
event / second
Type: float
kafka.replication.under_replicated_partitions

Number of unreplicated partitions.
None / None
Type: float
kafka.request.fetch.failed

Number of client fetch request failures.
request / None
Type: float
kafka.request.fetch.failed_per_second

Rate of client fetch request failures per second.
request / second
Type: float
kafka.request.fetch.time.99percentile

Time for fetch requests for 99th percentile.
request / second
Type: float
kafka.request.fetch.time.avg

Average time per fetch request.
request / second
Type: float
kafka.request.handler.avg.idle.pct

Average fraction of time the request handler threads are idle.
fraction / None
Type: float
kafka.request.metadata.time.99percentile

Time for metadata requests for 99th percentile.
millisecond / None
Type: float
kafka.request.metadata.time.avg

Average time for metadata request.
millisecond / None
Type: float
kafka.request.offsets.time.99percentile

Time for offset requests for 99th percentile.
millisecond / None
Type: float
kafka.request.offsets.time.avg

Average time for an offset request.
millisecond / None
Type: float
kafka.request.produce.failed

Number of failed produce requests.
request / None
Type: float
kafka.request.produce.failed_per_second

Rate of failed produce requests per second.
request / second
Type: float
kafka.request.produce.time.99percentile

Time for produce requests for 99th percentile.
request / second
Type: float
kafka.request.produce.time.avg

Average time for a produce request.
request / second
Type: float
kafka.request.update_metadata.time.99percentile

Time for update metadata requests for 99th percentile.
millisecond / None
Type: float
kafka.request.update_metadata.time.avg

Average time for a request to update metadata.
millisecond / None
Type: float
Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

Monday  —  Friday.

10am  —  6pm UK.

Dedicated Support.