Telegraf performance use statsd input plugin & kafka output plugin

Dear ALL,
I have metrics 100k/sec. But when I testing the statsd input and kafka output , I only get 600k/min,
Did anyone test telegraf performance or can give me some idea?

Machine type

1 telegraf: cpu 2core , memory 4GB
3 kafka: each server cpu 2core , memory 8GB

Software Version:

Telegraf 1.3.0
Kafka 0.10.2

telegraf config:

[global_tags]
[agent]
interval = “60s”
round_interval = true
metric_batch_size = 600000
metric_buffer_limit = 10000000
collection_jitter = “0s”
flush_interval = “60s”
flush_jitter = “0s”
precision = “”
debug = true
quiet = false
logfile = “/var/log/telegraf/telegraf.log”
hostname = “”
omit_hostname = true

[[outputs.kafka]]
brokers = [“10.62.4.160:9092”,“10.62.4.161:9092”,“10.62.4.162:9092”]
topic = “telegraf”
routing_tag = “host”
compression_codec = 0
required_acks = 0
max_retry = 3
data_format = “graphite”

[[inputs.statsd]]
service_address = “10.62.4.159:12004”

delete_gauges = true
delete_counters = true
delete_sets = true
delete_timings = true
percentiles = [90]
metric_separator = “.”
parse_data_dog_tags = false
allowed_pending_messages = 10000000
percentile_limit = 1000

Test Script:

statsd-tg -d 10.62.4.159 -D 12004 -T 2 -s 0 -c 1000000 -t 0 -g 0

Test Result:

2017-05-18T03:40:00Z D! Output [kafka] buffer fullness: 603796 / 10000000 metrics.
2017-05-18T03:40:59Z D! Output [kafka] wrote batch of 603796 metrics in 59.374510298s

I try socket_writer UDP & TCP output plugin, the UDP metrics performance can be 100k/s and tcp metrics performance can be 50k/s. But kafka output plugin metric performance just only 10k/s…

Telegraf UDP config:

[global_tags]
[agent]
interval = “60s”
round_interval = true
metric_batch_size = 600000
metric_buffer_limit = 10000000
collection_jitter = “0s”
flush_interval = “60s”
flush_jitter = “0s”
precision = “”
debug = true
quiet = false
logfile = “/var/log/telegraf/telegraf.log”
hostname = “”
omit_hostname = true

[[outputs.socket_writer]]
address = “udp://127.0.0.1:8094”
data_format = “graphite”

[[inputs.statsd]]
service_address = “10.62.4.162:12004”
delete_gauges = true
delete_counters = true
delete_sets = true
delete_timings = true
percentiles = [90]
metric_separator = “.”
parse_data_dog_tags = false
allowed_pending_messages = 10000000
percentile_limit = 1000

Result:

2017-05-18T06:50:08Z D! Output [socket_writer] wrote batch of 470028 metrics in 7.928522448s
2017-05-18T06:50:12Z D! Output [socket_writer] wrote batch of 600000 metrics in 7.39794513s
2017-05-18T06:51:00Z D! Output [socket_writer] buffer fullness: 418543 / 10000000 metrics.
2017-05-18T06:51:05Z D! Output [socket_writer] wrote batch of 418543 metrics in 5.534527361s
2017-05-18T06:51:10Z D! Output [socket_writer] wrote batch of 600000 metrics in 7.842004589s

Telegraf TCP Config:



[[outputs.socket_writer]]
address = “tcp://127.0.0.1:8094”
data_format = “graphite”

Result:

017-05-18T07:01:15Z D! Output [socket_writer] wrote batch of 478660 metrics in 15.150335814s
2017-05-18T07:01:17Z D! Output [socket_writer] wrote batch of 600000 metrics in 14.857339777s
2017-05-18T07:02:00Z D! Output [socket_writer] buffer fullness: 399438 / 10000000 metrics.
2017-05-18T07:02:04Z D! Output [socket_writer] wrote batch of 399438 metrics in 4.459943777s