Difference between kafka_consumer_legacy & kafka_consumer

telegraf

#1

Hi,

I wonder difference between kafka_consumer_legacy and kafka_consumer input plugins?
Are they depend on each other in order to consume data from kafka? Or, each of them should work independently?

I installed influxdb and kafka in my environment and set some telegraf agents to consume from kafka using kafka_consumer plugin. After a while, I realized that there is missing data in influxdb database.

For example,
I have producer telegraf agents running on host1, host2 and host3 (writing data to kafka)
And I have consumer telegraf agents running on host5 and host6 (reading data from kafka, writing data to influxdb)
Suppose they all are running without any problem.
I added one more producer running on host4. But it didn’t appear in influxdb. I changed debug = true in telegraf.conf on host4. From logs, I confirmed that it was writing data to kafka.
I looked into kafka to check if data exist in for host4 using kafka-console-consumer.sh command. Yes, it was there.
I also tried adding more producers. Some of them appeared in influxdb but some of them did not.

The issue is; host5 and host6 is not consuming data for some hosts. But I couldn’t find the reason.

Then, I used kafka_consumer_legacy plugin. With this one, I have now data in influxdb from all hosts. This is working fine.

So, this is the reason I wonder if kafka_consumer_legacy and kafka_consumer depend on each other.
What is your experience? Do you think that I did something wrong or miss?

Regards,
ismail


#2

The legacy plugin is no longer maintained, and will be removed when we release 2.0, so we should try to get kafka_consumer working for you. This is a pretty strange that it would skip the results sent by a particular client, can you try to come up with a smaller example that reproduces it, perhaps needing fewer hosts?


#3

I think our issue is not related to kafka_consumer_legacy or kafka_consumer.
Currently telegraf is configured to use kafka_consumer_legacy plugin and this telegraf agent cannot process some data from kafka.
We have 3 kafka and 3 zookeper cluster environment. I have created telegraf and custommon topics. One of our telegraf is configured to read from topic custommon.
On the other hand, we have 2 telegraf agents running on different hosts with exec plugin. They are executing scripts and pushing data to kafka. Data is like;

[root@KLMETKFKT1 bin]# /u01/kafka_2.11-1.0.1/bin/kafka-console-consumer.sh --bootstrap-server KLMETKFKT1:9092 --topic custommon
citrix_icasessions,environment=uat,host=KWXENDDCT3,sessions=total value=139 1524569107000000000

citrix_icasessions,environment=uat,host=KWXENDDCT3,sessions=active value=102 1524569107000000000

citrix_icasessions,environment=uat,host=KWXENDDCT3,sessions=disconnected value=37 1524569107000000000

citrix_icasessions,environment=uat,host=KWXENDDCT3,sessions=reconnecting value=0 1524569107000000000

citrix_icasessions,environment=uat,host=KWXENDDCT3,sessions=other value=0 1524569107000000000

citrix_icasessions,environment=uat,host=KWXENDDCT4,sessions=total value=139 1524569136000000000

citrix_icasessions,environment=uat,host=KWXENDDCT4,sessions=active value=102 1524569136000000000

citrix_icasessions,environment=uat,host=KWXENDDCT4,sessions=disconnected value=37 1524569136000000000

citrix_icasessions,environment=uat,host=KWXENDDCT4,sessions=reconnecting value=0 1524569136000000000

citrix_icasessions,environment=uat,host=KWXENDDCT4,sessions=other value=0 1524569136000000000

When we look into influxdb;

[root@KLMETIDBT1 ~]# influx
Connected to http://localhost:8086 version 1.5.1
InfluxDB shell version: 1.5.1
'> use custommon
Using database custommon
'> SHOW TAG VALUES ON “custommon” FROM “citrix_icasessions” WITH KEY = “host”
name: citrix_icasessions
key value


host KWXENDDCT3
'>

KWXENDDCT4 is recently configured node. It was configured approximately 2 weeks after KWXENDDCT3 was configured.

Do you have any idea?


#4

Perhaps the host is being overwritten by the Kafka consumer? You may want to disable the hostname tag on the Kafka consumers by setting:

[agent]
  omit_hostname = true