Difference between kafka_consumer_legacy & kafka_consumer

Hi,

I wonder difference between kafka_consumer_legacy and kafka_consumer input plugins?
Are they depend on each other in order to consume data from kafka? Or, each of them should work independently?

I installed influxdb and kafka in my environment and set some telegraf agents to consume from kafka using kafka_consumer plugin. After a while, I realized that there is missing data in influxdb database.

For example,
I have producer telegraf agents running on host1, host2 and host3 (writing data to kafka)
And I have consumer telegraf agents running on host5 and host6 (reading data from kafka, writing data to influxdb)
Suppose they all are running without any problem.
I added one more producer running on host4. But it didn’t appear in influxdb. I changed debug = true in telegraf.conf on host4. From logs, I confirmed that it was writing data to kafka.
I looked into kafka to check if data exist in for host4 using kafka-console-consumer.sh command. Yes, it was there.
I also tried adding more producers. Some of them appeared in influxdb but some of them did not.

The issue is; host5 and host6 is not consuming data for some hosts. But I couldn’t find the reason.

Then, I used kafka_consumer_legacy plugin. With this one, I have now data in influxdb from all hosts. This is working fine.

So, this is the reason I wonder if kafka_consumer_legacy and kafka_consumer depend on each other.
What is your experience? Do you think that I did something wrong or miss?

Regards,
ismail

The legacy plugin is no longer maintained, and will be removed when we release 2.0, so we should try to get kafka_consumer working for you. This is a pretty strange that it would skip the results sent by a particular client, can you try to come up with a smaller example that reproduces it, perhaps needing fewer hosts?

I think our issue is not related to kafka_consumer_legacy or kafka_consumer.
Currently telegraf is configured to use kafka_consumer_legacy plugin and this telegraf agent cannot process some data from kafka.
We have 3 kafka and 3 zookeper cluster environment. I have created telegraf and custommon topics. One of our telegraf is configured to read from topic custommon.
On the other hand, we have 2 telegraf agents running on different hosts with exec plugin. They are executing scripts and pushing data to kafka. Data is like;

[root@KLMETKFKT1 bin]# /u01/kafka_2.11-1.0.1/bin/kafka-console-consumer.sh --bootstrap-server KLMETKFKT1:9092 --topic custommon
citrix_icasessions,environment=uat,host=KWXENDDCT3,sessions=total value=139 1524569107000000000

citrix_icasessions,environment=uat,host=KWXENDDCT3,sessions=active value=102 1524569107000000000

citrix_icasessions,environment=uat,host=KWXENDDCT3,sessions=disconnected value=37 1524569107000000000

citrix_icasessions,environment=uat,host=KWXENDDCT3,sessions=reconnecting value=0 1524569107000000000

citrix_icasessions,environment=uat,host=KWXENDDCT3,sessions=other value=0 1524569107000000000

citrix_icasessions,environment=uat,host=KWXENDDCT4,sessions=total value=139 1524569136000000000

citrix_icasessions,environment=uat,host=KWXENDDCT4,sessions=active value=102 1524569136000000000

citrix_icasessions,environment=uat,host=KWXENDDCT4,sessions=disconnected value=37 1524569136000000000

citrix_icasessions,environment=uat,host=KWXENDDCT4,sessions=reconnecting value=0 1524569136000000000

citrix_icasessions,environment=uat,host=KWXENDDCT4,sessions=other value=0 1524569136000000000

When we look into influxdb;

[root@KLMETIDBT1 ~]# influx
Connected to http://localhost:8086 version 1.5.1
InfluxDB shell version: 1.5.1
'> use custommon
Using database custommon
'> SHOW TAG VALUES ON “custommon” FROM “citrix_icasessions” WITH KEY = “host”
name: citrix_icasessions
key value


host KWXENDDCT3
'>

KWXENDDCT4 is recently configured node. It was configured approximately 2 weeks after KWXENDDCT3 was configured.

Do you have any idea?

Perhaps the host is being overwritten by the Kafka consumer? You may want to disable the hostname tag on the Kafka consumers by setting:

[agent]
  omit_hostname = true