Write to Influx via telegraf is showing fewer data points than inserted

im using kafka to publish 1000 messages to telegraf which in turn writes to influx.
Influx measurement count shows only 529 data points in cli for query
(select count(*) from <meas_name> )

telegraf config is this :

[agent]
  ## Default data collection interval for all inputs
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  debug = true
  # quiet = false
  logtarget = "file"
  logfile = "/home/telegraf/telegraf.log"
  logfile_rotation_interval = "2d"
  logfile_rotation_max_size = "50MB"
  logfile_rotation_max_archives = 50
  # log_with_timezone = ""
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = false

[[outputs.influxdb_v2]]
  urls = ["http://localhost:8086"]
  token = "V645342442sfsfsdfR-kky6zA_zOnJOoUdIaeBYWG3dXlkXobfNRDSJQNq33Lnk_K3lYopn3duu9OWSwweNlA9ADDw=="
  organization = "qwerty"
  bucket = "BUCK1"

# Read metrics from Kafka topics
[[inputs.kafka_consumer]]
  ## Kafka brokers.
  brokers = ["localhost:9092"]

  ## Topics to consume.
  topics = ["TOPIC_TEST1", "TOPIC_TEST2"]


  sasl_username = "admin"
  sasl_password = "admin"
  sasl_mechanism = "PLAIN"

  max_message_len = 1000000
  data_format = "json"
  json_name_key = "msgId"
  tag_keys = ["gwId","deviceId","category","uid"]
  json_time_key = "srcTimestamp"
  json_time_format = "unix_ms"

telegraf logs do show 200 + 800 (1000) writes.

but for that particular msgId in influxdb the total count of messages is in the range of 520-530.
the influx logs show no sign of any errors.

the message published from kafka is -

{“msgId”:“76543”,“gwId”:“EA0G0018”,“deviceId”:“ED401A0D0000089”,“category”:“TELEMETRY”,“uid”:“ED401A0D0000089-90007-1”,“temperature”:999,“srcTimestamp”:1724338050166}

temperature value ranging from 0-999 (just testing)

Any help is appreciated. am i missing some config in telegraf? or can is there any optimization required?

Hello @Mayur_Rotti,
My guess is that you’re overwriting data or making upserts.
If you write something with the same series and same timestamp you’ll end up overwriting.
For example

mymeas, tag1=tagkey fieldkey=0 timestamp1
mymeas, tag1=tagkey fieldkey=2 timestamp1

Telegraf would write two poitns but only one point would appear in InfluxDB:

mymeas, tag1=tagkey fieldkey=2 timestamp1

You could write the values to a file from telegraf as well to see if that’s happening.
You might need to add a tag to differentiate points. For example this woudn’t be an overwrite

mymeas, tag1=tagkey fieldkey=0 timestamp1
mymeas, tag1=tagkey2 fieldkey=2 timestamp1

I hope that helps!

Can you share your telegraf config please?

Thanks for your response @Anaisdg,
Sorry i forgot to reply. found the mistake i was making. i was actually adding timestamp in code to the message so when the message frequency was high (say 500 or 1k messages per second) then a lot of ‘ms’ timestamps were same and hence overwritten. so for now i have disabled the timestamp json_time_key
and json_time_format from the telegraf config so that the influx itself handles the timestamps. Looking for a way to add unique timestamps from the message source when message frequency is high.

Also the telegraf config is pasted in the question.

Thanks!

1 Like

@Mayur_Rotti oh whooops sorry I don’t know how I missed that thanks for sharing. So are you all good then?

@Anaisdg Yes. Thank you!