Duplicate Data Point in InfluxDB

Nilesh786 · February 6, 2024, 8:49am

Hello All,

My usecase to read from a json output file wriiten by K6 (an opensource load testing tool) and write every new json to influxdb via telegraf. I have deployed two pods in k8s cluster for this purpose, one is a k6-telegraf pod and another is an influxdb pod, below is my telegraf conf file

  [agent]
    interval      = "$POLL_INTERVAL"
    omit_hostname = true
    metric_batch_size = 1000
    metric_buffer_limit = 10000
    flush_interval = "5s"

  [[inputs.tail]]
    files = ["/outputs/result.json"]
    data_format = "json"
    from_beginning = false
    path_tag = ""
    json_name_key = "metric"
    json_string_fields = ["data_value", "data_tags_name", "data_tags_scenario", "data_tags_testrun", "data_tags_workflow", "type", "data_time"]

  [[processors.starlark]]
  source='''
  def apply(metric):
    if metric.fields["type"] == "Point":
        return metric
    return None
  '''

  [[outputs.influxdb]]
    urls = ["$INFLUX_HOST"]
    database = "$INFLUX_DATABASE"
    username = "$INFLUX_USERNAME"
    password = "$INFLUX_PASSWORD"
    timeout = "5s"

I observed that telegraf is writing twice the datapoints in influxdb, I was following this issue raised earlier here
I am not able to find any viable solution to my problem
Here is the influxdb output for one of the measurement this is the case for every measurement

I have tested for similar cases also but it worked perfectly(like writing to a example.out file and push it to influxdb using tail) but only for this case
Happy to get some help regarding this

Anaisdg · February 6, 2024, 5:36pm

Hello @Nilesh786,
Hmm the points don’t look like exact duplicates/different timestamps. But I’m assuming you want to be overwriting the point? I’m not sure why it’s happening though. Are you using two telegraf agents deployed in both pods with the same config/reading the same json? Apologies i’m a little confused about your architecture.

Do you get this error outside of k6?

Nilesh786 · February 6, 2024, 6:50pm

No I am using one telegraf agent only, this is happening with k6 scenario only I tried with one sample out file no issues there, but for k6 outputs only telegraf is writing twice the datapoints from the file it is tailing.

Nilesh786 · February 6, 2024, 6:53pm

My usecase is simple, I am writing the k6 metrics to a json file and using telegraf agent to tail the file and push to influxdb, here I am running the load test using k6 and simultaneously pushing all the k6 metrics to influxdb, I want to avoid using the k6-influxdb client for its limitations.

scott · February 6, 2024, 8:21pm

I’m guessing what’s happening here is that Telegraf is sending the data with no timestamps so the times that get assigned to the point are the times that InfluxDB actually writes each point. That would explain the microsecond difference in timestamps. In InfluxDB, each point is uniquely identified by its measurement, tag set, and timestamp. Since the timestamps are different, InfluxDB recognizes each as a unique point.

You need to tell Telegraf to use the JSON data_time field as the time value when writing the point to InfluxDB. I’d also recommend writing some of these as tags instead of everything as string fields. The one column in there that should be a field (data_value) is numeric, so it shouldn’t be included as a string field.

Try this inputs.tail config:

[[inputs.tail]]
    files = ["/outputs/result.json"]
    data_format = "json"
    from_beginning = false
    path_tag = ""
    json_name_key = "metric"
    tag_keys = ["data_tags*", "type"]
    json_time_key = "data_time"
    json_time_format = "2006-01-02T15:04:05Z07:00"

Nilesh786 · February 7, 2024, 8:41am

Thanks this works perfectly

Topic		Replies	Views
Write to Influx via telegraf is showing fewer data points than inserted Telegraf influxdb , kafka	4	24	August 27, 2024
Telegraf log parser ---> Influxdb duplicates values Telegraf	14	2491	February 27, 2019
Telegraf only writes one entry to influxdb Telegraf influxdb , telegraf	3	1104	September 23, 2021
[solved] Telegraf -> Influx write issue Store	10	4114	May 8, 2019
Overwrite data in InfluxDB InfluxDB 2 influxdb , telegraf , grafana	2	3676	November 16, 2021

Duplicate Data Point in InfluxDB

Related topics