Hey everyone, I’m trying to use the dedup processor with Telegraf (version 1.24.2). Here’s my current config:
[[processors.dedup]]
## Maximum time to suppress output
dedup_interval = "30s"
And my “agent” config:
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "120s"
flush_jitter = "0s"
precision = "10s"
hostname = ""
omit_hostname = true
And my input/output plugins (sorry for the extraneous information, just want to be complete):
[[outputs.influxdb_v2]]
alias = "sensor-data"
urls = ["https://us-central1-1.gcp.cloud2.influxdata.com"]
token = "$INFLUX_TOKEN"
organization = "Thalo Labs"
bucket = "sensor-data"
content_encoding = "gzip"
data_format = "influx"
[outputs.influxdb_v2.tagpass]
state = ["ACTIVE"]
data = ["true"]
[[inputs.socket_listener]]
service_address = "udp://:$SENSOR_DATA_PORT"
data_format = "json"
json_time_key = "time"
json_time_format = "unix_ms"
json_name_key = "sensor"
tag_keys = [ "state" ]
[inputs.socket_listener.tags]
data="true"
node = "$BALENA_DEVICE_NAME_AT_INIT"
It seems that the processor is successfully filtering out duplicate metrics however the dedup_interval portion of the config is ignored, or perhaps I don’t understand correctly what it is meant to do. I am under the impression (from the comments and code in telegraf/dedup.go at master · influxdata/telegraf · GitHub) that the dedup_interval should allow me to specify a period of time where a metric will be published regardless if it is a duplicate…so for example if I have all identical readings every 10s for 5 minutes (30 metrics total) and I have a dedup interval of 30s…then the number of metrics that get published should be 10 (5 minutes / 30s) identical metrics. The behavior I am observing is that only 1 metric is published no matter how long the period and no other metrics are published unless there is a change or I restart the telegraf instance.