Telegraf processor: remove nan

inviridi · April 21, 2021, 3:39pm

Hi everyone,

I am new to Telegraf and the influx community but am very impressed with the ease of use and the products. With this in mind, I would like your advice on a problem I am facing.

I set up Telegraf to collect data via MODBUS from an energy meter and it works fine. But sometimes the result is “nan” and I would like to filter these with the help of a Telegraf processor before it gets injected into the DB. I feel that regex could be one way (e.g. by replacing “nan” with “-1”) but just not logging it, would also be fine. Do you have an idea how to achieve this?

Thank you very much in advance,
Frank

Anaisdg · April 23, 2021, 6:24pm

Hello @inviridi,
Welcome! I’m glad you’re enjoying telegraf. I’ll let the team know!
Have you seen the execd processor plugin? It allows telegraf to be extensible in any language. You can use it to convert values of line protocol through stdin and sent to stdout.

Here’s an example:

inviridi · April 23, 2021, 9:30pm

Yes, please! Telegraf is quite a joy to use and most of all it works reliably. I was actually reluctant to give it a try because I just have two kinds of devices to monitor and have a fully working Python script for that. But the network connection is flaky and I wasted hours trying to fail gracefully and reconnect properly. Telegraf just does this so nicely out of the box – I am very grateful.

The execd output adds complexity I would like to avoid. But if I cannot get it to work with Regex, I’ll give it a try. I am I the first to encounter this problem?

Thanks for the welcome, by the way!

Anaisdg · April 26, 2021, 4:51pm

Hello @inviridi,
I don’t think you are, but I just personally find regex to be more frustrating. That’s only why I suggested the execd plugin. You might also enjoy the starlark plugin How to Use Starlark with Telegraf | InfluxData.

inviridi · April 27, 2021, 2:48pm

Dear @Anaisdg,

That is a great idea and you are absolutely right about this:

Starlark has strong ideas and the number of options this adds to Telegraf is staggering. And yet I cannot get it to work.

The MWE runs be along the lines of:

[[processors.starlark]]
# namepass = ['measurementname']
source = '''
def apply(metric):  
  for k, v in metric.fields.items():
    if v == "nan":
      metric.fields['k'] = -1   
  return metric
'''

which lets me start Telegraf but does not filter the NaNs. According to the Starlark documentation I should be able to write NaN (without the quotes) but this won’t even let me start Telegraf (undefined: NaN).

The actual error message when trying to collect metrics looks like this:

2021-04-27T14:31:45Z E! [agent] Error writing to outputs.cratedb: ERROR: Columns cannot be used in this context. Maybe you wanted to use a string literal which requires single quotes: 'nan' (SQLSTATE XX000)

Can you see what is going wrong here?

Anaisdg · April 27, 2021, 7:14pm

Hello @inviridi,
I’m not a starlark pro, so I’ll share your question with the telegraf team. Thanks in advance for your patience.

You could also easily do some data processing with Flux tasks. Your Flux task would look something like:

option v = {
  bucket: "output bucket",
  measurement: "my transformed meeasurement",
  start: "",
  timeRangeStart: -30m,
  timeRangeStop: now()
}

option task = { 
  name: "nan to -1",
  every: 30m,
  offset: 1m
}
import "math"

from(bucket: "input bucket")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "my original measurement")
  |> map(fn: (r) => ({
            r with
            _value:
                if r._value == math.NaN() then -1.0
                else  r._value 
            })
            )
|> to(bucket: "output bucket", org: "my org")

inviridi · April 29, 2021, 6:38am

All good… I look forward to the telegraf team’s advice. Thank you!

helenosheaa · April 29, 2021, 8:28pm

Hi @inviridi,

If you want to do the replacement of NaN to -1 a script along the lines of this should work:

[[processors.starlark]]
namepass = ["measurementname"]
source = '''
def apply(metric):  
  for k, v in metric.fields.items():
    if str(v) == "NaN":
      metric.fields[k] = -1   
  return metric
'''

Or if you’d like to drop the field on that metric instead:

[[processors.starlark]]
  namepass = ["measurementname"]
  source = '''
load("logging.star", "log")

def apply(metric):
    for k, v in metric.fields.items():
        if str(v) == "NaN":
            metric.fields.pop(k)
            log.warn("Dropped NaN value: metric {} field {}".format(metric.name, k))

    return metric
'''

Let us know if that doesn’t resolve the issue for you.

Thanks!

inviridi · April 30, 2021, 9:26am

Hi @helenosheaa,

sun is shining, it is Friday and both of your solutions work beautifully. Life is good. Thank you so much for your help!

Just a comment: When searching for solutions before, Google had not many answers to Telegraf-related questions. Perhaps, because this forum is not accessible by the crawlers? I am mentioning it because I feel that Telegraf deserves as much attention as it can get.

Thanks again & have a nice weekend,
Frank

helenosheaa · April 30, 2021, 2:51pm

@inviridi, no problem. I’m happy to hear that helped you!

That’s a good point and I’m not sure if thats the case or not for the crawlers. I’ll look into it.

Thanks for the feedback and for the positive review of Telegraf!

Topic		Replies	Views
Monitoring dump1090 Telegraf telegraf	7	1078	March 18, 2021
[[inputs.exec]] Test shows data, yet none makes it to influxdb Telegraf influxdb , telegraf , grafana	4	930	September 8, 2020
Telegraf inputs.exec plugin Telegraf influxdb , telegraf	1	525	December 24, 2021
Telegraf Data Transformation Use Case Telegraf telegraf	0	482	April 6, 2021
Telegraf filter out bad data Telegraf telegraf , influxql	7	967	June 15, 2023

Telegraf processor: remove nan

Related topics