Telegraf processor: remove nan

Hi everyone,

I am new to Telegraf and the influx community but am very impressed with the ease of use and the products. With this in mind, I would like your advice on a problem I am facing.

I set up Telegraf to collect data via MODBUS from an energy meter and it works fine. But sometimes the result is “nan” and I would like to filter these with the help of a Telegraf processor before it gets injected into the DB. I feel that regex could be one way (e.g. by replacing “nan” with “-1”) but just not logging it, would also be fine. Do you have an idea how to achieve this?

Thank you very much in advance,
Frank

Hello @inviridi,
Welcome! I’m glad you’re enjoying telegraf. I’ll let the team know!
Have you seen the execd processor plugin? It allows telegraf to be extensible in any language. You can use it to convert values of line protocol through stdin and sent to stdout.

Here’s an example:

Yes, please! Telegraf is quite a joy to use and most of all it works reliably. I was actually reluctant to give it a try because I just have two kinds of devices to monitor and have a fully working Python script for that. But the network connection is flaky and I wasted hours trying to fail gracefully and reconnect properly. Telegraf just does this so nicely out of the box – I am very grateful.

The execd output adds complexity I would like to avoid. But if I cannot get it to work with Regex, I’ll give it a try. I am I the first to encounter this problem?

Thanks for the welcome, by the way!

1 Like

Hello @inviridi,
I don’t think you are, but I just personally find regex to be more frustrating. That’s only why I suggested the execd plugin. You might also enjoy the starlark plugin How to Use Starlark with Telegraf | InfluxData.

Dear @Anaisdg,

That is a great idea and you are absolutely right about this:

Starlark has strong ideas and the number of options this adds to Telegraf is staggering. And yet I cannot get it to work. :frowning_face:

The MWE runs be along the lines of:

[[processors.starlark]]
# namepass = ['measurementname']
source = '''
def apply(metric):  
  for k, v in metric.fields.items():
    if v == "nan":
      metric.fields['k'] = -1   
  return metric
'''

which lets me start Telegraf but does not filter the NaNs. According to the Starlark documentation I should be able to write NaN (without the quotes) but this won’t even let me start Telegraf (undefined: NaN).

The actual error message when trying to collect metrics looks like this:

2021-04-27T14:31:45Z E! [agent] Error writing to outputs.cratedb: ERROR: Columns cannot be used in this context. Maybe you wanted to use a string literal which requires single quotes: 'nan' (SQLSTATE XX000)

Can you see what is going wrong here?

Hello @inviridi,
I’m not a starlark pro, so I’ll share your question with the telegraf team. Thanks in advance for your patience.

You could also easily do some data processing with Flux tasks. Your Flux task would look something like:

option v = {
  bucket: "output bucket",
  measurement: "my transformed meeasurement",
  start: "",
  timeRangeStart: -30m,
  timeRangeStop: now()
}

option task = { 
  name: "nan to -1",
  every: 30m,
  offset: 1m
}
import "math"

from(bucket: "input bucket")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "my original measurement")
  |> map(fn: (r) => ({
            r with
            _value:
                if r._value == math.NaN() then -1.0
                else  r._value 
            })
            )
|> to(bucket: "output bucket", org: "my org")

All good… I look forward to the telegraf team’s advice. Thank you!

Hi @inviridi,

If you want to do the replacement of NaN to -1 a script along the lines of this should work:

[[processors.starlark]]
namepass = ["measurementname"]
source = '''
def apply(metric):  
  for k, v in metric.fields.items():
    if str(v) == "NaN":
      metric.fields[k] = -1   
  return metric
'''

Or if you’d like to drop the field on that metric instead:

[[processors.starlark]]
  namepass = ["measurementname"]
  source = '''
load("logging.star", "log")

def apply(metric):
    for k, v in metric.fields.items():
        if str(v) == "NaN":
            metric.fields.pop(k)
            log.warn("Dropped NaN value: metric {} field {}".format(metric.name, k))

    return metric
'''

Let us know if that doesn’t resolve the issue for you.

Thanks!

Hi @helenosheaa,

sun is shining, it is Friday and both of your solutions work beautifully. Life is good. :smiley: Thank you so much for your help!

Just a comment: When searching for solutions before, Google had not many answers to Telegraf-related questions. Perhaps, because this forum is not accessible by the crawlers? I am mentioning it because I feel that Telegraf deserves as much attention as it can get.

Thanks again & have a nice weekend,
Frank

1 Like

@inviridi, no problem. I’m happy to hear that helped you!

That’s a good point and I’m not sure if thats the case or not for the crawlers. I’ll look into it.

Thanks for the feedback and for the positive review of Telegraf!