Starlark processor - Remove unknown prefix

Hello,
I’m new with InfluxDB, Telegraf as well Starlark processor.
The basic works well, but I have a special case with metric.fields.

The json input string comes with different syntax “1759687200000|1808425.2361111215 Wh” and “278.5 W”. The random prefix before | should be removed.
The code starts with:

def apply(metric):
for k, v in metric.fields.items():
if type(v) == “string”:
longstr = metric.fields[k].find(“|”)
if longstr == 14

In the documentation I can’t find any idea how i can remove just “1759687200000|” because the syntax is not always the same.
Any ideas are welcome
Many thanks
2Xi

Sometimes it’s easier than you think:

def apply(metric):
for k, v in metric.fields.items():
if type(v) == “string”:
longstr = metric.fields[k].find(“|”)
if longstr == 13:
metric.fields[k] = str(metric.fields[k].rsplit(“|”)[1])
return metric

‘’’

I would recommend the regex processor for this. I would reside to starlark for verry niche stuff only, like where you need to make special use of the persistent state between metrics.

Here is an example of what I think that would work for you. The regex matches all charachters after a | and ends on the next space (tested on regexr.com). In your case the field value is a unit for energy, you want to store this as an integer or float and not add the “Wh” at the end which would make it a string and useless for decent timeseries analysis.

[[processors.regex]]
  alias = "trim prefix"
  [[processors.regex.metric_rename]]
    pattern = '^(.*?)\|'

  [[processors.regex.fields]]
    key = "*"
    pattern = "(?<=\|)[^\s]+"

Hope this helps. Let met know if it worked for you.

1 Like