Hey everyone I have a question.
Say I have some data along the lines of:
2021-11-16T14:45:04.959918682Z,start
2021-11-16T14:45:22.888960013Z,stop
2021-11-17T09:17:34.493371966Z,start
2021-11-17T09:18:41.713114886Z,stop
2021-11-17T11:06:31.444954418Z,stop
Where name
is the name of the field and is either a start
or stop
event. But I want to remove duplicate events. So in the end I would only like the start and stop events to be interlaced [start, stop, start, ...]
. And not multiple of the same events after each other. This can be seen in the example data as the last two stop’s at the end. I tried something along the lines of:
import "dict"
l = ["name": "None"]
dedup = (lastValue, newName) => {
result = dict.get(dict: lastValue, key: "name", default: "None") != newName
lastValue = dict.insert(dict: lastValue, key: "name", value: newName)
return result
}
from(bucket: "sr-data")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "events" and r._field == "name")
|> filter(fn: (r) => dedup(lastValue: l, newName: r._value))
But this is kind of mutable imperative code, as we are not allowed to modify lastValue
. Which I understand, but to do this we need some kind of look-ahead or backtracking. As in: have we seen this a stop
event before the current one that we are processing, any idea how to do this with flux?
The use-case is that I want to calculate the time between a start
and a stop
event. So either with a duration
or the contrib events.duration
. But because the program can crash for example, there can be a start
without a corresponding stop.
Thanks! (edited)