Conditional Filter to find table

strittmm · April 30, 2021, 7:13pm

Hello @scott,
I want to use a conditional filter in combination with tableFind() to find a table within a stream of tables where the first value = 0 and from that table I would like to extract the value of a tag. How would I do that?
What I have so far:

check = data 
  |> first()
  |> tableFind(fn: (key) => true)
  |> getColumn(column: "_value")

if check[0] == 0 then
(
 // Do something with the extracted tag
)
else
(
 // Do something different with the extracted tag
)

It extracts the column I would like to check if the first value is zero but only if it is in the first table in the table stream…
How can I check all tables (they only vary by one tag) and extract the value of the tag that separates them?

scott · April 30, 2021, 9:40pm

You’re on the right track. I’d suggest, after your first() call, filtering by rows with a _value of 0. Then in getColumn(), get the tag column to retrieve the value of the tag.

check = data 
  |> first()
  |> filter(fn: (r) => r._value == 0)
  |> tableFind(fn: (key) => true)
  |> getColumn(column: "yourTag")

Flux currently doesn’t guarantee the order of tables in a stream of tables, so it’s not possible to check if a table is the first in a stream of tables.

Conditional operators (if, if else, else, then) can only be used inside of a variable declarations or record values. You can’t use them to conditionally fork a function pipeline. What exactly are you trying to do once you have the tag value?

strittmm · May 1, 2021, 11:50am

@scott I would like to use a conditional execution to make sure I don’t miss any values. The thing is that I have “on” and “off” values and I am trying to calculate the duration between them. When I am using a flexible time window, it could happen that the first value I get back is a “off” value. With the way I calculate the durations the signal has been active can lead to me getting negative values for durations. When the “off” value is considered to be the first on it subracts a later timestamp from this value. How do I make sure I get only positive durations?

What I was trying to do was to check for which tag-set the first value is zero and then query the previous value for this tag to complete the series…

The code I use to calculate the durations:

data = from(bucket: "plcview_4/autogen")
  |> range(start: 2021-01-29T00:00:00.000Z, stop: now())
  |> filter(fn: (r) => r._measurement == "FAULT_LASER")

events = data
  |> map(fn: (r) => ({ r with event: if r._value == 1 then "start" else "stop", index: 1 }))
  |> group(columns: ["event"])

startEvents = events |> filter(fn: (r) => r.event == "start") |> cumulativeSum(columns: ["index"])
stopEvents = events |> filter(fn: (r) => r.event == "stop") |> cumulativeSum(columns: ["index"])

faultEvents = join(tables: {start: startEvents, stop: stopEvents}, 
	 on: ["index", "equipmentNumber", "_measurement", "_field", "workplace"])
  |> drop(columns: ["event_start", "event_stop", "index", "_start_start", "_start_stop", "_stop_start", "_stop_stop"])
  |> rename(columns: {_time_start: "start", _time_stop: "stop"})
  
faultDurations = faultEvents
  |> map(fn: (r) => ({ r with duration: (int(v: r.stop) - int(v: r.start))/1000000 }))

scott · May 3, 2021, 3:28pm

Would filtering out negative durations accomplish what you need? Or do you need to somehow still include that first row in the calculation?

// ...
faultDurations = faultEvents
  |> map(fn: (r) => ({ r with duration: (int(v: r.stop) - int(v: r.start))/1000000 }))
  |> filter(fn: (r) => r.duration >= 0)

strittmm · May 4, 2021, 7:24am

Yes I do need that first row and filtering out negative durations would not work because if the query returns a “Stop”-Value as the first row, every duration will be negative. I am trying to find a way to still calculate the durations while being able to use a varying time window…

I am currently using a “ugly” solution with the elapsed()-function. This does calculate the durations correctly but it gives me the durations with the wrong timestamps. Because it calculates subsequent rows, I get the “fault_duration” with the timestamp of the “Fault-Stop”-Value. For calculating the sum of the durations or something like that this isn’t a problem but it isn’t the most elegant way.

By the way: Is it possible to test query performance for Flux queries in v1.8.4? When trying to import the profiler package with import "profiler" I receive the error error: unkown import path: "profiler"

scott · May 4, 2021, 3:32pm

Hmm, this is a tricky problem. I think this use case merits a proposal for a new function – trim() or something similar. A function that would let you trim rows off the beginning or end of input tables. You may consider submitting a feature request on the Flux repo.

The version of Flux packaged with InfluxDB 1.8.4 doesn’t support the profiler package. 1.8.5 was recently released and I believe it includes a newer version of Flux, but don’t quote me on that.

Topic		Replies	Views
Filter data in table based on value of one column, keeping other data InfluxDB 2 influxdb , query , flux	6	4190	May 19, 2020
Using if condition in the flux query flux	0	526	November 23, 2022
Searching through all columns at a time InfluxDB 2 influxdb , query	4	492	June 13, 2022
FIltering tag keys without knowledge of their existence Fluxlang influxdb , schema , query , flux	0	331	June 23, 2023
findColumn results in contains Fluxlang	4	628	March 22, 2022

Conditional Filter to find table

Related topics