Compare values in 2 columns

Hey,

i have 2 time series which return Workorder IDs.
The same IDs do not have the same timestamp within the two time series.
Also only a few IDs from the first time series can be found in the second.
Is there a way to compare the values in the two columns and post only those which cannot be found in both time series?

I tried using a conditional mapping function like this:

|>map(fn:(r) => ({

ID: if r.ID == r.ID2
then r.ID
else “false”

}))

Thank you in advance for your help

Hi @Patse,
Can you explain a little more about why the above does not work for your use case? Here is an example I run comparing my two CPU cores:

from(bucket: "Jetson")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "exec_jetson_stats")
  |> filter(fn: (r) => r["_field"] == "jetson_CPU1" or r["_field"] == "jetson_CPU2")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
  |> map(fn: (r) => ({ r with conditional: if r.jetson_CPU1 == r.jetson_CPU2 then true else false}))
  |> yield(name: "mean")

Many thanks,
Jay

The difference between our usecases is, that i have the data stored in 2 measurements.
I have to create 2 seperate tables and use the union() function before I can use the map() function. Can this be the problem?

@Jay_Clifford : Your example of comparing two CPU cores is very close to my query, where I wish to compare setpoint vs actual. However, I get the following error, despite that my query syntax generally mirrors yours. Any idea why mine won’t work?

I would potentially consider using a join() instead. Check out this example here:
https://docs.influxdata.com/flux/v0.x/stdlib/universe/union/#:~:text=union()%20versus%20join()
This would potentially remove the need for a pivot. I see no issues doing either dependant on your schema of each.

Can you show me the output of your table after the pivot (remove the map function for now)? I want to see what the raw tables look like and the data types they hold. Many thanks :slight_smile:

As my first assumption based upon your query is that you are trying to compare two values that actually where tags:

r["MeasType"] == "actual" #filtering for tag
r["_field"] == "actual" #filtering for field

This means the less than expression is invalid as you are trying to compare two string values (since they are tags). You do have a few options:

  1. Prepare your data so actual and setpoint come in as fields. This would be the easiest option though I am not sure what your data ingest process looks like.

  2. wrap your r.actual and r.setpoint in a int() transformer. This should convert your string values to numerical values for processing

Thank you Jay. Very helpful info!

Just to answer your original request, here is the output of the table without the map function (but with the pivot):

I am still in the planning / testing phases of using Flux and can also change the data ingestion process according to what makes sense. And you are correct, I am trying to compare two tags (setpoint and actual). But having watched & read all of the info / webinars on data schema, I believe that when I ingest two values at the same time (setpoint: 1580, actual: 1577), and I assign each a MeasType tag as setpoint or actual, then my data is more compact and I can then perform operations comparing these two values.

So I can indeed change the data ingest and have the actual & setpoint (which are being ‘grabbed’ at the same instant) come in as two fields, but I presumed that was not the right approach.

PS: Here is a list of my tags and fields. If I make then proposed change of ingesting actual & setpoint, then the list of fields would double to temperature_setpoint, temperature_actual, feedrate_setpoint, feedrate_actual, rotational_speed_setpoint, rotational_speed_actual, etc. This is what seemed unnecessarily complex to me, so I created the MeasType field.

@grant1 sorry this makes alot more sense and you are correct to use actual and setpoint as tags like you intended. My confusion came from the fact that you were trying to compare the values of two tags i.e

setpoint = "1000" < actual = "500"

What you have done makes sense but sadly your query won’t based on the pivot you have created. Lets look a smaller example:

import "array"

data = array.from(rows: [
{_time: 2021-09-23T15:58:02.017Z, _value: 19.0, _field : "temperature", MeasType: "actual" },
{_time: 2021-09-23T15:58:02.017Z, _value: 16.0, _field : "temperature", MeasType: "setpoint" },
{_time: 2021-09-23T15:38:02.017Z, _value: 19.0, _field : "temperature", MeasType: "actual"  },
{_time: 2021-09-23T15:38:02.017Z, _value: 20.0, _field : "temperature", MeasType: "setpoint"},
])

data
|> pivot(rowKey:["_time"], columnKey: ["MeasType"], valueColumn: "_value")
|> map(fn: (r) => ({ r with conditional: if r.actual < r.setpoint then true else false}))

In this case, I have a temperature field with the tags either: actual or setpoint. We pivot on the tag column MeasType rather than the field key. This will provide us with the two columns for comparison like so:

Note: you must include 1 field to compare which belong to the value column. The above will not work if you try comparing more than field (temperature, pressure) etc.

@Jay_Clifford I pivoted on the tag column MeasType and I can indeed now perform conditional math on the setpoint vs. the actual. Thank you for explaining.

One last question: How can one get the absolute value of the difference between setpoint and actual? The map function below works, but when I try to enter math.abs I get syntax errors.

import "math"
from(bucket: "bucket2")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "measurement2")
  |> filter(fn: (r) => r["EquipNumber"] == "6")
  |> filter(fn: (r) => r["EquipElement"] == "zone1")
  |> filter(fn: (r) => r["MeasType"] == "actual" or r["MeasType"] == "setpoint")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> pivot(rowKey:["_time"], columnKey: ["MeasType"], valueColumn: "_value")
  |> map(fn: (r) => ({ r with alarm: if (r.setpoint - r.actual) > 5 then true else false}))
  |> yield(name: "mean")

I figured it out. Just needed to sleep on this…

  |> map(fn: (r) => ({ r with alarm: if (math.abs(x: (r.setpoint - r.actual))) > 5 then true else false}))

This works for me in my original example:

|> map(fn: (r) => ({ r with conditional: if math.abs(x: r.actual - r.setpoint) > 5 then true else false}))

I believe the only changes you need to make is include x: variable is included within the abs function

1 Like

hahaha @grant1 literally posted at the same time. Glad you got it sorted

The solution to both issues included using the map function with conditional arguments. Example:

import "math"
from(bucket: "bucket2")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "measurement2")
  |> filter(fn: (r) => r["EquipNumber"] == "6")
  |> filter(fn: (r) => r["EquipElement"] == "zone1")
  |> filter(fn: (r) => r["MeasType"] == "actual" or r["MeasType"] == "setpoint")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> pivot(rowKey:["_time"], columnKey: ["MeasType"], valueColumn: "_value")
 |> map(fn: (r) => ({ r with alarm: if (math.abs(x: (r.setpoint - r.actual))) > 5 then true else false}))
  |> yield(name: "mean")

@Jay_Clifford Returning to the pivot and conditional arguments about a month ago, how would one adjust the above query so that only the absolute difference of setpoint - actual is returned?

I need to use this in a Grafana alert, and if I can get the query to only return the absolute difference, then I should be good to go.

I tried this:

import "math"
from(bucket: "bucket2")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "measurement2")
  |> filter(fn: (r) => r["EquipType"] == "generator")
  |> filter(fn: (r) => r["MeasType"] == "actual" or r["MeasType"] == "setpoint")
  |> filter(fn: (r) => r["_field"] == "temperature")
  |> filter(fn: (r) => r["EquipNumber"] == "302")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> pivot(rowKey:["_time"], columnKey: ["MeasType"], valueColumn: "_value")
  |> map(fn: (r) => ({ r with difference: math.abs(x: r.setpoint - r.actual)}))
  |> yield(name: "mean")

but I get the values for actual, difference, and setpoint: (remember that I want only the difference)
image

In Grafana (where I am trying to write the same query for an alert), it is displaying actual and setpoint (the blue and green lines) and then the difference (the yellow line). I want only the yellow line (values) to use in the query.

@grant1,
You could use a drop() function to drop actual and setpoint.

import "math"
from(bucket: "bucket2")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "measurement2")
  |> filter(fn: (r) => r["EquipType"] == "generator")
  |> filter(fn: (r) => r["MeasType"] == "actual" or r["MeasType"] == "setpoint")
  |> filter(fn: (r) => r["_field"] == "temperature")
  |> filter(fn: (r) => r["EquipNumber"] == "302")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> pivot(rowKey:["_time"], columnKey: ["MeasType"], valueColumn: "_value")
  |> map(fn: (r) => ({ r with difference: math.abs(x: r.setpoint - r.actual)}))
  |> drop(columns: ["actual", "setpoint"])
  |> yield(name: "difference")

Thank you Jay. That works perfectly. Once I see the solution, I wonder how I missed such an easy thing.

No worries at all @grant1. I am personally still learning all the tricks Flux has to offer aswell :wink: . Just drop me a message if you need anymore help

@Jay_Clifford and anyone who can help…

The query we developed back in Dec 2021 still works fine…

import "math"
from(bucket: "AMPdata")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "GeneratorZoneData")
  |> filter(fn: (r) => r["EquipNumber"] == "303")
  |> filter(fn: (r) => r["_field"] == "DewPoint")
  |> filter(fn: (r) => r["MeasType"] == "actual" or r["MeasType"] == "setpoint")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> pivot(rowKey:["_time"], columnKey: ["MeasType"], valueColumn: "_value")
  |> map(fn: (r) => ({ r with difference: float(v:math.abs(x: r.setpoint - r.actual))}))
  |> drop(columns: ["actual", "setpoint"])
  |> yield(name: "difference")

and returns data:

However, in Grafana, I get this error:

where it does not seem to like whenever there is a large difference value (it can be ~250 or so every 12 hours). On a small time window, Grafana does NOT throw an error whenever the difference is small (and apparently when there are no floats calculated).

@Jay_Clifford and all: No further action required. This query is solid and works great in InfluxDB. Sometimes these same queries go haywire in Grafana, and the solutions are elusive. I think the error it was displaying was somehow related to spanning a larger time window, and then some value in that window returned an error. Again, it works fine in InfluxDB.

Here is a slight variation of the same query with a now-12h to now time window and no errors:

1 Like