How to filter out unchanged values (with difference() == 0) in Flux?

virtualdj · January 28, 2023, 1:10pm

Hi, I’ve made my first query with Flux to extract data that comes from Home Assistant, like so:

from(bucket: "homeassistant")
  |> range(start: 2022-08-10T00:00:00Z)
  |> filter(fn: (r) => r["entity_id"] == "previous_car_charge_distance")
  |> filter(fn: (r) => r["_field"] == "value")
  |> drop(columns: ["_measurement", "_field", "_start",
   "_stop", "domain", "entity_id", "friendly_name"])
  |> filter(fn: (r) => r["_value"] > 0)
  |> group()
  |> sort(columns: ["_time"], desc: false)

This query gives these results:

table _value _time
0	  42	 2022-08-11T02:10:08.111Z
0	  27	 2022-08-12T23:13:04.456Z
0	  27	 2022-08-13T14:15:15.366Z <-- remove this
0	  15	 2022-08-14T00:00:55.586Z
0	  119	 2022-08-14T10:01:26.947Z
0	  122 	 2022-08-15T03:52:06.659Z
0	  122	 2022-08-20T08:21:56.017Z <-- remove this

My _value tag is something that is not really time-related; basically it is a variable that holds how many kilometers a car run with a single charge. So I cannot use a time-window to aggregate those values because it might be an hour, a day or a week…

My goal is keeping only the values that are changed from the last one and so I want to remove those two records marked above. How can I do that in Flux?

I’ve tried with difference() which gaves 0 for the unchanged values, but I cannot understand how to use that function in Flux to filter out the records. I suppose I must add the difference as a new column and then filter that column out?

grant1 · January 28, 2023, 1:39pm

Welcome @virtualdj

Nice job on the query and clear explanation. What result does this give you?

from(bucket: "homeassistant")
  |> range(start: 2022-08-10T00:00:00Z)
  |> filter(fn: (r) => r["entity_id"] == "previous_car_charge_distance")
  |> filter(fn: (r) => r["_field"] == "value")
  |> drop(columns: ["_measurement", "_field", "_start",
   "_stop", "domain", "entity_id", "friendly_name"])
  |> difference()
  |> filter(fn: (r) => r._value >= 0)
  |> yield(name: "ChangeInKilometers")

virtualdj · January 28, 2023, 4:24pm

Thanks. Your query gives:

table _value _time
0	  0	     2022-08-13T14:15:15.366Z
0	  104	 2022-08-14T10:01:26.947Z
0	  3	     2022-08-15T03:52:06.659Z
0	  0	     2022-08-20T08:21:56.017Z
0	  48	 2022-09-08T03:29:43.150Z
0	  9	     2022-09-11T05:25:16.556Z
0	  0	     2022-09-19T16:10:27.721Z

So it displays the difference (but it excluded the first line 2022-08-11); now how can I correlate this to the original results?

EDIT: It doesn’t seem also to pick up the correct _time.

grant1 · January 28, 2023, 6:35pm

Let’s try this (I did not check the syntax of the map() function, so an error may pop up). You should see a new column called ChangeInKilometers as well as the original column.

from(bucket: "homeassistant")
  |> range(start: 2022-08-10T00:00:00Z)
  |> filter(fn: (r) => r["entity_id"] == "previous_car_charge_distance")
  |> filter(fn: (r) => r["_field"] == "value")
  |> drop(columns: ["_measurement", "_field", "_start",
   "_stop", "domain", "entity_id", "friendly_name"])
  |> map(fn: (r) => ({r with ChangeInKilometers: (difference(v: r._value))}))
  |> filter(fn: (r) => r["ChangeInKilometers"] >== 0) 
  |> yield(name: "ChangeInKilometers")

virtualdj · January 28, 2023, 7:17pm

It complains of an error on the line:

|> map(fn: (r) => ({r with ChangeInKilometers: (difference(v: r._value))}))

The error is:

error @7:65-7:73: found unexpected argument v
error @7:51-7:74: missing required argument tables

grant1 · January 28, 2023, 11:22pm

Yep, sorry about that. Still learning all this Flux stuff myself. Try this:

from(bucket: "homeassistant")
  |> range(start: 2022-08-10T00:00:00Z)
  |> filter(fn: (r) => r["entity_id"] == "previous_car_charge_distance")
  |> filter(fn: (r) => r["_field"] == "value")
  |> drop(columns: ["_measurement", "_field", "_start",
   "_stop", "domain", "entity_id", "friendly_name"])
  |> duplicate(column: "_value", as: "OriginalValue")
  |> difference()
  |> rename(columns: {_value: "Difference"})
  |> filter(fn: (r) => r["Difference"] > 0) 
  |> yield(name: "ChangeInKilometers")

virtualdj · January 29, 2023, 11:08am

I’m away from my home (and Influxdb instance) now, so you should settle for a picture as Influxdb is not very mobile-friendly.
However it doesn’t seem the query gives the correct results… It skips so much data?

grant1 · January 29, 2023, 1:32pm

The above is what is causing the some results to get filtered out. Delete it and then run again and see how the data looks, then we’ll go from there…

grant1 · January 29, 2023, 2:01pm

OK, I think this will work…

from(bucket: "homeassistant")
  |> range(start: 2022-08-10T00:00:00Z)
  |> filter(fn: (r) => r["entity_id"] == "previous_car_charge_distance")
  |> filter(fn: (r) => r["_field"] == "value")
  |> drop(columns: ["_measurement", "_field", "_start",
   "_stop", "domain", "entity_id", "friendly_name"])
  |> duplicate(column: "_value", as: "diff")
  |> difference(
    columns: ["diff"],
    keepFirst: true
    )
  |> filter(fn: (r) => not exists r.diff or r.diff != 0)
  |> yield(name: "ChangeInKilometers")

Borrowed mostly from here.

virtualdj · January 29, 2023, 7:16pm

Yes! You’ve done it!

Results

So basically I had to duplicate the column and apply difference only to that column and then filter it out. Clear! Thank you very much for your help!

virtualdj · January 29, 2023, 7:17pm

So it is not necessary to sort explicitly?

grant1 · January 29, 2023, 8:52pm

Glad it’s working.

I do not think the sort() by time function is required, since your data already appears to be sorted by time.

Hukondora · February 1, 2023, 12:20pm

Hi @grant1,

I think I have a similar issue and hope that you might be able to help with that too, so I will post it here:

Thank you!

Topic		Replies	Views
Flux and math operations Fluxlang flux	1	574	January 2, 2022
Filter out rows based on the row directly before and after that InfluxDB 2 influxdb , query , flux	4	746	February 2, 2023
How to filter data to show only changes in InfluxDB? InfluxDB 2 influxdb	0	55	November 21, 2024
Difference between two cumulative values InfluxDB 2 influxdb , query , flux	5	1210	May 30, 2023
Need help. Influxdb1 vs. influxdb2 with flux InfluxDB 2	3	263	October 5, 2023

How to filter out unchanged values (with difference() == 0) in Flux?

Related topics