Find difference over groups of time for a specific tag

Please, consider this set of data separated in 3 groups to reflect the 3 different groups of time

_measurement  _time                  _field  _value  tag
test          2021-01-01T00:00:00Z   fld	 1	     a
test          2021-01-01T00:00:00Z   fld	 1 	     b
test          2021-01-01T00:00:00Z   fld	 1 	     c
test          2021-01-01T00:00:00Z   fld 	 1 	     d	    
test          2021-01-01T00:00:00Z   fld	 1 	     e
test          2021-01-01T00:00:00Z   fld	 1	     f

Next time, tag elements c and e are removed ; tag elements g, h and i are added:

_measurement  _time                  _field  _value  tag
test          2021-01-02T00:00:00Z   fld	 1	     a
test          2021-01-02T00:00:00Z   fld 	 1	     b
test          2021-01-02T00:00:00Z   fld 	 1	     d
test          2021-01-02T00:00:00Z   fld 	 1 	     f	    
test          2021-01-02T00:00:00Z   fld 	 1	     g
test          2021-01-02T00:00:00Z   fld	 1	     h
test          2021-01-02T00:00:00Z   fld	 1	     i

Next time, tag elements d, g and h are removed ; tag element j is added.

_measurement  _time                  _field  _value  tag
test          2021-01-03T00:00:00Z   fld	 1	     a
test          2021-01-03T00:00:00Z   fld 	 1	     b
test          2021-01-03T00:00:00Z   fld 	 1 	     f	    
test          2021-01-03T00:00:00Z   fld	 1	     i
test          2021-01-03T00:00:00Z   fld	 1	     j

Now, I would like to find the right query to display the difference in tags between consecutive sets of times like this:

Time                  Tag_elmt_total  Tag_elmt_removed  Tag_elmt_added
2021-01-01T00:00:00Z  6               -                 -
2021-01-02T00:00:00Z  7               2                 3
2021-01-03T00:00:00Z  5               3                 1

I figured out how to count the total number of “tag” elements for each time, but I am not sure how to calculate the number of removed and added tag elements between each groups of time to produce a table like the one above…

from(bucket: "myBucket")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "test" and r._field == "fld")
  |> group(columns:["_time"])
  |> sum()
  |> group()

Thank you for any insight…

Hello @bolemo,

  1. You’d have to use timeShift to reposition the rows.
    timeShift() function | Flux 0.x Documentation
  2. Then you join on time so you can compare
    Joins in Flux | Additional resources | InfluxData Documentation
join.time(left: data_from_02, right: shifted_data_from_02_that_now_has_01, as: (l, r) => ({l with _value_02: r._value}))
  1. Then you’d use conditional filtering to compare values from 01 and 02 to see if they’re the same or new
    Query using conditional logic in Flux | InfluxDB Cloud Documentation
  2. Then you’d sum

I hope that helps!

Sounds promising.
The timeShift() might not be exactly the function I need, as the interval between two sets of measurements might not be a constant duration (I need to compare between a specific timestamp from the previous one (whatever the duration between them might be).
It seems to be however a tool I could definitely use with an interval slightly smaller than the smallest one I would expect, and discard all consecutive duplicates tables or something like that.

This is already helping considerably. I will experiment with this and come back here with my results.

Thank you for your help :slightly_smiling_face:

@bolemo Thank you!! Yah let me know how it goes :slight_smile: