Query Issue: Cumulative Values to Simple Values in InfluxDB

Hello everyone, I’m new here and I’m facing an issue with my query. Here’s an example of the result I have:

_value sessionID
10        1
12        1
8         2
8         2
7         3

My query is supposed to calculate the total consumption of a session. However, since the incoming data is in cumulative form, I need to use the difference() function to transform it and then store it in a new measurement (then later using this new measurement to calculate the total consumption and other stuff). I have several problems:

  1. When there is only one value, I’m unable to make difference() work.
  2. For session 1, the difference is 2. So when I sum it up, I get 2, but the total consumption should be 12.
  3. For session 2, the difference is 0. However, I would like to obtain 8 as the result.

One idea I had was to add a value of 0 before each session, but it seems a bit complicated and unusual.

I retrieve the data from an API, and therefore, I would like to process the data directly in Influx as much as possible.

If my problem is not clear, please feel free to ask for further clarification.

Any help or suggestions would be greatly appreciated. Thank you in advance!

I think for you |> increase() will work better than using difference

I think increase() returns the cumulative sum which is not what I am looking for. I need to first convert the data from a measurement (which is in a cumulative form) into another one (not in a cumulative form). The difference() function works most of the time, but in some cases, it does not.

No, Increase does what you wanted to do (Is the cumulative sum of the non-negative differences of consecutive values), however for two consecutive data points with the same value the increase function will return 0 now that I am trying to remember how it works :thinking: may be not exactly what you were looking for

Yes I agree with you on what increase() does but in the first place I need to calculate the difference but with certain conditions. I’m currently stuck there. If I manage to calculate the difference as I am expecting, I will be able to just sum() and then get the result I wanted. Hope you see my point :slightly_smiling_face:

:roll_eyes:… asuming your numbers start with 0 just add a 0 row then use Increase()

import "array"

rows1 = [

{ _value:10, sessionID: 1, _time: 2023-05-23T07:59:40.901Z},
{ _value:12, sessionID: 1, _time: 2023-05-23T07:59:41.901Z},
{ _value:8, sessionID: 2, _time: 2023-05-23T07:59:40.901Z},
{ _value:8, sessionID: 2, _time: 2023-05-23T07:59:41.901Z},
{ _value:7, sessionID: 3, _time: 2023-05-23T07:59:40.901Z},



]

rows2 = [


{ _value:0, sessionID: 1, _time: 2023-05-23T07:59:39.901Z},
{ _value:0, sessionID: 2,_time: 2023-05-23T07:59:39.901Z},
{ _value:0, sessionID: 3,_time: 2023-05-23T07:59:39.901Z},



]


A = 
union(tables: [array.from(rows: rows1), array.from(rows: rows2)])
  |> sort(columns: ["_value"], desc: false)
  |> group(columns: ["sessionID"], mode:"by")
|> increase()
|> last()
|> group()
|> yield(name: "custom-name")


If you want the total consumption this works, you can also use spread () if you want the difference between minimum and maximum values, in this code you can also add a 0 column.

Do you know a way I could automatically put a 0 at the start of each session ?

Like I did. create a row from an array then make a union, since it is timeseries what we are working with it gets tricky you could use an static timestamp? Or use your start time minus something like 10s to make sure is the very first entry. Dont forget to sort by time if you do unions, joins, or operate on different tables

Yes but my timeseries is not static… :joy:

Again :roll_eyes:

I don’t know your query or your data schema, so I cannot give you the specific solution but you could do something like this:

import "experimental"
import "array"

startTime = experimental.subDuration(d: 10s, from: v.timeRangeStart)
rows1= [

{ _value:0, sessionID: 1, _time: startTime},
{ _value:0, sessionID: 2, _time: startTime},
{ _value:0, sessionID: 3, _time: startTime},

]

A  = array.from(rows: rows1)


B = from(bucket: "BucketName")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_field"] == "fieldName")
  |> filter(fn: (r) => r["tag"] == "sessionID")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)



C = union(tables: [A, B])

  |> yield(name: "mean")

other ways could be branch from your query result , then use first() or any reduce function, then use map with r with _time: startTime, and _value: 0.

Manually creating a table from an “static array” allows you to have something when there is no result from the query, which may or may not be useful for your use case, but again it is hard to give you a specific solution… the limit is your creativity (and whatever the documentation says)

Ok, I understand now. Thank you for your assistance. I found a rather unconventional way to add “0” before each session, but I appreciate your help nonetheless.

there is no standard function to add zero.

well, you can also create a custom “reduce” function and use an identity parameter (initial value) to zero.

now that I am thinking about it this will be the best solution for you, you basically iterate across rows and take the previous result. you could do the difference subtracting, and If condition to preserve previous value if difference is negative.

I’ll do an example whenever I have the chance.