Difference between two cumulative values

So, I have a dataset which has time stamp column and the cumulative value column, what I want to do is calculate the difference of the current cumulative value from the previous one so I can get the actual value and the difference of all those values need to be summed up and get a single value, example C2=B3-B2, C3=B4-B3 and so on and at last C2+C3+C4 and so on. I have attached my dataset sample below.

My query is as follows:

from(bucket: “Bacnet_Network”)
|> range(start: 2022-01-01T00:00:00Z, stop: 2022-12-31T23:59:59Z)
|> filter(fn: (r) => r[“_measurement”] == “048_STEM”)
|> filter(fn: (r) => r[“sensor”] == “472200_AV_210_STEM_HRC2_GJ_EXPORT_AV”)
|> difference(nonNegative: false, columns: [“_value”])
|> aggregateWindow(every: 1mo, fn: sum, createEmpty: false)
|> group(columns: [“sensor”])
|> sum(column: “_value”)
|> drop(columns: [“host”, “system”, “unit”, “topic”, “objects_type”, “name”, “_measurement”])

The answer that I am getting is not correct and does not match with the raw data. Any suggestions?

what you described is easy to do but you have nothing in your query which makes me question:

is the screenshot the data that you are pulling in InfluxDB, or is an example on what you are trying to accomplish:
If the picture is the data source values then:
|>aggregateWindow(every: 1mo, fn: sum, createEmpty: false)

this line Downsamples in 1 month intervals and gets the sum of all the data points contained in each “group” and after that you sum again, so if you have 1 datapoint each month just delete that line.

I’m editing because I read wrong the first time:\ the above comment still applies, however if you do not have 1 datapoint each month, it is wise to down sample, move that line just below the last filter, and use fn: last instead of sum.

you may consider instead of using difference() + sum(), just increase() function.

assuming your data only increases and does not fluctuates
increase() returns the cumulative sum of non-negative differences between subsequent values.
increase() function | Flux 0.x Documentation (influxdata.com)

1 Like

So, the data is coming from sensor every 5 mins and I have taken the reading at the end of every month calculating the difference and then summing it up to see the total.

I tried to replace sum with last in aggregateWindow and put it after the last filter but the value still mismatches and it is still lower than the actual sum calculated from raw data by about 800 points.

sorry, I misunderstood you.

|> aggregateWindow(every: 1mo, fn: sum, createEmpty: false)

this is correct but should be after the filter and before the difference.
this line will group all data points on each month and sum all of those, fn:last will get you the last one of the data points on each month “window”.

also, If you only have one sensor you don’t need to group by sensor, If sensor is the tag which seems to be the case it is already grouped by “sensor”.

other thing to keep in mind is that influxDB uses UTC timestamps, I am not sure if on your excel spread sheet you are using UTC or Local time, this may affect the “window” aggregate results.

Thanks for the response @fercasjr

I made the changes as mentioned by you and it seems like that it’s has completed ignored the data for the month of January even though I have given the start range from January 1st (see below)

I was actually having 2 sensors but because the values were not matching so I was verifying the data for each sensor.

Also, the time stamp on Excel is also in UTC.

This is the query that I am using where January data has gone missing.

from(bucket: “Bacnet_Network”)
|> range(start: 2022-01-01T00:00:00Z, stop: 2022-12-31T23:59:59Z)
|> filter(fn: (r) => r[“_measurement”] == “048_STEM”)
|> filter(fn: (r) => r[“sensor”] == “472200_AV_210_STEM_HRC2_GJ_EXPORT_AV”)
|> aggregateWindow(every: 1mo, fn: last, createEmpty: false)
|> difference(nonNegative: false, columns: [“_value”])
//|> group(columns: [“sensor”])
//|> map(fn: (r) => ({_value:r._value}))
//|> sum(column: “_value”)
|> drop(columns: [“host”, “system”, “unit”, “topic”, “objects_type”, “name”, “_measurement”])

I actually figured out about the missing data for the month of January @fercasjr