Comparing data from different years, months, ... time ranges

Hello,

I would like to compare time series values from multiple years contained in the same measurement. Specifically, compare my last year’s gas and electricity usage to this year’s.
I think this is achievable by removing the year from the _time column and then using window(every: 1y) to separate the measurement data into multiple tables. But I’m stuck how to synchronize the year from the datetime specification (and put it into the field name to have Grafana display it in the legend).
Any ideas would be welcome!
Pseudocode:

from(bucket: "iobroker/global")
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) =>
    r._measurement =~ /gaszaehler.0.absolute/ and
    r._field == "value"
  )
  |> window(every: 1y)
  |> modDate(column: ["_time"], year: now().getYear())   // this doesn't exist yet

Or would a Pivot function be more suitable here, separating the years into columns? The problem is the timestamps during the years don’t exactly match up.

Also, to make this comparison feasible the (absolute) counter values displayed would each need to start at 0 again. So I’d need a function which, for every table, subtracts the first field value from all subsequent values. I know I can write a custom function to manipulate data but how can I force every measurement displayed to start at 0?

Something like this (Pseudocode):

foreach(value in measurement_values) {
    value = value - measurement_values[0];
}

Thanks :slight_smile:

Hello @Jens,
Have you tried using the timeShift() function?
https://docs.influxdata.com/flux/v0.x/stdlib/universe/timeshift/
I also find this doc pretty useful:
https://docs.influxdata.com/influxdb/cloud/query-data/flux/manipulate-timestamps/

You can also use an offset in the window() function.

I would query for the two tables in separate variables with different time ranges:
One for year 1 and one for year 2. Then I would shift the time column so they have the same timestamp so you can and then join the data. Now you can compare values across timestamps.

Like:

time1 = from(bucket: "systemstats")
  |> range(start: 2021-10-21T15:49:20.000Z, stop: 2021-10-21T15:50:20.000Z)
  |> filter(fn: (r) => r["_measurement"] == "cpu")
  |> filter(fn: (r) => r["_field"] == "usage_user")
  |> filter(fn: (r) => r["cpu"] == "cpu-total")
  
  |> timeShift(duration: 1m, columns: ["_start", "_stop", "_time"])
//   |> drop(columns: ["_start", "_stop"])
  |> yield(name: "time1")

time2 = from(bucket: "systemstats")
  |> range(start:	2021-10-21T15:50:20.000Z, stop: 2021-10-21T15:51:20.000Z)
  |> filter(fn: (r) => r["_measurement"] == "cpu")
  |> filter(fn: (r) => r["_field"] == "usage_user")
  |> filter(fn: (r) => r["cpu"] == "cpu-total")
//   |> drop(columns: ["_start", "_stop"])
  |> yield(name: "time2")


join(tables: {key1: time1, key2: time2}, on: ["_start", "_stop", "_time","_measurement", "_field", "cpu", "host" ], method: "inner")

This means you’ll have to keep track of which value was in the past but the use of intelligent naming can help with that.

Hello Anaisdg,

I found timeShift but this shifts by a fixed amount, and I have “N” years (dynamic), not just two.
I’d need a function that replaces parts of a date with a fixed value, not adds or subtracts a fixed value. Also the datapoints from different years aren’t synchronized so 99% of the time I won’t have e.g. a measurement at 2019-06-06 10:30:00, 2020-06-06 10:30:00, etc. since my sensors send data only when there’s a change in value, not at regular intervals.

Basically I now have this:

from(bucket: "iobroker/global")
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) =>
    r._measurement =? "gaszaehler.0.absolute" and
    r._field == "value"
  )
  |> aggregateWindow(every: 1w, fn: max, timeSrc: "_start")
  |> derivative(unit: 1w, nonNegative: true)
  |> window(every: 1y, offset: 6mo)

and I’d like the differently colored graphs overlaid (not side by side), the year removed from the X axis, and put into the graph legend (instead of “value”), and the X axis scaled to show only a single year (preferably, July … June, and not Jan … Dec). Without having to change my data or copy & paste a new query every year.

This should be possible using Flux … right? :slight_smile:

What about using date.yearDay() as the X-axis value? This would work since the year is not contained in the value, and I’d get values from 0…366 on the X axis for each year, which I can pivot on multiple columns and then display as overlaid graphs.
The only thing that’s now missing is that Grafana seems unable to plot non-time values on the X axis.
Any ideas?

Or, alternatively I could replace all years of all datapoint timespecs with the current year, but I’d probably need to convert date to string and back to do that - or a custom date.modify() function to set year, month, day, etc to custom values. I haven’t found such a function in the documentation, does it exist?

Hello @Jens,
The following worked for me:

from(bucket: "systemstats")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "cpu")
  |> filter(fn: (r) => r["_field"] == "usage_system")
  |> filter(fn: (r) => r["cpu"] == "cpu-total")
  |> window(every: 5m)
  |> map(fn: (r) => ({ r with _startUnix: uint(v: r._start)}))
  |> map(fn: (r) => ({ r with _time: experimental.addDuration(d: duration(v: uint(v: now()) - r._startUnix), to: r._time)}))

I just shifted everything to now() but you could shift it to a time that works for you.