Flux query leaves me without usable timestamps

cville · April 14, 2022, 10:40am

I’m trying to collect monthly settings for 4 counters called cT1, cT2, fT1, fT2 and calculate the difference between the end-of-month and start-of-month value and use these to compute the result “yld”. The problem is that the query twice offsets the timestamps by 1 month: once as a result of aggregateWindow() and once as a result of difference()

I’m running the following query:

import "timezone"
option location = timezone.location(name: "Europe/Amsterdam")

data24h = from(bucket: "e-counters")
  |> range(start: 2022-01-01)
  |> filter(fn: (r) => r["_measurement"] == "electricity")
  |> filter(fn: (r) => r["agg_type"] == "last24h")

pre_process = (tables=<-, window) => tables 
  |> filter(fn: (r) => r["_field"] =~ /[cf]T[1-2]/ )  // cT1, cT2, fT1, fT2
  |> aggregateWindow(every: window, fn: first, createEmpty: false)
  |> drop(columns: ["_start", "_stop", "agg_type", "_measurement"])

data_1mo = data24h
  |> pre_process(window: 1mo)
//   |> yield(name: "data_1mo")

yld = data_1mo
  |> group()
  |> sort(columns: ["_field", "_time"], desc: false)
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
//   |> duplicate(column: "_time", as: "original_time")
  |> yield(name: "intermediate_result")
  |> difference(columns: ["fT1", "fT2", "cT1", "cT2"])
  |> map(fn: (r) => ({ r with yld: (r.fT1 + r.fT2) - (r.cT1 + r.cT2) })) 
//   |> keep(columns: ["yld", "_time"])
  |> yield(name: "yld")

with the following result:

table
intermediate_result
	_time
no group
dateTime:RFC3339
	cT1
no group
double
	cT2
no group
double
	fT1
no group
double
	fT2
no group
double
0	2022-01-31T23:00:00.000Z	11513.767	6662.066	12434.695	28019.407
0	2022-02-28T23:00:00.000Z	11672.555	6789.526	12469.196	28091.143
0	2022-03-31T22:00:00.000Z	11796.889	6873.731	12610.354	28312.171
0	2022-04-14T10:10:02.217Z	11935.733	6953.828	12853.836	29036.859
table
yld
	_time
no group
dateTime:RFC3339
	cT1
no group
double
	cT2
no group
double
	fT1
no group
double
	fT2
no group
double
	yld
no group
double
0	2022-02-28T23:00:00.000Z	158.78800000000047	127.46000000000004	34.501000000000204	71.73600000000079	-180.0109999999995
0	2022-03-31T22:00:00.000Z	124.33399999999892	84.20499999999993	141.15799999999945	221.02799999999843	153.64699999999903
0	2022-04-14T10:10:02.217Z	138.84400000000096	80.09700000000066	243.48199999999997	724.6880000000019	749.2290000000003

As an example: the correct value for the difference in counter fT2 for the month of March is 749.23. I expect the timestamp for this to be 2022-03-31T22:00:00.000Z but it’s returning 2022-04-14T10:10:02.217Z which is the current time. I would also expect this timestamp to be used for the partial month of April, but there’s no result at all for April.

How can this query be changed to deliver the correct values for FEB, MAR & APR with either end-of-month or start-of-month (preferred) timestamps?

cville · April 21, 2022, 3:50pm

Is there anyone who can help me out?

Anaisdg · April 22, 2022, 7:28pm

Hello @cville,
You can use the timeshift function to change the _time value.

I was also able to use this approach to calculate the diff between the first and last value in a month:

import "array"

data = array.from(rows: [{_time: 2020-01-01T00:00:00Z, _value: 1.0 }, {_time: 2020-01-31T00:00:00Z, _value: 1.1 }
{_time: 2020-02-01T00:00:00Z, _value: 2.0}, {_time: 2020-02-28T00:00:00Z, _value: 2.2},
{_time: 2020-03-01T00:00:00Z, _value: 3.0}, {_time: 2020-03-31T00:00:00Z, _value: 3.3}
{_time: 2020-04-01T00:00:00Z, _value: 4.0}])
  |> range(start: 2020-01-01T00:00:00Z, stop: 2020-04-01T00:00:00Z)

data |> yield(name: "raw")

data_first = data
  |> aggregateWindow(every: 1mo, fn: first)
  |> yield(name: "first")


data_last = data
  |> aggregateWindow(every: 1mo, fn: last)
  |> yield(name: "last")

join(tables: {data_first: data_first, data_last: data_last}, on: ["_time"], method: "inner")
|> map(fn: (r) => ({ r with _value: r._value_data_first - r._value_data_last}))

cville · April 26, 2022, 2:03pm

Thanks for your reply. Your approach avoids using the somewhat problematic difference() function, which is nice. I had to rework the code because my data is more complicated than the simple array you constructed. I still need to do more work because I haven’t figured out how to calculate the days per month (or partial month). Do you?.

I also tried to use timeShift() which reconstructs the correct timestamp for a complete month but unfortunately not for the current partial month:

result: 2022-03-26T13:51:08.239Z
expected: 2022-03-31T22:00:00.000Z

Is this the desired behavior?

CODE:

import "timezone"
option location = timezone.location(name: "Europe/Amsterdam")

// NOTE: this version lacks the "days" functionality; need days per month for this

data = from(bucket: "e-counters")
  |> range(start: 2022-02-01)
  |> filter(fn: (r) => r["_measurement"] == "electricity")
  |> filter(fn: (r) => r["agg_type"] == "last24h")
  |> drop(columns: ["_measurement", "agg_type"])
  
bop = data  // begin of period
  |> aggregateWindow(every: 1mo, fn: first)
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
//   |> yield(name: "bop")

eop = data  // end of period
  |> aggregateWindow(every: 1mo, fn: last)
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
//   |> yield(name: "eop")

join(tables: {eop: eop, bop: bop}, on: ["_time"], method: "inner")
|> drop(fn: (column) => column =~ /_start*|_stop*/)  // drops all time columns except _time
|> map(fn: (r) => ( { r with imp: ((r.cT1_eop - r.cT1_bop) + (r.cT2_eop - r.cT2_bop)) }))  // import
|> map(fn: (r) => ( { r with exp: ((r.fT1_eop - r.fT1_bop) + (r.fT2_eop - r.fT2_bop)) }))  // export
|> map(fn: (r) => ( { r with yld_kWh: r.imp - r.exp } ))  // total yield per period
|> drop(fn: (column) => column =~ /[cf]T[12]/)  // drops all counter columns
|> map(fn: (r) => ({ r with yld_eur: r.yld_kWh * -0.45845}))

RESULT:

table
_result
	_time
no group
dateTime:RFC3339
	exp
no group
double
	imp
no group
double
	yld_eur
no group
double
	yld_kWh
no group
double
0	2022-02-28T23:00:00.000Z	325.84500000000116	202.95799999999963	56.33754515000071	-122.88700000000154
0	2022-03-31T22:00:00.000Z	967.6580000000031	214.3140000000003	345.3705568000013	-753.3440000000028
0	2022-04-26T13:56:50.050Z	894.460000000001	120.86200000000008	354.6560031000004	-773.5980000000009

Thank you for taking the time to respond; very helpful.

Anaisdg · April 27, 2022, 9:23pm

I’m not sure. How are you using timeshift? I don’t see it.

cville · April 28, 2022, 8:27am

Sorry about that. It is commented out in the bop and eop sections and was used like so:

bop = data  // begin of period
  |> aggregateWindow(every: 1mo, fn: first)
  |> timeShift(duration: -1mo)  // undo timeshift caused by aggregateWindow
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")//

In the mean time I found a workable solution to calculating the number of days per month. See lines 30 & 31 below.

import "math"
import "timezone"
option location = timezone.location(name: "Europe/Amsterdam")

  DAILY_RATE = -1.32934  // EUR/day
FEED_IN_RATE = -0.45845  // EUR/kWh exported

data = from(bucket: "e-counters")
  |> range(start: 2022-01-01)
  |> filter(fn: (r) => r["_measurement"] == "electricity")
  |> filter(fn: (r) => r["agg_type"] == "last24h")
  |> drop(columns: ["_measurement", "agg_type"])
  
bop = data  // begin of period
  |> aggregateWindow(every: 1mo, fn: first)
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
//   |> yield(name: "bop")

eop = data  // end of period
  |> aggregateWindow(every: 1mo, fn: last)
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
//   |> yield(name: "eop")
  
join(tables: {eop: eop, bop: bop}, on: ["_time"], method: "inner")
|> drop(fn: (column) => column =~ /_start*|_stop*/)  // drops all time columns except _time
|> map(fn: (r) => ( { r with imp: ((r.cT1_eop - r.cT1_bop) + (r.cT2_eop - r.cT2_bop)) }))  // import
|> map(fn: (r) => ( { r with exp: ((r.fT1_eop - r.fT1_bop) + (r.fT2_eop - r.fT2_bop)) }))  // export
|> map(fn: (r) => ( { r with yld_kWh: r.imp - r.exp } ))  // total yield per period
|> drop(fn: (column) => column =~ /[cf]T[12]/)  // drops all counter columns
|> elapsed(unit: 1h, timeColumn: "_time", columnName: "hours")  // compute the number of hours per (partial) month
|> map(fn: (r) => ({ r with days: math.round(x: float(v: r.hours)/24.0 )}))  //compute the number of days per (partial) month; rounded
// |> yield(name: "intermediate_result")
|> map(fn: (r) => ({ r with yld_eur: r.days * DAILY_RATE - r.yld_kWh * FEED_IN_RATE}))
|> keep(columns: ["_time", "yld_eur"])

Topic		Replies	Views
Wrong Results flux query funciton difference() Fluxlang influxdb , time-series , grafana , flux	9	352	February 21, 2024
Month aggregateWindow doesn't work with negative offset Fluxlang query , flux	18	3616	October 12, 2021
Migrate query to calculate energy (kWh) Fluxlang influxdb , flux	30	12319	July 4, 2023
Timestamp/duration mathematics Fluxlang	1	2346	November 13, 2019
Flux: Monthly values of data that represents a total value Fluxlang time-series , query , flux	11	1344	March 19, 2023

Flux query leaves me without usable timestamps

Related topics