Topic: Down sampling is not foolproof.
I enthusiastically started with flux
, and use it with custom functions in a project that includes:

Energy meter readings (smart meter) (1 second sample).

Weather data (10 min sample).

Solar panels data real time 10 minutes sample.

Central heating installation temperature (waterside adjustment temperature control) (30 sec sample).

Degree days calculations (KNMI) (custom function flux).

Variable hourly rate calculation (custom function flux) source: European Entsoe Database.
all these 6 sources come in through mqtt
and are put into a raw_bucket for downsampling.
So far it works perfectly.
Down sample sum up issue.
The following is a DownSample issue and is theoretical.
When downsampling, the last value always goes to the next sample.
It is not a problem with a long run, graphics or an average calculation,
As long as my salary doesn’t depend on it
But it has a great effect in a downsampling chain.

you always miss the last sample

If you downsampled in a chain from a downsampled source, the error becomes much larger.
Example worked out
down sample chain 10m => 1h => 1d
There are 1440 minutes in a day

if i have a value 1 representation every minute then i expect 1440 units in a day get 1439 for that and the 1 goes to the next day.

If I first sum up per hour in a new bucket and get the info for summing up to a day from this bucket, then I miss 60 minutes and I get 1380 units . 60 goes to the next day.

If I downsample from 10 minutes to hours and then from hours to days then I miss 10 minutes and 60 minutes = 70 minutes.
Here’s a flux script to demonstrate this.
Experiment script place or remove comment //
and see what happend.
import "generate"
start_proc = 20230101T00:00:00Z
stop_proc = 20230102T00:00:00Z
// for 2 day 's window'
//stop_proc = 20230102T00:00:00Z
//for testing day range
//stop_proc = 20230101T23:59:00Z
minuut_data =
generate.from(
// 24 * 60 = 1440 minuten
count: 1440,
// 2 * 24 * 60 =2880 for 2 days window
//count: 2880,
//fn: (n) => (n/60) ,
fn: (n) => (1) ,
//fn: (n) => (n),
start: start_proc,
stop: stop_proc
)
m10_data =
minuut_data
> range(start: start_proc , stop: stop_proc)
> aggregateWindow(every: 10m, fn: sum, createEmpty: false)
> yield(name: "10 Minuten info")
uur_data =
m10_data
> range(start: start_proc , stop: stop_proc)
> aggregateWindow(every: 1h, fn: sum, createEmpty: false)
> yield(name: "hour info")
uur_data
> range(start: start_proc , stop: stop_proc)
> aggregateWindow(every: 1d, fn: sum, createEmpty: false)
> keep(columns: ["_time", "_value"])
> yield(name: "test_output")
Meter reading with spread demo.
Downsample error with spread or (diff)
The error with downsampling increases if you use spread (diff) to read meter readings.
Example the meter increments 1 unit every minute

Then you have a spread of 1439 units in one day. 1 unit goes to the next day

If you spread on the hour and add that for the whole day you have a total of 1357 units instead of 1440 missing (23 + 60)

If you spread per 10 min and you add that up per hour and you add the hour for the whole day then you have 1233 units, you miss 6 units every hour and the last hour.
Demo fux script:
import "generate"
start_proc = 20230101T00:00:00Z
stop_proc = 20230102T00:00:00Z
minuut_data =
generate.from(
count: 1440,
fn: (n) => (n),
start: start_proc,
stop: stop_proc
)
m10_data =
minuut_data
> range(start: start_proc , stop: stop_proc)
> aggregateWindow(every: 10m, fn: spread, createEmpty: false)
> yield(name: "10 Minuten info")
uur_data =
m10_data
> range(start: start_proc , stop: stop_proc)
> aggregateWindow(every: 1h, fn: sum, createEmpty: false)
> yield(name: "hour info")
uur_data
> range(start: start_proc , stop: stop_proc)
> aggregateWindow(every: 1d, fn: sum, createEmpty: false)
> keep(columns: ["_time", "_value"])
> yield(name: "test_output")
This is a learning moment conclusion for me
Downsample as close to the source as possible.
I wish you lots of ‘flux’ fun.