Group data hourly, daily, monthly etc. how?

donnib · November 19, 2020, 12:33pm

Hi,
I want to have an overview how much electricity i use each therefore i want following :

Be able to see hourly for 24h graph how much has been used
Be able to dig into the hour and see by minnute.
Be able to see daily totals by day, month, year.

I get my data from Home assistant and i run influxdb 2.0 but i don’t know exactly how to aggregate data and what is the best practice to do what i want.

I can’t add attachments but here is a snippet of the data that i have uploaded to WeTransfer WeTransfer - Send Large Files & Share Photos Online - Up to 2GB Free in CSV format.

Following querry shows a upgoing curve of the data :

from(bucket: “Home Assistant”)
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r[“entity_id”] == “filtered_kamstrup_counter”)
|> filter(fn: (r) => r[“_field”] == “value”)
|> filter(fn: (r) => r[“_measurement”] == “kWh”)
|> aggregateWindow(every: 1s, fn: last, createEmpty: false)
|> yield(name: “last”)

Do note that i just have a counter of the total kWh used and that just increases so i need to calculate how much was used during the window somehow.

/donnib

Dilip_K · November 21, 2020, 9:20pm

Hi,
You can use difference() function to get the consumtion between two point, somthing like this:

from(bucket: "Home Assistant")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "kWh")
  |> filter(fn: (r) => r["entity_id"] == "filtered_kamstrup_counter")
  |> filter(fn: (r) => r["_field"] == "value")
  |> aggregateWindow(every: 10m, fn: last, createEmpty: false)
  |> difference()

Regards

donnib · January 20, 2021, 10:29am

I have tried your suggestion but i can’t get influx do give me correct data :

TRUE	TRUE	FALSE	FALSE	TRUE	TRUE
dateTime:RFC3339	dateTime:RFC3339	dateTime:RFC3339	double	string	string
_start	_stop	_time	_value	_field	_measurement
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:00:26Z	20090.51	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:01:31Z	20090.52	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:02:06Z	20090.53	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:02:51Z	20090.54	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:03:41Z	20090.55	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:04:31Z	20090.56	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:05:16Z	20090.57	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:05:51Z	20090.58	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:06:41Z	20090.59	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:07:26Z	20090.6	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:08:16Z	20090.61	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:09:22Z	20090.62	value	kWh
2021-01-15T23:00:00Z	2021-01-16T22:59:59Z	2021-01-15T23:09:41Z	20090.63	value	kWh

and i have a query like this :

from(bucket: "Home Assistant")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop) |> filter(fn: (r) => r["entity_id"] == "filtered_kamstrup_counter") |> filter(fn: (r) => r["_field"] == "value") |> filter(fn: (r) => r["_measurement"] == "kWh") |> aggregateWindow(every: 9m, fn: mean, createEmpty: false) |> difference()

and i would expect to see the difference between the first number 20090.51 and 20090.63 but the numbers don’t add up, i get incorrect value. I should get 0,12 but i get 0.065. I have a lot of data not shown above, above is just a small portion of it. I would normally do a 24h/1d window. I tried mean/sum/count but no matter what i never get the right result. What am i doing wrong ?

Harald_Tillmanns · January 20, 2021, 11:17am

Hello,
try this example on your data:

from(bucket: “myBucket”)
|> range(start: -1h)
|> filter(fn: ® => r["_measurement"] == “myMeasurement”)
|> filter(fn: ® => r["_field"] == “myField”)
|> reduce(fn: (r,
accumulator) => ({
min: if r._value < accumulator.min then r._value else accumulator.min,
max: if r._value > accumulator.max then r._value else accumulator.max
}),
identity: {min: 10000000000.0, max: -10000000000.0})
|> map(fn: ® => ({ r with range: r.max - r.min }))
|> yield(name: “powerRange/[kWh]”)

Greetings
Harald

donnib · January 20, 2021, 11:57am

Thank you, i got it to work :

from(bucket: "Home Assistant")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["entity_id"] == "filtered_kamstrup_counter")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["_measurement"] == "kWh")
  |> reduce(fn: (r,
  accumulator) => ({
  min: if r._value < accumulator.min then r._value else accumulator.min,
  max: if r._value > accumulator.max then r._value else accumulator.max
  }),
  identity: {min: 10000000000.0, max: -10000000000.0})
  |> map(fn: (r) => ({ r with range: r.max - r.min }))
  |> yield(name: "powerRange/[kWh]")

And i get a Min, Max and range but how can i make it so i only get range so i can show it in a graph? Also i ges if i change the range at the start i get the usage each hour, days, week etc ?

Harald_Tillmanns · January 20, 2021, 12:00pm

Hi,
use additional
|> drop(columns: [“min”, “max”])
before yield.

Greetings
Harald

donnib · January 20, 2021, 12:07pm

@Harald_Tillmanns Thank you very much, what i meant was i want to have one a graph on the X axis time (the first time in the range) and on Y i want the value e.g range. Is that possible ?

In the end i want to have a graph showing each hour the usage and then one graph showing each week, maybe one showing the usage at night (between a time range) all coming from the same raw data.

Update:
I am guessing i somehow need the aggregateWindow command merged into the solution you made so i can choose a range then use aggregateWindow to get number each e.g hour/day/etc ? I am not sure.

Harald_Tillmanns · January 20, 2021, 5:42pm

Hello,
okay hope to understand your problem now. What you need is windowing before. Here is an example with my test data:
from(bucket: “veo”)
|> range(start: -1h)
|> filter(fn: ( r ) => r["_measurement"] == “procdata”)
|> filter(fn: ( r ) => r["_field"] == “GT1_TNH”)
|> window(every: 5m)
|> reduce(fn: (r,
accumulator) => ({
min: if r._value < accumulator.min then r._value else accumulator.min,
max: if r._value > accumulator.max then r._value else accumulator.max
}),
identity: {min: 10000000000.0, max: -10000000000.0})
|> map(fn: ( r ) => ({ r with range: r.max - r.min }))
|> duplicate(column: “_stop”, as: “_time”)
|> window(every: inf)
|> drop(columns: ["_start", “_stop”, “min”, “max”, “_measurement”, “_field”])

The duplicate column is necessary to make the unwindowing window(every: inf) work.

Greetings
Harald

donnib · January 20, 2021, 8:58pm

@Harald_Tillmanns thank you! that did indeed do the trick. It’s correct now and exactly what i needed. Last question, how can i filter some outliers for example i have some that give enormous values because i get some errors in the data once in a while so i want to filter out values that are below 0 and values above for example 100. The value i want to filter out is the new range calculated if possible if not directly on the _value is also fine, same result i guess.

Harald_Tillmanns · January 20, 2021, 10:34pm

Hi,
i haven’t tested it but i think you would reach your goal with something like
|> filter(fn: ( r ) => r[“range”] > 0.0 and r[“range”] < 100.0)
at the end of the pipe.

Let me know if it works.

Greetings
Harald

donnib · January 21, 2021, 8:44am

@Harald_Tillmanns Indeed it does work however i moved it to filter out before i do the windowing so i do it on the raw data to filter out as little as possible. I tried to use the filter to take out data from a whole data but it seems i don’t have the syntax correct :
|> filter(fn: (r) => r["_time"] != "2020-11-20")
How is that done ? I find the documentation quite sparse on many of these subjects.

Harald_Tillmanns · January 23, 2021, 4:38pm

Hello,
i think the DateTime constant is the problem. Look into the documentation or try something
like “2018-01-01T00:00:00Z”. The next problem is that “2020-11-20” means “2020-11-20T00:00:00Z”
that is a time point and not a range
r["_time"] < “2020-11-20T00:00:00Z” and r["_time"] > “2020-11-20T23:59:59Z” is better.

Harald_Tillmanns · January 23, 2021, 5:07pm

Sorry but i forgot to say with “” you always have a string and not a date/time literal. So please remove “” around all date/time literals.

Topic		Replies	Views
Aggregate Energy and Water data and display in various views InfluxDB 2	2	401	May 20, 2022
How to group data by daytime (e.g. a 10-minute interval) InfluxDB 2 influxdb , grafana , flux	1	473	May 14, 2024
Calculate the total power usage Dashboards	4	2732	February 6, 2024
Power use, task to save daily, weekly, monthly? Welcome & Getting Started	2	574	May 6, 2021
Aggregate daily consumption from meter readings InfluxDB 2 influxdb , flux	2	1536	March 1, 2021

Group data hourly, daily, monthly etc. how?

Related topics