Group data hourly, daily, monthly etc. how?

Hi,
I want to have an overview how much electricity i use each therefore i want following :

  1. Be able to see hourly for 24h graph how much has been used
  2. Be able to dig into the hour and see by minnute.
  3. Be able to see daily totals by day, month, year.

I get my data from Home assistant and i run influxdb 2.0 but i don’t know exactly how to aggregate data and what is the best practice to do what i want.

I can’t add attachments but here is a snippet of the data that i have uploaded to WeTransfer WeTransfer - Send Large Files & Share Photos Online - Up to 2GB Free in CSV format.

Following querry shows a upgoing curve of the data :

from(bucket: “Home Assistant”)
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r[“entity_id”] == “filtered_kamstrup_counter”)
|> filter(fn: (r) => r[“_field”] == “value”)
|> filter(fn: (r) => r[“_measurement”] == “kWh”)
|> aggregateWindow(every: 1s, fn: last, createEmpty: false)
|> yield(name: “last”)

Do note that i just have a counter of the total kWh used and that just increases so i need to calculate how much was used during the window somehow.

/donnib

Hi,
You can use difference() function to get the consumtion between two point, somthing like this:

from(bucket: "Home Assistant")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "kWh")
  |> filter(fn: (r) => r["entity_id"] == "filtered_kamstrup_counter")
  |> filter(fn: (r) => r["_field"] == "value")
  |> aggregateWindow(every: 10m, fn: last, createEmpty: false)
  |> difference()

Regards

1 Like

I have tried your suggestion but i can’t get influx do give me correct data :

TRUE TRUE FALSE FALSE TRUE TRUE
dateTime:RFC3339 dateTime:RFC3339 dateTime:RFC3339 double string string
_start _stop _time _value _field _measurement
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:00:26Z 20090.51 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:01:31Z 20090.52 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:02:06Z 20090.53 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:02:51Z 20090.54 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:03:41Z 20090.55 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:04:31Z 20090.56 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:05:16Z 20090.57 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:05:51Z 20090.58 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:06:41Z 20090.59 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:07:26Z 20090.6 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:08:16Z 20090.61 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:09:22Z 20090.62 value kWh
2021-01-15T23:00:00Z 2021-01-16T22:59:59Z 2021-01-15T23:09:41Z 20090.63 value kWh

and i have a query like this :

from(bucket: "Home Assistant")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop) |> filter(fn: (r) => r["entity_id"] == "filtered_kamstrup_counter") |> filter(fn: (r) => r["_field"] == "value") |> filter(fn: (r) => r["_measurement"] == "kWh") |> aggregateWindow(every: 9m, fn: mean, createEmpty: false) |> difference()

and i would expect to see the difference between the first number 20090.51 and 20090.63 but the numbers don’t add up, i get incorrect value. I should get 0,12 but i get 0.065. I have a lot of data not shown above, above is just a small portion of it. I would normally do a 24h/1d window. I tried mean/sum/count but no matter what i never get the right result. What am i doing wrong ?

Hello,
try this example on your data:

from(bucket: “myBucket”)
|> range(start: -1h)
|> filter(fn: ® => r["_measurement"] == “myMeasurement”)
|> filter(fn: ® => r["_field"] == “myField”)
|> reduce(fn: (r,
accumulator) => ({
min: if r._value < accumulator.min then r._value else accumulator.min,
max: if r._value > accumulator.max then r._value else accumulator.max
}),
identity: {min: 10000000000.0, max: -10000000000.0})
|> map(fn: ® => ({ r with range: r.max - r.min }))
|> yield(name: “powerRange/[kWh]”)

Greetings
Harald

Thank you, i got it to work :

from(bucket: "Home Assistant")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["entity_id"] == "filtered_kamstrup_counter")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["_measurement"] == "kWh")
  |> reduce(fn: (r,
  accumulator) => ({
  min: if r._value < accumulator.min then r._value else accumulator.min,
  max: if r._value > accumulator.max then r._value else accumulator.max
  }),
  identity: {min: 10000000000.0, max: -10000000000.0})
  |> map(fn: (r) => ({ r with range: r.max - r.min }))
  |> yield(name: "powerRange/[kWh]")

And i get a Min, Max and range but how can i make it so i only get range so i can show it in a graph? Also i ges if i change the range at the start i get the usage each hour, days, week etc ?

Hi,
use additional
|> drop(columns: [“min”, “max”])
before yield.

Greetings
Harald

@Harald_Tillmanns Thank you very much, what i meant was i want to have one a graph on the X axis time (the first time in the range) and on Y i want the value e.g range. Is that possible ?

In the end i want to have a graph showing each hour the usage and then one graph showing each week, maybe one showing the usage at night (between a time range) all coming from the same raw data.

Update:
I am guessing i somehow need the aggregateWindow command merged into the solution you made so i can choose a range then use aggregateWindow to get number each e.g hour/day/etc ? I am not sure.

Hello,
okay hope to understand your problem now. What you need is windowing before. Here is an example with my test data:
from(bucket: “veo”)
|> range(start: -1h)
|> filter(fn: ( r ) => r["_measurement"] == “procdata”)
|> filter(fn: ( r ) => r["_field"] == “GT1_TNH”)
|> window(every: 5m)
|> reduce(fn: (r,
accumulator) => ({
min: if r._value < accumulator.min then r._value else accumulator.min,
max: if r._value > accumulator.max then r._value else accumulator.max
}),
identity: {min: 10000000000.0, max: -10000000000.0})
|> map(fn: ( r ) => ({ r with range: r.max - r.min }))
|> duplicate(column: “_stop”, as: “_time”)
|> window(every: inf)
|> drop(columns: ["_start", “_stop”, “min”, “max”, “_measurement”, “_field”])

The duplicate column is necessary to make the unwindowing window(every: inf) work.

Greetings
Harald

@Harald_Tillmanns thank you! that did indeed do the trick. It’s correct now and exactly what i needed. Last question, how can i filter some outliers for example i have some that give enormous values because i get some errors in the data once in a while so i want to filter out values that are below 0 and values above for example 100. The value i want to filter out is the new range calculated if possible if not directly on the _value is also fine, same result i guess.

Hi,
i haven’t tested it but i think you would reach your goal with something like
|> filter(fn: ( r ) => r[“range”] > 0.0 and r[“range”] < 100.0)
at the end of the pipe.

Let me know if it works.

Greetings
Harald

@Harald_Tillmanns Indeed it does work however i moved it to filter out before i do the windowing so i do it on the raw data to filter out as little as possible. I tried to use the filter to take out data from a whole data but it seems i don’t have the syntax correct :
|> filter(fn: (r) => r["_time"] != "2020-11-20")
How is that done ? I find the documentation quite sparse on many of these subjects.

Hello,
i think the DateTime constant is the problem. Look into the documentation or try something
like “2018-01-01T00:00:00Z”. The next problem is that “2020-11-20” means “2020-11-20T00:00:00Z”
that is a time point and not a range
r["_time"] < “2020-11-20T00:00:00Z” and r["_time"] > “2020-11-20T23:59:59Z” is better.

Sorry but i forgot to say with “” you always have a string and not a date/time literal. So please remove “” around all date/time literals.