Flux: Get total energy by month with an ever-increasing value

Hi!

I have a influxdb - grafana setup, using flux. I have tasmota devices that report their current energy usage as well as a lifetime total.

Now I want a visualisation of the sum of all devices’ total energy consumption in a monthly aggregationWindow, reading that “lifetime total” value.

I believe I’d just need to get the first entry from each month, and the last entry from each month. Then I substract the “Total” value of the first entry of the month from the last “Total” value of the last entry of the month, and that would get the absolute power usage.

So far so good, but I can’t wrap my mind around how I would build such a flux query? It must be an easy task but I searched everywhere and couldn’t flux code for this (I did find a lot of InfluxQL, but I sadly went the flux route and feel pot committed now).

Best regards,
Garvin

Hi @fe-hicking and welcome to the InfluxDB forum.

I tried to do what you outlined using some data of my own. Hopefully you get some ideas from what I have written below.

My data arrives with usage values per hour (not cumulative)…

…so to get the “always increasing” view, I use the cumulativeSum() function:

from(bucket: "Bucket1")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "HomeMetrics")
  |> filter(fn: (r) => r["_field"] == "EnergyUsage")
  |> cumulativeSum(columns: ["_value"])
  |> yield(name: "sum")

And then, I added the window() function to break it out into months, and the spread() function to return the difference between the max and min values for each of those months.

from(bucket: "Bucket1")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "HomeMetrics")
  |> filter(fn: (r) => r["_field"] == "EnergyUsage")
  |> cumulativeSum(columns: ["_value"])
  |> window(every: 1mo)
  |> spread()
  |> yield(name: "sum")

gives this:

As a spot check, I queried my data this way:

from(bucket: "Bucket1")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "HomeMetrics")
  |> filter(fn: (r) => r["_field"] == "EnergyUsage")
  |> aggregateWindow(every: 1mo, fn: sum)
  |> yield(name: "sum")

which gives values that closely match the other method (difference is probably due to the aggregation method of the hourly values, or something like that):

Hi @grant1 !

I was secretly hoping you would be able to reply to my question. I saw your efforts here for newcomers like me, and you are a powerhouse. I hugely appreciate what you are doing!

Your main tip for using “spread”, and it seems “window” instead of “aggregateWindow” did get me a few steps further.

Before I get to that, let me circle back to my initial setup, which I may not have described well enough.

My tracking yields data like this:


The charts/bars I want to generate should in the end look like this (right one; the left one is the actual data I would have right now):

https://docs.google.com/spreadsheets/d/1ojL08r9G5qo4ZARubAOeBilrBoTEovWop3sGh5DQ_Rc/edit#gid=313074819

I tried your query:

from(bucket: "homeassistant")
  |> range(start: -12mo, stop: -1d)
  |> filter(fn: (r) => r["_measurement"] == "tasmota")
  |> filter(fn: (r) => r["_field"] == "Total")
  |> filter(fn: (r) => r["host"] == "192.168.0.42")
  |> window(every: 1mo)
  |> spread()
  |> yield(name: "sum")

and this does seem to yield the properly spread values for each month per device. I restricted that to “office” and made it “per day” and fetched the last 7 days:


Looking at “_start” it seems properly fetched, however I fail at visualising this data now because it is missing a “_time” column. I tried to search for this problem, and it seems to stem from aggregation (or lack thereof). However when I use aggregateWindow() instead of window() in my query and apply i.e. a “max” aggregate, my raw data gets reduced to only one row and still there’s no “_time” attribute.

I know this must be a stupid newbie-mistake, but I’m running in circles. If you would have the time to tell me my error in this case, that would be awesome. :slight_smile:

1 Like

@fe-hicking

Thanks for the additional info (and for the kind words!).

Was the above using InfluxDB charting tools, or did you try putting the query into Grafana and see what it yields? Sometimes the same query works a bit differently in Grafana.

I will play around with this more today and report back.

1 Like

Hi @grant1 ,

Thanks for getting back to quickly. The above was done via the influxdb explorer. When I put the query into grafana, I just get the error about a missing time field (“Data is missing a time field”).

Through both the grafana query inspector and the influxdb explorer I do see in the raw data that no “_time” attribute is existing (second screenshot shows the influxdb explorer) . I failed at simply adding it into the query:

  |> map(fn: (r) =>  ({ _time: r._start }))

because then an error:

 runtime error @13:6-13:14: spread: column "_value" does not exist

happens. If it matters, I’m using InfluxDB 2.1.1, Server: 657e183, Frontend: cc65325 and grafana v8.3.1 (b9eacd93e9).

Hi @grant1

I got one step further. I found my mistake, I seem to need to readd the _time column key AFTER the spread() function. Now I have this query worked out; for now I restrict everything to a single device (“tele/tasmota_office/SENSOR”). I also added a function to expand kWh back to Wh to distribute the points to larger numbers.

import "strings"

from(bucket: "homeassistant")
  |> range(start: -12mo, stop: -1d)
  |> filter(fn: (r) => r["_measurement"] == "tasmota")
  |> filter(fn: (r) => r["_field"] == "Total")
  |> filter(fn: (r) => r["topic"] == "tele/tasmota_office/SENSOR")
  |> map(fn: (r) =>  ({ r with topic: strings.replaceAll(v: strings.replaceAll(v: r.topic, t:"/SENSOR", u:""), t:"tele/tasmota_", u:"") }))
  |> window(every: 1d)
  |> spread()
  |> map(fn: (r) =>  ({ r with _time: r._start }))
  |> map(fn: (r) =>  ({ r with _value: r._value * 1000.0 }))
  |> yield(name: "Total Usage")

In the raw data output via influxdb Data Explorer, this data looks good to me:

However. When I use this query in either influxdb Data Explorer “Graph” visualisation or Grafana, I don’t see a proper graph.

Data Explorer shows me an chart with proper X and Y axises showing the proper rangers for Watt and the date, but there are no points or bars. I set “Data X Column” to “time” and “Y column” to “_value”.

Funnily, if I put this into grafana, I get a similar issue with an empty graph, but I can see that grafana isn’t properly “merging” all data points. Maybe that’s the issue, but again I don’t really know how to fix it, see this screenshot:

I would actually expect there to be only one entry in the legend labelled office and the first value being that of the first day displayed, and last day of last day. I am sure something like a group function is missing here? I did add |> group() after the |> yield() but that didn’t work.

In grafana the graph is defined as “Time series” with no really configuration execept I specified the Y-Axis Unit, changed Display name to ${__field.labels.topic}, and no Value mappings, overrides or data links here.

Here’s another screenshot (sorry for multiple posts, but I am only allowed to add 2 screenshots per post?!) when I change the grafana output to bars, which shows that somehow weirdly each bar seems to go for 16 days instead of 1 day. I don’t really understand this, the _start and _stop columns really only span 24 hours. So I wonder why when being plotted, somehow this data seems to cross boundaries?!

Hi again @fe-hicking

I am afraid I led you down the wrong path, and there is a better way to do it. No need for window() or spread() functions. Just aggregateWindow every 1 month and then the difference() function, which sorta does the same thing as spread() in this case (since, just before the difference() function, we are (in the aggregateWindow function) downsampling the data by grouping the data into fixed windows of time and applying an aggregate or selector function to each window.

Try this:

from(bucket: "homeassistant")
  |> range(start: -12mo, stop: -1d)
  |> filter(fn: (r) => r["_measurement"] == "tasmota")
  |> filter(fn: (r) => r["_field"] == "Total")
  |> filter(fn: (r) => r["host"] == "192.168.0.42")
  |> aggregateWindow(every: 1mo, fn: sum)
  |> difference()
  |> yield(name: "sum")

With my data, I am getting something like this in Grafana:

I believe that if you expand your host to query more than just the office (and include living, kitchen, etc.) then you should get stacked bars like the example below (from play.grafana.org), but not sure.

1 Like

Hi @grant1

OMG! Yes! This is working now!

I had to change the aggregate function from “sum” to “max”, because in my understanding I don’t want to sum up cumulative values, but use the maximum data point in a window, right?

On the day from 09/01 to 10/01 I had these values:

kitchen1: 4.37 -> 4.72  = 0,35 (kwh)
kitchen2: 16.4 -> 17.2  = 0,8 (kwh)
living: 24.2 -> 24.7    = 0,5 (kwh)
office: 53.7 -> 54.9    = 1,2 (kwh)

Current query:

import "strings"

from(bucket: "homeassistant")
  |> range(start: -30d)
  |> filter(fn: (r) => r["_measurement"] == "tasmota")
  |> filter(fn: (r) => r["_field"] == "Total")
  |> map(fn: (r) =>  ({ r with topic: strings.replaceAll(v: strings.replaceAll(v: r.topic, t:"/SENSOR", u:""), t:"tele/tasmota_", u:"") }))
  |> map(fn: (r) =>  ({ r with _value: r._value * 1000.0 }))
  |> aggregateWindow(every: 1d, fn: max)
  |> difference()
  |> yield(name: "Total Usage")

I’ll fiddle with this around some more now. I learnt a lot - I hope it’ll be of use for others struggling similarly. Many many thanks to you for staying on it and teaching me.

Best regards,
Garvin

1 Like