How much data per measurement and day

Hello,
I am using influxdb version 1.8.9.
I would like to find out how much data is written per measurement and day. For a single measurement it works like this:

SELECT count(my_field_value) FROM "my_measurement_1" group by time(1d) 

Is this also possible for all measurements in a database without having to query each measurement individually?
Can this data be written into a new measurement?
The name of the measurement would have to be added to enable interpretation.

With this evaluation, one could recognise missing data and also identify candidates for continous queries.

thanks for your help in advance

Hello @crop,

I believe you can do this with Flux. Is it enabled on your instance? are you willing to do that?

Hello @Anaisdg,
Thanks for the tip. I have heard of flux, but have not used it yet. I found out via doc how to turn it on and google helped to get rid of the error 404.

For version 1.8.9 there is only access via cli?

I start like this:

influx -precision rfc3339 -type=flux -path-prefix /api/v2/query

…and can then make a first successful query:

from(bucket: "iobroker/")
  |> range(start: -5m)
  |> filter(fn: (r) => r._measurement == "knx.0.weather.station.WS_wind_speed")
  |> yield()

Does this fit so far?

Hello @crop,
I forgot these functions are only available to cloud users:

Hello @Anaisdg,
Is there a way to dynamically form and then execute statements in influxdb?

Hello @crop,
You could use this funciton:

is this what you’re envisioning?

import "experimental/array"

sources = [
    {bucket: "b1", measurement: "m1", field: "f1"},
    {bucket: "b2", measurement: "m2", field: "f2"},
    {bucket: "b3", measurement: "m3", field: "f3"},
]

timeRange = {start: -1h, stop: now()}

tables = array.map(
    arr: measurements,
    fn: (x) => from(bucket: x.bucket)
        |> range(start: timeRange.start, stop: timeRange.stop)
        |> filter(fn: (r) => r._measurment == x.measurement and r._field == x.field)
)

union(tables: tables)

@Anaisdg Thanks for the quick answer. I’m on vacation for some days. Afterwards I will try to continue with this topic.

Hello @Anaisdg,
Already the call “import "experimental/array”" goes wrong.

I use influxdb 1.8.9 and I would like to find out which flux version I have.
The attempt via “runtime.version()” fails with the message
“Error: error calling function “version”: build info is not present”.

The question about the cause has already been asked by another user, but got no answer. Is that not possible?

@Anaisdg
Is the question too simple or is it impossible to determine the version?

Hello @crop,
Not all flux functions are available in 1.x
I would expect this to work though for you to see your version number:

import "runtime"
import "array"

array.from(rows: [{_value: "${runtime.version()}"}])

You can see the version of Flux required for the function you want to use in the docs.
experimental/array requires Flux 0.79.0+

Hello @Anaisdg,
Exactly for the versions mentioned in the documentation as a prerequisite I need my version.
Since the import of “array” does not work, it is obviously smaller than 0.79.0+.

> import "array"
Error: type error 1:1-1:15: unknown import path: "array"

import “runtime” was successful so: 0.38.0+ <= my version < 0.79.0+
Is there any other way to find out the version?

Is it possible to do a flux upgrade without changing the influxdb version?

I am using influxdb 1.8.9. The flux version included in it is really unknown?

And of course thanks for the help.

That’s strange. I thought you should be able to run that command.

@Anaisdg Thank you very much for the help. It is a pity that such a basic information is not ascertainable.

Hello @crop,
Yes. :frowning:
I think our only recourse then is to find what version of InfluxDB it is and lookup the release notes for it.

I think your version is v0.65

Hello @Anaisdg,
I think that should fit. Thank you, the flux version is certainly v0.65.

Back to my actual request. I found out in the meantime that I can determine the number of data per day for a measured value with this query:

from(bucket: "iobroker/")
  |> range(start: 2022-04-12T00:00:00Z, stop: 2022-04-15T00:00:00Z)
  |> filter(fn: (r) => r._measurement == "fritzdect.0.DECT_087610355081.energy")
  |> filter(fn: (r) => r._field == "value")
  |> aggregateWindow(every: 1d, fn: count)

The result looks like this, for example:

Result: _result
Table: keys: [_start, _stop, _field, _measurement]
                   _start:time                      _stop:time           _field:string                   _measurement:string                  _value:int                      _time:time  
------------------------------  ------------------------------  ----------------------  ------------------------------------  --------------------------  ------------------------------  
2022-04-12T00:00:00.000000000Z  2022-04-15T00:00:00.000000000Z                   value  fritzdect.0.DECT_087610355081.energy                          22  2022-04-13T00:00:00.000000000Z  
2022-04-12T00:00:00.000000000Z  2022-04-15T00:00:00.000000000Z                   value  fritzdect.0.DECT_087610355081.energy                          24  2022-04-14T00:00:00.000000000Z  
2022-04-12T00:00:00.000000000Z  2022-04-15T00:00:00.000000000Z                   value  fritzdect.0.DECT_087610355081.energy                          24  2022-04-15T00:00:00.000000000Z 

Since I want to have the count per day, regardless of when the query was executed, I added “|> range(…)”. Can this be done more elegantly?

In the column “_time” the day is already the following day. What is the reason for this? Can this be changed? The first day should be 2022-04-12, shouldn’t it?

I’m not entirely sure what you mean by this. You can do |> range(start: 0) to query all of your data but do so with caution.

The windows have to be defined by either the start or the stop of the window that you’re aggregating on.
The flux team decided it made more sense for the window to be defined based on the stop boundary to describe all the points it encompassed.

Hello @Anaisdg,
To make it more clear here the raw data:

> from(bucket: "iobroker/")
  |> range(start: -1d)
  |> filter(fn: (r) => r._measurement == "fritzdect.0.DECT_087610355081.energy")
  |> filter(fn: (r) => r._field == "value")

Result: _result
Table: keys: [_start, _stop, _field, _measurement]
                   _start:time                      _stop:time           _field:string                   _measurement:string                      _time:time                  _value:float  
------------------------------  ------------------------------  ----------------------  ------------------------------------  ------------------------------  ----------------------------  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-18T19:50:59.204000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-18T19:55:22.517000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-18T21:50:59.205000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-18T21:55:22.518000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-18T23:50:59.205000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-18T23:55:22.519000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T01:50:59.205000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T01:55:22.519000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T03:50:59.206000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T03:55:22.519000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T05:50:59.206000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T05:55:22.520000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T07:50:59.206000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T07:55:22.520000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T09:50:59.206000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T09:55:22.521000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T11:50:59.206000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T11:55:22.523000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T13:50:59.205000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T13:55:22.524000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T15:50:59.207000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T15:55:22.525000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T17:50:59.209000000Z                             0  
2022-04-18T18:02:32.337591336Z  2022-04-19T18:02:32.337591336Z                   value  fritzdect.0.DECT_087610355081.energy  2022-04-19T17:55:22.525000000Z                             0 

If I ask with the range -1d, then the data are related to the time I ask for it. That’s why I used the range like:

 |> range(start: 2022-04-12T00:00:00Z, stop: 2022-04-15T00:00:00Z)

The result is about 24 rows for a day. I am not interested in the data itself, but in the number of data records per day. I want to find out where too much data is generated and then compress it (CONTINUOUS QUERY).
Therefore I supplement:

  |> aggregateWindow(every: 1d, fn: count)

I hope this makes my concern clearer. So if I were to ask for today, then a date of tomorrow in the _time column is not correct as it is, but I understand the reasoning. Is it possible to “convert” this value?