crop
February 9, 2022, 8:45am
1
Hello,
I am using influxdb version 1.8.9.
I would like to find out how much data is written per measurement and day. For a single measurement it works like this:
SELECT count(my_field_value) FROM "my_measurement_1" group by time(1d)
Is this also possible for all measurements in a database without having to query each measurement individually?
Can this data be written into a new measurement?
The name of the measurement would have to be added to enable interpretation.
With this evaluation, one could recognise missing data and also identify candidates for continous queries.
thanks for your help in advance
Anaisdg
February 15, 2022, 12:09am
2
Hello @crop ,
I believe you can do this with Flux. Is it enabled on your instance? are you willing to do that?
crop
February 15, 2022, 2:29pm
3
Hello @Anaisdg ,
Thanks for the tip. I have heard of flux, but have not used it yet. I found out via doc how to turn it on and google helped to get rid of the error 404.
For version 1.8.9 there is only access via cli?
I start like this:
influx -precision rfc3339 -type=flux -path-prefix /api/v2/query
…and can then make a first successful query:
from(bucket: "iobroker/")
|> range(start: -5m)
|> filter(fn: (r) => r._measurement == "knx.0.weather.station.WS_wind_speed")
|> yield()
Does this fit so far?
Anaisdg
February 18, 2022, 8:59pm
4
Hello @crop ,
I forgot these functions are only available to cloud users:
crop
February 25, 2022, 6:39pm
5
Hello @Anaisdg ,
Is there a way to dynamically form and then execute statements in influxdb?
Anaisdg
February 25, 2022, 6:52pm
6
Hello @crop ,
You could use this funciton:
is this what you’re envisioning?
import "experimental/array"
sources = [
{bucket: "b1", measurement: "m1", field: "f1"},
{bucket: "b2", measurement: "m2", field: "f2"},
{bucket: "b3", measurement: "m3", field: "f3"},
]
timeRange = {start: -1h, stop: now()}
tables = array.map(
arr: measurements,
fn: (x) => from(bucket: x.bucket)
|> range(start: timeRange.start, stop: timeRange.stop)
|> filter(fn: (r) => r._measurment == x.measurement and r._field == x.field)
)
union(tables: tables)
crop
February 25, 2022, 7:16pm
7
@Anaisdg Thanks for the quick answer. I’m on vacation for some days. Afterwards I will try to continue with this topic.
crop
March 10, 2022, 12:23pm
8
Hello @Anaisdg ,
Already the call “import "experimental/array ”" goes wrong.
I use influxdb 1.8.9 and I would like to find out which flux version I have.
The attempt via “runtime.version()” fails with the message
“Error: error calling function “version”: build info is not present”.
The question about the cause has already been asked by another user, but got no answer. Is that not possible?
crop
March 14, 2022, 6:07pm
9
@Anaisdg
Is the question too simple or is it impossible to determine the version?
Hello @crop ,
Not all flux functions are available in 1.x
I would expect this to work though for you to see your version number:
import "runtime"
import "array"
array.from(rows: [{_value: "${runtime.version()}"}])
You can see the version of Flux required for the function you want to use in the docs.
experimental/array requires Flux 0.79.0+
crop
March 21, 2022, 6:19pm
11
Hello @Anaisdg ,
Exactly for the versions mentioned in the documentation as a prerequisite I need my version.
Since the import of “array” does not work, it is obviously smaller than 0.79.0+.
> import "array"
Error: type error 1:1-1:15: unknown import path: "array"
import “runtime” was successful so: 0.38.0+ <= my version < 0.79.0+
Is there any other way to find out the version?
Is it possible to do a flux upgrade without changing the influxdb version?
I am using influxdb 1.8.9. The flux version included in it is really unknown?
And of course thanks for the help.
That’s strange. I thought you should be able to run that command.
crop
March 30, 2022, 5:44pm
13
@Anaisdg Thank you very much for the help. It is a pity that such a basic information is not ascertainable.
Hello @crop ,
Yes.
I think our only recourse then is to find what version of InfluxDB it is and lookup the release notes for it.
I think your version is v0.65
crop
April 15, 2022, 3:40pm
16
Hello @Anaisdg ,
I think that should fit. Thank you, the flux version is certainly v0.65.
Back to my actual request. I found out in the meantime that I can determine the number of data per day for a measured value with this query:
from(bucket: "iobroker/")
|> range(start: 2022-04-12T00:00:00Z, stop: 2022-04-15T00:00:00Z)
|> filter(fn: (r) => r._measurement == "fritzdect.0.DECT_087610355081.energy")
|> filter(fn: (r) => r._field == "value")
|> aggregateWindow(every: 1d, fn: count)
The result looks like this, for example:
Result: _result
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _value:int _time:time
------------------------------ ------------------------------ ---------------------- ------------------------------------ -------------------------- ------------------------------
2022-04-12T00:00:00.000000000Z 2022-04-15T00:00:00.000000000Z value fritzdect.0.DECT_087610355081.energy 22 2022-04-13T00:00:00.000000000Z
2022-04-12T00:00:00.000000000Z 2022-04-15T00:00:00.000000000Z value fritzdect.0.DECT_087610355081.energy 24 2022-04-14T00:00:00.000000000Z
2022-04-12T00:00:00.000000000Z 2022-04-15T00:00:00.000000000Z value fritzdect.0.DECT_087610355081.energy 24 2022-04-15T00:00:00.000000000Z
Since I want to have the count per day, regardless of when the query was executed, I added “|> range(…)”. Can this be done more elegantly?
In the column “_time” the day is already the following day. What is the reason for this? Can this be changed? The first day should be 2022-04-12, shouldn’t it?
I’m not entirely sure what you mean by this. You can do |> range(start: 0) to query all of your data but do so with caution.
The windows have to be defined by either the start or the stop of the window that you’re aggregating on.
The flux team decided it made more sense for the window to be defined based on the stop boundary to describe all the points it encompassed.
crop
April 19, 2022, 6:10pm
19
Hello @Anaisdg ,
To make it more clear here the raw data:
> from(bucket: "iobroker/")
|> range(start: -1d)
|> filter(fn: (r) => r._measurement == "fritzdect.0.DECT_087610355081.energy")
|> filter(fn: (r) => r._field == "value")
Result: _result
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ------------------------------------ ------------------------------ ----------------------------
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-18T19:50:59.204000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-18T19:55:22.517000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-18T21:50:59.205000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-18T21:55:22.518000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-18T23:50:59.205000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-18T23:55:22.519000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T01:50:59.205000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T01:55:22.519000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T03:50:59.206000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T03:55:22.519000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T05:50:59.206000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T05:55:22.520000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T07:50:59.206000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T07:55:22.520000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T09:50:59.206000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T09:55:22.521000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T11:50:59.206000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T11:55:22.523000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T13:50:59.205000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T13:55:22.524000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T15:50:59.207000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T15:55:22.525000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T17:50:59.209000000Z 0
2022-04-18T18:02:32.337591336Z 2022-04-19T18:02:32.337591336Z value fritzdect.0.DECT_087610355081.energy 2022-04-19T17:55:22.525000000Z 0
crop
April 19, 2022, 6:20pm
20
If I ask with the range -1d, then the data are related to the time I ask for it. That’s why I used the range like:
|> range(start: 2022-04-12T00:00:00Z, stop: 2022-04-15T00:00:00Z)
The result is about 24 rows for a day. I am not interested in the data itself, but in the number of data records per day. I want to find out where too much data is generated and then compress it (CONTINUOUS QUERY).
Therefore I supplement:
|> aggregateWindow(every: 1d, fn: count)
I hope this makes my concern clearer. So if I were to ask for today, then a date of tomorrow in the _time column is not correct as it is, but I understand the reasoning. Is it possible to “convert” this value?