GROUP BY time leaving tag - mixing aggregate and non-aggregate queries is not supported

Hi all, could you be so kind to help with the next issue. I have InfluxDB database with the measurement values from sensors, placed in the machines. Measurement schema has field “value”, timestamp, and two tags - “device_id” and “zone”. I have used queries “SELECT value, zone from temperature WHERE device_id=‘123456789’ AND time >= now() -50m AND time <now() -3s GROUP BY zone;” for the visualization of data. It works perfect, I can show all zones on one chart, but it is ok when the data flow is not interrupted.

Let’s imagine, that device worked from -50m to -30m (last value was 100), than it was switched off until -10m, first value after the device was on in -10m e.g. 200, and then device worked until now. Such query as above has drawback - on chart you can see line from (-50m, 100) to (-10, 200), but it should be NaN or 0 on the chart from -50m to -10m… So, I need somehow to show the time, when devise was not sent the data.

I decided to use query with aggregation: “select MEAN(value) AS mean_value, zone from temperature WHERE device_id=‘123456789’ AND time >= now() -50m AND time <now() -3s GROUP BY zone, time(1m);”), but received “mixing aggregate and non-aggregate queries is not supported” error. In case if I use “select MEAN(value) AS mean_value from temperature WHERE device_id=‘123456789’ AND time >= now() -50m AND time <now() -3s GROUP BY zone, time(1m)” query result consist of “mean_value” and “time”, but there are no zone. I have seen solution for continuous queries with INTO (
ERR: mixing aggregate and non-aggregate queries is not supported) - but I don’t need to store aggregated values.

So, how can I query the data with aggregation by time, but also leave tags (id of zone)?

Looks like

SELECT zone, mean_value FROM (SELECT mean(value) AS mean_value FROM temperature WHERE time > now() - 3h GROUP BY time(1m), zone)

works (query returned averaged value and also zone id), but it doesn’t help with the timestamps, where no any data was stored in InfluxDB. I’ve thought that MEAN for such time intervals will return 0, but that points just missed in the result. Interesting, in Grafana visualisation of similar query looks like it should be (empty space instead of line, connecting points), but in Giraffe visualisation library (that I’m using) it just connect points…

So, the question in the topic is closed, but the problem is not solved yet…

This is how the visualisation is looks like in Giraffe for the query from the previous post.

Not sure I got the latest request 100% right but looks like you got issues with filling gaps.
To do so there is a function called fill (see docs), which should require a group by time.

From what I’ve seen you are not grouping by time but I don’t think it will be an issue, you could even group the data by just one second GROUP BY time(1s) or whatever time unit suits your need. (As far as I can tell you won’t even notice the difference).

With fill you can fill with nothing (this is the default), fixed values (,0,10,100 whatever you want), and more, just have a look at the docs linked above

1 Like

Thank you for your help. Yes, current issue is a filling gaps. I have used grouping by tag:

SELECT zone, mean_value FROM (
        SELECT mean(value) AS mean_value FROM temperature WHERE time > now() - 5h 
        GROUP BY time(1m), zone
)

In this case I have a result such as

{"time": "2021-07-08T07:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T07:00:00Z", "zone": "002temp", "mean_value": 24.7},
{"time": "2021-07-08T08:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T08:00:00Z", "zone": "002temp", "mean_value": 25.6},
{"time": "2021-07-08T09:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T09:00:00Z", "zone": "002temp", "mean_value": 26.0},
{"time": "2021-07-08T07:00:00Z", "zone": "001set", "mean_value": 195.5},
{"time": "2021-07-08T07:00:00Z", "zone": "002set", "mean_value": 195.6},
{"time": "2021-07-08T08:00:00Z", "zone": "001set", "mean_value": 200.0},
{"time": "2021-07-08T08:00:00Z", "zone": "002set", "mean_value": 200.0},
{"time": "2021-07-08T09:00:00Z", "zone": "001set", "mean_value": 200.0},
{"time": "2021-07-08T09:00:00Z", "zone": "002set", "mean_value": 200.0}

I’ve tried to use fill(0):

SELECT zone, mean_value FROM (
        SELECT mean(value) AS mean_value FROM temperature WHERE time > now() - 5h 
        GROUP BY time(1m), zone fill(0)
)

I assumed that result should be like above plus additional 0 points for time 6:00:00 and 10:00:00, when device was switched off:

{"time": "2021-07-08T06:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T06:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T07:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T07:00:00Z", "zone": "002temp", "mean_value": 24.7},
{"time": "2021-07-08T08:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T08:00:00Z", "zone": "002temp", "mean_value": 25.6},
{"time": "2021-07-08T09:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T09:00:00Z", "zone": "002temp", "mean_value": 26.0},
{"time": "2021-07-08T010:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T010:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T06:00:00Z", "zone": "001set", "mean_value": 0.0},
{"time": "2021-07-08T06:00:00Z", "zone": "002set", "mean_value": 0.0},
{"time": "2021-07-08T07:00:00Z", "zone": "001set", "mean_value": 195.5},
{"time": "2021-07-08T07:00:00Z", "zone": "002set", "mean_value": 195.6},
{"time": "2021-07-08T08:00:00Z", "zone": "001set", "mean_value": 200.0},
{"time": "2021-07-08T08:00:00Z", "zone": "002set", "mean_value": 200.0},
{"time": "2021-07-08T09:00:00Z", "zone": "001set", "mean_value": 200.0},
{"time": "2021-07-08T09:00:00Z", "zone": "002set", "mean_value": 200.0}
{"time": "2021-07-08T010:00:00Z", "zone": "001set", "mean_value": 0.0},
{"time": "2021-07-08T010:00:00Z", "zone": "002set", "mean_value": 0.0},

But in this case it returns result that looks like “decart mutiplication” in SQL - for each result instance (row) from above results InfluxDB add additional points for each time interval (in the example below - from 6:00:00 to 10:00:00 with the step 1:00:00):

{"time": "2021-07-08T06:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T07:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T08:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T09:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T10:00:00Z", "zone": "001temp", "mean_value": 0.0},

{"time": "2021-07-08T06:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T07:00:00Z", "zone": "002temp", "mean_value": 24.7},
{"time": "2021-07-08T08:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T09:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T10:00:00Z", "zone": "002temp", "mean_value": 0.0},

{"time": "2021-07-08T06:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T07:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T08:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T09:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T10:00:00Z", "zone": "001temp", "mean_value": 0.0},

{"time": "2021-07-08T06:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T07:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T08:00:00Z", "zone": "002temp", "mean_value": 25.6},
{"time": "2021-07-08T09:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T10:00:00Z", "zone": "002temp", "mean_value": 0.0},

{"time": "2021-07-08T06:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T07:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T08:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T09:00:00Z", "zone": "001temp", "mean_value": 0.0},
{"time": "2021-07-08T10:00:00Z", "zone": "001temp", "mean_value": 0.0},

{"time": "2021-07-08T06:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T07:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T08:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T09:00:00Z", "zone": "002temp", "mean_value": 26.0},
{"time": "2021-07-08T10:00:00Z", "zone": "002temp", "mean_value": 0.0},
{"time": "2021-07-08T11:00:00Z", "zone": "002temp", "mean_value": 0.0},

etc

Queries that used as a subqueries (with and without fill) returned totally identical results!

I forgot to add GROUP BY zone in the end of the first query:

SELECT zone, mean_value FROM (
     SELECT mean(value) AS mean_value FROM temperature 
            WHERE time > now() - 10h AND time < now() 
            GROUP BY zone, time(10m) fill(0)
) GROUP BY zone

So, the problem solved. Thanks to Giovanni_Luisotto one more time - your comment about fill() has really helped me!

The query you posted should fill all the data gaps with zeroes, the subquery is not needed.

SELECT mean(value) AS mean_value FROM temperature WHERE time > now() - 5h 
        GROUP BY time(1m), zone fill(0)

Therefore, a filler will be created for each “member” of your GROUP BY.
In this case time and zone.
As a sample concept, if my data is

Time | Zone | Value
2021-07-08T11:00:00Z | zone1 | 10
2021-07-08T11:03:00Z | zone1 | 11
2021-07-08T11:04:00Z | zone1 | 12
2021-07-08T11:06:00Z | zone1 | 12

You will get something like this.

Time | Zone | Value
2021-07-08T11:00:00Z | zone1 | 10
2021-07-08T11:01:00Z | zone1 | 0
2021-07-08T11:02:00Z | zone1 | 0
2021-07-08T11:03:00Z | zone1 | 11
2021-07-08T11:04:00Z | zone1 | 12
2021-07-08T11:05:00Z | zone1 | 0
2021-07-08T11:06:00Z | zone1 | 12

for every single combination of minute (because I specified 1m as time grouping) and zone a row has been “created”.
The filler will cover the whole time range you select. (in my sample I limited it)

1 Like

Thank you!
Yes, your query is also do almost the same, but I’m using python influxdb client, where ResultSet by means of list(result.get_points()) is represented as a list of dictionaries, and than Giraffe is used for visualization. This is why I need to have a zone number in “flat” response (as represented below), and this is why I need to aggregate mean function and tag.

("zone": "001temp"): [
     {"time": "2021-07-08T12:47:00Z", "zone": "001temp", "mean_value": 10.0}, 
     {"time": "2021-07-08T12:48:00Z", "zone": "001temp", "mean_value": 0.0},
     ...
],
("zone": "002temp"): [
     {"time": "2021-07-08T12:47:00Z", "zone": "002temp", "mean_value": 0.0}, 
     {"time": "2021-07-08T12:48:00Z", "zone": "002temp", "mean_value": 0.0},
     ...
]

May be there is a better way to process the data in Python, but right now it is my implementation.