Having trouble with influxdb subqueries

Jonaus · October 31, 2017, 9:28am

Can’t figure out how to create a query for my needs (don’t even know if it’s possible).

Test data:

# DDL

CREATE DATABASE filltest

# DML

# CONTEXT-DATABASE: filltest

a,mytag=x b=1 1506816000000000000
a,mytag=x b=1 1506902400000000000
a,mytag=x b=1 1506988800000000000
a,mytag=x b=1 1507075200000000000
a,mytag=x b=1 1507161600000000000
a,mytag=x b=1 1507248000000000000
a,mytag=x b=1 1507334400000000000
a,mytag=x b=1 1507420800000000000
a,mytag=x b=1 1507507200000000000
a,mytag=x b=1 1507593600000000000
a,mytag=y b=2 1506816000000000000
a,mytag=y b=2 1506988800000000000
a,mytag=y b=2 1507161600000000000
a,mytag=y b=2 1507248000000000000
a,mytag=y b=2 1507420800000000000
a,mytag=y b=2 1507507200000000000
a,mytag=y b=2 1507593600000000000

There are 10 measurements with tag=x and 7 with tag=y (values are spaced 1 day apart). I need to take the average of all tags grouped by 1 day, filling with previous value for days where any tag didn’t have any value.

Measurements in influx (grouped by tag):

> SELECT mean(b) FROM a WHERE time >= 1506816000000000000 and time <= 1507593600000000000 GROUP BY time(1d), mytag
name: a
tags: mytag=x
time                 mean
----                 ----
2017-10-01T00:00:00Z 1
2017-10-02T00:00:00Z 1
2017-10-03T00:00:00Z 1
2017-10-04T00:00:00Z 1
2017-10-05T00:00:00Z 1
2017-10-06T00:00:00Z 1
2017-10-07T00:00:00Z 1
2017-10-08T00:00:00Z 1
2017-10-09T00:00:00Z 1
2017-10-10T00:00:00Z 1

name: a
tags: mytag=y
time                 mean
----                 ----
2017-10-01T00:00:00Z 2
2017-10-02T00:00:00Z 
2017-10-03T00:00:00Z 2
2017-10-04T00:00:00Z 
2017-10-05T00:00:00Z 2
2017-10-06T00:00:00Z 2
2017-10-07T00:00:00Z 
2017-10-08T00:00:00Z 2
2017-10-09T00:00:00Z 2
2017-10-10T00:00:00Z 2

I can create influx query to group by tag and fill with previous value:

> SELECT mean(b) FROM a WHERE time >= 1506816000000000000 and time <= 1507593600000000000 GROUP BY time(1d), mytag fill(previous)
name: a
tags: mytag=x
time                 mean
----                 ----
2017-10-01T00:00:00Z 1
2017-10-02T00:00:00Z 1
2017-10-03T00:00:00Z 1
2017-10-04T00:00:00Z 1
2017-10-05T00:00:00Z 1
2017-10-06T00:00:00Z 1
2017-10-07T00:00:00Z 1
2017-10-08T00:00:00Z 1
2017-10-09T00:00:00Z 1
2017-10-10T00:00:00Z 1

name: a
tags: mytag=y
time                 mean
----                 ----
2017-10-01T00:00:00Z 2
2017-10-02T00:00:00Z 2
2017-10-03T00:00:00Z 2
2017-10-04T00:00:00Z 2
2017-10-05T00:00:00Z 2
2017-10-06T00:00:00Z 2
2017-10-07T00:00:00Z 2
2017-10-08T00:00:00Z 2
2017-10-09T00:00:00Z 2
2017-10-10T00:00:00Z 2

My end result should look like this (but I need each value to be 1.5):

> SELECT mean(b) FROM a WHERE time >= 1506816000000000000 and time <= 1507593600000000000 GROUP BY time(1d)
name: a
time                 mean
----                 ----
2017-10-01T00:00:00Z 1.5
2017-10-02T00:00:00Z 1
2017-10-03T00:00:00Z 1.5
2017-10-04T00:00:00Z 1
2017-10-05T00:00:00Z 1.5
2017-10-06T00:00:00Z 1.5
2017-10-07T00:00:00Z 1
2017-10-08T00:00:00Z 1.5
2017-10-09T00:00:00Z 1.5
2017-10-10T00:00:00Z 1.5

I tried something like this:

> SELECT mean("a") FROM (SELECT mean(b) as "a" FROM a WHERE time >= 1506816000000000000 and time <= 1507593600000000000 GROUP BY time(1d), mytag fill(previous)) WHERE time >= 1506816000000000000 and time <= 1507593600000000000 GROUP BY time(1d)

But that doesn’t return what I need. I couldn’t find a similar problem anywhere, help is appreciated.

Jonaus · November 2, 2017, 12:54pm

I created CQ:

> CREATE CONTINUOUS QUERY foo ON filltest BEGIN SELECT mean(b) AS "g" INTO foo FROM a GROUP BY time(1d), mytag fill(previous) END

With it and some backfilling I was able to get the results I needed:

> SELECT mean("g") FROM foo WHERE time >= 1506816000000000000 and time <= 1507593600000000000 GROUP BY time(1d)

name: foo
time                 mean
----                 ----
2017-10-01T00:00:00Z 1.5
2017-10-02T00:00:00Z 1.5
2017-10-03T00:00:00Z 1.5
2017-10-04T00:00:00Z 1.5
2017-10-05T00:00:00Z 1.5
2017-10-06T00:00:00Z 1.5
2017-10-07T00:00:00Z 1.5
2017-10-08T00:00:00Z 1.5
2017-10-09T00:00:00Z 1.5
2017-10-10T00:00:00Z 1.5

But I need to be able to use different time periods for grouping (representing the data in grafana) so CQ is not an option. Still don’t understand why I can’t achieve the same using subquery.

EDIT: to me this looks like a bug/unexpected behaviour with fill() command in subquery. I opened an issue on github.

Topic		Replies	Views
Group by and fill(previous) creating wrong result influxdb , time-series , query	2	2580	November 12, 2019
GROUP BY tag but use last(time) InfluxDB 1 query	0	334	February 8, 2022
InfluxDB 1.7.9 - Fill function in a subquery when outer query groups by time bug influxdb , influxql , query	0	795	February 21, 2020
Version 1.7 group by time does not work Welcome & Getting Started influxdb , query	1	494	January 13, 2021
Subquery group by time influxdb , query	2	2121	June 16, 2020

Having trouble with influxdb subqueries

Related topics