Strange Behaviour of MEAN() function

Hi Team, @scott , @Anaisdg , @rickspencer3 , @Jay_Clifford

Below are two simple IQL queries.

InfluxQL Query 1: SELECT mean(“quantity”) as mean_quantity FROM (SELECT mean(“quantity”) as “quantity” FROM “quantity_management”…“group-products” WHERE “id” = ‘YAZOO’ GROUP BY “product_id”,“product_name” ) WHERE time > :dashboardTime: and time < :upperDashboardTime:

InfluxQL Query 2: SELECT mean(“quantity”) as mean_quantity FROM (SELECT mean(“quantity”) as “quantity” FROM “quantity_management”…“group-products” WHERE “id” = ‘YAZOO’ GROUP BY “product_id”,“product_name”, time(1d) ) WHERE time > :dashboardTime: and time < :upperDashboardTime:

InfluxQL Query 1 works perfectly but InfluxQL Query 2 (where the only difference is the addition of time in group by) does not yield any results, I get the warning “Your query or command is syntactically correct but returned no results”. Why does InfluxQL Query 2 not work? any ideas please? :pray:

@ajetsharwin What tool(s) are you using to run these queries? The tool is what actually populates the placeholder variables (:dashboardTime: and :upperDashboardTime:) with timestamps before sending the query to InfluxDB.

@scott Hey Scott, I was using Chronograf and the placeholder variables defined a period of 30d. It’s the equivalent of saying time > now() - 30d

Ok, did a little testing myself and learned a new quirk about InfluxQL. If you use GROUP BY time() in a subquery and include time bounds in the WHERE clause, the time bounds have to be in the inner query before the GROUP BY clause, not the outer query :man_shrugging:. This does result in a slightly slower query execution, but it will at least return results:

SELECT
  mean("quantity") as mean_quantity
FROM
  (
    SELECT
      mean("quantity") as "quantity"
    FROM
      "quantity_management".."group-products"
    WHERE
      "id" = 'YAZOO'
    WHERE
      time > :dashboardTime:
      and time < :upperDashboardTime:
    GROUP BY
      "product_id",
      "product_name",
      time(1d)
  )

I’ve added an issue to document this behavior: Add info about InfluxQL GROUP BY time() with time bounds behavior in subqueries · Issue #5269 · influxdata/docs-v2 · GitHub

@scott ah, thanks Scott, noted on the quirk!

I have three follows ups, sorry for the long message but I’m hopeful some useful findings will come out of them, here goes :crossed_fingers:

Mean Series: Follow up 1
Do you know if it’s possible to calculate a “mean series”?

Please allow me to expand on the question.

Here is the query from my original question with some tweaks and accounting for the quirk by putting the where clause related to time inside the sub query itself.

InfluxQL Query 2:
SELECT mean(“daily_quantity”) as mean_daily_quantity FROM
(SELECT sum(“quantity”) as “daily_quantity” FROM “quantity_management”…“group-products” WHERE time > now() - 30d and “id” = ‘YAZOO’ GROUP BY “product_id”,“product_name”, time(1d) )

The series that is returned after the sub-query is processed is the daily_quantity series. My requirement is now to calculate the mean of this daily_quantity series. More explicitly, see the below example:

date daily_quantity mean_daily_quantity
01-Nov-2023 10 10
02-Nov-2023 12 11
03-Nov-2023 11 11
04-Nov-2023 10 10.75
05-Nov-2023 14 11.4
06-Nov-2023 15 12
07-Nov-2023 12 12
08-Nov-2023 13 12.125
09-Nov-2023 13 12.22222222
10-Nov-2023 14 12.4

any idea how I can produce mean_daily_quantity from the series?

Application of Moving Average:Follow up 2

Stangely, applying a MOVING_AVERAGE to the below query is also resulting in the warning:
“Your query or command is syntactically correct but returned no results”

InfluxQL Query 2:
SELECT mean(“daily_quantity”) as mean_daily_quantity FROM
(SELECT sum(“quantity”) as “daily_quantity” FROM “quantity_management”…“group-products” WHERE time > now() - 30d and “id” = ‘YAZOO’ GROUP BY “product_id”,“product_name”, time(1d) )

Applying MOVING_AVERAGE:

SELECT MOVING_AVERAGE(mean(“daily_quantity”),5) as ma5_mean_daily_quantity FROM
(SELECT sum(“quantity”) as “daily_quantity” FROM “quantity_management”…“group-products” WHERE time > now() - 30d and “id” = ‘YAZOO’ GROUP BY “product_id”,“product_name”, time(1d) )
GROUP BY time(1d)

Forward Looking Python: Follow up 3

We’ve journeyed along with InfluxQL queries which have been valuable and effective in visualising our time series data. We are now at a business stage where we want to visualise time series of more sophisticated financial ratios and metrics.
Currently our data lives in InfluxDB 2 and we visualise via Grafana using InfluxQL queries. Do you know if there is a way in which we can use the data living in InfluxQL, visualise on Grafana, but the code used in Grafana is not InfluxQL but it’s Python? Calculating the sophisticated financial ratios and metrics in InfluxQL will not be possible and there is a real business need to further process data living in InfluxDB to visualise on our dashboards. I’d really appreciate some guidance on a suitable way forward here :pray:

tagging the wider team for thoughts and input on Follow up 3, @rickspencer3 @Jay_Clifford @Anaisdg :pray:

@scott gentle ping on if you had any thoughts on the above please :crossed_fingers:

I think your application of MOVING_AVERAGE is the way to go.

I don’t know why this wouldn’t return results. I’ve tested a similar query and it returns results as I’d expect.

Forward Looking Python: Follow up 3

I don’t know if Grafana has something like a Python data source, but if it’s possible to add a Python script that Grafana can visualize the results of, you could use the InfluxDB python client library to query InfluxDB and return the results as a dataframe, then process the dataframe however you need to.

Thank you Scott, I’ll get back on this :pray: