Flux performance compared to similar influxQL query

Am I missing something obvious when translating the influxQL query below to Flux? From Grafana the influxQL query executes in 191ms, the Flux one takes about 38s??
This is for the same time range, last 2 weeks with one record every 30s (so somewhere around 2.5M records)

SELECT  max(chargeur_velo) as chargeur_velo,  max(plaque) as plaque,  max(cuisine) as cuisine,  max(lave_vaisselle) as lave_vaisselle,  max(livebox) as livebox FROM ${db}.."power" WHERE $timeFilter GROUP BY time($watts_interval) fill(linear)

into

fields = ["chargeur_velo", "plaque", "cuisine", "lave_vaisselle", "livebox"]
from(bucket: "iota-oneweek")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "power")
  |> filter(fn: (r) => contains(value: r._field, set: fields))
  |> aggregateWindow(every: 1d, fn: mean, createEmpty: false )

Do you have any tags in your data that Flux is querying? I found this webinar from Influxdb about Schema Design for IoT to be extremely helpful.

My impression is that this data should be re-organized as follows:

  • Tag: Device ( chargeur_velo | plaque | cuisine | lave_vaisselle | livebox )
  • Field: Power
  • Measurement would be something other than Power, for example, “Maison”

Hi @grant1, I will look into your suggestion, but I think I found the main culprit: using the "contains"function in the second filter seems to prevent the push down.
If I simply use a static “or” expression performance is an order of magnitude better.

Franck

OK, thanks for the additional info. I had never seen the contains() function, so was not aware of how it was used.

Yes, it is the contains(). It is known to be impacting a lot the perfromance. If you are using grafana, you can bypassing it using carefully some variables and some regex. Read the full discussion here

2 Likes

There is a bug report for poor contains performance.