We are attempting to use Flux’s window() and/or windowaggregate() functions to plot the mean of a series of data points over time. We are converting an existing InfluxQL query performing the same logic on this dataset using the MEAN() function and a GROUP BY. The speed of the Flux query (over 30 seconds) is significantly worse compared to the InfluxQL one (5-6 seconds).
The slowness disappears if we remove the window() function from the query.
We noticed this Github issue that described this problem and provided a link to a merged PR. However, we have a version that already should already have the changes merged - currently running InfluxDB version 2.1.1.
Are there any other improvements happening to speed up the performance of the window() function?
Hi @elleanora,
Would you mind sending us your flux query? We can take a look and check your query is fully optimised. Here is what I will be looking for: push down optimisation .
Hi,
Will take a look at the push down optimization.
The Flux query - attempts to get the max, min, and mean temperatures for a given window period (ex: 1 sec):
data_temp = () => from(bucket: "system_telemetry")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) =>
r.system == "${system}" and
r.component == "asic_temp" and
r.agg_func == "NONE")
|> group(columns: ["system"])
|> window(every: v.windowPeriod)
mean_temp = data_temp
|> mean()
|> set(key: "_field", value: "mean_die_temp")
|> map(fn: (r) => ({
_time: r._start,
_field: r._field,
_value: r._value
}))
max_temp = data_temp
|> max()
|> set(key: "_field", value: "max_die_temp")
|> map(fn: (r) => ({
_time: r._start,
_field: r._field,
_value: r._value
}))
min_temp = data_temp
|> min()
|> set(key: "_field", value: "min_die_temp")
|> map(fn: (r) => ({
_time: r._start,
_field: r._field,
_value: r._value
}))
union(tables: [mean_temp, max_temp, min_temp])
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({
_time: r._time,
"Mean Die Temperature": r.mean_die_temp,
"Max Die Temperature": r.max_die_temp,
"Min Die Temperature": r.min_die_temp
}))
While the InfluxQL queries that are currently doing the same calculations on the same dataset are significantly faster:
SELECT mean("value")
FROM "sixty_days"."telemetry"
WHERE ("system" =~ /^$system$/
AND "component" = 'asic_temp'
AND "agg_func" = 'NONE')
AND $timeFilter
GROUP BY time($__interval)
SELECT max("value")
FROM "sixty_days"."telemetry"
WHERE ("system" =~ /^$system$/
AND "component" = 'asic_temp'
AND "agg_func" = 'NONE')
AND $timeFilter
GROUP BY time($__interval)
SELECT min("value")
FROM "sixty_days"."telemetry"
WHERE ("system" =~ /^$system$/
AND "component" = 'asic_temp'
AND "agg_func" = 'NONE')
AND $timeFilter
GROUP BY time($__interval)```
@Anaisdg can I get some of your insight on this one ?