Slow window function

We are attempting to use Flux’s window() and/or windowaggregate() functions to plot the mean of a series of data points over time. We are converting an existing InfluxQL query performing the same logic on this dataset using the MEAN() function and a GROUP BY. The speed of the Flux query (over 30 seconds) is significantly worse compared to the InfluxQL one (5-6 seconds).
The slowness disappears if we remove the window() function from the query.

We noticed this Github issue that described this problem and provided a link to a merged PR. However, we have a version that already should already have the changes merged - currently running InfluxDB version 2.1.1.

Are there any other improvements happening to speed up the performance of the window() function?

Hi @elleanora,
Would you mind sending us your flux query? We can take a look and check your query is fully optimised. Here is what I will be looking for: push down optimisation .

Hi,

Will take a look at the push down optimization.

The Flux query - attempts to get the max, min, and mean temperatures for a given window period (ex: 1 sec):

data_temp = () => from(bucket: "system_telemetry")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => 
    r.system == "${system}" and 
    r.component == "asic_temp" and 
    r.agg_func == "NONE")
  |> group(columns: ["system"])
  |> window(every: v.windowPeriod)
 
 mean_temp = data_temp
  |> mean()
  |> set(key: "_field", value: "mean_die_temp") 
  |> map(fn: (r) => ({
  _time: r._start,
  _field: r._field,
  _value: r._value
}))

max_temp = data_temp
  |> max()
  |> set(key: "_field", value: "max_die_temp") 
  |> map(fn: (r) => ({
  _time: r._start,
  _field: r._field,
  _value: r._value
}))

min_temp = data_temp
  |> min()
  |> set(key: "_field", value: "min_die_temp") 
  |> map(fn: (r) => ({
  _time: r._start,
  _field: r._field,
  _value: r._value
}))

union(tables: [mean_temp, max_temp, min_temp]) 
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({
  _time: r._time,
  "Mean Die Temperature": r.mean_die_temp,
  "Max Die Temperature": r.max_die_temp,
  "Min Die Temperature": r.min_die_temp
}))

While the InfluxQL queries that are currently doing the same calculations on the same dataset are significantly faster:

SELECT mean("value") 
FROM "sixty_days"."telemetry" 
WHERE ("system" =~ /^$system$/ 
AND "component" = 'asic_temp' 
AND "agg_func" = 'NONE') 
AND $timeFilter 
GROUP BY time($__interval)

SELECT max("value") 
FROM "sixty_days"."telemetry" 
WHERE ("system" =~ /^$system$/ 
AND "component" = 'asic_temp' 
AND "agg_func" = 'NONE') 
AND $timeFilter 
GROUP BY time($__interval)

SELECT min("value") 
FROM "sixty_days"."telemetry" 
WHERE ("system" =~ /^$system$/ 
AND "component" = 'asic_temp' 
AND "agg_func" = 'NONE') 
AND $timeFilter 
GROUP BY time($__interval)```

@Anaisdg can I get some of your insight on this one ? :slight_smile: