Poor performance for join(): cpu and memory grows quadratically with row count

wfjm · January 6, 2022, 11:44am

I’ve modified the query and

collapsed the filter stages, used keep instead of drop (mostly cosmetics)
used group() to collapse the second tableset, and re-grouped later

offmst = from(bucket: "dca")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "TfcMonitor" and
                       r["host"] == "cbmin00y" and
                       r["_field"] == "offabs")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
  |> keep(columns: ["_time","_value"])
  |> rename(columns: {_value: "off_mst"})
offend = from(bucket: "dca")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "TfcMonitor" and
                       r["host"] != "cbmin00y" and
                       r["_field"] == "offabs")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
  |> keep(columns: ["_time","_value", "host", "oid"])
  |> group()
  |> sort(columns: ["_time"])
  |> rename(columns: {_value: "off_end"})
join(tables: {mst:offmst, end:offend}, on: ["_time"])
  |> map(fn: (r) => ({ r with _value: r.off_end - r.off_mst}))
  |> drop(columns: ["off_mst","off_end"])
  |> group(columns: ["host", "oid"], mode:"by")
  |> sort(columns: ["_time"])
  |> yield()

The performance issue is still the same

Wind #rec  cpu_tot    mem_tot  cpu_join
10m  2017  2.983    598445952  2.720
20m  1009  0.974    150529920  0.759 
60m   337  0.227     17273856  0.100

I’ve tried this query directly in InfluxDB Explorer, and see a query time which grows roughly with row counts squared.

I’ve attached the profile loges generated with influxdb_client for detailed inspection by experts.
profile_10m.txt (9.3 KB)
profile_20m.txt (9.3 KB)
profile_60m.txt (9.3 KB)

Topic		Replies	Views
Flux join() performance InfluxDB 2 influxdb , query , flux	8	872	May 9, 2023
Tried experimental.join(): good performance, but also "panic: unknown type invalid" Fluxlang influxdb , flux , join	0	783	January 6, 2022
Slow queries when joining with SQL Fluxlang query , flux	4	1121	March 13, 2022
Join function query generates error in Flux Fluxlang influxdb , grafana	4	2974	July 18, 2018
Join Query Optimisation Help InfluxDB 2 influxdb , flux , join	2	362	August 2, 2023

Poor performance for join(): cpu and memory grows quadratically with row count

Related topics