Poor performance for join(): cpu and memory grows quadratically with row count

wfjm · January 12, 2022, 6:44pm

@MzazM ,
yes, I’ve tried experimental.join, and ran into a stability issue

experimental.join() joins on group key and _time. That forced me to flatten a table stream with group() and later regroup. Because it panics when input starts with empty rows I’ve to take care of that too. After all of this indeed it works, and performs with query times linear in row count.

So we have join() with a performance issue and experimental.join() with stability issue.
And I repeat what I stated earlier:

Why does a join of two tables by a unique key (here _time ) have a complexity of O(N*N) ?

Is experimental.join() an attempt to bypass this problem (instead of solving it) ?
Maybe one of the developers can comment on this and the strategy to resolve this.

Topic		Replies	Views
Flux join() performance InfluxDB 2 influxdb , query , flux	8	872	May 9, 2023
Tried experimental.join(): good performance, but also "panic: unknown type invalid" Fluxlang influxdb , flux , join	0	783	January 6, 2022
Slow queries when joining with SQL Fluxlang query , flux	4	1121	March 13, 2022
Join function query generates error in Flux Fluxlang influxdb , grafana	4	2974	July 18, 2018
Join Query Optimisation Help InfluxDB 2 influxdb , flux , join	2	362	August 2, 2023

Poor performance for join(): cpu and memory grows quadratically with row count

Related topics