Poor performance for join(): cpu and memory grows quadratically with row count

@MzazM ,
yes, I’ve tried experimental.join, and ran into a stability issue

experimental.join() joins on group key and _time. That forced me to flatten a table stream with group() and later regroup. Because it panics when input starts with empty rows I’ve to take care of that too. After all of this indeed it works, and performs with query times linear in row count.

So we have join() with a performance issue and experimental.join() with stability issue.
And I repeat what I stated earlier:

Why does a join of two tables by a unique key (here _time ) have a complexity of O(N*N) ?

Is experimental.join() an attempt to bypass this problem (instead of solving it) ?
Maybe one of the developers can comment on this and the strategy to resolve this.

2 Likes