I have two tables: T1 and T2. I Pivot them and do additional filtering and then experimental.unpivot() them such that they have the same schema. Then I apply |> group() to both of them. The “_value” column contains int64s exclusively. Producing these two tables takes a few seconds and they contains tens of entries. I cannot union them. Or at least, I cannot within 100 seconds, as that’s when I restarted the influxdb service. What am I doing wrong?
I see a few posts on union performance over the last few years with similar experiences but no resolutions.
@Arjun I’ve never seen this behavior in union(), but I’m not discounting it either. I’m wondering if instead of union’ing the streams together and outputting a single yield, what if you just yielded both streams? (I also added in some optimizations here)
I need the union because I have to merge and sort the tables by time. However, using your optimized query, removing the yields and adding a union at the end, it works! Thanks! …But I can’t figure out what is functionally different between your query and mine.
Structuring this stream of data as a function rather than a variable is the Flux equivalent of what some would call a “thunk.” It delays the planning of push-down functions through the invocation of the function.
For example, If I were to define the following variable, call that variable, and then follow it with other push-down’able functions:
The push-downs would get scoped to the variable and wouldn’t continue on through to the next push-down’able function (filter by field). So Flux would have to load all the data into memory returned by the first filter before processing the 2nd filter.
By structuring the stream as a function, you delay the planning of push-downs until after the invocation of the function, which allows subsequent pushdowns to be utilized. For example: