Unions are slow

scott · February 6, 2023, 10:45pm

Structuring this stream of data as a function rather than a variable is the Flux equivalent of what some would call a “thunk.” It delays the planning of push-down functions through the invocation of the function.
For example, If I were to define the following variable, call that variable, and then follow it with other push-down’able functions:

exampleVar = 
    from(bucket: "example-bucket")
        |> range(start: -1d)
        |> filter(fn: (r) => r._measurement == "example-m")

exampleVar |> filter(fn: (r) => r._field == "example-f")

The push-downs would get scoped to the variable and wouldn’t continue on through to the next push-down’able function (filter by field). So Flux would have to load all the data into memory returned by the first filter before processing the 2nd filter.

By structuring the stream as a function, you delay the planning of push-downs until after the invocation of the function, which allows subsequent pushdowns to be utilized. For example:

exampleFn = () =>
    from(bucket: "example-bucket")
        |> range(start: -1d)
        |> filter(fn: (r) => r._measurement == "example-m")

exampleFn() |> filter(fn: (r) => r._field == "example-f")

This allows the planner to apply pushdowns past the invocation of the function, so both filters get pushed-down and only data returned from the 2nd filter needs to be loaded into memory.

Topic		Replies	Views
Flux: Poor performance of `union()` and `yield()` InfluxDB 2 flux , performance , union	0	1107	September 9, 2021
Trouble sorting tables Fluxlang	2	471	November 30, 2020
Create a new column to store data of a new query influxdb , query , flux	1	134	August 12, 2024
Join Data quickly -- InfluxQL 1.8 vs Flux 2.0 performance InfluxDB 2 influxdb , chronograf , flux , join , union	2	1472	January 28, 2022
[URGENT] Flux vs InfluxQL Groupby Speed InfluxDB 2 flux	7	569	August 30, 2023

Unions are slow

Related topics