Difference between universe and execute type operations

Is there any detailed documentation on the profiler? I want to know the difference, at least between a universe and execute type operation in the operator profiler. It seems that an execute type only follows an influx type, and that their durations overlap.

Hello @simon38,
This is all the documentation and tutorials on the profiler:

As well as understanding how Flux workds:

For any of you advanced Flux users wanting more detail about how Flux operates, the image below is a visualization of the way that Flux executes a query. First, you write a query, then the Flux planner checks if your query matches existing pushdown patterns. If there’s a match, the planner writes a plan for how to perform your query. Next, the Flux executor executes your query by invoking hidden operations that initiate a storage read API call. These hidden operations differ based on your pushdown pattern. For example, if we execute a query with from |>range() |> filter() |>group() |>max() , a corresponding hidden storage operation, called ReadGroupMax , initiates data transformation on the storage side. The data is then streamed back to Flux via gRPC where Flux can then convert the data to Flux tables, or Annotated CSV.

~~~~~

I want to know the difference, at least between a universe and execute type operation in the operator profiler.

I think it has to do with the structure of Flux packages in the Flux repo:

Most flux queries begin with pushdown patterns that are a part of the universe directory.

It seems that an execute type only follows an influx type, and that their durations overlap.

Can you please give me an example of what you mean? Maybe with a query and a screenshot of the profiler output?

It’s possible durations are overlapping because time spent in queue? I’m not sure.

I’ll also

Thank you, those were very good resources. I found more information in the github docs.

The question I have about the profiler is this. If I run the following query:
from(bucket: “real_data”)
|> range(start: -1h)
|> filter(fn: (r) => r["_measurement"] == “xxxxx”)
|> filter(fn: (r) => r["_field"] == “xxxx”)
|> stateDuration(fn: (r) => r._value > 500000, unit: 1m)
|> filter(fn: (r) => r.stateDuration > 0)
// |> map(fn: (r) => ({r with stateDuration: float(v: r.stateDuration)}))
// |> histogram(column: “stateDuration”, upperBoundColumn: “le”, countColumn: “_value”, bins: linearBins(start: 0.0, width: 1.0, count: 30, infinity: true), normalize: false)
|> yield()

The query profiler tells me that the total duration is 1273636658 ns. The operator profiler gives me the following values for DurationSum

stateTracking4: 681093693
merged_ReadRange7_filter2_filter3: 805930153
filter5: 350534043

The total of these values is 1837557889, more than the total duration of the query.

It seems to me that the functions operate on the values as they are streamed from the previous functions, so the cost in time is swallowed by each function, until the end when it has to actually return the data, which itself incurs a cost. This is what I’d like to understand.

An interesting thing happens when I uncomment the line for “map”. The total time goes up to 3.2 seconds, but when I uncomment the histogram line, it goes down to 2.9 seconds. The numbers change each time I run the query, but it’s usually the case that it is higher without the histogram line. It seems to me, again, that the data is passed in its entirety to map before it is processed, but because histogram significantly reduces the amount of data returned to the user, the runtime is less using histogram.

This might be too much for this thread, but if you have any insight, that would be great. A guide to writing queries with performance in mind would be useful. I’m trying to write such a thing, but without enough technical knowledge. I also need to focus on other tasks now.

Your intuition is in fact correct here. The total duration may in fact be less than the sum of the durations of each operation, because Flux executes queries as a pipeline, and some operations may execute concurrently.

The unit of work for Flux transformations is somewhat of a tricky concept, but you would not be far off if you think of it as a table with a unique group key. So for example, in your query, stateDuration will process a table from its predecessor, and pass on the transformed table to its successor—in this case, a filter. Then stateDuration can process the next table (which will have a different group key) while filter works on the table it just received.

As to why you see a big increase when you uncomment “map”: this operations is special. It’s possible that a call to “map” will alter the group key, and the execution engine really wants only to release a table to its successor after it has seen all the rows for that group key that have will be produced by “map”. Unfortunately this means that “map” cannot release any of the tables it will produce to its successor until it has seen every input table that will be produced by its predecessor.

As a result of this behavior of “map” the concurrency of the query will be reduced, and the total duration will go up.

As to why uncommenting “histogram” makes the result smaller, I think again that you are right. The histogram will reduce the data that needs to be serialized back to the client, and we reap some performance benefit as a result.

Thank you both for your answers. This was all very useful and helped me clarify a lot of what I’m seeing.