Migrating FluxQL with multiple aggregations to Flux

Hi, we are in the midst of migrating a query that generates a data export.

Our data is made up of a measurement that represent the customer, a single tag that represents the device the data is coming from and a number of fields (in the tenths) that represent several sensor and cumulative values.
An example would be:

11EB768FB98C4270B6D16BAEE17CF0DD,deviceId=78767 EM1_Energy=124642,EM1_Energy_Tariff1=0,EM1_Energy_Tariff2=0,EM1_Flow=361,EM1_Flow_Temperature=64.18,EM1_Power=8.1,EM1_Return_Temperature=44.44,EM1_Serial=224212,EM1_Uptime=3192,EM1_Volume=4237,HC1_FlowTemperature=34.64,HC1_FlowTemperature_Ref=11,HC1_ReturnTemperature=30.96,HC1_ReturnTemperature_Extra=192,HC1_ReturnTemperature_Ref=40,HC1_RoomTemperature=192,HC1_RoomTemperature_Limit=18,HC2_FlowTemperature=49.3,HC2_FlowTemperature_Ref=52,HC2_ReturnTemperature=192,HC2_ReturnTemperature_Extra=0.01,HC2_ReturnTemperature_Ref=50,Outdoor_Temperature=27.64,Primary1_FlowTemperature=192,Primary1_ReturnTemperature=192 1625492641215000000

Over this series, we run a query that performs a variety of aggregations over several of the fields for a subset of devices. The aggregations can be min, max, first, last, average and delta (the difference between the value of a field at the end of the period minus the value of a field at the beginning of the period.
The query in FluxQL could look like:

select Last("EM1_Energy") as "0",Mean("Outdoor_Temperature") as "1",(Last("EM1_Energy") - First("EM1_Energy")) AS "2" from "11111111111111111111111111111111" where ("deviceId" = '179474' or "deviceId" = '237129' or "deviceId" = '237140' or "deviceId" = '254415' or "deviceId" = '323126' or "deviceId" = '323127' or "deviceId" = '5312' or "deviceId" = '53753' or "deviceId" = '999000037') and time >= 1617228000000ms and time <= 1619819999999ms group by "deviceId"

What we could come up with is this, arguably much more complex, flux query to get the same data:

doRename = (t=<-, name= "") => t
        |> map(fn: (r) => ({ r with _field: name }))

data = from(bucket: "development_readouts/high")
        |> range(start: time(v: "2021-03-31T22:00:00.000Z"), stop: time(v: "2021-04-30T21:59:59.999Z"))
        |> filter(fn: (r) => r._measurement == "11111111111111111111111111111111")
        |> filter(fn: (r) => r.deviceId == "179474" or r.deviceId == "237129" or r.deviceId == "237140" or r.deviceId == "254415" or r.deviceId == "323126" or r.deviceId == "323127" or r.deviceId == "5312" or r.deviceId == "53753" or r.deviceId == "999000037")

t0 = data
        |> filter(fn: (r) => r._field == "EM1_Energy")
        |> last()
        |> doRename(name: "0")
t1 = data
        |> filter(fn: (r) => r._field == "Outdoor_Temperature")
        |> mean()
        |> doRename(name: "1")
priv2_1 = data
        |> filter(fn: (r) => r._field == "EM1_Energy")
        |> last()
priv2_2 = data
        |> filter(fn: (r) => r._field == "EM1_Energy")
        |> first()
t2 = union(tables: [priv2_2, priv2_1])
        |> difference(nonNegative: false, columns: ["_value"])
        |> doRename(name: "2")

union(tables: [t0, t1, t2])
        |> pivot(
                rowKey:["deviceId"],
                columnKey: ["_field"],
                valueColumn: "_value"
        )
        |> group(columns: ["deviceId"], mode:"by")
        |> drop(columns: ["_start", "_stop", "_measurement"])
        |> yield(name: "default")

Even though we get the same data from both queries, the performance of the flux query is waaaay worse and varies heavily by the number of columns and the kind of aggregation.
Both queries (influxQL and Flux) are executed against our InfluxCloud instance and measured from our application:

  • 1 column:
    InfluxQL 97ms (using MEAN agg) 93ms (using DELTA agg)
    Flux 93ms (using MEAN agg) 997ms (using DELTA agg)
  • 2 columns
    InfluxQL 100.5ms (using MEAN+FIRST agg) 108ms (using MEAN+DELTA agg)
    Flux 948.1ms (using MEAN+FIRST agg) 1382ms (using MEAN+DELTA agg)
  • 3 columns
    InfluxQL 116.25ms (using MEAN+FIRST+MAX agg) 98.39ms (using MEAN+FIRST+DELTA agg)
    Flux 1325ms (using MEAN+FIRST+MAX agg) 1801ms (using MEAN+FIRST+DELTA agg)

As one can see the very moment we have more than one aggregation we get sensibly worse execution times with the flux query.
Being a “new”, more capable language, one can expect a little overhead, but 10x for such a small query is not acceptable.

Are we doing anything obviously wrong?

Thanks in advance for all the help

Hello @danielgonnet,
First I notice that you created a custom function called doRename which is the same as the set() function.
You might want to try using that instead to increase performance

Also I would try reusing as many variables as possible

priv2_1 = data
        |> filter(fn: (r) => r._field == "EM1_Energy")
        |> last()

t0 = priv2_1 
       |> set(key: "_field",value: "0")

As far as the first union you might get better performance with using the difference() and sum() function:

t2 =  data
        |> filter(fn: (r) => r._field == "EM1_Energy")
        |> difference(nonNegative: false, columns: ["_value"])
        |> sum(column: "_value")
        |> set(key: "_field",value: "2"

This blog could be useful to you:

Please let me know if any of the changes help.

I will give this a try and see if that improves the performance.
Thanks for the tip. I’ll let you know how it goes.

1 Like

@danielgonnet,
Please give the flux profiler a look (mentioned in that blog) I’m hoping it can help you identify the slow parts of your query so you can use alternative approaches.

Hi again,

I was able to perform some of the changes you suggested (and more) and have a look at the profiling output of the query.

I saw little to no improvement by using set instead of our function (of course we will be using setfrom now on).
The change you suggested to calculate deltas, we did not see much improvement, of course query looks cleaner, so we are adopting it as well.
Some improvement came from the reuse of fields and upfront filtering of fields: in the part of the query in which we range and filter measurement and tags, we filter by the set of fields we are going to query. Then, for each different field, we store the data in a variable that will be reused when performing the aggregation. This makes quite a difference when performing different aggregations on the same field, but we cannot tell if that is going to be the majority of the queries.

Looking at the profiler we could see that quite a bit of time is spent in the union, pivot and group, which gets the data in a shape that makes our program work comfortably with the query result.
Of course, removing that part made our query run faster (nowhere near as fast as our previous InfluxQL query) at the expense of our program having to deal with data that is more difficult to work with and having to do some expensive preprocessing.

So after spending many hours trying to migrate our simple InfluxQL we are left with a query that is more sensitive to the amount of columns than the old query is, that performs, at best, at least 3x worse and that returns data in a format that is very annoying to work with and that requires manipulation on the caller side.
On top of it, is several times the size of the original query (not an overly interesting metric) and much more difficult to generate from our program.

Of course I could still use the compatibility endpoint.
Unfortunately, being implemented as a GET request, we have found that some of our customers have long list of devices to filter, which causes the query to be too big to be executed. If it was implemented as a POST, the way that is implemented in v1.x we would not be forced to execute multiple queries in those cases.

Even though I can see the power that flux may bring for more complex scenarios, it makes simple scenarios like ours a living nightmare, which would be easier to accept if performance was better, which is not the case at all.

Very disappointed and not positive that we can make this query work as it should.

Sorry for the rant and thanks for your help.

Hello @danielgonnet,
I’m sorry that the performance isn’t what you expect yet or that my suggestions didn’t help.
The Flux team is continuously making improvements to the performance additionally v2 will support SQL and InfluxQL out of the box with IoX in the future.
I encourage you to take a look at this blog if you haven’t already:

and to attend these tech talks for an opportunity to ask the engineers directly about the challenges you’re facing now.

Finally, I’ve shared your question with the Flux team, but you’re welcome to comment on the issue as well:

1 Like

Don’t be sorry.

I know they do make performance improvements and for that I am thankful.

However flux is not a beta or “an alternative” anymore. Is the current default and most supported way of querying your cloud offering. And having that abysmal difference in performance with a language that is “in its way out” (and you are very welcome to call this one out by saying that the language will get new features in the future) after a long time in beta and a nice production run.

I could understand a duality (soon to be trio) if there was performance parity and equal library support. There is none.
New Javascript libraries are quite nice, yet they only support flux. In order to use the compatibility endpoint (which surprisingly offers hands down best performance) I am forced to use an older library that does not have the features or the support of the new one. On top of that is the GET only implementation of the query endpoint and the need to annoyingly having to create dbrp mappings if influxQl needs to be used.

Flux may be nicer for another (possibly majority) of users, but it really made our interfacing with our time series data much worse. And for that I am not really thankful.

Anyhow I will be trying to get my queries modernized and switch to flux when performance is in the same ballpark.
Thanks for trying to get us there.

@danielgonnet,
Sure thing. I hope the Flux team can think of something I’m missing. I think your concern brings up good questions about whether the data should be pivoted by field by default and whether the UI should support visualizing multiple columns simultaneously.