Limit / top ignored in flux queries

Fabian_Schneider · November 28, 2023, 1:24pm

Hello,

I’m using influxdb [InfluxDB v2.7.4 (git: 19e5c0e1b7)] as storage for my data crawlers. Yesterday I noticed large performance impacts with some jobs, and dug deeper.

I minified the query to the following POC:

from(bucket:"statistics")
 |> range(start: -1y)
 |> filter(fn: (r) => r["_measurement"] == "app_PlaytimeForever")
 |> top(n: 250, columns: ["_value"])
 |> limit(n: 250)

=> get me the top 250 measurements of app_PlaytimeForever, and just the top 250 - by my understanding limit is not even required, as:

top() sorts each input table by specified columns and keeps the top n records in each table
https://docs.influxdata.com/flux/v0/stdlib/universe/top/
limit() returns the first n rows after the specified offset from each input table.
https://docs.influxdata.com/flux/v0/stdlib/universe/limit/

Nevertheless, no matter if I use limit, top, both or even remove the filter, I get all ~400k results, in the data explorer as well as when executing the query manually.

As searching for that topic did not bring me any further, I’m looking for your suggestion, what part of the flux language I did not understood properly.

Kind regards,
Fabian

scott · November 28, 2023, 5:38pm

@Fabian_Schneider By default, from() |> range() |> filter() returns data grouped by _measurement, _field, and each tag. So each unique combination of measurements, fields, and tags is represented by a group/table in your results. top() and limit() operate on each input table/group.

What you can do is ungroup all your tables into a single table before you apply top():

from(bucket:"statistics")
    |> range(start: -1y)
    |> filter(fn: (r) => r["_measurement"] == "app_PlaytimeForever")
    |> group()
    |> top(n: 250, columns: ["_value"])

Fabian_Schneider · November 28, 2023, 5:55pm

Thank you, adding a group in the end indeed solved the issue.

Topic		Replies	Views
Top() and limit() function won't work Fluxlang query , flux	1	423	February 9, 2022
Flux - top from all tables? Fluxlang flux	2	640	July 31, 2020
Unable to replicate top function in flux Fluxlang flux	6	470	August 31, 2022
Query Latest Entries Fluxlang	2	455	May 31, 2022
Timeseries get top 10 results of sum query struggles InfluxDB 1 flux	7	638	August 18, 2023

Limit / top ignored in flux queries

Related topics