How to select top100 by value with original tags and tag the sum of ther rest with tag='others'?

yuyu · March 21, 2017, 7:34pm

I have a high cardinality measurement with traffic data stored every 5 minutes:

fieldKeys: bytes,packets (both integer) and tagKeys: device, asnum.

Tag “asnum” (AS number) is the main contributor to series cardinality.

It looks ok so far to keep all series for short term retention policy.
For longer periods I’d like to create another “longterm” measurement with the same tag “device” values, top(100) _asnum_s with original asnum tag values and tag the sum over all other asnum tags on device with asnum=‘others’.
By top(100) here I mean asnums with highest bytes per time slot on device.

The goal is to reduce longterm series cardinality and speed up queries without loosing longtail data traffic completely. Another goal is to get more compact dataset for grafana piechartorheatmap panel plugins.

Can this goal be achieved with influxql query or a sequence of queries?

My guess is to use “select into” queries for this.
Later on I can wrap them into continuous queries or just run batch job periodically.

Query like

select top(bytes,“asnum”,100) as ‘topbytes’, top(packets,“asnum”,100) from “measurementName” where <some_time_range> group by time(5m), “device”, “asnum”

should do the first part.

The questions I have:

How to do the second part - get sum(bytes) for remaining (not top100) asnums and assign tag asnum=‘others’ for this sum?
Are there other better approaches to the problem?

Topic		Replies	Views
Influxdb 1.2.2 subquery for top X influxdb , influxql	1	921	December 18, 2017
Top X tags over time	1	584	February 18, 2019
Flux query get top 5 of a tag influxdb , grafana , query , flux	2	494	January 6, 2023
Timeseries get top 10 results of sum query struggles InfluxDB 1 flux	7	680	August 18, 2023
Getting result of TOP() with tags influxdb	0	347	June 4, 2019

How to select top100 by value with original tags and tag the sum of ther rest with tag='others'?

Related topics