InfluxQL Select top N per timeperiod

I am trying to emulate a bit of functionality of some of the better known SAAS metric visualization providers using InfluxDB and Grafana.

I am capturing process stats and inserting them into InfluxDB, I have pasted the Influx line format I am using and a simple aggregation that give me the rss of all the process over a period of time. Of course I do not want to graph all of the process, I only want to graph the top 5 or 10 process for each aggregation period.

I have done a little bit of google fu and I some people are using the TOP function and sub query to get the top N of something from the overall period, but what I want to get the the top N of something from each period.

My inclination is that I am going to have to use continuous queries to sample each collection period take the top N of the metric I care about and place that in a new measurement that I graph. I would have to create different measurements foe each thing I want to graph, rss, cputime, pcpu, 


Can any suggest a way/ single query that I can use that will give me what I want.

All/Any help is much appreciated.

pslist,pid=579,user=root,comm=rpc.gssd cputime=11i,pcpu=0.0,pmem=0.0,rss=3992i,vsz=31388i
pslist,pid=584,user=root,comm=systemd-logind cputime=195i,pcpu=0.0,pmem=0.0,rss=3380i,vsz=22988i
pslist,pid=586,user=root,comm=accounts-daemon cputime=2187i,pcpu=0.0,pmem=0.0,rss=7352i,vsz=275284i
pslist,pid=643,user=root,comm=rngd cputime=1363i,pcpu=0.0,pmem=0.0,rss=140i,vsz=9512i
pslist,pid=688,user=root,comm=irqbalance cputime=530i,pcpu=0.0,pmem=0.0,rss=264i,vsz=19568i
pslist,pid=733,user=root,comm=agetty cputime=0i,pcpu=0.0,pmem=0.0,rss=1844i,vsz=9716i
pslist,pid=736,user=root,comm=agetty cputime=0i,pcpu=0.0,pmem=0.0,rss=1604i,vsz=9900i
pslist,pid=767,user=ntp,comm=ntpd cputime=1126i,pcpu=0.0,pmem=0.0,rss=5256i,vsz=106064i
pslist,pid=803,user=nslcd,comm=nslcd cputime=2060i,pcpu=0.0,pmem=0.0,rss=7348i,vsz=453680i
pslist,pid=809,user=root,comm=cron cputime=56i,pcpu=0.0,pmem=0.0,rss=2704i,vsz=22940i
pslist,pid=813,user=daemon,comm=atd cputime=0i,pcpu=0.0,pmem=0.0,rss=1932i,vsz=19716i
pslist,pid=960,user=root,comm=master cputime=85i,pcpu=0.0,pmem=0.0,rss=4484i,vsz=61228i
pslist,pid=975,user=postfix,comm=qmgr cputime=19i,pcpu=0.0,pmem=0.0,rss=4420i,vsz=65444i
pslist,pid=2140,user=root,comm=systemd-journal cputime=1299i,pcpu=0.0,pmem=1.8,rss=154380i,vsz=248928i
pslist,pid=4410,user=root,comm=kworker/1:2H cputime=1i,pcpu=0.0,pmem=0.0,rss=0i,vsz=0i
pslist,pid=4411,user=root,comm=kworker/2:2H cputime=2i,pcpu=0.0,pmem=0.0,rss=0i,vsz=0i
pslist,pid=4412,user=root,comm=kworker/0:2H cputime=3i,pcpu=0.0,pmem=0.0,rss=0i,vsz=0i
pslist,pid=4413,user=root,comm=kworker/3:2H cputime=3i,pcpu=0.0,pmem=0.0,rss=0i,vsz=0i
pslist,pid=6874,user=root,comm=sshd cputime=50i,pcpu=0.0,pmem=0.0,rss=6208i,vsz=61340i
pslist,pid=7991,user=nagios,comm=nrpe cputime=148i,pcpu=0.0,pmem=0.0,rss=2504i,vsz=19788i
pslist,pid=9008,user=statd,comm=rpc.statd cputime=1i,pcpu=0.0,pmem=0.0,rss=2892i,vsz=35368i
pslist,pid=9009,user=root,comm=rpcbind cputime=36i,pcpu=0.0,pmem=0.0,rss=3440i,vsz=47628i

SELECT mean(rss) AS rss FROM "30days"."pslist" WHERE time > now() - 5m AND "host" = 'xxxxxxxx' GROUP BY time(1m),"comm" SLIMIT 5

name: pslist
tags: comm=smartd
time                rss
----                ---
1518792300000000000 3548
1518793200000000000 3548
1518794100000000000 3548
1518795000000000000 3548


name: pslist
tags: comm=sshd
time                rss
----                ---
1518792300000000000 3288
1518793200000000000 3288
1518794100000000000 3288
1518795000000000000 3288


name: pslist
tags: comm=systemd
time                rss
----                ---
1518792300000000000 4824.8
1518793200000000000 4824.8
1518794100000000000 4824.8
1518795000000000000 4824.354838709677

name: pslist
tags: comm=systemd-journal
time                rss
----                ---
1518792300000000000 64906.4406779661
1518793200000000000 63610.2
1518794100000000000 63362.73333333333
1518795000000000000 64452

name: pslist
tags: comm=systemd-logind
time                rss
----                ---
1518792300000000000 3532
1518793200000000000 3532
1518794100000000000 3532
1518795000000000000 3532


name: pslist
tags: comm=systemd-udevd
time                rss
----                ---
1518792300000000000 3336
1518793200000000000 3336
1518794100000000000 3336
1518795000000000000 3336

image

1 Like

You should be able to get this by using a nested influx query. e.g.

select top(mean,5) from (SELECT mean(rss) AS rss FROM “30days”.“pslist” WHERE time > now() - 5m AND “host” = ‘xxxxxxxx’ GROUP BY time(1m),“comm” ) WHERE time > now() - 5m AND “host” = ‘xxxxxxxx’ GROUP BY time(1m),“comm”

Thanks for your response.

I have tried your query and it does not quite give me what I want. It seems that the group by “comm” in the outer select is trumping the top() selector and returning all of the comm and not just the top 5.

If the outer query is restricted by comms only does it give the desired results?

select top(mean,5) from (SELECT mean(rss) AS rss FROM “30days”.“pslist” WHERE time > now() - 5m AND “host” = ‘xxxxxxxx’ GROUP BY time(1m),“comm” ) GROUP BY comms

I think what I want to do may not be possible in a single query.

The 1st query takes my series and give me ALL of the process grouped by 1m interval by taking the mean of the smaller intervals. In the case of my query I am getting 6 points for each process

The sub query that you provided takes these top 6 points and returns the top 5 points for each process. So I am not getting the top 5 process overall.

What I want to it is take each time period bucket and get the the top 5 process by rss. The idea being that at any given time I can see on my system the top N process that were using rss or memory or pcpu. Like the image that I pasted in the 1st post.

I am beginning the think that I need some type of continuous query or kapacitor job that takes the data from the original measurement and computes what I want into a new one.

> SELECT mean(rss) FROM "30days"."pslist" WHERE time > now() - 5m AND "host" = 'vnet001.sv3' AND "comm" =~ /^[ijr]/ GROUP BY time(1m),"comm"
name: pslist
tags: comm=influxd
time                mean
----                ----
1518911880000000000 
1518911940000000000 1754619
1518912000000000000 1682508
1518912060000000000 1782947
1518912120000000000 1927172
1518912180000000000 2054742

name: pslist
tags: comm=iotop
time                mean
----                ----
1518911880000000000 
1518911940000000000 9090
1518912000000000000 8636
1518912060000000000 8595
1518912120000000000 8593
1518912180000000000 8595

name: pslist
tags: comm=irqbalance
time                mean
----                ----
1518911880000000000 
1518911940000000000 2364
1518912000000000000 2364
1518912060000000000 2364
1518912120000000000 2364
1518912180000000000 2364

name: pslist
tags: comm=jbd2/md2-8
time                mean
----                ----
1518911880000000000 
1518911940000000000 0
1518912000000000000 0
1518912060000000000 0
1518912120000000000 0
1518912180000000000 0

name: pslist
tags: comm=rcu_sched
time                mean
----                ----
1518911880000000000 
1518911940000000000 0
1518912000000000000 0
1518912060000000000 0
1518912120000000000 0
1518912180000000000 0

name: pslist
tags: comm=rngd
time                mean
----                ----
1518911880000000000 
1518911940000000000 1108
1518912000000000000 1108
1518912060000000000 1108
1518912120000000000 1108
1518912180000000000 1108

name: pslist
tags: comm=rpc.gssd
time                mean
----                ----
1518911880000000000 
1518911940000000000 2360
1518912000000000000 2360
1518912060000000000 2360
1518912120000000000 2360
1518912180000000000 2360

name: pslist
tags: comm=rpc.statd
time                mean
----                ----
1518911880000000000 
1518911940000000000 2352
1518912000000000000 2352
1518912060000000000 2352
1518912120000000000 2352
1518912180000000000 2352

name: pslist
tags: comm=rpcbind
time                mean
----                ----
1518911880000000000 
1518911940000000000 3936
1518912000000000000 3936
1518912060000000000 3936
1518912120000000000 3936
1518912180000000000 3936

name: pslist
tags: comm=rsyslogd
time                mean
----                ----
1518911880000000000 
1518911940000000000 59440
1518912000000000000 56452
1518912060000000000 56884
1518912120000000000 55030
1518912180000000000 55374


SELECT TOP(mean,5) FROM (SELECT mean(rss) FROM "30days"."pslist" WHERE time > now() - 5m AND "host" = 'vnet001.sv3' AND "comm" =~ /^[ijr]/ GROUP BY time(1m),"comm") GROUP BY "comm"
name: pslist
tags: comm=influxd
time                top
----                ---
1518912060000000000 1782947
1518912120000000000 1927172
1518912180000000000 2054742
1518912240000000000 2059850
1518912300000000000 2129757.3333333335

name: pslist
tags: comm=iotop
time                top
----                ---
1518912060000000000 8595
1518912120000000000 8593
1518912180000000000 8595
1518912240000000000 7803
1518912300000000000 8582.666666666666

name: pslist
tags: comm=irqbalance
time                top
----                ---
1518912060000000000 2364
1518912120000000000 2364
1518912180000000000 2364
1518912240000000000 2364
1518912300000000000 2364

name: pslist
tags: comm=jbd2/md2-8
time                top
----                ---
1518912060000000000 0
1518912120000000000 0
1518912180000000000 0
1518912240000000000 0
1518912300000000000 0

name: pslist
tags: comm=rcu_sched
time                top
----                ---
1518912060000000000 0
1518912120000000000 0
1518912180000000000 0
1518912240000000000 0
1518912300000000000 0

name: pslist
tags: comm=rngd
time                top
----                ---
1518912060000000000 1108
1518912120000000000 1108
1518912180000000000 1108
1518912240000000000 1108
1518912300000000000 1108

name: pslist
tags: comm=rpc.gssd
time                top
----                ---
1518912060000000000 2360
1518912120000000000 2360
1518912180000000000 2360
1518912240000000000 2360
1518912300000000000 2360

name: pslist
tags: comm=rpc.statd
time                top
----                ---
1518912060000000000 2352
1518912120000000000 2352
1518912180000000000 2352
1518912240000000000 2352
1518912300000000000 2352

name: pslist
tags: comm=rpcbind
time                top
----                ---
1518912060000000000 3936
1518912120000000000 3936
1518912180000000000 3936
1518912240000000000 3936
1518912300000000000 3936

name: pslist
tags: comm=rsyslogd
time                top
----                ---
1518912060000000000 56884
1518912120000000000 55030
1518912180000000000 55374
1518912240000000000 55628
1518912300000000000 55628

I believe you are looking for a column based values rather than grouped values.

As far as reducing the number of entries returned you can simply switch the “Top” function with “Max”, it will reduce it to single entry.

@mselby, do you find any solution for this problem? thanks.