Procstat top 10


#1

I’m having an issue aggregating multiple processes to 1 metric.
I’m using and expression to match the running process, its returning multiple instances.
I could pull the procstat by the pid, but I would rather not.

I want to get the sum of metrics grouped by ‘process_name’ tag. So it looks something like

procstat,host=FatRabbit,process_name=dansguardian-av memory_rss=11567104i,cpu_time_user=0 1543284200000000000

Currently. its returning multiple values.

procstat,host=FatRabbit,process_name=dansguardian-av memory_rss=11567104i,cpu_time_user=0 1543284200000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_time_user=0.01,memory_rss=11804672i 1543284200000000000
procstat,host=FatRabbit,process_name=dansguardian-av memory_rss=11812864i,cpu_time_user=0.02 1543284200000000000
procstat,host=FatRabbit,process_name=dansguardian-av memory_rss=11812864i,cpu_time_user=0.02 1543284200000000000
procstat,host=FatRabbit,process_name=dansguardian-av memory_rss=11767808i,cpu_time_user=0.01 1543284200000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_time_user=0,memory_rss=11575296i 1543284200000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_time_user=0.02,memory_rss=11812864i 1543284200000000000
procstat,host=FatRabbit,process_name=dansguardian-av memory_rss=11812864i,cpu_time_user=0.01 1543284200000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_time_user=0.02,memory_rss=11816960i 1543284200000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_time_user=0.01,memory_rss=11821056i 1543284200000000000
procstat,host=FatRabbit,process_name=dansguardian-av memory_rss=11812864i,cpu_time_user=0.01 1543284200000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_usage=0,memory_rss=11812864i,cpu_time_user=0.02 1543284210000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_time_user=0.02,memory_rss=11816960i,cpu_usage=0 1543284210000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_time_user=0,cpu_usage=0,memory_rss=11575296i 1543284210000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_usage=0,memory_rss=11821056i,cpu_time_user=0.01 1543284210000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_time_user=0.01,memory_rss=11812864i,cpu_usage=0 1543284210000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_time_user=0,cpu_usage=0,memory_rss=11567104i 1543284210000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_usage=0,memory_rss=11812864i,cpu_time_user=0.02 1543284210000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_usage=0,memory_rss=11804672i,cpu_time_user=0.01 1543284210000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_usage=0,cpu_time_user=0.01,memory_rss=11767808i 1543284210000000000
procstat,host=FatRabbit,process_name=dansguardian-av memory_rss=11812864i,cpu_time_user=0.01,cpu_usage=0 1543284210000000000
procstat,host=FatRabbit,process_name=dansguardian-av cpu_usage=0,cpu_time_user=0.02,memory_rss=11812864i 1543284210000000000

This is my telegraf configuration.

[[inputs.procstat]]
   pattern  = "."
   tagexclude = ["pattern","user"]
   pid_tag = false
   fieldpass = [
      "cpu_time_user",
      "cpu_usage",
       "memory_rss",
  ]

[[processors.converter]]
  namepass = ["*procstat*"]
  [processors.converter.fields]
     # Convert memory_rss from unsigned float, to signed float. 
    float = [
      "memory_rss",
        ]

[[processors.topk]]
  namepass = ["*procstat*"]
  k = 10
  group_by = ['process_name']
  fields = [
      "cpu_time_user",
      "cpu_usage",
      "memory_rss",
  ]
  aggregation = "sum"


#2

I think you probably want to replace the topk processor with the basicstats aggregator. The topk processor limits which metrics are output to only the top values, while the basicstats aggregator will actually sum up the values for you.


#3

I made a script to pull the data a different way. I used the exec input

#!/bin/bash
TIME=$(date +%s)
TIME="${TIME}000000000"
RESULT=$(ps aux | awk '{cpu[$1]+=$4}; {mem[$1]+=$3; next}; END{for (i in cpu) print i,"Zcpup="cpu[i],",memp="mem[i]}'  | sed "s/ //g")
for r in $RESULT
do
 echo "ps,process_name=$r $TIME" | sed "s/Z/ /g"
done

Code could probably be cleaner. But, it works for what I need. It also has the benefit of rejecting any processes that is using 0.1% cpu or memory utilization.


#4

Here is a better script

#!/bin/sh
TIME="${TIME}000000000"
RESULT=$(ps aux | awk '{cpu[$11]+=$3}; {mem[$11]+=$4}; {memrss[$11]+=$6}; END{for (i in cpu) print i,"Zcpup="cpu[i],",memp="mem[i],",mem_rss="memrss[i]}'  | sed "s/ //g")
for r in $RESULT
do
 echo "ps,process_name=$r $TIME" | sed "s/Z/ /g"
done