I am evaluating telegraf as a collector for our monitoring at the moment. And it works great (with influxdb), but the rising CPU usage is worrying me. It rises 1% every two days and there is no end in sight.
2017-03-31 0.3% restart (configuration change of flush interval)
RAM usage looks okay.
(Grid has 1% steps)
[agent] ## Default data collection interval for all inputs interval = "20s" round_interval = true metric_batch_size = 1000 metric_buffer_limit = 10000 collection_jitter = "0s" flush_interval = "60s" flush_jitter = "0s" precision = "" ## Logging configuration: debug = false quiet = false logfile = "" hostname = "" omit_hostname = false [[inputs.cpu]] percpu = true totalcpu = true collect_cpu_time = false [[inputs.disk]] ignore_fs = ["tmpfs", "devtmpfs"] [[inputs.diskio]] # Get kernel statistics from /proc/stat [[inputs.kernel]] # Read metrics about memory usage [[inputs.mem]] # Get the number of processes and group them by status [[inputs.processes]] # Read metrics about swap memory usage [[inputs.swap]] # Read metrics about system load & uptime [[inputs.system]] # # Read TCP metrics such as established, time wait and sockets counts. [[inputs.netstat]] # # Monitor process cpu and memory usage [[inputs.procstat]] exe = "vnstatd" [[inputs.procstat]] exe = "influxd" [[inputs.procstat]] exe = "grafana-server" [[inputs.procstat]] exe = "telegraf"
Is this normal? I don’t get why it uses more and more CPU when it does the same task every few seconds. RAM usage is always under 33MB.