after six months of normal operations the influxdb (version 1.2.4)
started to consume lot more of memory and cpu cycles since about ten days.
Altogether the influxdb occupies about 390 MB, split into
360 MB for /var/lib/influxdb/data,
30 MB for /var/lib/influxdb/wal and less than
1 MB for /var/lib/influxdb/meta
There are in average 3 write ops and 1 read ops. The influxdb processes allocate about 400 MB memory.
Taking a deeper look into the system calls, I can see per minute:
140.000 futex (fast user-space lockings) calls and
360.000 clock_gettime calls
I’m doubting whether this is a normal behaviour?
Any pointers how to dig deeper are appreciated!
This is an interesting problem, I’m also looking for a similar performance check whenever an upgrade is performed (version upgrade on influxdb or even an OS update). I noticed that if the logging for default performance stats is enabled (default interval 10s). You can gather interesting details on Memory and Goroutines. e.g How many goroutines it uses and how much memory it is using, including Heap. It can be a good starting point to compare before and after upgrade stats to see where the extra CPU utilization is happening.
Although I will start at looking under /var/log/messages to ensure it is not stuck on something. So far in my personal experience I’ve noticed CQs and TSM file access issues (due to backup restore) which was causing the db to eat up the resources.