Hi, I’m running Influxdb as a docker container on a Amazon Linux ami 4.9.20-11.31.amzn1.x86_64 having 4GB memory and 2 cpu t2.medium instance. I’m using Influxdb as remote read and write db for Prometheus. I’m a new to influxdb and using it for last 5 days. First 3 days there was no issue but from the 4th day i’m facing the below issue. Please help.
Error message:
fatal error: runtime: out of memory
Thank you for looking into the issue. As per your advise, I will look into the h/w sizing guidelines.
Just to let you know, last few days I was testing the db after removing the remote read and write option from prometheus. Basically no new data was being written to the db and no read queries happening. Still influxdb was going down every 1-2 mins with the error “running out memory”. After I added 6GB of swap space, it became stable and running for last 15hrs without going down.
Today again I enabled the remote read/write option in prometheus and withing few minutes the db crashed due to " fatal error: runtime: out of memory".
Sonia, is there any query/doc through which I can find out the following details:
Number of fields written per second
Number of queries per second
3 Number of unique [series].
Hi, my prototyping and testing with InfluxDB was stopped for a few months, (but I will start again in days for a real system), but I experienced some problem with memory management too, and I didn’t succeed in obtaining deterministic informations about these.
(Check my post: Memory usage forever growing with INF RP? )
I can tell you to check both: your DB structure and Retention Policies.
In my understanding, memory management is always slowly increasing due to indexes partially in RAM (to make querying faster), but the “big steps” in memory needs are due to:
new Series (keep in mind that not only a new Measurement is a new Serie, but also a new value used for the first time for a Tag, so if you have Tags with several values…they should be turned into Fields.
creation of new shards of data due to Retention Policies (something will be replicated, so the same “samples” stored with a retention policy of 2 days will consume more memory than if stored with a retention policy of 2 weeks. So…use a long retention period, or a short one, but move old data elsewhere as soon as possible …and delete them from the machine that is collecting new data .
Hi Sumit, are you using Chronograf? If so, you should be able to see an InfluxDB dashboard that monitors queries and writes. You can see it if you navigate to Host List then click influxdb in the telegraf host listed
Hey Sumit, no problem. Try running this curl command and look for pointsWrittenOK to see the number of points written if you’re running locally (the endpoint corresponds to http://influxdb:8086/debug/vars): curl http://localhost:8086/debug/vars
There is a lot of other information in that JSON blob that may prove useful to you.
Another ask would be if you are using TSI or TSM. The default is TSM and can increase the memory requirements for InfluxDB to run. If you are running out of memory, consider migrating to TSI.