Best practices against overflow

Raymond_Keller · May 12, 2020, 1:35am

I’ve had some of my InfluxDB systems repeatedly experience OOMs and shard corruption apparently resulting from either increases in data ingest that inhibit timely compaction and/or disk outages (perhaps themselves caused by failing compactions).
Are there general recommendations on how to avoid this sort of problem? I mean, I assume one answer is “don’t put too much stuff too fast into your DB”. Is that a – if you will – cardinal sin? Is the way to deal with that simply to “just control your data sources”?
Are there other recommendations to help build resilience in the face of difficult-to-control data sources?
Any docs on best practices generally or specifically regarding this issue?
Thanks for your thoughts on this.

Anaisdg · May 12, 2020, 4:59pm

Hello @Raymond_Keller,
Generally, yes–I’d say you’re right. It’s important to remember, InfuxDB will use whatever memory is available to it in order to optimise reads and writes. However, I can think of the following recommendations:

make sure TSI is enabled
monitor your influxdb instance with another instance
take a look at metric_buffer_limit if you’re using telegraf
have good retention policies
reduce cardinality if possible

system · May 14, 2020, 8:21pm

This topic was automatically closed 60 minutes after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Influxdb CPU usage Store influxdb	8	8594	April 28, 2017
Compaction and periodic spikes in the CPU usage	4	1186	March 18, 2020
Memory consumption leading to queries timeout InfluxDB 2	2	1408	January 27, 2021
[InfluxDB 1.8] Out of memory every 45-60 days Store influxdb	1	1168	December 23, 2020
Pi zero out of memory problem	7	1365	November 9, 2020

Best practices against overflow

Related topics