Influxdb Not working(cache-max-memory)

Luv · April 13, 2017, 1:37pm

I am sending data to influx via telegraf. Suddenly, my influxdb has stopped.

These are the error logs of the telegraf from the time when Influxdb stopped.

2017-04-13T09:23:00Z E! Error writing to output [file]: failed to write message: kernel,host=DHARI-Inspiron-3542 processes_forked=196837i,interrupts=150752942i,context_switches=265819737i,boot_time=1491987066i 1492075370000000000 , invalid argument

2017-04-13T09:23:00Z E! InfluxDB Output Error: {"error":"engine: cache-max-memory-size exceeded: (1073763734/1073741824)"}

2017-04-13T09:23:00Z E! Error writing to output [influxdb]: Could not write to any InfluxDB server in cluster

Then, when I tried to restart the influxdb service,

It did not start, and telegraf’s error logs showed this,

2017-04-13T13:23:00Z E! Error writing to output [file]: failed to write message: pg_stat_all_tables,host=DHARI-Inspiron-3542,server=db01,db=postgres seq_tup_read=0i,n_tup_ins=0i,n_dead_tup=0i,autovacuum_count=0i,n_tup_hot_upd=0i,n_tup_upd=0i,relname="sql_features",n_tup_del=0i,relid="12242",schemaname="information_schema",analyze_count=0i,n_live_tup=0i,seq_scan=0i,n_mod_since_analyze=0i,vacuum_count=0i,autoanalyze_count=0i 1492089100000000000 , invalid argument

2017-04-13T13:23:00Z E! InfluxDB Output Error: Post http://127.0.0.1:8086/write?consistency=any&db=telegraf&precision=ns&rp=: dial tcp 127.0.0.1:8086: getsockopt: connection refused

2017-04-13T13:23:00Z E! Error writing to output [influxdb]: Could not write to any InfluxDB server in cluster

When I try to launch the influx command line, I get

luvpreet@DHARI-Inspiron-3542:/etc/influxdb$ influx
Failed to connect to http://localhost:8086: Get http://localhost:8086/ping: dial tcp 127.0.0.1:8086: getsockopt: connection refused
Please check your connection settings and ensure 'influxd' is running.

But the status shows that it is running !!! Here have a look,

luvpreet@DHARI-Inspiron-3542:/etc/influxdb$ sudo service influxdb status
● influxdb.service - InfluxDB is an open-source, distributed, time series database
Loaded: loaded (/lib/systemd/system/influxdb.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2017-04-13 19:05:20 IST; 43ms ago
Docs: https://docs.influxdata.com/influxdb/
Main PID: 32211 (influxd)
Tasks: 6
Memory: 3.4M
CPU: 8ms
CGroup: /system.slice/influxdb.service
└─32211 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
Apr 13 19:05:20 DHARI-Inspiron-3542 influxd[32211]: [I] 2017-04-13T13:35:20Z InfluxDB starting, version 1.2.2, branch master, commit 1bcf3ae74c
Apr 13 19:05:20 DHARI-Inspiron-3542 influxd[32211]: [I] 2017-04-13T13:35:20Z Go version go1.7.4, GOMAXPROCS set to 4
Apr 13 19:05:20 DHARI-Inspiron-3542 influxd[32211]: [I] 2017-04-13T13:35:20Z Using configuration at: /etc/influxdb/influxdb.conf

After reading the telegraf logs, I thought it might be occuring as my cache is full, so I doubled my
cache-max-memory-size = 2048576000

But it still does not start.

The biggest thing is that the log file has been automatically deleted and I cannot have a clue about what happened as long as there will be no logs.

Also, I want to discuss that increasing the cache size will take more RAM, and can hinder system performance.

And also, how does influx assure data-persistence of the data in the cache ?

tkiraly · April 13, 2017, 3:01pm

you could check that influxd listens on port 8086 with a tool like nmap

Luv · April 13, 2017, 4:16pm

Yes, I know that. Thanks.

I ran it using the /etc/influxdb/influxdb.conf file, then I came to know that collectd was not working which was causing this problem.

Still, I want to know that why the log file was deleted automatically ?

And also the questions about cache.

jason · April 13, 2017, 9:38pm

The influxdb log would be in systemd journal. Try journalctl -f -u influxdb.service.

If you hit the cache-max-memory-size limit, either you are writing very large payloads or your system is not able to keep up with your write load. Your system may be slow to startup after it hits that limit due to the large number of WAL segments that need to be reloaded. The process would be running, but the DB is not ready for writes or queries until after it reloads the WAL.

If you have large payloads, increasing that limit might be appropriate.

Luv · April 14, 2017, 6:47am

cache-snapshot-write-cold-duration = “10m”

#The cache snapshot memory size is the size at which the engine will snapshot the cache and write it to a TSM file, freeing up memory.

So, my cache will be cleared after every 10 minutes, right ? and wrie all the data to .tsm file

which means I am consuming my cache-max-mermory limit before 10 minutes, is it so ?

Luv · April 14, 2017, 6:49am

Also, Can you help with this please ?

In “inputs.logparser”, what if there are 2 files , “files=[‘file1.log’,‘file2.log’]” and I want 2 different measurements for these 2 files ?

Like, 2 measurements should be made, “file1” and “file2” . In influxdb.conf file, we can only give 1 name to measurement, means, both files will go under 1 measurement. How can I get 2 different measurements ?
_

Topic		Replies	Views
Data loss when max-cache-memory-size exceeded? InfluxDB 1 influxdb	3	629	November 15, 2021
InfluxDB uses more meory Store	0	500	April 22, 2019
InfluxDB : Memory Consumption extremely high, though DB is small influxdb	0	1790	February 4, 2020
No Handling, because DB too large	0	447	March 28, 2019
InfluxDB not releasing memory Store influxdb	1	1568	September 26, 2018

Influxdb Not working(cache-max-memory)

Related topics