Timeout error at 8am each day

I’m running InfluxDB on Raspbian on a Raspberry Pi 4

$ influx --version
InfluxDB shell version: 1.6.4

Each morning, there’s a time out error and the application stops writing to InfluxDB. I noticed the correlation that 8am local time is UTC midnight. After several days of experimentation, the only workaround that I’ve found is to reboot.

Please advise how to resolve this daily timeout? What further information is useful in diagnosing this issue?

The log messages follow

Aug 26 08:01:04 s2 influxd[579]: ts=2020-08-26T00:01:04.282632Z lvl=error msg="[500] - “timeout”" log_id=0OqCj28W000 service=httpd
Aug 26 08:01:04 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:54 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 500 20 “-” “-” 35f302cc-e72f-11ea-b29d-000000000000 10000657
Aug 26 08:00:58 s2 influxd[579]: ts=2020-08-26T00:00:58.535609Z lvl=error msg="[500] - “timeout”" log_id=0OqCj28W000 service=httpd
Aug 26 08:00:58 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:48 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 500 20 “-” “-” 32861e15-e72f-11ea-b29c-000000000000 10000462
Aug 26 08:00:54 s2 influxd[579]: ts=2020-08-26T00:00:54.847455Z lvl=error msg="[500] - “timeout”" log_id=0OqCj28W000 service=httpd
Aug 26 08:00:54 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:44 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 500 20 “-” “-” 305356e6-e72f-11ea-b29b-000000000000 10000510
Aug 26 08:00:35 s2 influxd[579]: ts=2020-08-26T00:00:35.415906Z lvl=error msg="[500] - “timeout”" log_id=0OqCj28W000 service=httpd
Aug 26 08:00:35 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:25 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 500 20 “-” “-” 24be5223-e72f-11ea-b29a-000000000000 10000519
Aug 26 08:00:18 s2 influxd[579]: ts=2020-08-26T00:00:18.912116Z lvl=error msg="[500] - “timeout”" log_id=0OqCj28W000 service=httpd
Aug 26 08:00:18 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:08 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 500 20 “-” “-” 1ae80aa7-e72f-11ea-b299-000000000000 10000527
Aug 26 08:00:13 s2 influxd[579]: ts=2020-08-26T00:00:13.571164Z lvl=error msg="[500] - “timeout”" log_id=0OqCj28W000 service=httpd
Aug 26 08:00:13 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:03 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 500 20 “-” “-” 17b91548-e72f-11ea-b298-000000000000 10000493
Aug 26 08:00:11 s2 influxd[579]: ts=2020-08-26T00:00:11.632361Z lvl=error msg="[500] - “timeout”" log_id=0OqCj28W000 service=httpd
Aug 26 08:00:11 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:01 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 500 20 “-” “-” 16912490-e72f-11ea-b297-000000000000 10000578
Aug 26 08:00:11 s2 influxd[579]: ts=2020-08-26T00:00:11.412188Z lvl=info msg=“failed to store statistics” log_id=0OqCj28W000 service=monitor error=timeout
Aug 26 08:00:01 s2 influxd[579]: ts=2020-08-26T00:00:01.406173Z lvl=info msg=“failed to store statistics” log_id=0OqCj28W000 service=monitor error=timeout
Aug 26 08:00:00 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:00 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 1612f73d-e72f-11ea-b296-000000000000 2127
Aug 26 08:00:00 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:00 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 1612aa0a-e72f-11ea-b295-000000000000 3098
Aug 26 08:00:00 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:00 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 1612a53c-e72f-11ea-b294-000000000000 3028
Aug 26 08:00:00 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:00 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 16128c1e-e72f-11ea-b293-000000000000 2307
Aug 26 08:00:00 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:08:00:00 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 16127084-e72f-11ea-b292-000000000000 2083
Aug 26 07:59:58 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:58 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 14c8c614-e72f-11ea-b291-000000000000 1320
Aug 26 07:59:57 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:57 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 1405fe27-e72f-11ea-b290-000000000000 2138
Aug 26 07:59:57 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:57 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 1405c648-e72f-11ea-b28f-000000000000 1481
Aug 26 07:59:51 s2 influxd[579]: ts=2020-08-25T23:59:51.401129Z lvl=info msg=“failed to store statistics” log_id=0OqCj28W000 service=monitor error=timeout
Aug 26 07:59:50 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:50 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 1031b537-e72f-11ea-b28e-000000000000 1153
Aug 26 07:59:50 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:50 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 100f0ab6-e72f-11ea-b28d-000000000000 1324
Aug 26 07:59:47 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:47 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 0e24a0f5-e72f-11ea-b28c-000000000000 1202
Aug 26 07:59:47 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:47 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 0de5f27b-e72f-11ea-b28b-000000000000 1491
Aug 26 07:59:45 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:45 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 0d22ac73-e72f-11ea-b28a-000000000000 1324
Aug 26 07:59:41 s2 influxd[579]: ts=2020-08-25T23:59:41.395981Z lvl=info msg=“failed to store statistics” log_id=0OqCj28W000 service=monitor error=timeout
Aug 26 07:59:40 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:40 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 09efce65-e72f-11ea-b289-000000000000 1220
Aug 26 07:59:39 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:39 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 098c7cda-e72f-11ea-b288-000000000000 1222
Aug 26 07:59:35 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:35 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 07017f41-e72f-11ea-b287-000000000000 1301
Aug 26 07:59:34 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:34 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 063e737b-e72f-11ea-b286-000000000000 1464
Aug 26 07:59:34 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:34 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 063e6781-e72f-11ea-b285-000000000000 1282
Aug 26 07:59:33 s2 influxd[579]: [httpd] 127.0.0.1 - gateway [26/Aug/2020:07:59:33 +0800] “POST /write?db=gateway&p=%5BREDACTED%5D&precision=n&rp=&u=gateway HTTP/1.1” 204 0 “-” “-” 05b85125-e72f-11ea-b284-000000000000 1290

Influxdb needs fast storage . Most likely the retention policy is starting and trying to delete and compact old records. Can you confirm how big your dB is and what your retention policy on all the Dbs are?

How many series are in your dbs?

Where is the dB sitting on the pi? If it’s on the sd card, try moving it onto an ssd usb storage. ($ per GB they are cheaper than usb sticks from what I’ve been seeing - and perform better)

Thank-you - there’s some food for thought.

  • I reviewed all of the series and got rid of those that are of no use (slightly more than 50%).
  • The retention policy is ‘infinite’, no deletion.
  • The data is stored on an NFS mount. While that is faster than an SD card, I suspect that occasionally the write latency spikes. I’ll look into getting a reasonably priced USB to store the data.
  • It appears that influx_inspect is not packaged by Raspbian/Debian, so it’s not possible to compact the data files.

My use case is that InfluxDB is a component in my home automation system, used to store data and graph it with Grafana. While I can control which series are stored or not, I can’t easily control details of series to match the optimisation recommendations without re-writing other’s code while I am currently writing and debuggiung other areas. So further recommendations will be very welcome.

That’s very interesting. Seems unlikely that influx dB would be causing the hanging then. Until you’re on some kind of faster local storage, maybe try some monitoring of the nfs file share (nas?) and what else is happening around that time on that end