Influxdb2 2.7.5 on Linux RHEL 8.9. We had a influxd client (Grafana) with complex queries (GET or POST), failing if these took more than 16-18 seconds to complete (after ~16 seconds the client would get EOF on the HTTP channel to influxd). Making the queries simpler (14- seconds) all worked OK. Problem confirmed with local influx queries (via shell “influx” command), same symptoms, same timings. Turned on all log options, up to debug level: failing queries reported completed OK after 16+ seconds, no trace of problems with the queries - but the client would still get the EOF.
To make a long story short (several days of investigations), we noticed that the “http-write-timeout” configuration parameter was set to 15 seconds, very close to the timeout with our failing queries. After increasing it, all worked as expected, problem solved.
Questions:
-
Why does a write-related parameter affect read operations? Note: docs strongly recommend to set this parameter according to local working conditions (which we did).
-
Why this timeout firing and causing the query channel to be closed (but not the query itself, that one - at least according to the logs - completed OK) was not mentioned at all in the logs of influxd?
Take care.