Python query results in "Read timed out"

Using:

  • Influxdb 1.7
  • Python client libraries (using data frame client)
  • Windows 10 or Ubuntu 18.04

Trying:
Rather simple, but longer running query: some aggregations per second over one day.

Observation:

  • I don’t mind the query taking 91s or more. This could be ok.
  • Unfortunately, at 91 seconds after the launch of the query, I get this exception:

HTTPConnectionPool(host=‘MyInfluxServer’, port=8086): Read timed out. (read timeout=30).

Already tried:

  • As a guess, I just tried to Increase the timeout on the urllib3 pool manager in my main code.
    urllib3.PoolManager(timeout=urllib3.Timeout(connect=5.0, read=600.0))
    This doesn’t seem to change anything.
  • Taking a smaller period in order to reduce the query time.
    That works fine, but I would prefer not to chop up the query too much.
    Handling a query of a few minutes should not be a problem, should it?
  • I also looked into other timeouts mentioned in the spec, like the subscriber timeout, but that doesn’t seem to be the right one, is it?

Question:
How can this timeout be prevented?

1 Like

Hello @qootec,
I’m not sure why that’s happening.
How many points are you trying to query approximately?
Just to make sure, is this the client that you’re using?

Hi Anais,

Thanks for your answer.

Replies to your questions:

  • No, I have been using influxdata/influx-python, installed through the Anaconda distribution of Python.
    This installs the python client package: influxdb, 5.3.0, build pyh9f0ad1d_0 from conda-forge. I took that version, since I’m not using Influxdb 2.0. I guessed the client you point at is specific to the 2.0 version or can I also use it for my 1.7 and 1.8 influxdb installations?
  • The query size depends a bit on the source dataset, but the one I was trying before has 1300 fields and 1440 rows (one row for every minute of a day). This is a result of aggregation (per minute) of fields of which some are very frequent (periods down to 20ms) and some are very slow (minutes to hours).

Tried additionally:

  • To exclude that the issue would be related to the conda-forge version of the influxdb client, I also tried to install it through pipenv and outside of my “normal” test environment of Visual Studio Code. It then works fine and the query runs within 20 seconds.
  • I then retried with the conda-forge version, from the command line (so also outside of Visual Studio Code’s debug environment) and it also runs fine: 19 seconds.
  • I then re-ran it from within Visual Studio Code’s debugger and it seems that it currently also runs fine and finishes at about 60s. This difference with the original query is probably to be explained by the “moment of the day” since the influx server is cloud hosted (Azure VM) and could therefore suffer reduced transfer rates during peak hours.
  • I retried a few different queries and was able to get one result that took 160s, so well above the 91s that made it fail consistently earlier.

Conclusion:

  • I have the impression the version/flavour of the influx-python doesn’t make a real difference (not really confirmed, just an impression).
  • The 91s seems not to be an upper limit in any way, but might be related to slow traffic?

Remaining questions:

  • What else could cause such behavior? Can I debug it in more detail (e.g. debug level logging of influxdb or the client library?)
  • Is it better to try the InfluxDB 2.0 client library that you proposed?

Regards,
Johan