Unexpected response time to async http requests

Hi,

I have been observing a very strange behaviour with async’ requests: the response time increases linearly with the number of concurrent http requests I send from my client.

I set up my InfluxDB using docker-compose with the following entries in my docker-compose.yml.

services:
   influxdb:
      image: influxdb:latest
      container_name: influxdb
      restart: always
      ports:
         - 8086:8086
      volumes:
         - shared_volumes/influxdb:/home/influxdb/shared_volume
      environment:
         DOCKER_INFLUXDB_INIT_MODE: setup
         DOCKER_INFLUXDB_INIT_USERNAME: admin
         DOCKER_INFLUXDB_INIT_PASSWORD: password
         DOCKER_INFLUXDB_INIT_ORG: MyOrg
         DOCKER_INFLUXDB_INIT_BUCKET: test-bucket
         DOCKER_INFLUXDB_INIT_ADMIN_TOKEN: test-token
         INFLUXD_QUERY_CONCURRENCY: 200
         INFLUXD_QUERY_QUEUE_SIZE: 1

To finish setting up my DB and insert the data (I can provide the data if necessary, as a new user I cannot post attachments), I use the following script

import os
import sys

import subprocess as sp

path_to_file = 'shared_volume/cac.lp'

bucket_id = os.popen('sudo docker exec influxdb influx bucket list').read().split('\t')[25].split('\n')[1]
print(f'bucket has id {bucket_id}')

os.system(f'sudo docker exec influxdb influx write -b test-bucket -f {path_to_file} --format=lp')
os.system(f'sudo docker exec influxdb influx v1 dbrp create --bucket-id {bucket_id} --db test-bucket --rp test-retention')
os.system(f'sudo docker exec influxdb influx v1 auth create --read-bucket {bucket_id} --write-bucket {bucket_id}  --username admin --no-password --token test-token')

Finally, to send my concurrent http requests I use that script

# Source (modified version of this): 
# https://stackoverflow.com/questions/57126286/fastest-parallel-requests-in-python

import asyncio
import aiohttp
import time
​
import numpy as np
import time as tm
import pandas as pd
import matplotlib.pyplot as plt

VENUES = 2
BOOKS = 2
​
HEADERS = {
    'Host': '127.0.0.1:8086',
    'User-Agent': 'python-requests/2.26.0',
    'Accept-Encoding': 'gzip, deflate, br',
    'clearAccept': 'application/x-msgpack',
    'Connection': 'keep-alive',
    'Content-Type': 'application/json',
    'Authorization': 'Token test-token'
}
​
BASE_URL = 'http://127.0.0.1:8086'


async def get(n, url, session):
    try:
        start = time.time()
        async with session.get(url=url) as response:
            resp = await response.read()
        end = time.time()
        duration = end - start
        print(f"Success -- URL {url} -- resp length {len(resp)} -- duration {end - start} seconds -- job {n}.")
    except Exception as e:
        resp = f"Failure -- URL {url} -- except {e.__class__} -- job {n} "
    return len(resp), n, duration
​
​
async def retrieve_queries(urls, base_url = BASE_URL, headers = HEADERS):
    async with aiohttp.ClientSession(base_url=base_url, headers=headers) as session:
        ret = await asyncio.gather(*[get(n, url, session) for n, url in enumerate(urls)])
    print("Finalized all. Return is a list of len {} outputs.".format(len(ret)))
    return ret


def make_query(N_venues, N_books):
    fields = ['open', 'high', 'low', 'close', 'adjustedclose', 'volume']
    query = '/query?q=Select+' + '%2C+'.join(fields) + '+from+daily%3B&db=test-bucket&chunked=true&chunk_size=100'
    filters = [
        f'("venue"=\'{np.random.randint(0, i)}\'+and+"book"=\'{np.random.randint(0, j)}\')' 
        for i in range(1, N_venues + 1) 
        for j in range(1, N_books + 1)
        ]
    where_clause = f'+where+{"+or+".join(filters)}' if filters else ''
    query = '/query?q=Select+' + '%2C+'.join(fields) + '+from+daily'
    query += where_clause
    query += '%3B'
    query += '&db=test-bucket'
    return query


async def compute_query_stats(urls, n):
    loop = asyncio.get_event_loop()
    ret = await loop.create_task(retrieve_queries(urls[:n]))
    durations = np.array([duration for _, _, duration in ret])
    lengths = np.array([length for length, _, _ in ret])
    return (n, durations.mean(), durations.std(), lengths.mean(), lengths.std())


n_max = 100
urls = n_max*[make_query(0, 0)]
data = ()
for n in (10, 20, 30, 40, 50, 60, 70, 80, 90, 100):
    data += (await compute_query_stats(urls, n), )
data = pd.DataFrame(data, columns = ['Number_of_queries', 'Avg_duration', 'Std_duration', 'Avg_length', 'Std_length'])
data = data.set_index('Number_of_queries')

plt.plot(data.index, data.Avg_duration, 'k-')
plt.plot(data.index, data.Avg_duration + data.Std_duration, 'k--')
plt.plot(data.index, data.Avg_duration - data.Std_duration, 'k--')

The result I get is the following:

plot

I would have not expected my async query batches’ response times to be linear with the batch size. In layman’s terms, if I send 10 queries it takes about 3 seconds to get a response for all queries. If I send 100, it takes about thirty seconds.
(Note that all this is consistent with the slope of the above curve to be about 0.3 seconds/query.)

What I would have expected instead would have been for each query to take about 0.3 seconds and arrive whenever they can.

Could you please shed some light on this? Is there a way to avoid this batching of responses?

Many thanks in advance and best wishes,
M0nk

Hi @M0nk,
So this does make sense when you are batching writes. Essentially the InfluxDB server will handle the request for each metric in the batch and send back the collated response. Have you considered using the Influx python client?

You have more control over how you write data to InfluxDB including batching for which you set to 1 meaning each metric will be sent individually.

Hello Jay,

Many thanks for looking in to this and giving me some insights. I knew about the python client which is pretty awesome by the way. :slight_smile:

I am mostly interested in queries here, queries which I want to send with a rather high frequency.

So, when I send a single query to InfluxDB, I get an answer after 0.3s. However, when I send 100 queries consecutively, the first one doesn’t return until 30s later instead of returning after 0.3s (or so).

Why is that? Is there a way to I disable this behaviour in order for all my queries to return as soon as possible?

Many thanks again for your help,
M0nk

Hi Jay,

Thinking about you reply, I wondered if I did not miss the point of it…

Where you implying that if I wrote by batch, then queries would be served with the behaviour I am seeing?

As far as writing is concerned, I used the influx client inside the docker container to write in the bucket.

Looking forward to your inputs on this.

Best wishes,
M0nk

Hi @M0nk,
My apologies for misreading your goal. I assumed you were using an async method to write data rather than the query. Could you describe to us a little more the context around you’re use case for querying InfluxDB using an async method? This one is really interesting to us :slight_smile:

Also does async gather act as a task group? So essentially in your case, it will wait for all queries repones to return before marking true and allowing the rest of your code to continue? I am wondering if this makes it look like InfluxDB is batching queries.

@bednar do you have any ideas about how we could achieve the requirement using our python client?

@Jay_Clifford Thank you very much for following up with this.

Just to clarify, this work attempts to reproduce what has already been seen with another technology (Asynchronous querying in C++). When querying asynchronously multiple queries, we would have a batched-like response from InfluxDB, even though we use different querying sockets. And with python modules asyncio and aiohttp, I seem to be finding the exact same behaviour.

I’ll try the python client as it is wise to leave to no stone unturned. :slight_smile:

However, our end goal is to be querying InfluxDB with C++ so sadly, the python client will not be a viable solution for us (although if there are specificities in its implementation, I am all ears!).

@Jay_Clifford I had a look in the docs of the python API. It looks like all queries are sent synchronously.

Screenshot 2021-12-07 at 10.08.43

So my take is that I’ll probably be seeing the same thing if I make several asynchronous calls using the python client. :cry:

Hi @M0nk,
I must admit you have truly stumped me on this one. In a good way! :wink: I have passed your query back to our engineering team for insight. Hopefully they will get in touch to explain the query engine internal mechanics further.

My apologies I could not be of further help. I will keep pushing internally for you.

@Jay_Clifford Thank you very much for your kind help. :slight_smile:

Looking forward to hearing from the engineering team about this.

Best wishes,
M0nk

Hello @Jay_Clifford,

I hope this email finds you well. Have you been able to get some info on this matter?

Best wishes,
M0nk

Hi @M0nk,
None yet, it’s been quite a busy few weeks for our engineer team :slight_smile:. I have given another nudge internally so hopefully, someone will take a look soon.

Thanks,
Jay

Awesome thank you. Apologies to the engineer team for loading up their plate even more!
I’ll be really grateful though if I can get some help. :slight_smile:
Best wishes,
M0nk

I don’t believe there is any kind of server-side batching mechanism for queries. Each query is a separate HTTP request and the request/response cycle for each one is entirely independent of any others with respect to batching.

You should see that as you perform multiple simultaneous queries from the client, that the response time for each one will increase slightly (but not linearly) as the server becomes more loaded. Here’s an example of a very simple Go program that illustrates this: Go Playground - The Go Programming Language

Running influxd locally with that program, I get results along the lines of:

took "24.643063ms"
took "24.681479ms"
took "24.428102ms"
took "27.188396ms"
took "27.335884ms"
took "27.404516ms"
... etc

I’m not intimately familiar with python like your example, but I’d guess that there is some kind of idiosyncrasy in that code that is leading to the unexpected results. Since your end goal is not to use python, perhaps this little Go program will be helpful for you to see the behavior you are interested in, since it confirms that influxd does not do any kind of batching of queries on the server.

@wbaker Thank you very much for taking the time on my issue and for providing this (very useful!) piece of code.

I modified it to be able to query the data to the DB I am investigating. The results are below. I can provide you with code and data if needed.

As you can see, the response time for the first query to come back increases drastically with the number of requests.

  • 1 request: 355 ms
  • 100 requests: 12s
  • 1000 requests: 1m33s
    And loss of connection with host after 2m35 (this could be another matter, but I thought it’d be worth mentioning).

What do you think is the cause of this behaviour? Is it expected? I would have thought that the concurrency would imply that I roughly get ~500ms to receive the first response. Am I misunderstanding something?

@Jay_Clifford Thank you :hugs:

(base) ubuntu@db-tests:~/scripts/go$ go run main_v2.go 1 # number of requests
took "355.202466ms"
(base) ubuntu@db-tests:~/scripts/go$ go run main_v2.go 10
took "1.408498588s"
...
took "1.611987346s"
(base) ubuntu@db-tests:~/scripts/go$ go run main_v2.go 100
took "12.329331088s"
...
took "20.479085691s"
(base) ubuntu@db-tests:~/scripts/go$ go run main_v2.go 1000
took "1m33.093039531s"
...
took "2m35.626336134s" 
# Here the machine I am working on stopped responding
# (either started swapping - I doubt it as I set the 
# swappiness to 0 - or simply crashed)

It seems like the queries you are testing with may be resource-intensive enough that before the first query can return, additional queries start to be processed by the server which increases server load and extends how long it takes to return the initial query as well.

If you’re trying to minimize the total amount of time it takes to make N requests, you might that make fewer simultaneous requests is advantageous. So if you want to get 100 queries completed, running a maximum of 5 at a time might be best, for example.

@wbaker Thank you for spending the time looking into this. We’ll do as advised then. :slight_smile:

Just to drill a little further though, and please forgive me if I am being too pushy, there really is no part of the code we could tweak on our side (we’d be happy to make a pull request if we’re happy and you deem the change useful) to avoid this kind of behaviour?

Just to make things very clear, we do expect responses to be impacted with load increase. The surprise is more that what is roughly happening is a sort of ‘batching’ where we have to wait for the last request to be processed to receive the first one.

Are you specifically seeing batching on the queries? There isn’t any mechanism for doing that on the server. It’s just not something the server does…each query is treated completely separately. I’m not seeing batching from the latest data you posted.

From your data, it looks like in the case of 10 simultaneous queries, the “fastest” query takes 1.4 seconds, and the “slowest” takes 1.6 seconds? So the duration from starting the first query (all the queries will start within ~ns of each other) to the last query being complete is 1.6 seconds. That would mean to run 10 queries, it takes longer than 1 query - but not 10 times longer. Each query is individually slower but the duration to get them all done is much faster than doing 10 queries sequentially. It’s not that the server completed the “first” query in ~355 ms and then waited another second for other queries to be complete…it just took the “first” query 1.4 seconds to complete, because the server is running 9 other queries simultaneously with it.

What I am interpreting from the data you have is along those lines. When you run 1000 simultaneous queries, even the fastest query is being run on the server loaded with 999 other queries, so it takes a long time. There’s a big range between the fastest and the slowest query because of those 999 other queries, many of them probably don’t do any work until some of the others are complete due to hardware limitations. This wouldn’t be because of any kind of batching code, but rather what would be expected from doing asynchronous queries in this way.

Hopefully this is helps…happy to continue to discuss if you have further questions! It’s always possible that I’m misunderstanding something but from what I’m seeing everything is working as I would expect.

1 Like

@wbaker Ah awesome. Thank you so much for taking the time to explain all of this. :slight_smile:

Crystal clear now!

Best wishes,
M0nk