InfluxQL truncated chunked queries

Hello!

I am having some trouble understanding exactly how chunking works when querying data from Influx.

According to the documentation available at the following link: Querying data with the InfluxDB API | InfluxDB OSS 1.7 Documentation When requesting data one can specify the query parameter ‘chunked’ as true, meaning that the data will be returned in streamed batches rather than a single response. Besides this parameter, there is also the ‘chunk_size’ parameter, which can specify how many data points to be chunked in a batch.

This seems to be a very useful feature, especially when dealing with large amounts of data, as it could allow batch-by-batch processing of a smaller sub-set of data points.

However, so far from my experiments I have failed to reach the desired behavior of receiving the data batch-by-batch, with each batch containing exactly the number of data points as specified in the chunk_size query parameter.

I have started an InfluxDB 2.4 server as a Docker container. Container started by running:

docker run --detach --rm --name influx -p 8086:8086 influxdb:2.4.0

Using a python script I wrote some dummy data into the Influx database. Python script:

import influxdb_client
from influxdb_client.client.write_api import SYNCHRONOUS

bucket = "bucket2"
organization = "example-org"
token = "<access-token>"
url = "172.19.0.2:8086"

client = influxdb_client.InfluxDBClient(url=url, org=organization, token=token)
write_api = client.write_api(write_options=SYNCHRONOUS)

for i in range(0, 8000):
    point = influxdb_client.Point("test-measurement").field("Data", "ABCDEFGHI")
    write_api.write(bucket=bucket, org=organization, record=point)

Then, using another Python script I queried the data in chunks with a small batch size and printed each batch alongside other information. Python script:

import requests

params = {'rp': 'retpolicythree', 'db':'bucket2', 'q':'SELECT * from "test-measurement" limit 100', 'chunked':'true', 'chunk_size':'10'}
headers = {'Accept':'application/csv', 'Authorization': 'Token <access-token>'}


response = requests.get("http://172.19.0.2:8086/query", params=params, headers=headers, stream=True)

chunk_index = 1
for chunk in response.iter_content(chunk_size=None):
    print("Received a new batch number: " + str(chunk_index))
    data = chunk.decode("utf-8")
    size = len(data)
    print("Size of data: " + str(size))
    print(data)
    print("\n")
    chunk_index = chunk_index + 1

print("Finished the request")

My expectation was that I would see a total of 10 chunks being printed, as I only query 100 data points requesting them in chunks of 10 data points. However, the results that I get are not very clear. Below is the result of the Python script:

Received a new batch number: 1
Size of data: 2048
name,tags,time,Data
test-measurement,,1662635480885321600,ABCDEFGHI
test-measurement,,1662635480959275200,ABCDEFGHI
test-measurement,,1662635480963779500,ABCDEFGHI
test-measurement,,1662635480968483500,ABCDEFGHI
test-measurement,,1662635480972424000,ABCDEFGHI
test-measurement,,1662635480976543400,ABCDEFGHI
test-measurement,,1662635480980748900,ABCDEFGHI
test-measurement,,1662635480985150800,ABCDEFGHI
test-measurement,,1662635480989368100,ABCDEFGHI
test-measurement,,1662635480993390000,ABCDEFGHI
test-measurement,,1662635480997737300,ABCDEFGHI
test-measurement,,1662635481001886500,ABCDEFGHI
test-measurement,,1662635481005886600,ABCDEFGHI
test-measurement,,1662635481010275800,ABCDEFGHI
test-measurement,,1662635481014228900,ABCDEFGHI
test-measurement,,1662635481018313000,ABCDEFGHI
test-measurement,,1662635481022479000,ABCDEFGHI
test-measurement,,1662635481026632700,ABCDEFGHI
test-measurement,,1662635481030988600,ABCDEFGHI
test-measurement,,1662635481035078700,ABCDEFGHI
test-measurement,,1662635481039244900,ABCDEFGHI
test-measurement,,1662635481043543100,ABCDEFGHI
test-measurement,,1662635481047518600,ABCDEFGHI
test-measurement,,1662635481051595800,ABCDEFGHI
test-measurement,,1662635481055863900,ABCDEFGHI
test-measurement,,1662635481059889800,ABCDEFGHI
test-measurement,,1662635481063919600,ABCDEFGHI
test-measurement,,1662635481068072900,ABCDEFGHI
test-measurement,,1662635481072286100,ABCDEFGHI
test-measurement,,1662635481076525300,ABCDEFGHI
test-measurement,,1662635481080720400,ABCDEFGHI
test-measurement,,1662635481084895800,ABCDEFGHI
test-measurement,,1662635481089509000,ABCDEFGHI
test-measurement,,1662635481094136300,ABCDEFGHI
test-measurement,,1662635481100058200,ABCDEFGHI
test-measurement,,1662635481104223500,ABCDEFGHI
test-measurement,,1662635481108549500,ABCDEFGHI
test-measurement,,1662635481112550800,ABCDEFGHI
test-measurement,,1662635481116461500,ABCDEFGHI
test-measurement,,1662635481120262400,ABCDEFGHI
test-measurement,,1662635481124179600,ABCDEFGHI
test-measurement,,1662635481128495900,ABCDEFGHI
test-measure


Received a new batch number: 2
Size of data: 2048
ment,,1662635481132203800,ABCDEFGHI
test-measurement,,1662635481136201100,ABCDEFGHI
test-measurement,,1662635481140127400,ABCDEFGHI
test-measurement,,1662635481144331700,ABCDEFGHI
test-measurement,,1662635481148258300,ABCDEFGHI
test-measurement,,1662635481152288700,ABCDEFGHI
test-measurement,,1662635481156528200,ABCDEFGHI
test-measurement,,1662635481161137800,ABCDEFGHI
test-measurement,,1662635481165871000,ABCDEFGHI
test-measurement,,1662635481170329700,ABCDEFGHI
test-measurement,,1662635481174913500,ABCDEFGHI
test-measurement,,1662635481179287400,ABCDEFGHI
test-measurement,,1662635481183449000,ABCDEFGHI
test-measurement,,1662635481187639300,ABCDEFGHI
test-measurement,,1662635481192294800,ABCDEFGHI
test-measurement,,1662635481196977100,ABCDEFGHI
test-measurement,,1662635481201258400,ABCDEFGHI
test-measurement,,1662635481205685600,ABCDEFGHI
test-measurement,,1662635481210099300,ABCDEFGHI
test-measurement,,1662635481214382400,ABCDEFGHI
test-measurement,,1662635481218648000,ABCDEFGHI
test-measurement,,1662635481222958200,ABCDEFGHI
test-measurement,,1662635481227320900,ABCDEFGHI
test-measurement,,1662635481231261800,ABCDEFGHI
test-measurement,,1662635481235774400,ABCDEFGHI
test-measurement,,1662635481240051100,ABCDEFGHI
test-measurement,,1662635481244478700,ABCDEFGHI
test-measurement,,1662635481249234700,ABCDEFGHI
test-measurement,,1662635481253759300,ABCDEFGHI
test-measurement,,1662635481257980800,ABCDEFGHI
test-measurement,,1662635481262297700,ABCDEFGHI
test-measurement,,1662635481266422100,ABCDEFGHI
test-measurement,,1662635481270546200,ABCDEFGHI
test-measurement,,1662635481274299700,ABCDEFGHI
test-measurement,,1662635481278596300,ABCDEFGHI
test-measurement,,1662635481282977100,ABCDEFGHI
test-measurement,,1662635481286874400,ABCDEFGHI
test-measurement,,1662635481291093700,ABCDEFGHI
test-measurement,,1662635481295437900,ABCDEFGHI
test-measurement,,1662635481299431600,ABCDEFGHI
test-measurement,,1662635481303185300,ABCDEFGHI
test-measurement,,1662635481306945900,ABCDEFGHI
test-measurement,,1662635481311057700,ABCDEF


Received a new batch number: 3
Size of data: 724
GHI
test-measurement,,1662635481315062400,ABCDEFGHI
test-measurement,,1662635481318992100,ABCDEFGHI
test-measurement,,1662635481322786400,ABCDEFGHI
test-measurement,,1662635481326972500,ABCDEFGHI
test-measurement,,1662635481330995800,ABCDEFGHI
test-measurement,,1662635481334649900,ABCDEFGHI
test-measurement,,1662635481338256200,ABCDEFGHI
test-measurement,,1662635481342069200,ABCDEFGHI
test-measurement,,1662635481345796700,ABCDEFGHI
test-measurement,,1662635481349454300,ABCDEFGHI
test-measurement,,1662635481353189400,ABCDEFGHI
test-measurement,,1662635481357029300,ABCDEFGHI
test-measurement,,1662635481361336800,ABCDEFGHI
test-measurement,,1662635481364963600,ABCDEFGHI
test-measurement,,1662635481368512200,ABCDEFGHI



Finished the request

As it can be seen from the output, the batches received do not contain only 10 data points, and moreover, the data seems to be truncated, with each batch cutting off the .csv records.

Running the same query but removing the header ‘Accept: application/csv’ returns data in json, however even that data is truncated.

I would very much appreciate some clarification on this topic. Is it something that I am doing wrong (misconfigurations, wrong query, etc.) or is chunked querying in InfluxDB not running based on data points?

Thank you!