Queries for most-recent records returns random null fields

With one python client (example at bottom of this post) we’re inserting ~15 fields into an InfluxDB at 10-20hz, and with another client (python, javascript via HTTP, or curl; client doesn’t seem to matter) we’re querying the data back at a regular interval (every second or so).

The query results are showing random NULL field values in random records that we know are not NULL, and this only occurs in the most recent record. If you go back a few seconds later and requery that record (using the timestamp) all fields will be non-NULL as expected. Also, if you change the query to return the TWO most recent records (“ORDER BY time DESC LIMIT 2”), then you’ll only see the NULLs in the most-recent record, never the 2nd-most-recent record.

The number of fields and frequency of INSERTs seems to make a difference. When I started writing the example python script below I started with 3 fields at 10hz and could not reproduce the problem. It was only once I got up to 15 fields at 100hz before I started seeing the problem and even then it took ~15 seconds on my machine. In our real application this problem reproduces constantly.

Is it possible that InfluxDB is returning records that are not yet fully constructed?

X-Influxdb-Build: OSS
X-Influxdb-Version: 1.7.2

Shell script to monitor data:

while [ true ]
do
    curl -G HOST:PORT/query -u USER:PASS --data-urlencode "db=testdata" --data-urlencode "q=SELECT * FROM test GROUP BY id ORDER BY time DESC LIMIT 1" -H 'Accept: application/csv'
    sleep 1
done

Example Output (note the 2nd and 7th results):

name,tags,time,a,b,c,d,e,f,g,h,i,k
test,id=BRAVO,1549362627625793024,3281,3281,2821.7526728879693,3281.2938092214545,3281,NUMBER3281,76,876,9567,0
test,id=ALPHA,1549362627625793024,3281,3281,2821.7526728879693,3281.2938092214545,3281,NUMBER3281,76,876,9567,0
name,tags,time,a,b,c,d,e,f,g,h,i,k
test,id=BRAVO,1549362628735095040,3337,3337,2739.900803680287,3337.987289242315,3337,NUMBER3337,75,190,1176,0
test,id=ALPHA,1549362628735095040,,,,,,,,,,0
name,tags,time,a,b,c,d,e,f,g,h,i,k
test,id=BRAVO,1549362629847493888,3394,3394,1794.644406422681,3394.137331619533,3394,NUMBER3394,27,7,72,0
test,id=ALPHA,1549362629847493888,3394,3394,1794.644406422681,3394.137331619533,3394,NUMBER3394,27,7,72,0
name,tags,time,a,b,c,d,e,f,g,h,i,k
test,id=BRAVO,1549362630926076928,3447,3447,3088.427124800288,3447.1667128580702,3447,NUMBER3447,1,225,3706,0
test,id=ALPHA,1549362630926076928,3447,3447,3088.427124800288,3447.1667128580702,3447,NUMBER3447,1,225,3706,0
name,tags,time,a,b,c,d,e,f,g,h,i,k
test,id=BRAVO,1549362632039686912,3502,3502,226.2890415850697,3502.8870562305306,3502,NUMBER3502,32,535,8675,0
test,id=ALPHA,1549362632039686912,3502,3502,226.2890415850697,3502.8870562305306,3502,NUMBER3502,32,535,8675,0
name,tags,time,a,b,c,d,e,f,g,h,i,k
test,id=BRAVO,1549362633150441984,3559,3559,14.748148441622998,3559.527007017778,3559,NUMBER3559,72,867,5978,0
test,id=ALPHA,1549362633150441984,3559,3559,14.748148441622998,3559.527007017778,3559,NUMBER3559,72,867,5978,0
name,tags,time,a,b,c,d,e,f,g,h,i,k
test,id=BRAVO,1549362634248436992,3614,3614,668.0670182831518,3614.584586236862,3614,NUMBER3614,63,978,6921,0
test,id=ALPHA,1549362634248436992,,,,,,NUMBER3614,63,978,6921,0
name,tags,time,a,b,c,d,e,f,g,h,i,k
test,id=BRAVO,1549362635328133120,3668,3668,2697.5671731272537,3668.6300554540594,3668,NUMBER3668,54,249,1678,0
test,id=ALPHA,1549362635328133120,3668,3668,2697.5671731272537,3668.6300554540594,3668,NUMBER3668,54,249,1678,0

Demo script to insert data:

import time, datetime, pprint, math, random
from influxdb import InfluxDBClient

BASETIME = datetime.datetime(1970, 1, 1)

def createInfluxClient(database, measurement):
	client = InfluxDBClient(HOST, PORT, USER, PASS)

	if not database in [db['name'] for db in client.get_list_database()]:
		client.create_database(database)

	client.switch_database(database)

	if measurement in [ms['name'] for ms in client.get_list_measurements()]:
		client.drop_measurement(measurement)
	return client

def getTimestamp():
	d = datetime.datetime.now() - BASETIME
	return int(math.floor(d.total_seconds() * 1e+9))

if __name__ == '__main__':
	client = createInfluxClient('testdata', 'test')

	try:
		count = 0
		while True:
			timestamp = getTimestamp()
			fields = {
				"a": count,
				"b": count,
				"c": count * random.uniform(0, 1),
				"d": count + random.uniform(0, 1),
				"e": str(count),
				"f": 'NUMBER' + str(count),
				"g": random.randint(0, 100),
				"h": random.randint(0, 1000),
				"i": random.randint(0, 10000),
				"j": "",
				"k": 0.0,
				"l": 0,
				"m": 'asdf',
				"n": 42,
				"o": 0,
			}
			client.write_points([
				{
					"measurement": "test",
					"time": timestamp,
					"tags": {
						"id": "ALPHA",
					},
					"fields": fields,
				},
				{
					"measurement": "test",
					"time": timestamp,
					"tags": {
						"id": "BRAVO",
					},
					"fields": fields,
				}
			])
			print(count)
			count = count + 1
			time.sleep(0.01)
	except KeyboardInterrupt:
		pass
	finally:
		client.close()

Having the exact same issue in a production application. I’ve recreated the issue with a Node client as well with a simple example, using influx v1.8.2. I’ve reproduced in Ubuntu 18.04.1 and Windows 10.

The frequency of requests and size of data seems to increase the frequency of receiving the null values. I generally have to let it run for a few minutes for the issue to occur. If I run 3 or 4 instances of the below code simultaneously, it starts happening frequently.

Any insight or solution is appreciated.

Code to Reproduce:

`const { InfluxDB } = require("influx");

const influx = new InfluxDB({
  host: "localhost",
  database: "testDb",
});

const createDb = async () => {
  await influx.createDatabase("testDb");
};

const read = async () => {
  const res = await influx.query(`
  select * from livedata
  ORDER BY time desc
  limit 1
`);


    console.log(res[0]);

};

const write = async () => {
  await influx.writeMeasurement("livedata", [
    {
      tags: {
        id: "site",
      },
      fields: {
        site1: Math.random() * 1000,
        site2: Math.random() * 1000,
        site3: Math.random() * 1000,
        site4: Math.random() * 1000,
        site5: Math.random() * 1000,
        site6: Math.random() * 1000,
        site7: Math.random() * 1000,
        site8: Math.random() * 1000,
      },
    },
  ]);
};

createDb();

setInterval(() => {
  write();
  read();
}, 300);

Example output:

  [
    '2021-01-07T21:40:11.4031559Z',
    'site',
    830.4042230769617,
    522.1830877694142,
    698.8789904008146,
    678.305459618109,
    118.82269436309988,
    631.6295948279627,
    376.3112870744887,
    830.4872612882643
  ]
]
[
  [
    '2021-01-07T21:40:11.7034901Z',
    'site',
    null,
    null,
    null,
    65.3968316403697,
    680.7946463560837,
    330.7338852317838,
    872.7936919556367,
    145.03057994702618
  ]
]
[
  [
    '2021-01-07T21:40:12.0036893Z',
    'site',
    901.031149970251,
    501.1825877093237,
    99.38758592260699,
    78.79549874505165,
    403.8558500935323,
    545.085784401504,
    969.637642068842,
    51.657735620841194
  ]
]