Influxdb failing to load all data points via HTTP

influxdb
#1

Hi all,
I have 5,300,000 data points, all in the same series, to be loaded into Influxdb. The data points are split into files of 5,000 lines each. I use a bash script that submits a file load via curl every 30 seconds. It takes quite a few hours to run (!) but to each request I get an HTTP response code 204.
When I go into influx and
SELECT COUNT(*) FROM “digital”
I get the count 654,363 NOT 5,300,000
The database has been created with a duration of 36500d as the data I have is old (2011-2017). I am using a precision of ‘s’.
I am running Influxdb OSS 1.7.4 in a docker container with a persisted volume for /var/lib/influxdb.
Can anyone give me any idea where I should look for the problem or any ideas how to fix this?
Thanks,
Stephen

UPDATE:
I have looked at the influxdb logs and there are no lines that look like errors.

Import Data not Working
#2

Hi @setheridge,

What’s your shard duration? For this kind of upload, it’s worth noting that default shard duration of a RP of that length (https://v2.docs.influxdata.com/v2.0/reference/flux/functions/built-in/transformations/selectors/top/#function-definition) would be 7 days. This means that uploading data that is ~7 years old would open up ~350 shard groups…which is probably too many.

For this upload, I’d suggest setting a shard duration of up to 7 years (not less than 1 year, probably) prior to upload.

1 Like
#3

Thanks Sam, I will change that and see what happens. It is the default at the moment. I’ll let you know what happens.

#4

Well Sam that fixed the issue almost perfectly. I now load most of the data (most means 99.99999%), there are a couple of hundred data points that get skipped in the 32 million. I have no idea why, there does not seem to be any similarity in the ones skipped.