Writing csv to influxdb bucket through CLI

Hi all,

I’m writing csv’s to a bucket I have in influxdb v2.3 using the write command in the CLI. Each csv has 100000 records. The first 9 csv’s are written successfully. I check this by querying the bucket to show all the points. After this point, there are 895440 points in the bucket. The reason there aren’t 900000 points I believe is influx replaces points that have the same measurement, tag set, and timestamp. However, when I write the 10th csv there are only 941622 points in the bucket. I would expect there to be at least 990000 points in the bucket (there is only about 19000 points influx would see as duplicates in all my csv’s). If I write more csv’s the number of points does not change much and sometimes they even decrease. And I am not getting any errors in the CLI.

Is there a max number of points a bucket can take?
Are there configurations I need to change?

Any help would be much appreciated. Thank you.

Hello @kouter,
Welcome!
You’re correct. You will overwrite existing points if the lines contain the same measurement, tag set, field keys, and timestamp.
You’re using OSS?
No there isn’t a limit.
Is it possible that your estimation is off? How are you making that estimate?

@mhall119 do you have any suggestions here? Have you seen this?

Hi @Anaisdg,

Thank you very much for you reply.

Yes, I am using OSS.

Estimation for duplicates: In SQL I am grouping by the timestamp, measurement and tag set, while doing a count and using a HAVING clause > 1. Here I get roughly 19000 rows.

Estimation for number of points in bucket: I am using flux query, from my bucket, choosing time range that encapsulates all my data, choosing my measurement and I did remove the aggregate window. I believe this should return all the data in the bucket, but I stand to be corrected.

One thing I’ve noticed in the UI when I submit the flux query is that the query does take some time and buffers. Before the query finishes there is a red box that pops up for a split second. In order to read what it says I needed to record my screen. It says "Large response truncated to first 100.1 MB.

influx_image

Does this mean all of my data is in the bucket and only showing the first 100.1 MB? If so, is there a way to bypass this limitation and show all of my data?

Just to give an update on this issue. It seems like all of my data is in my bucket, I just can’t view all of my data at once.

Is there something I can configure that will make it possible to view all of my data at once?