I’m writing csv’s to a bucket I have in influxdb v2.3 using the write command in the CLI. Each csv has 100000 records. The first 9 csv’s are written successfully. I check this by querying the bucket to show all the points. After this point, there are 895440 points in the bucket. The reason there aren’t 900000 points I believe is influx replaces points that have the same measurement, tag set, and timestamp. However, when I write the 10th csv there are only 941622 points in the bucket. I would expect there to be at least 990000 points in the bucket (there is only about 19000 points influx would see as duplicates in all my csv’s). If I write more csv’s the number of points does not change much and sometimes they even decrease. And I am not getting any errors in the CLI.
Is there a max number of points a bucket can take?
Are there configurations I need to change?
Any help would be much appreciated. Thank you.
You’re correct. You will overwrite existing points if the lines contain the same measurement, tag set, field keys, and timestamp.
You’re using OSS?
No there isn’t a limit.
Is it possible that your estimation is off? How are you making that estimate?
@mhall119 do you have any suggestions here? Have you seen this?
Thank you very much for you reply.
Yes, I am using OSS.
Estimation for duplicates: In SQL I am grouping by the timestamp, measurement and tag set, while doing a count and using a HAVING clause > 1. Here I get roughly 19000 rows.
Estimation for number of points in bucket: I am using flux query, from my bucket, choosing time range that encapsulates all my data, choosing my measurement and I did remove the aggregate window. I believe this should return all the data in the bucket, but I stand to be corrected.
One thing I’ve noticed in the UI when I submit the flux query is that the query does take some time and buffers. Before the query finishes there is a red box that pops up for a split second. In order to read what it says I needed to record my screen. It says "Large response truncated to first 100.1 MB.
Does this mean all of my data is in the bucket and only showing the first 100.1 MB? If so, is there a way to bypass this limitation and show all of my data?
Just to give an update on this issue. It seems like all of my data is in my bucket, I just can’t view all of my data at once.
Is there something I can configure that will make it possible to view all of my data at once?