Writing csv to influxdb bucket through CLI

kouter · August 17, 2022, 8:11pm

Hi all,

I’m writing csv’s to a bucket I have in influxdb v2.3 using the write command in the CLI. Each csv has 100000 records. The first 9 csv’s are written successfully. I check this by querying the bucket to show all the points. After this point, there are 895440 points in the bucket. The reason there aren’t 900000 points I believe is influx replaces points that have the same measurement, tag set, and timestamp. However, when I write the 10th csv there are only 941622 points in the bucket. I would expect there to be at least 990000 points in the bucket (there is only about 19000 points influx would see as duplicates in all my csv’s). If I write more csv’s the number of points does not change much and sometimes they even decrease. And I am not getting any errors in the CLI.

Is there a max number of points a bucket can take?
Are there configurations I need to change?

Any help would be much appreciated. Thank you.

Anaisdg · August 17, 2022, 9:58pm

Hello @kouter,
Welcome!
You’re correct. You will overwrite existing points if the lines contain the same measurement, tag set, field keys, and timestamp.
You’re using OSS?
No there isn’t a limit.
Is it possible that your estimation is off? How are you making that estimate?

@mhall119 do you have any suggestions here? Have you seen this?

kouter · August 18, 2022, 6:45am

Hi @Anaisdg,

Thank you very much for you reply.

Yes, I am using OSS.

Estimation for duplicates: In SQL I am grouping by the timestamp, measurement and tag set, while doing a count and using a HAVING clause > 1. Here I get roughly 19000 rows.

Estimation for number of points in bucket: I am using flux query, from my bucket, choosing time range that encapsulates all my data, choosing my measurement and I did remove the aggregate window. I believe this should return all the data in the bucket, but I stand to be corrected.

kouter · August 18, 2022, 8:14am

One thing I’ve noticed in the UI when I submit the flux query is that the query does take some time and buffers. Before the query finishes there is a red box that pops up for a split second. In order to read what it says I needed to record my screen. It says "Large response truncated to first 100.1 MB.

Does this mean all of my data is in the bucket and only showing the first 100.1 MB? If so, is there a way to bypass this limitation and show all of my data?

kouter · August 23, 2022, 7:55am

Just to give an update on this issue. It seems like all of my data is in my bucket, I just can’t view all of my data at once.

Is there something I can configure that will make it possible to view all of my data at once?

Topic		Replies	Views
Influx Writes are incomplete on my instance?	1	152	February 1, 2024
Write CSV data on influxDB 2.0rc InfluxDB 2	1	499	October 31, 2020
Python module/method InfluxDBClient.write_points does not write all points Store	7	1173	July 4, 2020
Issue writing point when providing historical times? influxdb , client-libraries	0	1248	May 22, 2017
Poor Write Performance on Bulk Import Store	3	2028	October 2, 2018

Writing csv to influxdb bucket through CLI

Related topics