Overview:
InfluxdB(v1.5.2) is in azure and I am trying to insert records from windows/centos. I have written a python script to read the data from csv and insert the same to the influxdB. The ingestion rate is quite good on the newly created database but once the number of records crosses the 1.5+ million mark the ingestion rate deteriorates, such that not all the records gets ingested. If I use the same csv file and insert the records to a newly created database then I get more number of records. The deterioration rate varies between 40 to 80% data loss.
Configuration Details:
Azure instance that we have signed up is Premium SSD Managed disks.
P4 flavor: Disk size= 32GB, IOPS rate per disk = 120, RAM=8GB and Throughput per disk=25 MB/sec.
No problem was faced until the records count reached over 1.5+million. The ingestion rate began to choke after this value.
Azure disk details:
SDA1 >> Size= 976MB;Used=46MB (5%);Available=879MB
SDA2 >> Size= 29GB;Used=9.2GB (35%);Available=18GB
InfluxDB(v1.5.2) instance size allocated= ~30GB.
My InfluxDB database size = ~2.0G
Influx Query related details:
Measurement = 1
No of tags= 9 (8+1uniq tag)
No of fields= 3
Number of Series = 999952
One csv file having max of 10K rows (can be less but not more) is used for ingesting the data for every 5 minutes (288000 records per day). Simple query is used with no regular expression.
For every 10K records ingested:
Time taken for ingestion ranges between= 5-10secs (depending upon the size of the db.)
Field writes per second is around= 100 per second.
Total unique series= 10000
Any suggestions are welcome.!
Thanks in advance.