I have been using influxDB 1.8 for a couple of years now to store data coming from IoT sensors.
I recently switched the frequency of the sensors, from 1Hz to 10Hz.
I have noticed that I suddenly get data written in the database with MAC addresses that don’t exist. (From 12 actual sensors, I can see 75 sensors on influxdb (so 63 “fake” ones)
MAC is one of the tags that are in the data, that I write using lineProtocol.
I have narrowed down the problem to the actual client_write function of the python library. If I analyze the data up to the moment that is being sent, it looks perfectly fine with only 12 sensors.
If I write a couple of hours of this data, influxDB reports only 12 sensors as expected. But if I keep writing more and more data, the extra 63 “fake” sensors appear, and have some datapoints here and there, and these “fake” sensors always send data at a fraction of 1Hz.
At what point the corruption / buffer overflow could happen and what could I do to limit this?
I have tried sending data in batches of 5000 lines, I have tried modifying the timestamp precision from nanosecond to millisecond.
The weird thing is that even if I delete the data and write it again multiple times, I always come up with the 75 sensors number, even if I write the data of different days.