I have a requirement for storing baseband data received from UWB sensor.
The data received is around 80 complex data per frame.
I was analyzing the two possible schema since field values do not support array datatype in InfluxDB.
- The field will contain 80 frame data array concatenated as string:
Complex_data: “[0.4, 5.55].........[0.3, -2.1]”
- The field will contain 80 field for real values and another 80 for imaginary values:
real_0: 0.4, imaginary_0: 5.55, real_1: 3.7, imaginary: -2.0 ………… real_79: 0.3, imaginary_79: -2.1
I analyzed these two schemas on disk writes and query time and the results are counter-intuitive.
I would suspect the disk io as well as the storage space for string would be much higher than that of the float values, but the results showed that the disk io remains the same for any of these schema while the storage for storing it as string has a much less storage occupied.
Disk I/O (write_bytes from Telegraf): For storing as String:
Disk I/O (write_bytes from Telegraf): For storing as 80x2 different float fields (row_0, row_1…row_79, imaginary_0, imaginary_1, imaginary_79):
I am not able to upload more picture but the Disk I/O did not have much difference between the two schema. Actually it the same.
File Storage: for storing as String:
The starting storage occupied was: 30 MB.
After storing around 600,000 data points, the storage rises to 220 MB.
( I am running du -sh
in the /influxdb/engine directory for reference )
File Storage: for storing as 80x2 float fields (row_0, row_1…row_79, imaginary_0, imaginary_1, imaginary_79):
The starting storage occupied was 21 MB
After storing around 600,000 data points, the storage rises to 569 MB.
We see that when storing as 160 float fields shows almost 2x storage occupied.
I suspect this is happening because of mapping each field to a timestamp by InfluxDB.
Is there any explanation to
- why the Disk I/O remains same regardless of the schema?
- why the storage for storing as 160 float field is so high when they are not indexed?