Hello,
I generated virtual IoT data, and inserted into influxdb2 bucket.
The data I generated contains “timestamp”(collected time), “coordinates” (collected location), and some other detail information of the IoT device. (these features are all generated)
In detail:
- I specify _field attribute like this → “Detail_info”: “”
- and among the time attributes (_start, _stop, _time), I specify _time as .
- Finally, in the case of duplication of timestamp, I put latitude and longitude as tags
In addition, all the points are stored in a single measurement in a single influxdb2 bucket.
After the insertion, I read points on the Influxdb2 UI like below,
However, as shown in the image above, I realized that the table columns have become individual groups based on their tags, and this has a bad effect on the performance of reads.
So I used the group() function to improve the read performance a little bit, but I also realized that using the built-in function as written in the influxdb2 documentation worsens the read performance.
This is the reason I thought it would be nice if I could show the same data to the client as the data stored in the actual bucket with the built-in function or specific query applied to the temporal storage (not in ssd or hdd).
So my questions are,
- if there is a library or package that implements the above mentioned temporal storage, I would like to know what it is,
- and if there is no such feature, I would like to get advice on implementing it using a specific language tool (e.g. pandas in python).
I realize that this is a different use than the original purpose of influxdb. However, I am doing this experiment to see if it can be used as a large data processing time series database (I will be using about 170 million data as experimental data).