Searching for libraries in influxdb2 to apply large data processing techniques

ksunbum97 · March 13, 2024, 7:14am

Hello,

I generated virtual IoT data, and inserted into influxdb2 bucket.
The data I generated contains “timestamp”(collected time), “coordinates” (collected location), and some other detail information of the IoT device. (these features are all generated)

In detail:

I specify _field attribute like this → “Detail_info”: “”
and among the time attributes (_start, _stop, _time), I specify _time as .
Finally, in the case of duplication of timestamp, I put latitude and longitude as tags

In addition, all the points are stored in a single measurement in a single influxdb2 bucket.

After the insertion, I read points on the Influxdb2 UI like below,

However, as shown in the image above, I realized that the table columns have become individual groups based on their tags, and this has a bad effect on the performance of reads.
So I used the group() function to improve the read performance a little bit, but I also realized that using the built-in function as written in the influxdb2 documentation worsens the read performance.
This is the reason I thought it would be nice if I could show the same data to the client as the data stored in the actual bucket with the built-in function or specific query applied to the temporal storage (not in ssd or hdd).

So my questions are,

if there is a library or package that implements the above mentioned temporal storage, I would like to know what it is,
and if there is no such feature, I would like to get advice on implementing it using a specific language tool (e.g. pandas in python).

I realize that this is a different use than the original purpose of influxdb. However, I am doing this experiment to see if it can be used as a large data processing time series database (I will be using about 170 million data as experimental data).

Topic		Replies	Views
InfluxDB Performance with Non-periodic Time Series: Retrieve Latest Points InfluxDB 2 influxdb , flux , performance	3	578	April 18, 2023
Maintaining field relationships InfluxDB 2	2	492	February 12, 2020
InfluxDB dropping records with the same timestamp and tagset InfluxDB 2 influxdb	1	674	April 25, 2022
Storing sensordata correctly influxdb , time-series , iot	3	2810	October 13, 2020
Duplicate values stored in database Store influxdb	6	591	May 27, 2019

Searching for libraries in influxdb2 to apply large data processing techniques

Related topics