Best practices for storing high-velocity data

twim · January 26, 2021, 2:37pm

Thanks so much for the resources!

A typical example or a query has proven to be problematic would be:

from(bucket: "logger")
|> range(start: 2020-10-23T16:55:28.3905490Z, stop: 2020-10-23T16:55:38.3905490Z)
|> filter(fn: (r) => r.JobID == "#14011")
|> limit(n: 1, offset: 0)

Which is a ‘trick’ I use to determine which fields are present in a ‘JobID’ (JobID is a tag here), since we can guarantee that all used fields will be present in any 10 second interval. A query like this can take anywhere between a few seconds to tens of minutes, and sometimes take the server down due to out-of-memory issues.

Other queries that tend to take long are ones in which we use drop columns (Like drop(columns: [“SomeTag”]) or group(). These will sometimes cause near 10x slow down in the query time.

Topic		Replies	Views
Best setup for high-velocity data InfluxDB 2 influxdb	0	441	June 10, 2021
Assess InfluxDB support for high scale InfluxDB 2 influxdb , schema , performance	0	131	December 19, 2023
Query tight timeslices Dashboards influxdb , time-series , influxql	1	441	October 3, 2019
Scaling with high cardinality or multiple databases InfluxDB 2 influxdb , iot , schema , performance	1	591	May 10, 2022
Storing high-frequency (10kHz) data Telegraf time-series	5	2919	November 11, 2022

Best practices for storing high-velocity data

Related Topics