High Sample Rates

I’m trying to test collecting high sample rate data (eg 100 Hz, or even 1000 or 500 Hz).

It seems typically data recording rates are closer to 1 Hz.

From a data collection/recording side, the first hurdle is the protocols to send data to influxdb. line protocol over HTTP is way to verbose…why send a timestamp for every single sample. UDP drops too many packets at this rate.

Has there ever been talk of a protocol that allows sending multiple samples at once, with a single timestamp (of the first sample) and a sample rate so all other timestamps are interpolated from the first? Even better, a compressed, binary format for these samples?

Our company has used packet-stores instead of a traditional or timeseries database for this reason – we store roughly 2sec of data (depending on compression) in each packet, and retrieve packets of data from the packet-store, rather than raw samples. We want to try out influx for this purpose instead, if it can handle the load.

This may be a non-starter if this is only the first issue we’ll encounter. My larger concern is that the influxdb data model is designed for many fields at ~1Hz maximum, and not fewer fields at high sample rates.

@BenTatham A single instance running on commodity hardware (8 CPUs, 16GB RAM) can easily ingest 1M points per second. It is normally ingested via the HTTP API in batches of 5k-10k field values. In your case that would mean 1k sources generating data at 1k Hz. We also support ns timestamps to support things exactly like this. Is your usecase higher volume than that?

We were actually originally designed to take graphite points so the few fields usecase is very well supported. We have discussed binary protocols for the write as well as query paths. The issue to track that progress is here: Proposal: New response structure · Issue #8450 · influxdata/influxdb · GitHub

1 Like