At work, we’re migrating from a custom time-series solution to InfluxDB 2.0.
We have a number of devices (Currently 8, but eventually likely around 64) that all push high velocity data to the InfluxDB server. Each device logs about 80 values at a rate of about 1KHz (so ~80Kvalues/sec/device, although this is an average, the actual data will arrive in bursts every few minutes). This data rate is necessary and down-sampling is not an option.
The current ‘schema’ is that all data is stored in one single bucket, and we have one measurement per device.
Each measurement has 2 tags associated with it, one of which can only be 4 unique values, while another ~50 or so unique values.
My question is, would this be a somewhat optimal way of storing the data, and what would be a recommended hardware setup? We’re currently running this in a VM with 48GB ram and 8 cores of a Xeon(R) Gold 5118. The filesystem, however, is mounted on a NAS over a 10G network, which I assume might be a huge bottleneck.
Finally, what are the options when it comes to redundancy? The virtual machines are currently duplicated among multiple nodes, but if we choose to run InfluxDB on a dedicated box, we’d like some redundant solution here as well.