I have a question regarding the number of recommended buckets one should use. The current usage plans mention that you can create an unlimited number of buckets.
Does the influxDB V3/IOx engine actually handle large amounts of buckets well? Is there some impact in terms of query performance for very large amounts of buckets say >10K?
Any help is appreciated, thanks.
The main recommendations are:
How you structure your schema within a measurement can affect underlying performance. Here are three schema rules to consider:
- Avoid wide schemas
- Avoid sparse schemas
- Measurement schemas should be homogeneous
The final golden rule of schema design is creating homogenous measurements:
This means each row should have the same tag and field keys. This will prevent sparse schemas being formed. For instance in this example on the left we have two industrial machines outputting the same field and tag keys so they may live within the same measurement. On the right hand side we have an industrial machine and server health data. In most cases these two data sources will share different fields and tags and should live within their own measurements. Data from each can be joined at query time if required.
I think you should be fine with a large amount of buckets.
InfluxDB 3.0 is up to 45x Faster for Recent Data Compared to InfluxDB Open Source | InfluxData.
But I’ll ask aroudn to verify.