I have read the Data Layout and Schema Design Best Practices for InfluxDB (Data Layout and Schema Design Best Practices for InfluxDB | InfluxData) blog post and I am now considering how I should setup my database.
According to the blog post “Series cardinality is the number of unique bucket, measurements, tag sets, and field keys combinations in an organization”.
There is this example formula:
I wonder if it makes sense to include the number of buckets in the series cardinality calculation in the way it is presented in the formula? Shouldn’t the series cardinality be calculated separately for each bucket?
For example If I have two buckets: bucket_A, bucket_B
within bucket A I have one measurement and one field: measurement_A, field_A
within bucket B I also have one measurement and one field: measurement_B, field_B
So in total I have 2 buckets, 2 measurements and 2 fields.
The way I interpret the example formula, the series cardinality should be calculated:
SC=number_of_buckets * number_of_measurements * number_of_field_keys = 222 = 8
Based on the descriptions in the blog post, however, to me it sounds like the series cardinality should be calculated separately for each bucket because there could for example never be a situation where bucket B is combined with measurement_A and field_A:
SC=number_of_measurements * number_of_field_keys
SC_bucket_A = 11 = 1
SC_bucket_B = 11 = 1
SC_total = SC_bucket_A + SC_bucket_B = 1+1 = 2
Is this correctly understood?