Query regarding cardinality. Empty field vs multiple measurements

Hello, I am trying to collect data from some zigbee sensors of different types which send different data.
Should I store the data from each kind of sensor in different measurements or is it better to store them all in one measurement. Which of them would end up in higher cardinality ?

Measurement: All in one

Device ID Sensor ID Field 1 Field 2 Field … Field N
D1 T1 123
D2 T2 123 123
D3 T3 123 123
D25 T4 123 123

vs

Measurement: Type 1

Device ID Sensor ID Field 1
D1 T1 123

Measurement: Type 2

Device ID Sensor ID Field 1 Field 2
D2 T2 123 123

and so on.

I don’t think this will affect storage or cardinality either way, but if you plan on performing queries or calculations that combine different devices and sensors, it’ll be best to keep them all in the same measurement.

Hello @tintin ,
I agree with @mhall119, but you might find the following blogs and resources useful to you as well:

Hello,
Thank you both for the response.
I do plan to have queries that combine those fields that’s why I am planning to keep them together.
I looked at https://docs.influxdata.com/influxdb/v2.0/reference/glossary/#series-cardinality. It says Cardinality is the number of unique measurement, tag set, and field key combinations in an InfluxDB bucket. I wasn’t sure how field key is accounted here.

I meant that the example below https://docs.influxdata.com/influxdb/v2.0/reference/glossary/#series-cardinality doesn’t mention fields at all. But the post you mentioned Data Layout and Schema Design Best Practices for InfluxDB | InfluxData counts fields.

What I want to know then is that given the following setup, will the cardinality be 2 or 4.

Device ID Sensor ID Field 1 Field 2
D1 T1 123
D1 T2 123

@tintin my understanding is that the field count counts towards the cardinality (so in your example it would be 4). I believe this is more explicitly so in 2.x than in 1.x, but the net effect is similar. This thread from a while ago might have some relevant info: Maximum number of fields and tags

Btw the series cardinality definition that you linked to in the docs does say

The number of unique measurement, tag set, and field key combinations in an InfluxDB bucket.

But yes, those examples don’t talk about fields at all, so it’s easy to come away with the conclusion that they’re not relevant.

Thanks. I’ll stick with saving them in separate measurements then.

If you have N fields, then it wouldn’t make a difference if you have them all on one measurement (M * N) or half on one measurement and half on another (2M * N/2), your cardinality calculation would be the same.

I’m still not completely sure. I tested this on v1.8 and number of fields don’t seem to matter. I’ll test on v2 and add the results later.


Device ID Sensor ID Field 1 Field 2 Field 3
D1 T1 123 124
D1 T2 125 126

SHOW SERIES CARDINALITY = 2


Device ID Sensor ID Field 1 Field 2 Field 3
D1 T1 123
D1 T2 124 125

SHOW SERIES CARDINALITY = 2


Device ID Sensor ID Field 1 Field 2 Field 3 Field 4 Field 5
D1 T1 123
D1 T2 124 125 126 127

SHOW SERIES CARDINALITY = 2


Hello @tintin
Yes that makes sense, the fields are still being counted towards the series.
If for you wrote the following point in the first example as well:
D1 T2 Field1=123
Then you’d have 3 series. Even though your two tags are the same as the D1 T2 Field2=125 Field3=126.

Also please note that dependent tags don’t increase your cardinality. A dependent tag is one where the tag values always have an exact match. Example: email and first name.

Series cardinality is tricky!

Perhaps it wasn’t clear, the output of SHOW SERIES CARDINALITY is 2 for all those cases.
I understand tags, I want to understand how fields are accounted.

Hello @tintin,
Yes that was clear.
But in every instance you only have two series.
So I believe that if you wrote the following three points
Measurement,DeviceID=D1,SensorID=T1 Field1=1
Measurement,DeviceID=D1,SensorID=T1 Field2=1
Measurement,DeviceID=D1,SensorID=T1 Field3=1

Then your cardinality would equal 3.
Try SHOW FIELD KEY EXACT CARDINALITY instead.

1.x doesn’t include field keys in the series key, so in the cardinality data, it doesn’t include the field key. However… field keys are actually part of the series key (on the storage level)
Yah that’s confusing. But this discrepancy was fixed in 2.x. The different commands were used to help users dive into their runaway cardinality problems, but now there are better solutions:

Thanks a lot. So my original understanding was correct. I would suggest to add some reference to FIELD KEY CARDINALITY in the docs for SERIES CARDINALITY, for anyone who is still on v1.x. And may be examples like the following.

Device ID Sensor ID Field 1 Field 2 Field 3
D1 T1 123 124
D1 T2 125 126

CARDINALITY = 4

Device ID Sensor ID Field 1 Field 2 Field 3
D1 T1 123
D1 T2 124 125

CARDINALITY = 3

Device ID Sensor ID Field 1 Field 2 Field 3 Field 4 Field 5
D1 T1 123
D1 T2 124 125 126 127

CARDINALITY = 5

Hello @tintin,
Thanks for your examples! I’ll pass that info along to the docs team. @scott can you please take a look?