Cardinality and Data Series

Hi, I am a total noob to InfluxDB, however I have a good understanding of the concepts of sharding, indexing, and consistancy from other databases.

I am building a large system that is supposed to collect data from hundreds of millions of devices. I am trying to understand how design my system to avoid high cardinality, and allow for good query times.

Writing: My devices will report data from different sensors in real time:
<1:12:01, device1 ,temprature = 24C >
<1:12:03, device1 ,speed= 1m/s >
<1:12:05, device1 ,noise = 1DB >
<1:12:02, device2 ,temprature = 12C >
<1:12:04, device2 ,speed= 3m/s >
<1:12:04, device2 ,noise = 5DB >

My select queries will be something along the lines of:
Give me all of the measurments from “device1” for the last minute.

I am going to have 10s of millions of devices.
Each devices can have one of 10s of sensors (noise/speed/temperature).
The list of sensors needs to be expandable in the future (although the number will remain in the 10s)

So what kind of data schema should I use?
I should probably create a single database.
device_id should probably be a “tag”, right?
For sensors: I am debating myself between seprate time series, or perhaps a sensor_type tag.
(My writes tend to point me towards separate time series, but I am not sure about my reads (perhaps a sensor_type tag?).

I agree that devide_id should be a tag. You’re going to end up with relatively high cardinality because of the ids, but that’s just the nature of the dataset. Make sure you have TSI enabled on InfluxDB (it helps with performance in high cardinality). I would only add a sensor_type tag if you need it for the kinds of analysis/queries you’ll be doing. The more separate series you have, the higher your cardinality will be. The best path is usually to decide what attributes you will want to group by and make those tags.

1 Like

Thanks @katy,
So I should ignore the Series part, and just have a general “measurement” series with a bunch of tags (as little as possible with respect to my queries). Right?

Yeah, I think that’s a great start