We are evaluating Time Series databases for our sensor data. We have a use case to capture sensor data from up to 6 million sensors. Each sensor has the following information that we need to capture:
“deviceid, location, units, status, value, readTimestamp”.
DeviceID - Upto 6 million
Location - a million different locations
units - 10 - 20 units
Status - 5-10 status codes
Queries will be grouped by deviceId’s or deviceId+units or deviceId+units+status - All grouped by timestamps. One of the requirements is to be able to create graphs and charts.
I am aware of the series cardinality in influx. I love everything about InfluxDb but is this the right solution for my problem? Appreciate any insight so i can decide if i should pursue this line of thought.
Thanks
Srini
@srini-raju This usecase is exactly what InfluxDB was meant to solve!
A single instance of InfluxDB can handle between 5-10 Million series. More than that and you will need a cluster. One thing to note about those tags is that some are dependent on others. For example Location
will not add series cardinality because DeviceID
has more values and does not change Location
.
Status
sounds like it varies with DeviceID
so that will add additional series. Can you give me some more information about units
? Is that tag dependent on DeviceID
?
Jack
@jackzampolin, thanks for the response. Like you mentioned a device will be in one location , so device cardinality comes into play here. Each device though may have a few statuses at any given point. So that might add to cardinality. Units is tied to deviceId. A deviceID will have one unit at any given time, so might not add to cardinality.
So, 5-10 milion is fine for the latest version of influx? I remember it was around 100k in version 0.9.
Another question: Is cardinality limitation of 5-10 million within a measurement or across DB?
Thanks
Srini
@srini-raju It is across an instance
Hi @jackzampolin,
Does the retention policy have an effect on series cardinality, i.e. if series are dropped due to the rp in effect, is the series cardinality reduced as well? Thus, the series cardinality impact could be controlled by the retention policy?
Thanks,
Flavio
@monfla00 Yes it can! Thats how RPs are designed to work.
@jackzampolin, does adding more nodes in a InfluxDB cluster reduce the series cardinality? E.g. If a single node InfluxDB series cardinality is 10 million. If I create a cluster of 10 nodes with the same set of data with replication factor 2, would the per node series cardinality drop to 1/10th (1 million per node)?