I’m trying to figure out a database schema for a large and growing IOT sensor network. We install boxes onsite at industrial facilities that can monitor upwards to 200 sensors each; things like temperature sensors, pressure, boolean equipment switch states, etc. Currently we have 5 boxes reading around 200 sensors each, so 1000 sensors. In the next year we’ll probably be up to 10k sensors, and I imagine on the road to 100k sensors in the years to follow. We’re currently using TimescaleDB, but I’m finding it a bit hard to manage for our use case and thinking of switching to influx.
We currently create a table for each field device, which has a time column and then a column (named with sensors id) for each sensor (all are read at roughly the same timestamp), so very wide tables that are hard to manage/query even programmatically. We also create continuous views on those tables to get 1 minute, 1 hour, and 1 day aggregate data. All data is cached locally on our field devices in a mirrored table, and then uploaded to our main cloud database that houses all the data.
For influx, I was thinking it might be easiest to use a single measurement for everything, with simply a uuid tag that maps to a sensor metadata row in our postgres database and a single field named value, either float or bool. I don’t think we’d need any other tags, because the query process generally begins with someone finding a set of sensors through our front end, so we already have the set of UUIDs for those sensors, and can easily build the influx query. This seems like it will be WAY easier to manage than what we do now. Any thoughts on if this is the correct approach or if there are any performance drawbacks to using a single measurement. Based on what I’ve read, it seemed okay to me, because using the single id tag our cardinality will likely stay in the 10k-100k range, which seemed to be under where I saw large performance and RAM use penalties.
Another quick question is, is it advised to only record data when it changes from the previous time we read the sensor, or should we just always record when we read and let the compression engine take care of things for us? Thanks for the help