Help with scheme for sensor data

Hi!

I‘m looking for help with the database scheme for weather sensor data.
One of the major issues I have here is, that there are a bunch of different sensor types I want to support, which all support a different range of data types (99% of them do temperature, some of them also measure wind speed and/or humidity and/or illumination, etc.).

The best approach I came up with, is to create a measurement “sensor_data”, with only one field called “value” (containing the measured value) and two tags, which are sensor_id (id of sensor that tracked “value”) and type (identification for the type of the value, like “temperature”, “humidity”, …).

That structure would solve all my problems, but what about performance? Will this scale properly, or is it better to go with a measurement for each of the datapoint type? Or would it be even better to add fields for all the different data point types (and leave those fields empty that are not supported by a sensor)?

What do you think about it? Or do you have another idea?

Thanks for your help!

Hi tek!

TL;DR
Your schema would work just fine

Remember to turn on the TSI engine option if you think you’ll have thousands and thousands of sensor ids.

I typically add a “units” field as well just for documentation purposes (units=celsius)

As an aside I’ve abused tags before when my ADC was 12bit… feel free to shield your eyes!
I would have 12 tags with one for each bit as well as the full value for the field. Because there are really only 4096 possible steps this wasn’t much cardinality.

This way I could get very fast filter queries for values of the sensor that were less than 2048 by checking that the tag bit_12=‘0’

I could do a “band-pass” filter by doing a
WHERE "bit_11"='1' OR "bit_10"='1' OR "bit_9"='1' to get 512 - 2048.

After doing a fast, course bit filter like that I’d do math on the value field itself.

One other tip, if you know the sensor calibration at insertion, it can be really nice to store those as fields as well.

3 Likes

Hi @chris

Thanks for your in depth response! Glad to hear that I‘m not completely off with the scheme. :wink: Also the comment about the TSI option is really helpful, thanks!

About your “aside note”: that is a great workaround to improve speed for these kind of queries, but as I won‘t use these kind of queries I definitely stick to the value=… option. :wink: