Is it more efficient to store metadata about points (unit of measure, variable) in a relational DB instead of repeating it for every point?


I have a system with a bunch of sensors on it. Originally I was using MySQL to organize the data using the following schema:

The important tables for this discussion are:

  1. System
  2. Tag (caution: this is my own use of “tag” and shouldn’t be confused with InfluxDB definition)
  3. Tag usage
  4. Unit of measure (volt, celsius, etc.)
  5. Variable (temperature, pressure, etc.)
  6. Data point

A system must be modeled with one or more tags. A tag must be the source of a tag usage. A tag usage must be expressed in terms of a unit of measure, as well as a variable. A data point must come from a tag usage.

For example, consider a temperature sensor on the system. The tag “T1” would model that sensor in the database, and two tag usages could be 1=“actual temperature” and 2=“temperature setpoint”, with 1 and 2 being primary keys.

Therefore, data being inserted into the datapoint table would look like:
tag_usage=1,value=“64”,timestamp=“2019-02-13 14:00:02”
tag_usage=2,value=“65”,timestamp=“2019-02-13 14:00:02”

I’d like to use InfluxDB to store the time series data (i.e. the datapoint table) as that is what it is built to do. The documentation seems to be recommending that I store a tag “variable=temperature” with every data point representing a temperature measurement, etc. That seems like a violation of keeping a database in normal form and I would think that metadata would be stored elsewhere.

Related questions:
Schema for Plants, Devices and Signals
Schema Considerations - 1 tag key instead of multiple field keys
Using Influx to monitor environmental data
Schema advice for commercial solar monitoring

EDIT 1: Add related questions.
EDIT 2: Change voltage to temperature in last paragraph.
EDIT 3: Add another related question.
EDIT 4: Add another related question.


Tag keys and values are stored only once per series, so sending them with every data point does not mean redundant data.