Model "data_valid" as tag or as field?


imagine a time series with a temperature measurement.

However the system can detect some indication the temperature readings are not valid, maybe because normally the temperature is measured in a tank containing water but the tank is not filled with water. So technically the temperature readings are not “valid” because there is no water. But anyway the readings are still available because they measure the temperature of the air. There is a boolean flag named “data_valid” to model this situation.

So my question is: What is the better solution, to record the “data_valid” flag as a tag or as a field in influxdb? I can find arguments for both cases.

What’s your opinion?

Thanks in advance!

Hello @michael2,
That’s a good question.
I think I would ask:

  1. are you looking to commonly query when data is only valid? If so make it a tag
  2. How many tanks are you measuring? Is it in the 1000s? If so make it a field.

Hello @Anaisdg,

thanks for your reply.

  • Currently the main queries of the data is a little bit vague because we don’t have a lot of experience with data at all.
  • The number of tanks are about to increase by say 250 per year. So in about 4 to 5 years there will be definitive 1000 tanks and the number will continue grow steady.

However there will be more temperature readings in the future beyond temperatures in tanks. And additionally there are maybe more states like “data_valid”. So it sounds like that going with a field is the way to go.