Limitation of fields and field keys?

Hi,

we’re planning to use various field keys, like “chart_linear_temperature[°C]”, “chart_linear_humidity[%]”, “chart_bar_dewpoint[°C]”, etc. for our measurement.
Is there any limitation on field keys (> 200) or is there any recommended approach to handle a wide range of various sensor data containing display type, label and unit?

Thanks & best regards,
Michael

@h2n
If you’re planning on having a multitude of sensors, I don’t recommend using the fields key.
That’s because the fields aren’t indexed.
The field will only store your value.
However, you can use the tags :
A tag is sort of what you make your “where” clause on, in your case it’s your sensor.

I hope I’ve answered your question

Nicolas

There aren’t lots of limitation in naming, some special chars like commas, equals and spaces need to be escaped (see the docs).

Having more than 200 metrics to track should not cause any problem as long as you don’t exceed some limits, see “General Limitation” below (which happens only if you have huge strings).

For what I’ve seen so far there are two main approaches

1. The Normal Structure

This probably is the one you are already planning, tags are used to provide context for the metrics, each metric has his own “field” (column)

Sample:

Time Machine Sensor Temperature[C] Something
x Machine1 Sensor1 10 12
y Machine1 Sensor1 11 13

2. a sort of EAV structure (entity-attribute-value)

in which you use tag to track the context of the metric (and so far everything is normal), plus a tag “metric” or “counter” that defines what the value is represented in the “row”, in the end, you have only one field “value” (if you need different data types, more fields will be needed)

Sample:

Time Machine Sensor counter value
x Machine1 Sensor1 Temperature[C] 10
x Machine1 Sensor1 Something 12
y Machine1 Sensor1 Temperature[C] 11
y Machine1 Sensor1 Something 13

Pros:

  • Extremely flexible if you need to add more metrics, in fact you will only have more points (rows)
  • Renaming “counters” causes fewer issues as you manage different rows instead of different columns, the structure will still be “clean” (meaning that “old” rows will disappear because of the RP)

Cons:

  • Not always “comfortable” to query, you might end up having a query for each metric you want to gather in your chart
  • It is not immediately clear what the measurement contains

General limitations

  • maximum key size - given by time + tag set, it cannot exceed 64k
  • maximum body size - Http request size, it is configurable and can be disabled (see docs)
  • Encoding - Telegraf and InfluxDB encode strings in UTF-8, before using special chars check if they are actually printable in UTF-8

Here is a list of my personal suggestions:

  • Avoid special chars when possible. they make querying harder in general
  • Put the “unit of measure”(um) inside the field name, in this way it is immediately clear how it represents data. (tools like Grafana can automatically display the data with the best “um”, once you tell them which is the “um” of the field)
  • If the metrics are of several “types”, like cumulative value | instantaneous value | etc it might be useful to put this information in the field name itself, so you know how to correctly manage it without having to test the data in each field
  • If you have all those metrics (over 200), it might not be easy to navigate/find them. Consider using different measurements (tables) if appropriate.
  • I’m not a fan of the EAV design since it puts some limit on querying data, which could be more or less mitigated by the tool you use. If you can decide the structure of your data, have a “field” for each metric
2 Likes