Starting to get stuck into learning flux. After following a tutorial I find myself confused between fields and tags.
The person giving the tutorial helpfully explained that it’s important to always filter by tags rather than fields where possible for performance reasons. The person did however filter by _field which I gather is a field?
I’m now confused about the _field field as it feels like it was being used as a tag in the tutorial and in my experience I want to filter on it. Also, on the influx ui query builder, _field is the default second thing to filter on.
Couple of questions that may help my confusion
Can I confirm that _field is not indexed?
Should I be avoiding filtering on _field?
If I do avoid it, does it make sense to have a tag column that mirrors the _field?
Ideally, yes, you should avoid filtering on _field if performance is a concern for high throughput usecases. However, it depends on your query’s requirements. In many cases, filtering on _field is necessary because you may want to only retrieve certain measurements or specific values from fields. If you need to filter by _field, it’s not inherently wrong, but it can impact performance in large datasets, as fields are not indexed. But people do filter by fields all the time.
Tags should typically represent metadata that doesn’t change often (e.g., location, host). If you expect to query a particular field often and performance is a concern, and that field has low cardinality (i.e., a small number of distinct values), converting it into a tag might improve performance. However, adding too many tags can have a negative impact due to increased index size.