in my system, clients send multiple metrics (cpu, memory, etc.). Each client has some meta-information associated (e.g. account, platform, browser, network conditions, application used) and, after looking at the documentation, I was thinking to store it as tags.
Is that a good idea? How many tags am I allowed to have / what are the best practices? I’m scared that I may end up with 20/30 tags causing an excessive overhead in the storage space.
On the other side, the alternative would be to use a client id and store the association id->meta-data in a separate database, but that would complicate the queries since I would have to retrieve the ids in advance.
I looked at the schema design suggestions in the docs but I’m not very practical with time-series databases and I want to make the right choice. Thanks!
@eloparco The thing to consider here is series cardinality – the total number of unique tag value combinations across all data. 20-30 tags isn’t necessarily something to worry about, but 20-30 tags with 1000s of unique values each can quickly become a problem. How many unique values do your tags have?
If you’re using InfluxDB Cloud or InfluxDB OSS 2.0, storing metadata in a separate database such as Postgres or MySQL wouldn’t be a bad approach. You can use join() to join data in InfluxDB with data in one of these external DBs at query time. Here’s an example: Join data with Flux.
…I was reading that is better to use one field per measurement instead of multiple ones.
Can you point me to where you read that , because that definitely isn’t true. Really, on disk, measurements act as another tag, one that associates related points, so the more measurements you have, to more cardinality you have.
If you want to store all of those associated metrics in a single measurement, I think that’s totally fine.