Hi, I’m evaluating a migration from InfluxDB 1.8 to 2.1 and I need some help figuring out some of the differences and how to properly migrate structures and evaluate limitations.
Current structure (v1.8):
As of now, I have 42 databases, some are inactive and could be deleted, but realistically the number might get above 42, for the considerations let’s say that I’ve got 20 DBs but I must be able to scale by creating more databases.
- Each DB identifies a customer
- All the databases have the very same structure
- Each DB has his read-only user
Proposed conversion steps
The proposed (by the docs) conversion 1.8 → 2.1 steps are the following:
- Create one organization
- create buckets to replace DBs and RPs with a naming convention that can be db\rp (ex: Customer_1\temp)
- Create users and assign proper authorization tokens
- Rewrite CQs to Tasks
Strictly speaking about data, it should look like this, what once was a single DB is now made of 3 buckets
Here are my doubts
About Organizations
- what’s an organization supposed to represent exactly (is it proper to make it represent a customer?)
- what’s the gain of having only one Org with multiple customer DBs vs Multiple Orgs with only “single customer” Dbs (buckets) ?
About Task
I know tasks reside inside an organization, now I have 5 identical CQs in each database
- in the case of a single org, can I have just one task to loop over all the buckets? or do I still need to create a task for each “customer”
Other
Bucket limitation - A single InfluxDB 2.1 OSS instance supports approximately 20 buckets actively being written to or queried across all organizations depending on the use case
I’ve got basically 3 areas each with different write workloads:
- temp → continuous low activity with small recurring peaks (every 5min)
- standard → continuous activity, most of the writing is here + some CQs results (heavy calculation, but small result)
- long → CQs result only, computed every few hours or daily, very few data points
I’m not worried about read activity as it is pretty low (mostly in case of troubleshooting)
Given this kind of workload:
- How bad is it to go over the 20 buckets per instance? (as of now I’ll have ~20*3 = 60 buckets)
- should I count/consider my temp and long buckets even if their activity is pretty low?
Scaling
So far I’ve only used vertical scaling, meaning adding DBs on the existing instances, and if needed adding resources to the machine.
- given the “20 buckets maximum” recommendation, what’s the best way to scale? looks like vertical scaling is not as good as it was before…
- it’s easy and flexible to route data into the proper DB & RP inside an Influx instance (using telegraf), but if you got multiple nodes/instances it becomes way less pleasant as you have to properly filter the output plugins to “route” the data to the correct instance (meaning hardcoded values inside the config)
Thanks for reading thus far
I’d appreciate any suggestion and/or additional consideration