Schema design - Multi-tenancy: multiple measurements vs one measurement with different tags for the same type of device

influxdb
#1

We have different clients all using the same kind of networking device (ecv’s) provisioned and configured by us. Usually I would have just a single measurement (“ecv”) and use the “hostname” of the device as the tag and the rest of the data as fields. However in this case, because there are multiple independent customers, the device “hostname” can be duplicated for different customers (i.e. same customer can have a device based in london as “london-01”).

For me, the most elegant way (in terms of making queries, and using in conjuntion with Chronograf) would be to use the customer name as the measurement, and the ECV hostname as the tag which would now be unique within each customer (and hence easily queryable ) as illustrated:

As you can see this design allows very easy navigation in Chronograf to locate a specific device belonging to a customer and performing queries for a specific customer is made easy.

However, I understand this is equivalent of making 100 table’s (assuming 100 customers) with each table containing exactly the same columns). Would this be an issue in Influx (recommended? not recommended?). The equivalent python code is:

return {
    "measurement": customer,
    "tags": {
        "device": all_appliances_dict[device]["hostName"]
    },

    "fields": {
        "device_id": device,
        "ip": all_appliances_dict[device]["IP"],
        "serial": all_appliances_dict[device]["serial"],
        ...
        ...
    },
}

If this is not good design, the alternative would be to change the schema to the following so that the measurement falls under one name “ecv”, and the tags would be customer_name and device hostname thereby identifying a unique series for each network appliance:

Although this seems to fit more in tune with the way data is supposed to be ingested by Influx, it makes querying a bit more inconvenient and using Chronograf navigation more tedious since if I select a customer, it would still list out all the devices for the entire measurement whereas I am only interested in devices for that specific customer:

Equivalent python code:

    return {
    "measurement": "ecv",
    "tags": {
        "orchestrator": self.orchestrator.ip,
        "hostname": self.all_appliances[device]["hostName"]
    },

    "fields": {
        "softwareVersion": self.all_appliances[device]["softwareVersion"],
        "ip": self.all_appliances[device]["IP"],
        "serial": self.all_appliances[device]["serial"],
        ...
        ...
    },
}