I have a schema design question for you.
Imagine data was flowing from satellites, with records containing satellite name, subsystem name, metric name, metric value, and timestamp.
Basically, telemetry that could be restructured to look something like this:
telemetry,satellite=Hubble,subsystem=gps lat=90,lng=10 1465839830100400200
telemetry,satellite=Hubble,subsystem=camera fov=90 1465839830100400300
telemetry,satellite=ISS,subsystem=gps lat=10,lng=20 1465839830100400400
etc.
Users would want to be able to compare metrics (such as the gps values above) across satellites and subsystems.
My questions are:
- Is it better to store single or multiple fields per entry in InfluxDB? That is, we could store multiple metrics per entry by storing our metric name as a field, as in the line protocol example above. Alternatively, we could add a metric tag and a value field, like this:
telemetry,satellite=Hubble,subsystem=gps,metric=lat value=90 1465839830100400200
telemetry,satellite=Hubble,subsystem=gps,metric=lng value=10 1465839830100400200
Note that it’s sparse: many metrics (fields) will only exist on a single subsystem. Is one better than the other? Are there performance implications?
- Would it make sense to replace the measurement name (currently “telemetry”) with the satellite name (e.g., “Hubble”, “ISS”), making many more measurements that each have smaller cardinality? Could users still compare across satellites? E.g.:
hubble,subsystem=gps lat=90,lng=10 1465839830100400200 # with many fields
hubble,subsystem=gps,metric=lat value=90 1465839830100400200 # or with metric as a tag instead of a field
Again, are there performance implications to consider?
Any thoughts are much appreciated!