Best practices for naming measurements and not repeating myself

schema
time-series
influxdb
telegraf
#1

As I start instrumenting my software to send data to telegraf and on to influxdb, I’ve been searching for tips on how best to name things. I found this good advice for a start.

I find that a bit too often I send data where the measurement name and the field name are the same, which suggests I’m doing something wrong.

Here’s an example:

cycle_milliseconds,program_name=tyrannosaur cycle_milliseconds=2.991858

(There are some other tags that I’m skipping here, like hostname and the git rev that generated the program.)

The thing I’m measuring is a property of the program called tyrannosaur. Perhaps I should instead call the measure tyrannosaur and drop program name? For example, it doesn’t really make sense ever to plot cycle_milliseconds for different programs together in the same curve. Then I get a very wide table, tyrannosaur, with lots of measurements (as I measure other things about that program).

tyrannosaur cycle_milliseconds=2.991858

(This is what I’m currently doing.) On the other hand, the canonical example is a temperature time series, looking something like this:

temperature,room="10-250" celsius=17.9,relative_humidity=0.42

which would argue for

cycle,program_name=tyrannosaur cycle_milliseconds=2.991858

Often I have only one field to report at a time (like time through a loop or across a function, number of things processed in a batch, etc.).

Or perhaps I’m thinking of this entirely in the wrong way.

This is intentionally open ended: suggestions and pointers most welcome.