Storing additional timestamp-type information as a field?

I’m storing some time-series data from find3 in InfluxDB.

One of the fields will be a “first_seen” field, which is a timestamp of when a given device was last-seen.

Example might be:

      "devices": [
        {
          "device": "wifi-fa:d2:44:55:d0:dd",
          "vendor": "?",
          "timestamp": "2018-07-21T22:22:52.833Z",
          "probability": 0.21,
          "randomized": true,
          "num_scanners": 1,
          "active_mins": 1,
          "first_seen": "2018-07-21T22:22:52.833Z"
        },

The plan is to have device, probability, active_mins and first_seen be fields, and vendor, randomized, num_scanners be tags.

According to the line protocol reference, fields can be of type float, integers, strings or booleans. But there’s no timestamp type?

Ideally I’d like to be able to do timestamp operations in InfluxDB on this field (e.g. earlier/later than comparisons etc.).

How do you do this in InfluxDB?

2 Likes

You are correct that fields can be floats, integers, strings, or booleans. At present, there is no timestamp type for fields. The docs give some good general recommendations for your InfluxDB schema design.

Could you also give an example of some of the timestamp operations you’d like to perform on InfluxDB using the “first_seen” field?

Example - I’d like to do deltas between first_seen and say, the current time (i.e. now), or the timestamp of that particular point.

I would store it as a nanoseconds since epoch integer.

1 Like

I’d love to know why the available field types doesn’t include timestamp. Is it a technical limitation? Or a design decision? Is there a discussion on this in an Influx Github repo somewhere? I’m asking solely out of curiosity. Given the nature of a time series database it does seem natural that a field could include a timestamp and that it could be operated upon similarly as the timestamp of a point itself (e.g. WHERE clause, etc.). I get that an integer field of nanoseconds is essentially equivalent to a timestamp, but without recognizing it as a timestamp there are various limitations in truly time-based queries and operations with it.

Hi @mkarlesky,

It is an interesting question , for me it does not seem natural.
I wonder what would be the added value of an extra timestamp field …
Do you have an example in mind ?

I can think of a couple cases I’ve encountered recently. The first is only inspired by some recent work; the second is an actual set of circumstances we are still working out because timestamps aren’t field types.

  1. Coordination of time keeping sources (a la NTP). The point in Influx would be timestamped with reference time while the point fields would include time as measured by remote devices. The math between point timestamp and field timestamp as well as between field timestamps in multiple points would allow tracking drifts, corrections, lags, etc. Perhaps there’s a way to do this solely with point timestamps. And, certainly there is a way to hack it by storing integer tick counts as fields. But, it would surely be easier and more effective to do this with time as a native field type and using Influx’s native time math and query abilities (especially as Flux comes online).
  2. Recording a “batch” that has a beginning time stamp and ending timestamp. Without timestamps as a field type, we may well have to separate the details of a batch into two different measurements. This necessitates some sort of unique Batch ID to correlate the measurements. But, since batches are unique and numerous, storing that unique ID as a tag causes cardinality issues. At least one good solution to this schema challenge would instead be the ability to store a single batch with both timestamps in a single point. Of course, we can store tick counts or durations instead of timestamps or use a field instead of a tag to store a Batch ID, but both of these solutions are probably a bit clunkier than storing well managed timestamps as fields.

Is there an update on this? This is essential for things like temperature forecasts where you need to store when the forecast was made and what day the forecast is for.

I second this request, In my case, there is a downstream application which receives stream of events. Those events have start and end times and they need to be stored so that users can later look up events in a specific time frame. This requires querying against start time and end time fields. Breaking it into two points makes code more complex and wastes storage.

Thanks!

Is there any update on this?

I would love that feature to have multiple timestamps in a measurement. In case you’re wondering for a valid use case .This grafana plugin (Gantt plugin for Grafana | Grafana Labs )
expects start and end time to be returned in order for it to create a gantt chart. Now the absence of multiple timestamps makes it impossible for this plugin to go well with influxdb.

That plug-in docs imply that you can customize it to suit the data you have, even telling it which fields contain the start and end times.

This means that although influxdb might not store the data natively in the way you want it, you should be able to use a query (from graphana ) to pull the data out and rearrange it into the structure this gantt module wants.