Timestamp uniqueness workarounds

@jackzampolin This is the follow up question from the webinar 30/3/2017 Optimizing TICK Stack

The question was if there are plans to"fix" the problem with storing two point with the same time stamp either by

  • Adding another dimension in the time stamp field, eg 1490893591.1 and 1490893591.2 … 1490893591.n
  • Or instead of merging, aggregating the fields eg 120, 130 ~ 125 using a mean or median or whatever makes sense
  • any other solution?

This happens with our servers where we have around 50 concurrent requests per server which generate per request 1 point in influx. We use the random ns workaround to distinguish possible collisions since using the unique tag workaround might create a serie cardinality issue.

Does btw telegraf handle this problem when ingesting haproxy logs which have a ms accuracy and a lot of requests happen the same ms?

thanks

1 Like

@mantzas We normally tag by host, or in a multi-tennant situation process_id, and have the client set the timestamps. In this situation with ns timestamps i’ve never heard of people seeing collisions. A couple of questions:

  • How many requests per second are each of your sever instances handling per second?
  • Also how many servers are you monitoring? This should help you establish how many series will created with proper tagging. If that number is below 1M then there is no need to have concerns about cardinality.

@jackzampolin

  • around 25 req/s
  • series cardinality is already high since we tag with http response code, host, and some other tags which raises the bar.

With 1B ns in 1 second the likelihood of timestamp collision in that case would be negligible. Also it sounds like you have properly tagged so that the the percentage of collisions would be even smaller. Are you seeing issues with this in your production environment?

yes i had some due to the fact that i add random nanoseconds to the time stamp. the number is small maybe 40 in 20M was a number that i calculated.

The reason for this is probably the random number generator. If influx did not have this restriction it would be much nicer since i would not tamper with the time at all.

What does telegraf do with sources that have millisecond resolution like the logs of HAProxy? Does it add random ns to it?

Just to be clear, the decision to use Last Write and have timestamp as a primary key is one of the decisions that was made when designing the database. I don’t anticipate that will change anytime soon.

Telegraf generates ns timestamps for all points that it creates. In the case of the logparser where it is pulling the timestamp from the source, it passes that timestamp through to the database.

In your case, properly tagging these values to differentiate them should take care of these timestamp collisions. Is there anything preventing you from generating ns timestamps for this data at the host? What level of tolerance do you have for this behavior?

I have already implemented a component (C#) which gets a unique time stamp with tick precision every time i call the get time stamp function. (On MS Windows, there are 10,000 ticks in a millisecond). This is the smallest amount of time the OS has to offer so i am covered per machine. On linux i have to check how to do the same thing.

Now i have to use this component everywhere i can to have unique time stamps per machine.

Since telegraf handles this on his own i think we are ok.

1 Like