Telegraf logfmt timestamp

I spun up a configuration that uses the tail input plugin configured with the “logfmt” data input format. The input plugin appears to be working as expected. However, I have not been able to figure out how to specify which field contains the time stamp and to provide the timestamp format. Can this be specified with the logfmt input data format? Or, is there a way to use a processor to process the timestamp?

Here is a sample of the logfmt data:

ts=2019-03-18T19:09:56.018 id=agg level=info uid=00010000161994C4 addr=31482 msg="request log success"
ts=2019-03-18T19:09:56.022 id=agg level=info uid=00010000161994C4 addr=31482 msg="device aggregation complete"
ts=2019-03-18T19:09:56.027 id=network level=info addr=0 msg="request status"
ts=2019-03-18T19:09:56.811 id=zigbee level=info uid=0001000013B6AECF addr=0 addr_list="[34422, 4777, 31482]" msg="coordinator children"
ts=2019-03-18T19:13:57.340 id=zigbee level=info uid=0001000013B6AECF addr=0 addr_list="[34422, 4777, 31482]" msg="coordinator children"
ts=2019-03-18T19:13:57.352 id=agg level=info uid_list="['0001000016198841', '00010000167A201B', '00010000161994C4']" msg="devices processed" motes_processed="3"
ts=2019-03-18T19:13:57.355 id=cmd level=info msg="network discovery done"

There currently isn’t anything in the logfmt parser to specify which field contains a timestamp. It seems to always use the time at processing. If that’s not ideal, you can open a feature request to add an option like json’s json_time_key. Otherwise you should be able to drop the ts tag?

Just to verify, is ts being output as a tag or a field?

I attempted to use a parser processor plugin to do this as follows:

[[processors.parser]]
  namepass = [ "test2" ]
  order = 1
  parse_fields = [ "ts" ]
  drop_original = false
  merge = "override"
  data_format = "grok"
  grok_patterns = ['^%{TIMESTAMP_ISO8601:timestamp:ts-"2006-01-02T15:04:05.999"}']

However, the time stamp is not being used for the metric. Here is the output:

test2,application_name=logfmt,city=edmonton,country=ca,device_type=gateway,device_version=2v0,host=ip-xxx,hostname=scannet03,id=coord_reset,level=info,organization=scanimetrics,path=/motescan/scannet03/log/ag_history.log,project=P100 msg="Coordinator reset",delay=1i,ts="2019-03-18T20:05:26.229" 1552941714177679876

Hi @glinton ts is a field. I thought using a parser processor would allow me the opportunity to extract the timestamp but it doesn’t seem to do the trick?

@glinton There is a workaround for this but it is kind of ugly. If I reverse the order of things I tried earlier I can almost make it work. That is, use a tail input plugin configured with a grok input data format. I can capture the timestamp from the input and the remainder of the log line is captured into a field (I called it “message”) as follows:

[[inputs.tail]]
  files = ["/motescan/scannet03/log/ag_history.log"]
  name_override = "test3" 
  from_beginning = false
  pipe = false
  watch_method = "inotify"
  data_format = "grok"
  grok_patterns = ['^ts=%{TIMESTAMP_ISO8601:timestamp:ts-"2006-01-02T15:04:05.999"} %{GREEDYDATA:message:string}']
  [inputs.tail.tags]
    application_name = "grok"
    device_type = "gateway"
    device_version = "2v0"
    sensor_type = ""
    country = "ca"
    city = "edmonton"
    organization = "scanimetrics"
    project = "P100"
    hostname = "scannet03"

Then, I used a parser processor configured to parse the logfmt data capture in the “message” field as follows:

[[processors.parser]]
  namepass = [ "test3" ]
  order = 1
  parse_fields = [ "message" ]
  drop_original = false
  merge = "override"
  data_format = "logfmt"

Here is the output:

test3,application_name=grok,city=edmonton,country=ca,device_type=gateway,device_version=2v0,host=ip-xxx,hostname=scannet03,id=coord_reset,level=info,organization=scanimetrics,path=/motescan/scannet03/log/ag_history.log,project=P100 message="level=info id=coord_reset msg=\"Coordinator reset\" delay=1",msg="Coordinator reset",delay=1i 1552939526229000000

The unfortunate consequence is that the “message” field passes through to the output. I had a look at the “fielddrop” filter but it appears you cannot use it inside an output. And if I try to use it in the parser processor, nothing passes to the output suggesting the drop happens before the parser processes the field.

Is there another way I can drop the unwanted field?

I may be incorrect, but I think if you set drop_original=true, the original message field will not get passed through to the output. If that doesn’t work, setting field_drop=["message"] should do the trick. Learn more about field drop here.

Also noted many of the following error messages in telegraf.log:

`2019-03-18T23:05:25Z E! [processors.parser] could not parse field message: logfmt syntax error at pos 44 on line 1: unterminated quoted value`

I did not encounter these errors when when I configured the tail plugin to use the logfmt data input format.

There are a few lines similar to the following in the log. Could these lines be the ones causing the problem with the parser?

`ts=2019-03-18T23:05:29.914 level=info id=zigbee uid=0001000013B6AECF addr=0 msg="config liveMode=0"`

Note the liveMode=0 inside the quoted string.

Hi @glinton

Unfortunately, the drop_original=true parameter drops everything, including tags that I added with the input plugin [inputs.tail.tags].

Regarding fielddrop, the documentation indicates that you cannot use it in the output plugin. Is that correct?

Regarding the error message in the telegraf log, the equals inside the quoted string is not the problem. I think I have found a bug in the grok input data processor. Here is an example:

input:
ts=2019-03-19T13:05:27.679 level=info id=cmdopts nwkwait=480 msg="ok"

output:
test3,application_name=grok,city=edmonton,country=ca,device_type=gateway,device_version=2v0,host=ip-xxx,hostname=scannet03,organization=scanimetrics,path=/motescan/scannet03/log/ag_history.log,project=P100 message="level=info id=cmdopts nwkwait=480 msg=\"ok" 1553000727679000000

Note the quoted string value for the “msg” field in the input line. Also note the position of this field in the input line and particularly the double quote at the end of the line.

In the output line there are two issues. First, the “message” field is missing the second escaped double quote that goes with the “msg” field value. Second, the “message” field has not been parsed by the logfmt parser. I now believe this is what is throwing the logfmt syntax error message in the telegraf log (unterminated quoted value). Accordingly, the parser processor ignores the line which explains why the parsed fields and tags do not appear in the output.

I believe I have tracked this problem down and isolated it to the log lines that end with a double quote. Thus, the lines that end with msg=“some text” are mangled and are missing the second escaped double quote. I have confirmed that any line from the input that ends without a double quote is properly parsed by the grok input data processor and passed along to the logfmt parser processor. This appears to be a bug and I think it is with the grok input data format.

This problem appears to be the same as reported in issue 4998