Tail and count log file with tags

dataviruset · September 5, 2018, 4:05am

I’m trying to use the valuecounter aggregator to count occurrences of certain values in my tail input (I used logparser before, but changed to tail since logparser is deprecated in Telegraf 1.8+).
It works as expected, but I want to have the output separated by tags (level: WARN, level: INFO) etc instead of multiple fields called level_WARN and level_INFO because it makes it easier to do GROUP BY, and so on.
But the valuecounter aggregator only seems to support outputting to new fields. So I tried the converter processor, but it seems that only works with raw data and not counts output by the valuecounter.

Any clues on how I should proceed?
This is my telegraf config.

[[inputs.tail]]
  files = [
    "/data/rocketmq/logs/rocketmqlogs/broker_default.log",
    "/data/rocketmq/logs/rocketmqlogs/broker.log",
    "/data/rocketmq/logs/rocketmqlogs/namesrv_default.log",
    "/data/rocketmq/logs/rocketmqlogs/namesrv.log",
    "/data/rocketmq/logs/rocketmqlogs/remoting.log",
    "/data/rocketmq/logs/rocketmqlogs/rocketmq_client.log",
    "/data/rocketmq/logs/rocketmqlogs/stats.log",
    "/data/rocketmq/logs/rocketmqlogs/storeerror.log",
    "/data/rocketmq/logs/rocketmqlogs/store.log",
    "/data/rocketmq/logs/rocketmqlogs/tools_default.log",
    "/data/rocketmq/logs/rocketmqlogs/transaction.log",
  ]
  data_format = "grok"
  grok_patterns = ["%{MQ_LOG_GENERIC}"]
  grok_custom_patterns = '''
MQ_LOG_GENERIC %{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{WORD:class} -%{GREEDYDATA:data:drop}
    '''

[[aggregators.valuecounter]]
  period = "10s"
  drop_original = false
  fields = ["level", "class"]
  namepass = ["tail"]

# Tried using this, but seems to only work with the tail raw output (for example reading the "level" field)
[[processors.converter]]
  [processors.converter.fields]
    tag = ["class_*", "level_*"]

This is the error message I got when I tried to use the converter processor:

Sep 04 08:32:40 rocketmq1.testcluster.local telegraf[16064]: 2018-09-05T02:32:40Z E! [serializers.influx] could not serialize metric: "tail,class_NettyServerCodecThread_1=2,host=rocketmq1.testcluster.local,level_INFO=2,path=/data/rocketmq/logs/rocketmqlogs/namesrv.log": no serializable fields; discarding metric

Example input data:

2018-08-29 16:55:14,014 INFO RocketmqClient - RebalanceService service end

I want to be able to use queries like these to get the data out:

SELECT mean("level") AS "mean_level" FROM "mq"."autogen"."tail" WHERE time > now() - 1h GROUP BY time(:interval:), "level" FILL(null)

daniel · September 5, 2018, 9:46pm

It might be best to do this at query time, unless you really need to remove the original metric. Just change your grok pattern to save level as a tag %{LOGLEVEL:level:tag} and then you can query it from InfluxDB:

select count(class) from mq where time > now() - 1h group by level

You can pick any field for the count function, so long as it is always set for what you want to count.

You wouldn’t want to add the counts as a tag because it would create very high series cardinality, which would add a lot of load to the server. Additionally, it is nicer to keep counts as fields because then you can do mathematical operations and comparisons on them, while tags are always strings.

dataviruset · September 6, 2018, 3:43am

What if I wanted to add both the level and class fields as tags and drop the timestamp from the log (I can use the time as it was inserted into InfluxDB in most cases)? And just save a log row count with these two tags in InfluxDB?

I can’t do this as there will be only tags and no field:

MQ_LOG_GENERIC %{TIMESTAMP_ISO8601:timestamp:drop} %{LOGLEVEL:level:tag} %{WORD:class:tag} -%{GREEDYDATA:data:drop}

Result:

Sep 06 11:40:38 rocketmq1.testcluster.local telegraf[18426]: 2018-09-06T03:40:38Z E! Error in plugin [inputs.tail]: E! Malformed log line in /data/rocketmq/logs/rocketmqlogs/remoting.log: [2018-09-06 11:40:38 INFO NettyServerCodecThread_6 - NETTY SERVER PIPELINE: channelInactive, the channel[123.4.5.6:43398]], Error: grok: must have one or more fields

It works if I keep one field, such as timestamp, level or class.

daniel · September 6, 2018, 6:02pm

It’s not possible to store tags in InfluxDB without at least one field. We do have plans to support for adding static fields in Telegraf for cases where you want everything to be a tag: Create static fields parsing log files · Issue #2564 · influxdata/telegraf · GitHub

dataviruset · September 7, 2018, 2:25am

Well, in my specific case I only care about the count, so the only field could be a count (basically an int field). A solution would be to let the inputs.tail count the lines instead of saving the raw contents. I guess this would be a good feature request unless it can be done with the existing processors or other plugins.

Topic		Replies	Views
Logparser creating tags Telegraf	5	2598	October 20, 2017
Need both stream and metrics outputs from telegraf telegraf	2	1221	May 13, 2019
Telegraf Tail plugin as a replacement for Filebeat Telegraf influxdb , telegraf	2	1100	February 9, 2022
Inputs.tail grok debugging Telegraf telegraf , grok	3	3794	August 23, 2022
Will tail plugin work for this scenario Telegraf influxdb , telegraf , csv , tail	4	810	March 4, 2022

Tail and count log file with tags

Related topics