Add tag if grok pattern matched

Hello,

I have logs coming from syslog input, and some of them matches a specific grok pattern. I have to record matched and unmatched log entries to different storage. So I want to add tag if grok match or not. But seems that [processors.parser.tags] doesn’t work (tag is not added). Did I something wrong?

My config file:

[[inputs.syslog]]
  server = "udp://:514"
  name_override = "nginx_errors"
  tagexclude = ["appname", "facility", "host", "severity", "source"]
  fielddrop = ["facility_code", "severity_code", "version"]

[[processors.parser]]
  parse_fields = ["message"]
  merge = "override-with-timestamp"
  drop_original = false
  data_format = "grok"
  grok_patterns = ............
  grok_custom_patterns = ............

  [processors.parser.tags]
    nginx_lua_errors = "1"

[[outputs.file]]
  files = ["stdout"]

I doubt [__.tags] can be used in processor plugins, except for processor.override.

To achieve that I would add a processor override that filters (via tagpass) only the relevant metrics and adds that static tag.
You may also set it to 1 by default (in the input itself) and override it to 0 in the processor override

Hello,

The problem is that my syslog input (nginx_errors) have both nginx raw errors and lua module errors. Lua module errors follow specific grok pattern and non-lua errors doesn’t follow specific grok pattern.

So if I add tag at input or processor.override, it would be available on both, as processor.override will tag them without any conditions.

The main goal is to add (or remove) tag if grok pattern matched or not.

Do you have a pair of samples to share?
Any processor can apply filters, meaning it will work only on a subset of “filtered” points… what kind of filter you can apply depends on your data structure

2023/12/15 10:00:00 [error] 586902#586902: *233904 connect() failed (111: Connection refused) while connecting to upstream, client: 1.2.3.4, server: _, request: "GET / HTTP/2.0", upstream: "http://127.0.0.1:7080/", host: "example.com", referrer: "https://example.com/"
2023/12/15 10:00:00 [error] 586899#586899: *234010 [lua] example.lua:1: Log text from my LUA script, client: 1.2.3.4, server: _, request: "GET / HTTP/2.0", host: "example.com"

First one is from nginx directly, second one is from LUA module. Both are sent by error_log nginx directive to syslog server, which is in instance the telegraf syslog input.

So my goal is to tag them differently, to be able to send them to different output. Unfortunately, I can’t predict format for nginx error logs (for example “upstream:” may not appear if this is not an upstream error). But I have grok for all LUA errors.

The processors.parser can parse them, but can’t add additional tags so if I do drop_original true it removes nginx errors, and if I do drop_original false I don’t have special tags to detect if grok matched or not.

My grok:

grok_patterns = ['%{NGINXDATE:timestamp:ts-"2006/01/02 15:04:05"} \[%{DATA:loglevel}\] %{NUMBER:pid}#%{NUMBER:tid}\: \*%{NUMBER:rid} \[lua\] %{DATA:message}\, client\: %{IPORHOST:clientip}, server\: %{DATA:server_block}, request\: \"%{DATA:http_request}\", host\: \"%{DATA:vhost}\"']
grok_custom_patterns = 'NGINXDATE %{YEAR}[/-]%{MONTHNUM}[/-]%{MONTHDAY} %{TIME}'

since you can only filter based on tags, you need to:

  1. convert (processor.converter) any of your parsed results to a tag
    • processors.parser only creates fields, but filters can’t be applied on them
    • processor.converter won’t complain if a field doesn’t exists, so it’s won’t generate errors for the series without that data
  2. use said tag as a filter (tagpass) for the processor.override
    2a. add/override the static tag
  3. set up your outputs to filter the data based on that tag (tagpass)
    • or you can just set them up to filter based on the tag obtained at step 1 (skipping step 2 entirely), if the tag client exists only in the lua data, then i’ts a valid filter for lua data

The simplest way (the one with less steps) is the following one

[[inputs.syslog]]
  server = "udp://:514"
  name_override = "nginx_errors"
  tagexclude = ["appname", "facility", "host", "severity", "source"]
  fielddrop = ["facility_code", "severity_code", "version"]

[[processors.parser]]
  parse_fields = ["message"]
  merge = "override-with-timestamp"
  drop_original = false
  data_format = "grok"
  grok_patterns = ............
  grok_custom_patterns = ............

[[processors.converter]]
  [processors.converter.fields]
    tag = ["client"]

##any point containing the tag "client" will pass here
[[outputs.file]]
  files = ["stdout"]
    [[outputs.file.tagpass]]
      client = ["*"]

##any point NOT containing tag "client" will pass here (set up a different output type)
[[outputs.file]]
  files = ["stdout"]
    [[outputs.file.tagdrop]]
      client = ["*"]
1 Like