Custom log parsing with latest Tail Plugin, GROK and InfluxDB (Grafana ready)

Influx now uses inputs.tail instead of inputs.logparser

This post shares an examples of the new grok_custom_patterns parameter.

I found mostly examples with old examples from 2017, I hope this catches the eyes of people looking for examples in 2020. I struggled to get Telegraf inputs.tail to work due to the changes in config style and lack of clarity in the InfluxDB/Telegraf/Tail/Logparser/GROK docs.

The docs specifically references the following example but without giving a full example. It also doesn’t explaining how grok_custom_patterns relates to grok_patterns.

Text book example

files = [“/var/log/apache/access.log”]
from_beginning = false
grok_patterns = [“%{COMBINED_LOG_FORMAT}”]
name_override = “apache_access_log”
grok_custom_pattern_files = []
grok_custom_patterns = ‘’’
grok_timezone = “Canada/Eastern”
data_format = “grok”

Here is my explanation of latest config format:

Example Log Line

RESULT,2020-06-25 15:34:34,UNXPRD01,PROD_INSTANCE01,Running


  ## file(s) to tail:
  files = ["c:\\magicparser\\customlogs\\host01.log"]
  from_beginning = false

  #name of the "Metric" (which I want to see in Grafana eventually)
  name_override = "magicparser"
  grok_patterns = ["%{CUSTOM_LOG}"]
  grok_custom_patterns = '''
CUSTOM_LOG %{MAGICDATE:date},%{WORD:log_entry_hostname:tag},%{WORD:log_entry_service:tag},%{WORD:log_entry_state}
  data_format = "grok"


  1. files” - should be clear, the file you want to monitor. Only one file per input agent.
  2. from_beginning” - only tick if you are sure you want to process your whole log file. I typically only used this when I was testing.
  3. name_override” - Will set the metric name. For my use case, this is the name of the Metric I will use in Grafana. Grafana has Metrics, Tags and Fields. Metric is the primary name. Tags are indexed for high performance. Fields is for often changing data.
  4. grok_patterns” - is the required field (according to my tests) that input.tail can not work without. You have to define your main pattern here. In this case I indicate I will use a CUSTOM_LOG pattern
  5. grok_custom_patterns” - This is the tricky part that the docs did not explain well. This parameter allows me to use the triple quote approach to define a series of lines of custom patterns. This value defines the details of pattern that I had listed above in “grok_patterns”. The two work together. What is nice about GROK patterns is you can use one pattern in another. This way you break the problem up in manageable pieces. See how I defined “MAGICDATE” as my own custom pattern, and then used that in my “CUSTOM_LOG” pattern.
  6. data_format = “grok” - tells the tail plugin that we are using the GROK data format.


  1. A note on data types. - You will see that I didn’t just leave GROK to decide data formats for me. I overwrite the string default by appending the “tag” keyway to my element definition. This forces InfluxDB to store this field as a tag. This means the field will be indexed. Also in my use case this means I can now see the field pop up in the Grafana query builder as a WHERE parameter. Note that you should not just make all fields tags, see this post about performance impact of tags.


  1. Use timestamp to force InfluxDB to use your logs time - I didn’t override the Telegraf timestamp with my date. I just store my date as a string field. If I used a timestamp pattern, like follows, I could have told Telegraf to use my log entry’s time as its timestamp and ignore the time of reading the log file:

%{TIMESTAMP_ISO8601:timestamp:ts-“2006-01-02 15:04:05”}

  1. Use command line output for testing - Don’t waste time trying to send data to InfluxDB immediately, first try out your work locally to the standard output/command line using the file output. Remember you can only have one output at a time (as far as I know), so comment out your existing output plugin before adding this:

files = [“stdout”]
data_format = “influx”

  1. Build and test one field at a time - Pattern matcing is an incredibly tricking and painful exercise. It’s recommended to start with just one parameter in a test log file and then add one by one your other fields. Trust me. I read that advise another blog I can’t find now, and it was a great tip.

For example, with above log file just start with a test.log file simply with:


And get that to match and make sure everything is working before tackling the rest.

Good luck for your pattern matching.


Hey @eclements,

I am using the following input conf to transfer data from log file to influxdb but it is not working:


files = ["/var/log/testlog.log"]
from_beginning = true
grok_patterns =  ["%{ERROR_LOG}", "%{CUSTOM_LOG}"]
grok_custom_patterns = '''
ERROR_LOG  level=%{LOGLEVEL:severity:tag} msg="%{GREEDYDATA:value:tag}"

data_format = “grok”

but if i remove the tag modifiers, it works and i can see the data in my influxdb, something like this:


files = ["/var/log/testlog.log"]
from_beginning = true
grok_patterns =  ["%{ERROR_LOG}", "%{CUSTOM_LOG}"]
grok_custom_patterns = '''
ERROR_LOG  level=%{LOGLEVEL:severity} msg="%{GREEDYDATA:value}"

data_format = “grok”

Do you have any idea as to why this is happening?
I need those tags in my influxdb

At least one value must be captured as a field.

1 Like

@daniel would it be possible to store the original log content as well as the grok fields . Like in the above case , I’d like to store level and msg and also store “value” ( which is the original log content thrown out by tail )

@prashanthjbabu You can, but I know about a bug you will probably run into. Let’s say you have:

grok_patterns =  ["%{A:a}"]
grok_custom_patterns = '''
    A %{NUMBER:x} %{NUMBER:y}"

And a document:

1 2
3 4

The results would be:

> tail a="1 2",x="1",y="2"
> tail a="3 4",x="3",y="4"

The issue you might run into is that nested patterns with named captures at multiple levels like this don’t save the child “modifiers”, you won’t be able specify if x or y are tags or integers unless you remove the a capture. You can work around this for now using the converter processor.

@daniel Thanks for your reply… I also found out this method which seems to be working as well

    grok_patterns =  ["%{ERROR_LOG}"]
    grok_custom_patterns = '''
    ERROR_LOG %{SYSLOGTIMESTAMP:syslog_timestamp:drop} %{SYSLOGHOST:syslog_hostname:drop} %{DATA:syslog_program:string}(?:\[%{POSINT:syslog_pid:drop}\])?: level=%{LOGLEVEL:severity:tag} msg="%{GREEDYDATA:value:drop}"
    data_format = "grok"

The processor.parser takes the “value” field generated by tail and runs grok on it and appends its fields retaining the original “value” . This should work as well right?

1 Like

Hi @mrAbhishek I found the same as what @daniel answered. If I made “all” my values tags, I would not see the data in InfluxDB. I would have to make at least one field not a tag. In other words, have to have at least one field value.

This topic was automatically closed 60 minutes after the last reply. New replies are no longer allowed.