Inputs Tail (Grok) - Would This Configuration Work?

Hello,

Just looking for some advice if that’s OK, I haven’t used this plugin before and I’m unsure on grok patterns.

Goal

Parse the below logfile to plot data usage by user on Open VPN.

March 05 14:58:51, user1, 10.10.10.10, 192.192.192.192, 22000, 20682, 182694
March 05 14:59:51, user2, 10.10.10.11, 193.193.193.193, 45000, 40682, 882694

Plugin

[[inputs.tail]]
files = ["/var/log/ovpnlog.log"]
from_beginning=false
grok_patterns = ["%{CUSTOM_LOG}"]
grok_custom_patterns = '''
CUSTOM_LOG %{SYSLOGTIMESTAMP:timestamp:ts-syslog},%{IPORHOST:vpnip:tag},%  {IPORHOST:source:tag},%{NUMBER:duration:tag},%{NUMBER:received:tag},%{NUMBER:sent:tag}
'''
data_format = "grok"

I am also unsure of how it runs. If I have say two users generate into the log at once so two new lines, how does it react? Does it only read the final line of the log per collection?

Thank you.

Tested, doesn’t seem to work unfortunately.

I also tried this to no success:

[[inputs.logparser]]
  files = ["/var/log/ovpnlog.log"]
  interval = "60s"
  from_beginning=true
  [inputs.logparser.grok]
    measurement = "ovpn_log"
    patterns = ["^%{SYSLOGTIMESTAMP:timestamp:ts-syslog},%{IPORHOST:vpnip:tag},%{IPORHOST:source:tag},%{NUMBER:duration:tag},%{NUMBER:received:tag},%{NUMBER:sent:tag}"]
    timezone = "Local"

Yes, you can do that with grok, maybe even with csv or another input format.
The grok pattern is just a bit tricky.
In your case I think the pattern doesn’t quite fit, you have 7 entries in the line, but parsed only 6. The user is missing?

Thanks Franky.

I did find a CSV example but even that doesn’t seem to send any metrics (or run at all).

You’re right, totally missed out on the user! It still doesn’t work but it never would have worked without that so thanks for pointing it out.

Does incorrect grok syntax prevent the plugin from running completely?

OK this seems to work, albeit with a “malformed log” due to the timestamp. Just have to figure out how to specify what time format it is?

[[inputs.tail]]
  files = ["/home/ubuntu/ovpnlog.log"]
  from_beginning = true
  pipe = false
  watch_method = "inotify"
  data_format = "csv"
  csv_delimiter = ","
  csv_column_names = ["time","user","vpn-ip","source-ip","duration","received","sent"]
  csv_header_row_count = ""
  csv_skip_rows = 0
  csv_timestamp_column = "time"
  name_override = "ovpnlog"

Yes parsing the timestamp is always the biggest headache… :unamused:

I have fiddled a bit with the grok format and I think I have found a solution that works.

A snippet with your data as input file:

March 05 14:58:51, user1, 10.10.10.10, 192.192.192.192, 22000, 20682, 182694
March 05 14:59:51, user2, 10.10.10.11, 193.193.193.193, 45000, 40682, 882694

The configuration snippet of Telegraf:

[[inputs.file]]
  files = ["openvpn.log"]
  data_format = "grok"
  grok_patterns =  ['%{SYSLOGTIMESTAMP:timestamp:ts-"January 02 15:04:05"}, %{DATA:user:tag}, %{IPORHOST:vpnip:string}, %{IPORHOST:source:string}, %{INT:duration:int}, %{INT:received:int}, %{INT:sent:int}']
  grok_timezone = "Local"
  name_override = "openvpn"

[[outputs.file]] # only for debugging
  files = ["openvpn.out"]
  influx_sort_fields = true

The snippet with the output lines of Telegraf in influx line protocol format:

openvpn,user=user1 duration=22000i,received=20682i,sent=182694i,source="192.192.192.192",vpnip="10.10.10.10" 1614952731000000000
openvpn,user=user2 duration=45000i,received=40682i,sent=882694i,source="193.193.193.193",vpnip="10.10.10.11" 1614952791000000000

I would say that looks good :wink:

I tried out with CSV again and got this working:

[[inputs.tail]]
  files = ["/home/ubuntu/ovpnlog.log"]
  from_beginning = true
  pipe = false
  watch_method = "inotify"
  data_format = "csv"
  csv_delimiter = ","
  csv_column_names = ["time","user","vpn-ip","source-ip","duration","received","sent"]
  csv_header_row_count = ""
  csv_skip_rows = 0
  csv_timestamp_column = "time"
  name_override = "ovpnlog"
  csv_timestamp_format = "2006-01-02 15:04:05"

I did have to change the formatting on my logfile creater, but it did come through. I think I need some data to be “tags” in order to use it properly. I need it to behave like “select user from ovpnlog and display usage over time”. If that makes sense.

Many thanks for your grok version, it’s preferable! I’ll give it a try now.

If you can influence this, this is often the easier option, especially to get a standard timestamp.

If you have changed the log format in the meantime, you must of course adjust the grok pattern again, otherwise it will not work.

Yeah I can influence how the logfile is written. The new pattern is as above in that CSV. I’d probably rather use the grok version you created, would that be as simple as doing:

'%{SYSLOGTIMESTAMP:timestamp:ts-"2006-01-02 15:04:05"}

um, not sure, try it out, maybe it works with that too:

'%{TIMESTAMP_ISO8601:timestamp:ts-"2006-01-02 15:04:05"}

I tested it with the “March 05” version, it plots just fine on the graph. I based it on my DNSBL block log, which I run to Grafana already (I didn’t make that logparser, hence my struggling).

I imagine both versions should work fine. Thank you for your help, time to test it for real!

Maybe you’re right after all RE: date/time, it works perfectly on my test bench but in production Grafana isn’t plotting the data by time (or at all).

Odd, there’s no difference between the test bench and the real one.

EDIT: My date format for the logs I was generating was “h” - 12hr clock. Modified to “H”, getting the correct plots.

Thank you very much for your help Franky!