LogParser does not work after logrotate

telegraf
#1

I just started testing the LogParser input for a couple of HAProxy servers. I know that I could use the HAProxy input instead but wanted to test the difference between the two inputs. I have LogParser setup for an edge HAProxy instance (EXT) and one for the associated internal HAProxy (INT).

The problem I seem to have is that when the log rotates due to reaching a maximum size I no longer get output from the LogParser. the EXT HAProxy rotates logs everyday and seems to continue to emit log data after each rotation but the INT HAProxy always stops emitting data after each rotation until I manually reload telegraf (service telegraf reload). I even tried adding that reload to the the haproxy logrotate postrotate script but that seems to fail as well. There are no differences in the telegraf configuration for each server and the same pattern works for both log files.

Im running on a somewhat older version of Ubuntu (Ubuntu 12.04.5 LTS) with the latest release of telegraf (1.2.1). Below is the latest from the telegraf.log (INT) …

2017/04/24 10:43:57 Seeked /var/log/haproxy.log - &{Offset:0 Whence:2}
2017/04/24 11:17:01 Re-opening moved/deleted file /var/log/haproxy.log ...
2017/04/24 11:17:01 Successfully reopened /var/log/haproxy.log
2017/04/24 11:17:01 Re-opening moved/deleted file /var/log/haproxy.log ...
2017/04/24 11:17:01 Successfully reopened /var/log/haproxy.log

It was 10:43:57 when I manually reloaded telegraf. It was at 11:17:01 when the haproxy.log rotated. No stats after that and I do not see a “Seeked” line.

Looking at the telegraf log on EXT it looks the same …

2017/04/24 06:25:02 Re-opening moved/deleted file /var/log/haproxy.log ...
2017/04/24 06:25:02 Successfully reopened /var/log/haproxy.log

Notice no "Seek there either but telegraf continues to emit data for EXT.

I see no differences in permissions for haproxy.log. Both are owned by syslog:adm and are mode 640. The Telegraf user has been added to the adm group. the only difference I see is that INT rotates because it reached 25M where EXT rotates every morning on schedule. Both INT and EXT have identical logrotate configurations.

/var/log/haproxy.log {
	rotate 2
	size 25M
	missingok
	notifempty
	nocompress
	create 640 syslog adm
	sharedscripts
	postrotate
		reload rsyslog >/dev/null 2>&1 || true
	endscript
}

Both INT and EXT have identical Telegraf configurations for logparser …

[[inputs.logparser]]
  ## file(s) to tail:
  files = ["/var/log/haproxy.log"]
  from_beginning = false
  name_override = "haproxy_log"
  ## For parsing logstash-style "grok" patterns:
  [inputs.logparser.grok]
    patterns = ["%{HAPROXYHTTP}"]
    custom_patterns = '''
        HAPROXYTIME %{HOUR}:%{MINUTE}:%{SECOND}
        HAPROXYDATE %{MONTH:haproxy_month} %{MONTHDAY:haproxy_day} %{HAPROXYTIME:haproxy_time}
        HAPROXYHTTP %{HAPROXYDATE} %{IPORHOST:syslog_server} %{SYSLOGPROG}: %{IP:client_ip}( {[[:graph:]]*})? %{WORD:method} %{URIPATHPARAM:http_request} HTTP/%{NUMBER} %{NUMBER:status:int} %{NUMBER:request_bytes:int} %{INT}
'''

I wasn’t sure about the from_beginning parameter. I was worried that if I said from_beginning = true that I run the risk of duplicate data so both are set to false.

The haproxy.log files are very similar (custom).

Example from INT …

Apr 24 19:09:27 localhost haproxy[22412]: 10.10.2.124 POST /ws/rest/trip/search HTTP/1.1 201 512 21

Example from EXT …

Apr 24 19:11:05 localhost haproxy[18489]: 52.35.167.170 {AHC/2.0} GET /ws/rest/trip/search/84ca38dcd6274f9cb0151c1d2c1378fa HTTP/1.1 200 6229 1157

Can’t seem to figure this out so thinking I may go back to using the haproxy input but strip it down some since I don’t need all the data it produces. But I do like being able to drill down to the specific log entry produced by the logParser in Grafana (using a table to display that).

Anyone have any ideas as to what I may have missed?

#2

@Kipland_Iles I’ve had this issue before. from_beginning will not create duplicate points as the logparser uses the timestamps from the logs. So setting that to true and fixing this with a little cron hack is what I have done in the past.

#3

The good news is that we merged a fix for this, will be in 1.3.

2 Likes