[[inputs.logparser]] no output in --test mode and no metrics

influxdb
telegraf
#1

Hi,

The internet/ GitHub is full of example and issues.
Kindly could someone guide me to have the best approach.
I’ve tried a lot of approache and as the telegraf --test mode doesn’t work for this case I’m on a dead end with this topic.

–Logs–
/var/log/nginx/access.log

127.0.0.1 - - [06/Apr/2018:15:03:15 +0000] “GET /nginx_status HTTP/1.1” 200 112 “-” “Go-http-client/1.1” “-”
127.0.0.1 - - [06/Apr/2018:15:03:17 +0000] “POST /elasticsearch/_msearch HTTP/1.1” 200 172 “https://127.0.0.1/app/kibana” “Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0” “-”
127.0.0.1- - [06/Apr/2018:15:03:24 +0000] “POST /elasticsearch/_msearch HTTP/1.1” 200 172 “https://127.0.0.1/app/kibana” “Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0” “-”
127.0.0.1 - - [06/Apr/2018:15:03:30 +0000] “GET /nginx_status HTTP/1.1” 200 112 “-” “Go-http-client/1.1” “-”
127.0.0.1 - - [06/Apr/2018:15:03:45 +0000] “GET /nginx_status HTTP/1.1” 200 112 “-” “Go-http-client/1.1” “-”
127.0.0.1 - - [06/Apr/2018:15:04:00 +0000] “GET /nginx_status HTTP/1.1” 200 112 “-” “Go-http-client/1.1” “-”

–Telegraf–
telegraf-1.5.3-1.x86_64

[[inputs.logparser]]
files = ["/var/log/nginx/access.log"]
from_beginning = true
name_override = “nginx_access_log”
[inputs.logparser.grok]
patterns = ["%{CUSTOM_LOG}"]
custom_patterns = ‘’’
CUSTOM_LOG %{COMBINED_LOG_FORMAT} %{NUMBER:response_time_us:float} %{NUMBER:request_time:float}
‘’’

–InfluxDB–
influxdb-1.5.0-1.x86_64

Kind Regards,

#2

What metrics do you want to extract from the logs or want to see in Grafana (number of respone codes, bytes received …?) I did a lot of work in the last month on this topic, but there are many obstacles I found for a busy site. So it is not easy to say what exactly you have to do…
What is your actual problem now, what merics are you missing ? What metrics does the logaprser extract and what do you already have in your DB ?

1 Like
#3

Hi Bolek,

Thank you for the support.
My main focus is on the below KPI’s:
Response Codes (e.g. HTTP 1xx, 2xx, 3xxx, 4xxx, 5xxx, etc);
TCP, UDP, Dropped Pkts, Errors, Traffic already monitored.
I don’t know if I could extract other relevant information for NG|NX the open source version.
I’ve put also a Print screen of what I already have in the Dashboard regarding NG|NX.

There are many examples using non_negative_derivative.


***Old post: https://community.influxdata.com/t/collecting-metrics-with-inputs-logparser-no-output-on-test/4560

  1. Regarding logparser for the moment nothing because I’ve tried many implementations of [[inputs.logparser]] no output in --test mode and nothing in InfluxDB database and I’m trying to understand better what I could do to have those in the InfluxDB.
  2. What I want to have is related to response codes if possible.

Started from this Telegraf.conf example from Internet:

[[inputs.nginx]]
urls = [“http://localhost/nginx_status”]
[[inputs.logparser]]
files = ["/var/log/nginx/access.log"]
from_beginning = true
name_override = “nginx_access_log”
[inputs.logparser.grok]
patterns = ["%{COMBINED_LOG_FORMAT}"]

to many possible variants but no output yet.

Kind Regards,

#4

In fact you can can extract every information from the access log, but then you need to write your own pattern matching for your log_format. We have added fields in the nginx log_format, so my patterns will not solve all your problems, but is a start. I had a hard learning curve for the Grok stuff. Also you can add response code as a value or as a tag, depending on what you want to achieve. I use response code as a tag and have values for response_time etc…
So first you need to check if your nginx log_format has all the information you need:
e.g.:
log_format compression '$remote_addr - $remote_user [$time_local] ’
'"$request" $status $bytes_sent ’
'"$http_referer" “$http_user_agent” +$request_time $upstream_response_time $pipe+ “$gzip_ratio” ’
‘"$host~$is_mobile $is_bot $sent_http_x_cache"’;

A (not complete) parser pattern can look like this for my use case:
%{CLIENT:client_ip:drop} %{NOTSPACE:ident:drop} %{NOTSPACE:auth:drop} [%{HTTPDATE:ts:ts-httpd}] “%{WORD:http_method:drop} %{PATHLEVEL1:pathlevel1:tag}(/|?)?.* HTTP/%{NUMBER:http_version:drop}” %{RESPONSE_CODE} (?:%{NUMBER:resp_bytes:int}|-) “%{DATA:referer:drop}” “%{DATA:user_agent:drop}” +(?:%{NUMBER:request_time:float}|-)

So then you need to write a regex pattern, to extract the relavat information what can be a bit tiresome, because you need to test that thourougly:
I add some links to pages that helped me:
https://github.com/influxdata/telegraf/blob/master/plugins/inputs/logparser/grok/patterns/influx-patterns


https://regexr.com/
https://www.influxdata.com/blog/telegraf-correlate-log-metrics-data-performance-bottlenecks/
For testing I use the telegraf file output (and disable InfluxDb output) to see what gets sent by telegraf. Then you have to test pattern by pattern from the beginning to see if it matches correctly. In debug output of telegraf.log you can see “Grok No Match Lines”, but it gives no specific error message, so you need to find out youself, what the problem with the pattern is…Regex hell :wink:

So I extract for example domain names, parts of URLs and so on, depending on what information I need from the logs.

In Grafana I created tables with a query like this:

SELECT count(“resp_bytes”) FROM “proxy_access_log” WHERE (“domain” =~ /^domain/ AND “cache_status” =~ /^cache_status/ AND “mobile” =~ /^mobile/ AND “bot” =~ /^bot/) AND $timeFilter GROUP BY “response_code”Screenshot%20from%202018-04-09%2017-42-36

I found that the basicstats aggregator could help in making some basic counts and aggegations for you, but not on reponse codes.
Another way would be using response_code as a value and using a value_counter plugin like in this pull request:


By now I didn’t manage to test this aggregator, because I went the way of counting the resp_bytes field and grouping it by resonse_code, to get the desired information.

1 Like
#5

Many thanks for this approach and support, one little question regarding nginx.conf.

$ cat /etc/nginx/nginx.conf
access_log /var/log/nginx/access.log main;

I’ve noticed in some cases there is: “access_log /var/log/nginx/access.log combined;”, this aspect is also important for logparser aspect?

Something very useful: https://github.com/lebinh/ngxtop

#6

Thanks for the link, I didn’t know yet, it’s also my first deeper dive into nginx stuff.

I think the type of log is important as “combined” is a predefined log “style”.

The configuration always includes the predefined “combined” format:

log_format combined '$remote_addr - $remote_user [$time_local]
'"$request" $status $body_bytes_sent ’
‘"$http_referer" “$http_user_agent”’;

from:
http://nginx.org/en/docs/http/ngx_http_log_module.html

So if your format is not “combinded” your log fields or the type of time stamp or alike may look differently, so you need to look if the predefined Grok patterns really match your log style.
Read bottom lines of the telegraf Grok dcument and you see that COMBINED_LOG_FORMAT and COMMON_LOG_FORMAT have differences :

Also you or other sysops might have configured other custom log_formats, but here I am not too experienced, but as you can see in the example above my log_format is called “compression” not “combined” and looks like this:
access_log /var/log/nginx/wordpress.access.log compression

so you need find the “log_format main” in your configs: e.g.:
grep -R “log_format main” /etc/nginx/*

1 Like
#7

Yes this is a good tool “ngxtop”, NG|NX open source has no Status Page unfortunately and not so much metrics.

$ cat /etc/nginx/nginx.conf

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;
#8

I’ve managed somehow.