The internet/ GitHub is full of example and issues.
Kindly could someone guide me to have the best approach.
I’ve tried a lot of approache and as the telegraf --test mode doesn’t work for this case I’m on a dead end with this topic.
What metrics do you want to extract from the logs or want to see in Grafana (number of respone codes, bytes received …?) I did a lot of work in the last month on this topic, but there are many obstacles I found for a busy site. So it is not easy to say what exactly you have to do…
What is your actual problem now, what merics are you missing ? What metrics does the logaprser extract and what do you already have in your DB ?
Thank you for the support.
My main focus is on the below KPI’s: Response Codes (e.g. HTTP 1xx, 2xx, 3xxx, 4xxx, 5xxx, etc);
TCP, UDP, Dropped Pkts, Errors, Traffic already monitored.
I don’t know if I could extract other relevant information for NG|NX the open source version.
I’ve put also a Print screen of what I already have in the Dashboard regarding NG|NX.
There are many examples using non_negative_derivative.
Regarding logparser for the moment nothing because I’ve tried many implementations of [[inputs.logparser]] no output in --test mode and nothing in InfluxDB database and I’m trying to understand better what I could do to have those in the InfluxDB.
What I want to have is related to response codes if possible.
Started from this Telegraf.conf example from Internet:
In fact you can can extract every information from the access log, but then you need to write your own pattern matching for your log_format. We have added fields in the nginx log_format, so my patterns will not solve all your problems, but is a start. I had a hard learning curve for the Grok stuff. Also you can add response code as a value or as a tag, depending on what you want to achieve. I use response code as a tag and have values for response_time etc…
So first you need to check if your nginx log_format has all the information you need:
e.g.:
log_format compression '$remote_addr - $remote_user [$time_local] ’
'“$request” $status $bytes_sent ’
'“$http_referer” “$http_user_agent” +$request_time $upstream_response_time $pipe+ “$gzip_ratio” ’
‘“$host~$is_mobile $is_bot $sent_http_x_cache”’;
A (not complete) parser pattern can look like this for my use case:
%{CLIENT:client_ip:drop} %{NOTSPACE:ident:drop} %{NOTSPACE:auth:drop} [%{HTTPDATE:ts:ts-httpd}] "%{WORD:http_method:drop} %{PATHLEVEL1:pathlevel1:tag}(/|?)?.* HTTP/%{NUMBER:http_version:drop}" %{RESPONSE_CODE} (?:%{NUMBER:resp_bytes:int}|-) "%{DATA:referer:drop}" "%{DATA:user_agent:drop}" +(?:%{NUMBER:request_time:float}|-)
For testing I use the telegraf file output (and disable InfluxDb output) to see what gets sent by telegraf. Then you have to test pattern by pattern from the beginning to see if it matches correctly. In debug output of telegraf.log you can see “Grok No Match Lines”, but it gives no specific error message, so you need to find out youself, what the problem with the pattern is…Regex hell
So I extract for example domain names, parts of URLs and so on, depending on what information I need from the logs.
In Grafana I created tables with a query like this:
SELECT count(“resp_bytes”) FROM “proxy_access_log” WHERE (“domain” =~ /^$domain$/ AND “cache_status” =~ /^$cache_status$/ AND “mobile” =~ /^$mobile$/ AND “bot” =~ /^$bot$/) AND $timeFilter GROUP BY "response_code"
I found that the basicstats aggregator could help in making some basic counts and aggegations for you, but not on reponse codes.
Another way would be using response_code as a value and using a value_counter plugin like in this pull request:
By now I didn’t manage to test this aggregator, because I went the way of counting the resp_bytes field and grouping it by resonse_code, to get the desired information.
So if your format is not “combinded” your log fields or the type of time stamp or alike may look differently, so you need to look if the predefined Grok patterns really match your log style.
Read bottom lines of the telegraf Grok dcument and you see that COMBINED_LOG_FORMAT and COMMON_LOG_FORMAT have differences :
Also you or other sysops might have configured other custom log_formats, but here I am not too experienced, but as you can see in the example above my log_format is called “compression” not “combined” and looks like this:
access_log /var/log/nginx/wordpress.access.log compression
so you need find the “log_format main” in your configs: e.g.:
grep -R “log_format main” /etc/nginx/*
@fchiorascu
I see that the issue is really old however am stuck in the exact same position…your last post also is totally another aspect of the monitoring which only ships the nginx-engine status i.e how many connection established handled and waiting…nothing related to the log parsing which is totally different realm.
I wonder anyone out there is having the same issue as me to successfully log pars specific part of the data to the monitoring-in my case the response code for a simple graph to monitor the values and put alerts based on them…