Telegraf: count field values (processing http response codes)

piotr1212 · November 27, 2017, 3:33pm

I’m using Telegraf’s logparser input plugin to process a HTTP access log and want to count/aggregate the occurrence of response codes.

so basically my input is a file like this:
/somepath 200
/somepath 302
/somepath 200

I want as output:
http_accesslog, response_200=2
http_accesslog, response_302=1

I don’t see an obvious way to accomplish this with telegraf. In logstash I would use the metrics filter and pass a variable to the metric name, basically like this: How can I plot Apache HTTPd status counts in Graphite without Statsd using Logstash? - Server Fault

I’ve checked for this functionality in the logparser input, basicstats aggregator, histogram aggregator but don’t see it. Is this currently not possible or am I missing something?

I’m ok with writing some code if this currently is not a feature but have no clue if this should go into the logparser input, an aggregator, or an processor plugin.

Rik_Wasmus · November 27, 2017, 5:19pm

I would normally prefer a method like GitHub - jib/mod_statsd: Apache module to send statistics to Statsd, which can forward it as statsd metrics without some heavy handed log parsing. If you prefer using logs, there’s a specific Apache example in the git documentation. Have you tried that one, and if you did, what problems are you having?

daniel · November 27, 2017, 7:00pm

If you send along the raw data you could calculate this at query time, but I think you need one query per group:

select count(status_code) from http_accesslog where time > now() - 1m and status_code >= 200 and status_code < 300 group by time(10s)
select count(status_code) from http_accesslog where time > now() - 1m and status_code >= 300 and status_code < 400 group by time(10s)

The mod_statsd idea sounds like a good idea as well, and you would get the standard statsd aggregations in Telegraf and I imagine it would be much more performant.

That said, you should be able to do this with the histogram aggregator, define a bucket for 2xx, 3xx, 4xx, 5xx and select only the the status_code field.

[[aggregators.histogram]]
  period = "30s"
  drop_original = false

  [[aggregators.histogram.config]]
    buckets = [200.0, 300.0, 400.0, 500.0]
    measurement_name = "http_accesslog"
    fields = ["status_code"]

This should produce measurements like so:

http_accesslog,le=200.0 status_code_bucket=2i
http_accesslog,le=300.0 status_code_bucket=1i

When you write your query, use the non_negative_derivative function to view the change in value.

piotr1212 · November 28, 2017, 9:16am

Unfortunately I cannot add modules.

I did, the parsing works. Problem is that the logs are all sent as a single event, there is no aggregation (counting of status codes) happening.

I am not using Influxdb here, I understand other tsdbs are probably not number one use case. But even if I would be using influxdb, how will this perform if you would have hundreds requests per minute. If they are all stored not aggregated I can imange this would create quite a load.

I’ve tried the histogram plugin, the problem here is that it accumulates lower buckets. The output is:
http_accesslog,le=200.0 status_code_bucket=2i
http_accesslog,le=300.0 status_code_bucket=3i
Notice the 3i in the second line

I basically need an histogram plugin without the accumulation. Not having to predefine the buckets would also be nice.

daniel · November 28, 2017, 5:59pm

how will this perform if you would have hundreds requests per minute. If they are all stored not aggregated I can imange this would create quite a load.

I guess it depends on how often you need to perform the query, but with InfluxDB you could use Continuous Queries or a Kapacitor task to do this kind of aggregation and store it back into the database.

I basically need an histogram plugin without the accumulation. Not having to predefine the buckets would also be nice.

I think it could be nice if the histogram aggregator had an option to not make cumulative histograms, but I’m not sure how we could get away from needing to define the buckets.

piotr1212 · November 29, 2017, 10:05am

Created this: https://github.com/influxdata/telegraf/pull/3523
Think it might be useful for others as well.

bolek2000 · December 1, 2017, 10:50am

Hey piotr, this is very useful, I was just searching for this functionality, so I hope it will find its way into telegraf soon. Thank you !

piotr1212 · December 1, 2017, 3:41pm

Good to hear!

I think that if you would be able to test it (and share your experience in the comment on Github), it would make it more likely to get included.

bolek2000 · December 1, 2017, 4:20pm

I would like to try, but my development skills are not very advanced. I am confident using Linux, shell, bash, make commands, but never used golang. I guess I need to compile the go Code you wrote and integrate it somehow into the existing telegraf. Can you please give me hints, what steps I need to do to get this running and test ?

daniel · December 4, 2017, 8:41pm

Just want to second @piotr1212 comment about sharing you experience and how you would use the plugin, as this helps us determine if the plugin will be generally useful enough to be included in the official Telegraf builds.

Compiling is fairly easy, install Go from your distros repo and then follow the compiling from source steps. You can skip the setup your gopath step if you like and all files will be placed in the default path ~/go.

andyhorng · January 19, 2018, 7:46am

Hi @daniel ,

I think this plugin valuecounter is very useful. My use case needs to parse log and count how many log entries are labeled ERROR, INFO and so on. Currently the builtin aggregator basicstats is only supported to count filed type is integer or float.

github.com

influxdata/telegraf/blob/master/plugins/aggregators/basicstats/basicstats.go#L231


      
          			parsed.max = true
          		case "mean":
          			parsed.mean = true
          		case "s2":
          			parsed.variance = true
          		case "stdev":
          			parsed.stdev = true
          		case "sum":
          			parsed.sum = true
          		case "diff":
          			parsed.diff = true
          		case "non_negative_diff":
          			parsed.nonNegativeDiff = true
          		case "rate":
          			parsed.rate = true
          		case "non_negative_rate":
          			parsed.nonNegativeRate = true
          		case "interval":
          			parsed.interval = true
          		default:
          			b.Log.Warnf("Unrecognized basic stat %q, ignoring", name)

Topic		Replies	Views
Histogram of response codes from apache log Fluxlang telegraf	1	692	June 6, 2019
Telegraf : Monitor webserver with http_response and query Telegraf influxdb	0	1645	May 27, 2018
Apache Log with Tail Plugin and Grok Format Telegraf telegraf , grok	2	1433	November 16, 2021
How to make http_response telegraf input plugin include unresponsive web servers	9	1615	September 15, 2022
Telegraf Histogram - Can buckets be more granular at different fields level telegraf , prometheus , windows	5	991	September 6, 2019

Telegraf: count field values (processing http response codes)

Related topics