Originally published at: Telegraf Update - 1.3 and How to Write Plugins | InfluxData
We recently released the 1.3 version of Telegraf.
Have a python script that grabs some custom metrics? Use the exec plugin. Have some logfiles that contain metrics? Use the logparser plugin. Need to measure API response time? The ping plugin has your back. With over 300 different committers and over 100 plugins in Telegraf, it’s the data Swiss Army Knife you need.
What makes Telegraf such an active project? The ease of plugin development. Below I’ll briefly describe that architecture and show how easy it is to contribute to Telegraf.
Telegraf Architecture
Telegraf is a configuration-driven agent. Each of the input plugins satisfies a simple golang interface:type Input interface { // SampleConfig returns the default configuration of the Input SampleConfig() string // Description returns a one-sentence description on the Input Description() string // Gather takes in an accumulator and adds the metrics that the Input // gathers. This is called every "interval" Gather(Accumulator) error }The
SampleConfig()
and Description()
functions are used when generating the configuration file. The real magic happens in the Gather()
function. The Accumulator that gets passed in is shared by all the plugins. It takes a representation of a single metric from the plugin and gives it to Telegraf: Accumulator.AddFields(measurement, fields, tags, time)
The Gather function gets called on an interval that is set in the configuration file ([agent]interval
). Every ‘interval’ Telegraf calls the Gather function for each of its plugins and stores the resulting metrics from all the AddFields
calls in its metric buffer.
The output interface is slightly more complex than the input interface as it needs to deal with database connections and writes:
type Output interface { // Connect to the Output Connect() error // Close any connections to the Output Close() error // Description returns a one-sentence description on the Output Description() string // SampleConfig returns the default configuration of the Output SampleConfig() string // Write takes in group of points to be written to the Output Write(metrics []Metric) error }
The metric buffer then gets cleared by the configured output plugins every flush interval ([agent]flush_interval
) by a Write()
call. Connect() and Close() help manage the connection to the metric output and are called on startup and shutdown by Telegraf.
This simple, yet powerful architecture is very extensible and makes Telegraf easy to contribute to.
Config Generation
One of the cool features the above architecture allows for is the config file generation which makes Telegraf self-documenting. I've found it particularly useful for getting monitoring started on a new project, or seeing what kind of metrics I can get out of existing systems.For example if you are trying to see what kind of metrics Telegraf can generate from your databases, you might generate the following config:
$ telegraf -sample-config -input-filter elasticsearch:mysql:mongodb -output-filter influxdb
And if you have a RabbitMQ instance you would like to route the data through on the way to InfluxDB, Telegraf makes that easy too:
$ telegraf -sample-config -input-filter elasticsearch:mysql:mongodb -output-filter amqp $ telegraf -sample-config -input-filter amqp_consumer -output-filter influxdb
I could go on about processor and aggregator plugins, queuing integrations, input and output service plugins, and the countless things that make Telegraf cool, but I’ll leave those topics for another day!