Processor.lookup dynamic updates

The docs for the lookup processor state " The lookup is static as the files are only used on startup". Is there a better way to deal with changing lookup tables other than to “kill -HUP” the telegraf process. This is not ideal as it reloads all state.

I build my lookups from a database, but I am ok have either a file watcher detecting file changes built into processor.lookup, or some other mechanism to signal a processor to reload?

Thoughts?

If you need to build dynamic data and want to continue to use Telegraf, then my suggestion is to use the exec processor. That way you can run your arbitrary code to look up dynamic processors however you want.

As you have seen, the lookup processor is for static lookups, not dynamic.

@richarde I think the right approach would be to either extend the lookup processor (e.g. add a format sql and the necessary parameters and logic) or create a new processor (based on lookup) which can handle the requests to the database.

What do you think? As always, please create a feature request describing the expected behavior and PRs are welcome! :wink:

Hi @srebhan,

I really tried to work with the lookup processor but moved away in favor of sending the data to a database to do lookups using outputs.sql.

My logic is as follows: send metrics to db table, trigger on db table does lookups (something like this: TRIGGER events_enrichment BEFORE INSERT ON events_processor FOR EACH ROW BEGIN). Then I have an inputs.sql that reads from the same table and deleting rows (something like this: DELETE FROM events_processor RETURNING *).

Unfortunately this requires two trips to the DB but solves a number of issues for me:

  1. I can change the lookup data without having to restart Telegraf
  2. The current data is actually replaced, not new data created (fields are updated, not tags created that I manually have to mangle back to replace the fields)

An ideal lookup processor for me will have the following properties:

  1. Ability to replace data (like processor.enum does)
  2. Dynamically update the lookup data (either from a DB as you suggested in a timed, cached fashion, or a static file with watcher looking for changes)

Some background of what I am trying to do is in order. I am building a multi-purpose telemetry pipeline using Telegraf as the collector feeding into Apache Kafka that then persists data into Apache Druid. We have a number of custom applications then consuming the data from Druid. Works well and will scale for us going forward. One of our best practice design goals is to enrich data at the start of the pipeline so that data is idempotent and does not change during query on the Druid side - details like customer information, network state etc. need to reflect the state at the time of message creation. Doing lookups at query time also has performance implications especially with large amounts of data as is not advised. Here is some info on how the Druid lookup features work (scroll down to the JDBC section for DB): Druid

I would love to contribute code and learn some GO (I not a GO programmer), but for now I have to move my project along. Once I have all my features in place and working I can revisit some aspects (like lookups) and improve some bits in future.

My journey has been long and frustrating and I have encountered many potholes along the way. Some are:

Lookups: described above
ping processor: cannot dynamically change ping list - requires restart, does not scale with long ping lists even if using native ping - the numbers are so way off I switched to using fping which created more issues when using exec processor
exec processor: cannot ignore exit codes or read from stderr (looking at the source code ignoring exit codes and reading from stderr is hardcoded for the nagios parser, should be config settings not just for nagios)
Aggregation idempotence: I actually wrote a Telegraf feature request but closed it after I changed direction https://github.com/influxdata/telegraf/issues/14107

Sorry for the long post and listening. Telegraf is great and hopefully I will be able to use it 100% at the start of my pipeline for all my collection purposes. Keep up the great work.

1 Like