Is it possible to merge or (combine) .conf files with an override behavior

I started today testing telegraf. I understand the systemd service loads a default .conf file plus all .conf files present in /etc/telegraf/telegraf.d.

As far as I understand, telegraf treats each .conf as a separate set of instructions - saving measurements to each specified output, for example - if each .conf file defines its own output.

As an example, right now I have an untouched /etc/telegraf/telegraf.conf and a /etc/telegraf/telegraf.d/custom.conf with the correct parameters for connecting to InfluxDB and I’m seeing both the systemd service log with connection errors AND entries being added to InfluxDB.

I would like to ask if it would be possible to load the default .conf and then have other smaller files override just the settings they include. The default .conf is quite big and such an override behavior in my mind would facilitate editing and managing settings.

The conf.d directory actually works a bit differently, you cna think of it as if all the config files are combined into a single config. This means if you want to connect an input plugin to an output plugin, you need to use the metric filtering. In the absense of any filtering all metrics produced are sent to all outputs.

Override behavior isn’t possible, and it’s not something the current TOML could support. A new plugin is created for each plugin table found in the configuration files. Recall that any [[tables]] with double brackets is TOML style for a list of [table], so you can read double bracket as the item can be specified multiple times. This allows you to have as many instances of a plugin as you need.

It’s not required, but when using the conf.d directory and a split configuration, I usually remove, or disable, everything other than the [global_tags] and [agent] table from the main telegraf.conf and place all plugins in the conf.d directory.

Ok, thank you for your quick and clear reply.

Creating a static file with all the modifications and just copy it over to the server being monitored would mean I would have to maintain slightly different template versions of the file for each server. If the default telegraf.d changes significantly in future releases, all the templates will need to be re-edited and that could be very cumbersome and time-consuming.

But I’m actually trying to automate the editing of the telegraf.conf, using regexp to find the relevant lines. The issue with scripted regexp editing is that some lines are not unique, making for really ugly rules to find (for example) the exact " # # password = “” " line you want - there are 7 of them.

Hey take a look here https://grafana.com/docs/installation/configuration/

Grafana overrides config settings with environment variables.

Telegraf has something similar, but there are limitations. Most notably that it isn’t possible to assign an array of strings using it.

We have considered adding more functionality like this, but are holding off for now. I do recommend using a templating library, perhaps the one that comes with your configuration management software.

Personally, I’m using ansible and using it’s built-in Jinja templating provides better integration than I would have if we added a template library directly to Telegraf itself. I don’t usually use the lineinfile module, which is similar to your regexp editing, and instead stick to full plugins as the smallest unit. Merging the top level telegraf.conf isn’t much of a problem for me either, since I only use it for the agent table. In the next major release of Telegraf, we may do this split as default, but for now we don’t want to upend anyone’s process.

Hope that adds a little color to the current setup.

I’m also using ansible and I replaced some code using lineinfile with replace module as I was working on the .conf settings.

But I actually don’t understand what you mean by ‘full plugins as the smallest unit’.

Since you are using ansible too, let me show you some examples of what I have. Instead of editing installed files on a per line basis I use templates for everything. I create them for either individual plugins or a group of related plugins, making the template configurable with any values that I want to vary across my systems.

roles/telegraf/templates/inputs.github.conf.j2:

[[inputs.github]]
  interval = "20m"
  repositories = ["influxdata/telegraf"]
  access_token = "{{ github_access_token }}"

Then in roles/telegraf/tasks/main.yml I have something like this:

- name: Enable Telegraf github input plugin
  template:
    src: inputs.{{ item }}.conf.j2
    dest: /etc/telegraf/telegraf.d/inputs.{{ item }}.conf
    owner: telegraf
    group: telegraf
    mode: 0600
  loop:
    - github
  notify:
    - reload telegraf
  when: telegraf_github|bool
  become: yes

I have a template for main configuration file as well which removes all the plugins in it, as a starting point the file could even be empty to use the defaults.

Then I use ansible variables to configure the hosts that the template is deployed on and the settings. I just set everything as an inventory variable, because my setup is very small, but I believe if you take advantage of everything ansible offers you can create a solution that will work for even very large deployments.

Ok, I see, I think.

But the .conf in your example doesn’t have outputs. This means that outputs (or any other setting?) defined in /etc/telegraf/telegraf.conf apply to all .conf files in /etc/telegraf/telegraf.d/? The “combined into a single config” you mentioned?

But .conf files in /etc/telegraf/telegraf.d/ work as separate entities? So if I had a bunch of .conf files in /etc/telegraf/telegraf.d/ with full workflow settings, do they work in isolation or all [tables] will be combined in one big mix?

One case: a running config is already in production and I want to test another input/filter in parallel. The testing results I want in a local file, not committed to the database where everything else is being recorded.
In this case, I need to apply metric filtering? Or can I have a .conf file work in isolation? The filter will need to run for a few days. I see metric filtering being a bigger complication than simply duplicating settings for jobs that don’t share the same outputs.

But the .conf in your example doesn’t have outputs. This means that outputs (or any other setting?) defined in /etc/telegraf/telegraf.conf apply to all .conf files in /etc/telegraf/telegraf.d/? The “combined into a single config” you mentioned?

I have my outputs added in another ansible task, but yeah, all the files in telegraf.conf and the various files in telegraf.d are merged into one big config.

But .conf files in /etc/telegraf/telegraf.d/ work as separate entities? So if I had a bunch of .conf files in /etc/telegraf/telegraf.d/ with full workflow settings, do they work in isolation or all [tables] will be combined in one big mix?

It’s one big mix. By default all data produced by all inputs is sent to all outputs, so in your example you must use metric filtering. TBH, I wasn’t so sure about the metric filtering either when I started working on Telegraf, but I’ve come to find that it works quite well, its flexible and doesn’t add hardly any runtime overhead like having multiple workflows would. I try to think about routing metrics based on their properties and less about what plugin produced them.

Ok thanks. This helped clarify a few things and now I have a much better understanding of how to make Telegraf work to my needs.