Potential bug in telegraf where if one conf fails others are ignored

I have multiple telegraf config files in C:\Program Files\Telegraf\Config, one of them included a section with inputs.win_services. I added a new file to this folder and tested to confirm it works using .\telegraf --test .... but it wasn’t writing into influxdb.

I realised that the inputs.win_services section was failing due to a recent change caused the account I was using to lose privileges the service manager. Once I resolved this the data from my additional config began writing into influxdb.

Is this expected behaviour, or a bug? It seems to me that just because one config file is in error state all the other configs should continue to work.

Thanks,
Lee

It depends on what kind of error you received.

I don’t receive any error for the additional config, it either silently fails or is ignored. I get an access denied to the service manager error for the config that is in error state.

Where do you get that error? And can you share that actual message?

The error is in the Window Event Logs.

[inputs.win_services] Error in plugin: Could not open service manager: Access is denied.

The [inputs.win_services] part indicates this comes from telegraf.

As which user is the telgraf service running? Does that user has administrator privileges as mentioned in the docs?

Yeah this isn’t my issue, maybe I wasn’t clear in my first post. My problem is I have multiple config files - some work, one fails and one is failing silently or being ignored. I suspect it’s an ordering problem.

  1. windows_server_performance_counters.conf - works
  2. windows_server_web_services.conf - works
  3. windows_server_services.conf - fails and records error
  4. windows_server_application_metrics.conf - fails silently or is ignored

I can fix or remove the 3rd config and then the 4th one would work, but I would expect the 4th to continue working despite any of the previous ones failing. Changes in an environment could cause one to fail and it may not be identified (or fixed) immediately so the rest should continue to work.

the file order is not an issue, the files will be concatenated into a single file, and logically executed based on the plugin type input/processor/aggregators/output.

A possible issue might be something missing in the config itself, the equivalent of a quote that is opened but never closed… making everything that comes after a simple string…

About the failures, this is my experience so far:

  • the agent won’t start or won’t work at all only for config parsing errors
  • A plugin failure is always scoped to the plugin itself, as plugins are independent of each other

therefore I expect the issue to be:

  • the file is not loaded at all
  • something weird in the config once merged, that is syntactically valid but not actually working

I suggest you run the agent with a higher logging detail, and by logging to a file, to have the overall picture

[agent]
  quiet = false
  logtarget = "file"
  logfile = "_wahteverfile_"

at the begging telegraf will list all the loaded plugins, form there you will see if the ones you configured in the fourth file have been loaded

1 Like