I’ve hunted around for some info on the Kafka output plugin, but have a couple of questions.
- When multiple endpoints are specified in the brokers field, how does Telegraf determine which endpoint to use? Does it pick first one, select a random one, or send to all?
[[outputs.kafka]]
brokers = [“10.1.1.1:9092”,“10.1.1.2:9092”,“10.1.1.3:9093”] - If a single broker is down, how does the plugin handle it? Does it stop, or retry one of the others and if the latter, how does it pick the next one. Does is “mark” the failed instance as down and not retry again, or does it reset on the next push?
UPDATE
I have done some testing and the behaviour looks like below, but would like confirmation.
- On startup, the plugin reads the list of brokers from the “brokers” field and selects the LAST one from the list that is contactable. The plugin will use this broker from now on. i.e. it’s sticky
- If that broker goes down, the agent will move to the next one in the list which is online. This is the broker which will be used from now on
- If this “new” broker now fails, it will repeat the process, going back to the first one in the list when the final entry has been processed.
If this is the behaviour, is there any way to randomise the broker used at startup and each time it polls for a new one, as we’d like to add in a level of balancing throughout our estate and this would be simpler than adding in a Load Balancer
thanks