Pdns_recursor plugin fails collection with I/O timeout

Hi,

I’m trying to use Telegraf to collect metrics from a local PDNS Recursor instance with the pdns_recursor plugin. However, it fails with the following error:

E! [inputs.powerdns_recursor] Error in plugin: read unixgram /run/pdns-recursor/pdns_recursor_telegraf5344709737374851411->/run/pdns-recursor/pdns_recursor.controlsocket: i/o timeout

The plugin configuration:

[[inputs.powerdns_recursor]]
    unix_sockets = ["/run/pdns-recursor/pdns_recursor.controlsocket"]
    socket_dir = "/run/pdns-recursor"
    socket_mode = "0666"

Telegraf’s user belongs to the pdns-recursor group which owns the socket directory, so Telegraf is able to write to the control socket, and it can create its own socket in the socket directory. By testing the control socket directly with sudo -u telegraf rec_control get-all, Recursor immediately returns a response. I can see in Recursor’s logs that it does receive the control command from the plugin:

Received rec_control command 'get-all
' from control socket

However, the plugin’s command includes a newline at the end wheras a command sent from rec_control does not. Maybe that’s got something to do with the problem?

  • OS: CentOS 8 in an LXC container running in Proxmox
  • Telegraf version: 1.18.2 (git: HEAD a6143722)
  • PDNS Recursor version: 4.5.0-rc1

Thanks!

Do note that /run/pdns-recursor has 751 permissions (at least in older versions of PDNS Recursors), so maybe that’s the issue? i.e. create a new folder like /run/telegraf-pdns-recursor with root:pdns-recursor and 771 permissions and set in Telegraf’s configuration socket_dir to /run/telegraf-pdns-recursor. Such configuration works for me with PDNS Recursor 4.4.3 and Telegraf 19.1. Of course user telegraf should be added to group pdns-recursor