I have had to rebuild my monitoring server (backups hit with same corruption) and Im having issues with the APC UPSD monitoring with Telegraf.
The APC agent is running on my monitoring server and can query the UPS, telegraf is installed and other instances are working fine, but for some reason it errors every time I try and monitor the APCUPS daemon.
I have tried moving the APC daemon to another computer and running telegraf from another also, but each time I receive this error and cannot find what is wrong as the telefraf.conf file is very simple and minimal for APCUPSD.
IT basically has this for the output plugins. (default for apcupsd)
[[inputs.apcupsd]]
“# A list of running apcupsd server to connect to.”
“# If not provided will default to tcp://127.0.0.1:3551”
servers = [“tcp://192.168.0.1:3551”]
“## Timeout for dialing server.”
" #timeout = “5s”"
I have tried adding the timeout and various duration,but this error always remains. Has anyone see this before or know how to debug telegraf?
Ubuntu 20.05
Telegraf 1.14.5 & 1.16.2
apcupsd 3.14.4
’ ## Collection jitter is used to jitter the collection by a random amount.
collection_jitter = “0s”
’ ## Default flushing interval for all outputs. Maximum flush_interval will be
’ ## flush_interval + flush_jitter
flush_interval = “10s”
flush_jitter = “0s”
’ ## By default or when set to “0s”, precision will be set to the same
precision = “”
’ ## Logging configuration:
’ ## Run telegraf with debug log messages.
debug = false
’ ## Run telegraf in quiet mode (error log messages only).
quiet = false
’ ## Specify the log file name. The empty string means to log to stderr.
logfile = “”
’ ## Override default hostname, if empty use os.Hostname()
hostname = “”
omit_hostname = false
'# Configuration for sending metrics to InfluxDB
[[outputs.influxdb]]
urls = [“http://netmon:8086”]
[[inputs.apcupsd]]
’ # A list of running apcupsd server to connect to.
’ # If not provided will default to tcp://127.0.0.1:3551
servers = [“tcp://192.168.0.1:3551”]
’ ## Timeout for dialing server.
’ #timeout = “5s”
I have tried running it with all inputs commented out and the same error, both locally and via a remove server.
This should be simple but for some reason will not work on this new server. frustrating… Only difference is this is Ubuntu 20 and the old server was 16.04
Seems like this is a bug in the upstream package used for parsing this. Perhaps it thinks some field is a time duration “10 Seconds” type of response, but that’s not what it’s reading from the field value.
Could you open an issue for this? It’s not user error.