Hi
In a nutshell: Collecting syslog data in docker swarm → the host tag is wrong from one host.
System layout:
I have multiple host. Docker swarm is started from Server1 on all Servers. My config is pretty much the same as in this thread (@glass_willis thanks for solving this). The full compse is:
version: "3.7"
services:
telegraf:
image: telegraf:1.30
user: "telegraf:999"
hostname: "{{.Node.Hostname}}"
volumes:
- /:/hostfs:ro
- /var/run/docker.sock:/var/run/docker.sock
- /data/stack/telegraf_configurations_test_syslog:/etc/telegraf/telegraf_configurations:ro
- /data/stack/telegrafOutput:/tmp:rw
command:
- '--config-directory'
- '/etc/telegraf/telegraf_configurations'
- '--watch-config'
- 'notify'
environment:
- HOST_ETC=/hostfs/etc
- HOST_PROC=/hostfs/proc
- HOST_SYS=/hostfs/sys
- HOST_VAR=/hostfs/var
- HOST_RUN=/hostfs/run
- HOST_MOUNT_PREFIX=/hostfs
- HOSTNAME={{.Node.Hostname}}
networks:
- proxy-net
deploy:
mode: global
labels:
- "traefik.enable=true"
- "traefik.docker.network=proxy-net"
- "traefik.tcp.services.telegrafSyslog.loadbalancer.server.port=6514"
- "traefik.tcp.routers.telegrafSyslog.entrypoints=telegrafSyslog"
- "traefik.tcp.routers.telegrafSyslog.rule=HostSNI(`*`)"
- "traefik.tcp.routers.telegrafSyslog.service=telegrafSyslog"
networks:
proxy-net:
external: true
To be able to filter accross many different services collecting data the “host” tag needs to be correct. In the influxdb the “host” tag I see is always “Server2” while the “hostname” is reported correctly (Server1 or Server2). Server1 is where I execute the docker commands.
I tried with the processor rewrite plugin to no avail. This is my current telegraf.conf looks like this:
[global_tags]
[agent]
# The agent table configures Telegraf and the defaults used across all plugins.
interval = "2s"
round_interval = true
metric_batch_size = 10000
metric_buffer_limit = 100000
collection_jitter = "1s"
flush_interval = "2s"
flush_jitter = "1s"
precision = "1ms"
# debug: Run Telegraf in debug mode.
debug = true
# quiet: Run Telegraf in quiet mode (error messages only).
quiet = false
# logfile: Specify the log file name. The empty string means to log to stderr. The directry has to exist in advance, else no logfile gets written.
logfile = "/var/log/telegraf/Telegraf.log"
# logtarget: Control the destination for logs. Can be one of �file�, �stderr� or, on Windows, �eventlog�. When set to �file�, the output file is determined by the �logfile� setting.
logtarget = "file"
# logfile_rotation_interval: Rotates logfile after the time interval specified. When set to 0 no time based rotation is performed.
logfile_rotation_interval = 0
# logfile_rotation_max_size: Rotates logfile when it becomes larger than the specified size. When set to 0 no size based rotation is performed.
logfile_rotation_max_size = "100KB"
# logfile_rotation_max_archives: Maximum number of rotated archives to keep, any older logs are deleted. If set to -1, no archives are removed.
logfile_rotation_max_archives = 50
# log_with_timezone: Set a timezone to use when logging or type �local� for local time. Example: �America/Chicago�. See this page for options/formats.
# hostname: Override default hostname, if empty use os.Hostname().
hostname = "${HOSTNAME}"
# omit_hostname: If true, do no set the host tag in the Telegraf agent.
omit_hostname = false
[[inputs.syslog]]
alias = "Log_System"
name_override = "Log_System"
interval = "1s" #value is ignored by "tail" plugin as it is event driven
## Protocol, address and port to host the syslog receiver.
server = "tcp4://localhost:6514"
## Framing technique used for messages transport
## Available settings are:
## octet-counting -- see RFC5425#section-4.3.1 and RFC6587#section-3.4.1
## non-transparent -- see RFC6587#section-3.4.2
framing = "octet-counting"
# In order to avoid dis- and reconnects, which can create many warnings in syslog, read_timeout and keep_alive_period should be set as followed
## Zero means unlimited.
read_timeout = "0s"
## Zero disables keep alive probes. Defaults to the OS configuration.
keep_alive_period = "20s"
# best_effort tries to handle even malformated syslog entries.
best_effort = true
[inputs.syslog.tags]
_in = "LogSystemTest"
[[processors.override]]
[processors.override.tags]
host = "${HOSTNAME}"
[[outputs.influxdb]]
alias = "InfluxDB_PCM_Log_System_Test"
tagexclude = ["_in"]
urls = ["https://192.168.102.109:8087"]
insecure_skip_verify = true
database = "InfluxDB_PCM_Log_System_Test"
username = "telegraf_writer"
password = "Write@InfluxDB"
[outputs.influxdb.tagpass]
_in = ["LogSystemTest"]
Any help on this highly appreciated! Thanks to all those folks out there helping out!