Hi,
i have just reproduced it. The Output config looks like this:
[[outputs.influxdb]]
urls = [“http://127.0.0.1:8086”]
namedrop = [“_test"]
[outputs.influxdb.tagdrop]
influxdb_database = ["”]
database = “telegraf”
[[outputs.influxdb]]
urls = [“http://192.168.x.y:8086/”]
namedrop = [“_test"]
metric_buffer_limit = 100000
[outputs.influxdb.tagdrop]
influxdb_database = ["”]
database = “telegraf”
[[outputs.influxdb]]
urls = [“http://127.0.0.1:8086”]
database = “pmi”
tagexclude = [“influxdb_database”]
[outputs.influxdb.tagpass]
influxdb_database = [“pmi”]
[[outputs.influxdb]]
urls = [“http://192.168.x.y:8086/”]
database = “pmi”
metric_buffer_limit = 100000
tagexclude = [“influxdb_database”]
[outputs.influxdb.tagpass]
influxdb_database = [“pmi”]
[[outputs.influxdb]]
urls = [“http://127.0.0.1:8086”]
database = “telegraf”
tagexclude = [“influxdb_database”]
[outputs.influxdb.tagpass]
influxdb_database = [“idc”]
[[outputs.influxdb]]
urls = [“http://192.168.x.y:8086/”]
database = “telegraf”
tagexclude = [“influxdb_database”]
[outputs.influxdb.tagpass]
influxdb_database = [“idc”]
When I tell the Firewall that leads to 192.168.x.y to drop the packets silently the following happens:
Nov 18 11:01:00 telegraf[11714]: 2019-11-18T10:01:00Z D! [outputs.influxdb] Wrote batch of 51 metrics in 61.231974ms
Nov 18 11:01:00 telegraf[11714]: 2019-11-18T10:01:00Z D! [outputs.influxdb] Buffer fullness: 1 / 100000 metrics
Nov 18 11:01:05 telegraf[11714]: 2019-11-18T10:01:05Z E! [outputs.influxdb] when writing to [http://192.168.x.y:8086/]: Post http://192.168.x.y:8086/write?db=telegraf: net/http: request canceled while waiting for connection
(Client.Timeout exceeded while awaiting headers)
Nov 18 11:01:05 telegraf[11714]: 2019-11-18T10:01:05Z D! [outputs.influxdb] Buffer fullness: 787 / 100000 metrics
Nov 18 11:01:05 telegraf[11714]: 2019-11-18T10:01:05Z E! [agent] Error writing to outputs.influxdb: could not write any address
Nov 18 11:01:05 telegraf[11714]: 2019-11-18T10:01:05Z E! [outputs.influxdb] when writing to [http://192.168.x.y:8086/]: Post http://192.168.x.y:8086/write?db=telegraf: net/http: request canceled while waiting for connection
(Client.Timeout exceeded while awaiting headers)
Nov 18 11:01:05 telegraf[11714]: 2019-11-18T10:01:05Z D! [outputs.influxdb] Buffer fullness: 946 / 100000 metrics
Nov 18 11:01:05 telegraf[11714]: 2019-11-18T10:01:05Z E! [agent] Error writing to outputs.influxdb: could not write any address
Nov 18 11:01:05 telegraf[11714]: 2019-11-18T10:01:05Z E! [outputs.influxdb] when writing to [http://192.168.x.y:8086/]: Post http://192.168.x.y:8086/write?db=pmi: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Nov 18 11:01:05 telegraf[11714]: 2019-11-18T10:01:05Z D! [outputs.influxdb] Buffer fullness: 1000 / 100000 metrics
Nov 18 11:01:05 telegraf[11714]: 2019-11-18T10:01:05Z E! [agent] Error writing to outputs.influxdb: could not write any address
Nov 18 11:01:10 telegraf[11714]: 2019-11-18T10:01:10Z D! [outputs.influxdb] Buffer fullness: 0 / 100000 metrics
Nov 18 11:01:10 telegraf[11714]: 2019-11-18T10:01:10Z D! [outputs.influxdb] Wrote batch of 23 metrics in 96.375817ms
Nov 18 11:01:10 telegraf[11714]: 2019-11-18T10:01:10Z D! [outputs.influxdb] Buffer fullness: 17 / 100000 metrics
This keeps happening and data is not written to any of the output, I see a gap in the Graphs.
When the missing output comes back online, I see this:
Nov 18 11:03:40 telegraf[11714]: panic: channel is full
Nov 18 11:03:40 telegraf[11714]: goroutine 10357 [running]:
Nov 18 11:03:40 telegraf[11714]: github.com/influxdata/telegraf/agent.(*trackingAccumulator).onDelivery(0xc000292780, 0x2c11e80, 0xc002bff860)
Nov 18 11:03:40 telegraf[11714]: #011/go/src/github.com/influxdata/telegraf/agent/accumulator.go:167 +0x7a
Nov 18 11:03:40 telegraf[11714]: github.com/influxdata/telegraf/metric.(*trackingData).notify(...)
Nov 18 11:03:40 telegraf[11714]: #011/go/src/github.com/influxdata/telegraf/metric/tracking.go:73
Nov 18 11:03:40 telegraf[11714]: github.com/influxdata/telegraf/metric.(*trackingMetric).decr(0xc00174ea60)
Nov 18 11:03:40 telegraf[11714]: #011/go/src/github.com/influxdata/telegraf/metric/tracking.go:163 +0x9e
Nov 18 11:03:40 telegraf[11714]: github.com/influxdata/telegraf/metric.(*trackingMetric).Accept(0xc00174ea60)
Nov 18 11:03:40 telegraf[11714]: #011/go/src/github.com/influxdata/telegraf/metric/tracking.go:144 +0x3a
Nov 18 11:03:40 telegraf[11714]: github.com/influxdata/telegraf/internal/models.(*Buffer).metricWritten(0xc0001c2fa0, 0x2c72240, 0xc00174ea60)
Nov 18 11:03:40 telegraf[11714]: #011/go/src/github.com/influxdata/telegraf/internal/models/buffer.go:93 +0x72
Nov 18 11:03:40 telegraf[11714]: github.com/influxdata/telegraf/internal/models.(*Buffer).Accept(0xc0001c2fa0, 0xc002092000, 0x30, 0x30)
Nov 18 11:03:40 telegraf[11714]: #011/go/src/github.com/influxdata/telegraf/internal/models/buffer.go:179 +0xa6
Nov 18 11:03:40 telegraf[11714]: github.com/influxdata/telegraf/internal/models.(*RunningOutput).Write(0xc0001aa280, 0x0, 0xc000560660)
Nov 18 11:03:40 telegraf[11714]: #011/go/src/github.com/influxdata/telegraf/internal/models/running_output.go:190 +0xf7
Nov 18 11:03:40 telegraf[11714]: github.com/influxdata/telegraf/agent.(*Agent).flushOnce.func1(0xc001755b00, 0xc0016d7bc0)
Nov 18 11:03:40 telegraf[11714]: #011/go/src/github.com/influxdata/telegraf/agent/agent.go:597 +0x27
Nov 18 11:03:40 telegraf[11714]: created by github.com/influxdata/telegraf/agent.(*Agent).flushOnce
Nov 18 11:03:40 telegraf[11714]: #011/go/src/github.com/influxdata/telegraf/agent/agent.go:596 +0xc8
Nov 18 11:03:40 systemd[1]: telegraf.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 18 11:03:40 systemd[1]: telegraf.service: Unit entered failed state.
Nov 18 11:03:40 systemd[1]: telegraf.service: Failed with result ‘exit-code’.
Nov 18 11:03:40 systemd[1]: telegraf.service: Service hold-off time over, scheduling restart.
Telegraf panics, and systemd restarts it. After this, data is pouring in again.
thx for looking into this…