Telegraf inputs.net not correctly reporting bytes_sent on single interface (FreeBSD)

influxdb
telegraf

#1

I’m using telegraf on an OPNsense system (FreeBSD 11.1-RELEASE-p6). This system has six ethernet interfaces total. One interface, em0, seems to be sending bad data into influxdb for bytes_sent. The rest of the interfaces appear to be working as expected, and bytes_recv also appears to work correctly in this interface.

I have tried creating a new influx database, but it seems like it is probably on the telegraf side.

I also saw this same issue on a pfSense system, which is also on FreeBSD.

> SELECT "bytes_sent" FROM "net" WHERE ("interface" = 'em0' AND "host" =~ /^opnsense-1$/) AND time >= now() - 1m
name: net
time                bytes_sent
----                ----------
1520636495000000000 8628839
1520636505000000000 8628839
1520636515000000000 8628839
1520636525000000000 8628839
1520636535000000000 8628839
1520636545000000000 8628839


% netstat -I em0 -b
    Name    Mtu Network       Address              Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll
    em0    1500 <Link#1>      00:0c:29:30:b0:13    77935     0     0 14337154278    52496     0    8628839     0
    em0       - <1.2.3.4>/    <1.2.3.4>         19197387     -     - 10626667850 34283153     - 17564605980    -

% telegraf --input-filter net --test | grep em0
> net,interface=em0,host=opnsense-1 err_out=0i,drop_in=0i,drop_out=0i,bytes_sent=8628839i,bytes_recv=14906910189i,packets_sent=52496i,packets_recv=77935i,err_in=0i 1520637784000000000

% cat telegraf.conf

[global_tags]

[agent]
  interval = "10s"
  round_interval = false
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_jitter = "0s"
  precision = ""
  debug = false
  quiet = true
  logfile = "/var/log/telegraf.log"
  hostname = "opnsense-1"
  omit_hostname = false

[[outputs.influxdb]]
  urls = ["https://influxdb:8086"]
  database = "telegraf"
  retention_policy = ""
  write_consistency = "any"
  timeout = "5s"
  username = "<redacted>"
  password = "<redacted>"

[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false

[[inputs.disk]]

[[inputs.diskio]]

[[inputs.mem]]

[[inputs.processes]]

[[inputs.swap]]

[[inputs.system]]

[[inputs.net]]

#2

I get the same issue using collectd. This seems to be a FreeBSD thing.


#3

We use gopsutil to gather the info, here is the function we call. Looks like it runs netstat -ibdnW and takes the first entry for each interface, if we can come up with a better method we could open a pull request to have it modified.


#4

Interesting. Running netstat -ibdnW shows me that while the IPV4 counters increment normally, the link counters do not. Both Opkts and Obytes never change, while the IPv4 counters do.

% netstat -ibdnW
Name              Mtu Network                  Address                                    Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll  Drop
em0              1500 <Link#1>                 00:0c:29:30:b0:13                          29895     0     0 17052983977    46293     0   41289211     0     0
em0                 - xx.xxx.xxx.x/xx          xx.xxx.xxx.xxx                          12001664     -     - 15325736646 16406071     - 1390516328     -     -
<snip>

Odder still, there are more than half a dozen interfaces in this system, all working normally except for this one. I’ll keep poking at it, and thank you for pointing me at the collection method.


#5

I ran into the same issue today on OPNsense 18.7 (FreeBSD 11.1-RELEASE-p11). em0 is reporting the same value for bytes_sent over and over.

netstat -ibdnW yields the same result as for @kendokan. The other interface on the same host is reporting the correct value.