Network data rate over subset of interfaces

#1

Typical InfluxDB/Telegraf user here (v1.5.2) – for each monitored host I’d like to obtain the amount of mean incoming network traffic per second for a given period of time (e.g. 15min), but limited to a subset of interfaces (e.g. eth1 and eth2). Basically, what I’d like to do is:

  1. sum up the net.bytes_recv values over all relevant interfaces for the host
  2. downsample to 10s or 1m
  3. use non_negative_derivative(first(), 1s) on the resulting values
  4. then use mean() over the resulting series to obtain a single value

I assumed this would be a fairly common and straightforward thing to ask, but apparently it’s either not or I’m thinking in a too complicated way. I’ve tried:

SELECT mean(received) as "mean_received" 
  FROM (SELECT non_negative_derivative(first("bytes_recv"), 1s) AS "received" 
          FROM "telegraf"."autogen"."net" 
          WHERE time > now() - 15m 
            AND ("interface"='ens5f0' OR "interface"='ens5f1' OR "interface"='ens5f2' OR "interface"='ens5f3') 
            AND "host" = 'myhost'
          GROUP BY time(1m))

but apparently step 1, the summing up of values across interfaces, is not working and my feeling is that I am doing something wrong. For example, it can be that two interfaces do not see any traffic, but they still influence the result. That is, if I remove them from the interface-limiting WHERE…AND clause, then the result changes (increases). I am suspecting that the zero entries in the series are somehow still counted as no summing up is done. Any ideas on how to approach this?

Thanks a lot in advance!
S.

#2

I have now done it like this, by doing the summing up across interface tags and the calculation of the mean in one go:

SELECT sum(bytes_derivative)/15
  FROM (SELECT non_negative_derivative(first(bytes_recv), 1s) AS "bytes_derivative"
          FROM "telegraf"."autogen"."net"
          WHERE time > now() - 15m
            AND host='myhost'
            AND interface =~ /^en[ps]/ GROUP BY time(1m), interface);

We just sum up values across all interfaces that match the regex and then divide by the expected number of items in each subset defined by the interface tags (15 – one per minute). This is functionally equivalent to calculating the arithmetic mean of the sums of the individual lines, which is what we want here, at the expense of having to adjust more than one part of the query if we want to modify the time period considered.
I feel it’s a bit hacky for my taste… is there a cleaner way to do it?

I also tried combining multiple subqueries:

SELECT non_negative_derivative(total_recv) 
  FROM (SELECT bytes_recv1 + bytes_recv2 + bytes_recv3 + bytes_recv4 AS total_recv 
    FROM (SELECT bytes_recv AS "bytes_recv1" 
          FROM "telegraf"."autogen"."net" 
          WHERE time > now() - 15m  
          AND "interface"='ens5f0' 
          AND "host" = 'myhost'),
         (SELECT bytes_recv AS "bytes_recv2"
          FROM "telegraf"."autogen"."net"
          WHERE time > now() - 15m
          AND "interface"='ens5f1' 
          AND "host" = 'myhost'), 
         (SELECT bytes_recv AS "bytes_recv3" 
          FROM "telegraf"."autogen"."net" 
          WHERE time > now() - 15m  
          AND "interface"='ens5f2' 
          AND "host" = 'myhost'),
         (SELECT bytes_recv AS "bytes_recv4" 
          FROM "telegraf"."autogen"."net" 
          WHERE time > now() - 15m  
          AND "interface"='ens5f3' 
          AND "host" = 'myhost') 
    fill(0))

but unfortunately this caused the individual lines in the subquery to ‘misalign’ sometimes, giving a wrong result. Any comments on this approach?