Hi there,
I have the following usage scenario. Iptables counters from different servers are stored in an influxdb table. My goal is to visualize traffic graphs, in Bits/second, based on these counters. I also want to be able to handle the case when a telegraf instance on given server may be stopped/restarted and thus few samples from given server could be missing but the iptables counters wont be reset and will still increase on the server in the mean time. Last but not least some colums come only from given server.
So, here is an example values from the table. The data are collected every 60 seconds from telegraf running on given server.
select (8 * non_negative_derivative(“class:http:local:out:bytes”)) as http, (8 * non_negative_derivative(“class:p3hit:in:bytes”)) as p2p from xtables where host =~ /^exmh-/ and time >= 1564394279563ms and time <= 1564394642693ms group by host
name: xtables
tags: host=exmh-a
time http p2p
---- ---- ---
1564394340000000000 1038037408.1333333
1564394400000000000 1080546471.4666667
1564394460000000000 1171599072.4
1564394520000000000 1079397346.6666667
1564394580000000000 1202226452.5333333
1564394640000000000 1338852115.7333333
name: xtables
tags: host=exmh-http
time http p2p
---- ---- ---
1564394340000000000 1664832943.0666666
1564394400000000000 1613622968.5333333
1564394460000000000 1671066106.6666667
1564394580000000000 1701544141.4
1564394640000000000 1684389083.7333333
Note that the http is missing from the exmh-a server, the p2p is missing from exmh-http server and there are no entry for time 1564394520000000000 from exmh-http server because the telegraf was restarted there.
And so I’m trying to sum the internet and http and p2p columns and present them in one graph. So, I’m using nested select statements.
select sum(http) as HTTP, sum(p2p) as P2P from (select (8 * non_negative_derivative(“class:http:local:out:bytes”)) as http, (8 * non_negative_derivative(“class:p3hit:in:bytes”)) as p2p from xtables where host =~ /^exmh-/ and time >= 1564394279563ms and time <= 1564394642693ms group by host) where time >= 1564394279563ms and time <= 1564394642693ms group by time(1s) fill(none)
name: xtables
time HTTP P2P
---- ---- ---
1564394340000000000 1664832943.0666666 1038037408.1333333
1564394400000000000 1613622968.5333333 1080546471.4666667
1564394460000000000 1671066106.6666667 1171599072.4
1564394520000000000 1079397346.6666667
1564394580000000000 1701544141.4 1202226452.5333333
1564394640000000000 1684389083.7333333 1338852115.7333333
Few things to note here which I’m not sure why work in this way.
- I need secod ‘time>= … and time<=’ with the same values otherwise it returns all timestamps up to now. I would expect the outer select to work only on the data returned from the inner select but probably I misunderstand how the things work internally?
- The ‘group by time’ needs to be for 1s instead of 60s altough the inner query results are only for every 60s. This is probably because the non_negative_derivative by default is rate per 1s? Also this grouping per 1s gives me a lot of empty values between the real ones and I skip them using fill(none).
And here comes the strangest thing for me. I would like to present HTTP, P2P and Total (HTTP + P2P) on single grafana graph. However, the missing HTTP value from 1564394520000000000 makes the HTTP and Total to be removed from the graph at this time point. Grafana feature ‘null as zero’ doesn’t help because then the Total becomes zero at this time point which is incorrect.
So, I tried to fill zeroes in the inner most query, like this:
select sum(http) as HTTP, sum(p2p) as P2P from (select (8non_negative_derivative(“class:http:local:out:bytes”)) as http, (8non_negative_derivative(“class:p3hit:in:bytes”)) as p2p from xtables where host =~ /^exmh-/ and time >= 1564394279563ms and time <= 1564394642693ms group by host fill(0)) where time >= 1564394279563ms and time <= 1564394642693ms group by time(1s) fill(none)
name: xtables
time HTTP P2P
---- ---- ---
1564394340000000000 0 1038037408.1333333
1564394400000000000 0 1080546471.4666667
1564394460000000000 0 1171599072.4
1564394520000000000 0 1079397346.6666667
1564394580000000000 0 1202226452.5333333
1564394640000000000 0 1338852115.7333333
1564394340000000000 1664832943.0666666 0
1564394400000000000 1613622968.5333333 0
1564394460000000000 1671066106.6666667 0
1564394580000000000 1701544141.4 0
1564394640000000000 1684389083.7333333 0
I can’t understand why fill(0) leads to doubling the values per time point. Shouldn’t the outer sum + group by time lead to summing all values for given time point?
So to summarize my questions:
- Why the outer ‘time >= 1564394279563ms and time <= 1564394642693ms’ is needed to limit the returned values which seems already limitted by the inner such clause?
- Why the ‘group by time(60s)’ leads to wrong results even though the actual values from the inner select are set every 60s?
- Why adding ‘fill(0)’ in the inner select leads to double values from the full query? Is there a way to avoid this behavior?
Thanks,
Pavel.