I am metering network connections. I would like to know how many unique devices are connected to my network.
To achieve this, I create connection event every time a device connects to my system.
It produces an entry such as:
[ timestamp, mac_address="ab_cd_ef_12", amount =1 ]
In grafana, I would like to graph over time the total amount of distinct mac addresses connected to my system from the beginning of each day. I use this query:
select cumulative_sum(*) from (
select * from (
select first(amount),mac_address from (
SELECT *,mac_address FROM radius_mac
) where $timeFilter group by mac_address
)
)
In my next query, I would like to create a (histogram) bar chart that will display the same number, aggregated hourly. So I thought I would take the result of the previous cumulative_sum query, and select the maximum value for each hour.
My logic: If I had seen a total of 3 unique devices at 5:05pm, and then later, 2 more devices joined for a total of 5 devices at 5:10pm, then my histogram should show 5 devices for 5pm.
And so I constructed the folowing query:
select max(*) from (
select cumulative_sum(*) from (
select * from (
select first(amount),mac_address from (
SELECT *,mac_address FROM radius_mac
) where $timeFilter group by mac_address
)
)
) group by (time(1h))
However, to my surprise, I found that the second aggregation did not work as expected. What I got instead was: [5pm , 10 connections]. Somehow the max(cumulative_sum) summed up 3+5=8 instead of max(3,5)=5.
So my questions:
- Why am I getting max(3,7)=10 instead of 7?
- Is there another way to get this histogram with a completely different query?
- Would it help if I use FluxQL? (I don’t know FluxQL instead of SQL)
PS: I am using InfluxDB 1.6 with Grafana 5.4.2
Here is the good result of the first query:
{
"xhrStatus": "complete",
"request": {
"method": "GET",
"url": "api/datasources/proxy/1/query",
"params": {
"db": "auth_service",
"q": "select cumulative_sum(*) from ( select * from (select first(amount) from (SELECT *,mac_address FROM radius_mac ) where time >= 1550068046504ms and time <= 1550073154203ms group by mac_address) )",
"epoch": "ms"
},
"data": null,
"precision": "ms"
},
"response": {
"results": [
{
"statement_id": 0,
"series": [
{
"name": "radius_mac",
"columns": [
"time",
"cumulative_sum_first"
],
"values": [
[
1550068574350,
1
],
[
1550068608823,
2
],
[
1550072112129,
3
],
[
1550072351825,
4
]
]
}
]
}
]
}
}
Here is the bad result of the second query:
{
"xhrStatus": "complete",
"request": {
"method": "GET",
"url": "api/datasources/proxy/1/query",
"params": {
"db": "auth_service",
"q": ";select max(*) from (\n\tselect cumulative_sum(*) from ( \n\t\tselect * from (\n\t\t\tselect first(amount),mac_address from (\n\t\t\t\tSELECT *,mac_address FROM radius_mac\n\t\t\t) where time >= 1550068046504ms and time <= 1550073154203ms group by mac_address\n\t\t) \n\t)\n) where time >= 1550068046504ms and time <= 1550073154203ms group by (time(1h))",
"epoch": "ms"
},
"data": null,
"precision": "ms"
},
"response": {
"results": [
{
"statement_id": 0,
"series": [
{
"name": "radius_mac",
"columns": [
"time",
"max_cumulative_sum_first"
],
"values": [
[
1550066400000,
2
],
[
1550070000000,
6
]
]
}
]
}
]
}
}
As you can see, the cumulitive_sum looks like it is run doubly…