Aggregating cumulative_sum

query
influxdb
#1

I am metering network connections. I would like to know how many unique devices are connected to my network.
To achieve this, I create connection event every time a device connects to my system.
It produces an entry such as:
[ timestamp, mac_address="ab_cd_ef_12", amount =1 ]

In grafana, I would like to graph over time the total amount of distinct mac addresses connected to my system from the beginning of each day. I use this query:

select cumulative_sum(*) from ( 
  select * from (
    select  first(amount),mac_address from (
      SELECT *,mac_address FROM radius_mac
    ) where $timeFilter group by mac_address
  ) 
)

In my next query, I would like to create a (histogram) bar chart that will display the same number, aggregated hourly. So I thought I would take the result of the previous cumulative_sum query, and select the maximum value for each hour.
My logic: If I had seen a total of 3 unique devices at 5:05pm, and then later, 2 more devices joined for a total of 5 devices at 5:10pm, then my histogram should show 5 devices for 5pm.
And so I constructed the folowing query:

select max(*) from (
	select cumulative_sum(*) from ( 
		select * from (
			select  first(amount),mac_address from (
				SELECT *,mac_address FROM radius_mac
			) where $timeFilter group by mac_address
		) 
	)
) group by (time(1h))

However, to my surprise, I found that the second aggregation did not work as expected. What I got instead was: [5pm , 10 connections]. Somehow the max(cumulative_sum) summed up 3+5=8 instead of max(3,5)=5.

So my questions:

  1. Why am I getting max(3,7)=10 instead of 7?
  2. Is there another way to get this histogram with a completely different query?
  3. Would it help if I use FluxQL? (I don’t know FluxQL instead of SQL)

PS: I am using InfluxDB 1.6 with Grafana 5.4.2

Here is the good result of the first query:

{
  "xhrStatus": "complete",
  "request": {
    "method": "GET",
    "url": "api/datasources/proxy/1/query",
    "params": {
      "db": "auth_service",
      "q": "select cumulative_sum(*) from ( select * from (select  first(amount) from (SELECT *,mac_address FROM radius_mac  ) where time >= 1550068046504ms and time <= 1550073154203ms group by mac_address) )",
      "epoch": "ms"
    },
    "data": null,
    "precision": "ms"
  },
  "response": {
    "results": [
      {
        "statement_id": 0,
        "series": [
          {
            "name": "radius_mac",
            "columns": [
              "time",
              "cumulative_sum_first"
            ],
            "values": [
              [
                1550068574350,
                1
              ],
              [
                1550068608823,
                2
              ],
              [
                1550072112129,
                3
              ],
              [
                1550072351825,
                4
              ]
            ]
          }
        ]
      }
    ]
  }
}

Here is the bad result of the second query:

{
  "xhrStatus": "complete",
  "request": {
    "method": "GET",
    "url": "api/datasources/proxy/1/query",
    "params": {
      "db": "auth_service",
      "q": ";select max(*) from (\n\tselect cumulative_sum(*) from ( \n\t\tselect * from (\n\t\t\tselect  first(amount),mac_address from (\n\t\t\t\tSELECT *,mac_address FROM radius_mac\n\t\t\t) where time >= 1550068046504ms and time <= 1550073154203ms group by mac_address\n\t\t) \n\t)\n) where time >= 1550068046504ms and time <= 1550073154203ms group by (time(1h))",
      "epoch": "ms"
    },
    "data": null,
    "precision": "ms"
  },
  "response": {
    "results": [
      {
        "statement_id": 0,
        "series": [
          {
            "name": "radius_mac",
            "columns": [
              "time",
              "max_cumulative_sum_first"
            ],
            "values": [
              [
                1550066400000,
                2
              ],
              [
                1550070000000,
                6
              ]
            ]
          }
        ]
      }
    ]
  }
}

As you can see, the cumulitive_sum looks like it is run doubly…