Sum col A where col B = X

Hey all,

I am new to time series data and very new to Influx, but really excited about v2 OSS. I’m sure there is a simple way to do this but I have not been able to find a solution in the past few days.

Data:
_field = ‘ip’ where _value is the ip address
_field = ‘10sec’ where _value is bytes

I’ll outline what I’m trying to accomplish:

  1. take ‘ip’ and reduce it to distinct [or unique?] values
  2. Sum ‘10sec’ per those distinct values in ‘ip’
  3. Ideally this query would also have a window function to sum ‘10sec’ on ‘ip’ criteria each hour for the past 12 hours. Or something like that, but just getting items 1 & 2 would be great.

So the result would be total bytes transmitted by ip each hour.

I have been able to get the distinct values for ‘ip’, but I am having a terrible time trying to sum.

import "influxdata/influxdb/schema"

from(bucket: "monitor")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_field"] == "ip")
  |> schema.fieldsAsCols()
  |> distinct(column: "ip")

I apologize in advance if I am not using the correct terminology. Let me know if I can provide any additional context.

Thank you!

Hi there,

I haven’t tested this but I think it will help get you closer.
Firstly, I think you need to filter for the “10sec” field and then I think you want to group instead of distinct.

import "influxdata/influxdb/schema"

from(bucket: "monitor")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_field"] =~ /ip|10sec/)
  |> schema.fieldsAsCols()
  |> group(columns: ["ip"])
// I often find grouping messes with the time order
  |> sort(columns:["_time"])
  |> sum(columns:["10sec"])

This should give you the sum of 10sec for each ip over the selected time range.

Thanks,
Tom

@thopewell - This is great! Thanks for the insight. Exactly what I needed.

One point for anyone using this as reference: change the last line to

|> sum(column:"10sec")

Great Flux community!