Calculate percentage using two queries

Hello everyone.

I’m stuck in trying to create a percentage of a single table, but with different values.

I have this telegraf plugin.exec which is uploading data to table patch_all.
patch_all has one single field called value which match with the hostname and timestamp.

    from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value >= 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Total")
  |> group(columns: ["tag"])
  |> count()

    from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value > 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Bad")
  |> group(columns: ["tag"])
  |> count()```


So now I have two queries with return the **total** of hosts and another one with the total of hosts with value >0.

Now I do need to get the percentage of BAD out of TOTAL.
$BAD / $TOTAL * 100 = 

How do I can achieve that with two queries into to the same table/bucket?


Thanks

Hello @jcorrea,
I’m assuming you want second query/ (first query) * 100?
Based off the set.
But I am confused because the values that the queries return are identical so you should get 100…because the queries are basically identical.
If you’re grouping by hosts do you expect multiple values from each total and bad for each host? Or just one?

For just one result from each query I’d do:

import "experimental/array"

 q1 = (from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value >= 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Total")
  |> group(columns: ["tag"])
  |> count()
  |> findRecord( fn: (key) => true, idx: 0))._value

  q2 = (from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value > 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Bad")
  |> group(columns: ["tag"])
  |> count()
  |> findRecord( fn: (key) => true, idx: 0))._value

perc = q2/q1 * 100.0

array.from(rows: [{percentage: perc},])

If you’re expecting multiple results from each query I’d do:

q1 =  from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value > 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Total")
  |> group(columns: ["tag"])
  |> count()

q2 =  from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value > 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Bad")
  |> group(columns: ["tag"])
  |> count()

join(tables: {q1: q1, q2: q2}, on: ["_time", "host"])
    |> map(fn: (r) => ({r with _value: r._value_q2/r._value_q1 * 100.0}))

I think. it’s hard be 100% sure without seeing the shape of your data.

@Anaisdg thank you for replying.

The query works… but still cannot make the join or the percentage to work.

q1 = from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /p/)
  |> filter(fn: (r) => r._value >= 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Total")
  |> group(columns: ["tag"])
  |> count()
  |> yield(name: "Total")
  

q2 = from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /p/)
  |> filter(fn: (r) => r._value > 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Bad")
  |> group(columns: ["tag"])
  |> count()
  |> yield(name: "Bad")

I’m about to lost my mind over this… it seems to be so simple, but can’t make it work.

Thanks for the help

my data look like this:

So I basically have two results.
value >= 0 (which means is all hosts)
value > 0 (which means is hosts with patch available)

I do need to find the percentage of hosts with patches available.

$BAD / $TOTAL * 100

@jcorrea There are a couple of different ways you can do this. Here’s an option that is pretty well optimized and should give you what you’re looking for:

import "array"

data = () =>
    from(bucket: "telegraf")
        |> range(start: -6h)
        |> filter(fn: (r) => r["_measurement"] == "patch_all")
        |> filter(fn: (r) => r["_field"] == "value")
        |> filter(fn: (r) => r["host"] =~ /p/)
        |> group()

total =
    (data()
        |> count()
        |> findColumn(fn: (key) => true, column: "_value")
    )[0]

bad =
    (data()
        |> filter(fn: (r) => r._value > 0)
        |> count()
        |> findColumn(fn: (key) => true, column: "_value")
    )[0]

array.from(rows: [{percentage: float(v: bad) / float(v: total) * 100.0}])