Calculate percentage using two queries

jcorrea · November 11, 2022, 2:37pm

Hello everyone.

I’m stuck in trying to create a percentage of a single table, but with different values.

I have this telegraf plugin.exec which is uploading data to table patch_all.
patch_all has one single field called value which match with the hostname and timestamp.

    from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value >= 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Total")
  |> group(columns: ["tag"])
  |> count()

    from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value > 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Bad")
  |> group(columns: ["tag"])
  |> count()```


So now I have two queries with return the **total** of hosts and another one with the total of hosts with value >0.

Now I do need to get the percentage of BAD out of TOTAL.
$BAD / $TOTAL * 100 = 

How do I can achieve that with two queries into to the same table/bucket?


Thanks

Anaisdg · November 15, 2022, 5:33pm

Hello @jcorrea,
I’m assuming you want second query/ (first query) * 100?
Based off the set.
But I am confused because the values that the queries return are identical so you should get 100…because the queries are basically identical.
If you’re grouping by hosts do you expect multiple values from each total and bad for each host? Or just one?

For just one result from each query I’d do:

import "experimental/array"

 q1 = (from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value >= 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Total")
  |> group(columns: ["tag"])
  |> count()
  |> findRecord( fn: (key) => true, idx: 0))._value

  q2 = (from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value > 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Bad")
  |> group(columns: ["tag"])
  |> count()
  |> findRecord( fn: (key) => true, idx: 0))._value

perc = q2/q1 * 100.0

array.from(rows: [{percentage: perc},])

If you’re expecting multiple results from each query I’d do:

q1 =  from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value > 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Total")
  |> group(columns: ["tag"])
  |> count()

q2 =  from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /q/)
  |> filter(fn: (r) => r._value > 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Bad")
  |> group(columns: ["tag"])
  |> count()

join(tables: {q1: q1, q2: q2}, on: ["_time", "host"])
    |> map(fn: (r) => ({r with _value: r._value_q2/r._value_q1 * 100.0}))

I think. it’s hard be 100% sure without seeing the shape of your data.

jcorrea · December 8, 2022, 6:58pm

@Anaisdg thank you for replying.

The query works… but still cannot make the join or the percentage to work.

q1 = from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /p/)
  |> filter(fn: (r) => r._value >= 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Total")
  |> group(columns: ["tag"])
  |> count()
  |> yield(name: "Total")
  

q2 = from(bucket: "telegraf")
  |> range(start: -6h)
  |> filter(fn: (r) => r["_measurement"] == "patch_all")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["host"] =~ /p/)
  |> filter(fn: (r) => r._value > 0)
  |> aggregateWindow(every: 6h, fn: last, createEmpty: false)
  |> set(key: "tag", value: "Bad")
  |> group(columns: ["tag"])
  |> count()
  |> yield(name: "Bad")

I’m about to lost my mind over this… it seems to be so simple, but can’t make it work.

Thanks for the help

jcorrea · December 8, 2022, 6:59pm

my data look like this:

jcorrea · December 8, 2022, 8:35pm

So I basically have two results.
value >= 0 (which means is all hosts)
value > 0 (which means is hosts with patch available)

I do need to find the percentage of hosts with patches available.

$BAD / $TOTAL * 100

scott · December 8, 2022, 10:35pm

@jcorrea There are a couple of different ways you can do this. Here’s an option that is pretty well optimized and should give you what you’re looking for:

import "array"

data = () =>
    from(bucket: "telegraf")
        |> range(start: -6h)
        |> filter(fn: (r) => r["_measurement"] == "patch_all")
        |> filter(fn: (r) => r["_field"] == "value")
        |> filter(fn: (r) => r["host"] =~ /p/)
        |> group()

total =
    (data()
        |> count()
        |> findColumn(fn: (key) => true, column: "_value")
    )[0]

bad =
    (data()
        |> filter(fn: (r) => r._value > 0)
        |> count()
        |> findColumn(fn: (key) => true, column: "_value")
    )[0]

array.from(rows: [{percentage: float(v: bad) / float(v: total) * 100.0}])

Topic		Replies	Views
Count values for percentage Fluxlang grafana , flux	2	505	July 8, 2021
FluxQL query with percentage calculation for table with time on columns Fluxlang influxdb , flux	2	361	August 2, 2023
Divide 2 queries results in InfluxDB Telegraf	2	2179	October 16, 2020
Flux - Append calculated field (percentage) and combine with results from different datasets Fluxlang flux	3	1539	October 20, 2021
Count() sum() on the same measurement InfluxDB 2	2	560	October 7, 2022

Calculate percentage using two queries

Related topics