Flux - Append calculated field (percentage) and combine with results from different datasets

I’m struggling with an Influx 2 query in Flux on how to join and map data from two differents sets (tables) into a specific desired output.

My current Flux query is this:

data = from(bucket: "foo")
    |> range(start:-1d)
    |> filter(fn: (r) => r._measurement == "io")
    |> filter(fn: (r) => r["device_id"] == "12345")
    |> filter(fn: (r) => r._field == "status_id" )

    # count the total points
    totals = data
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "total_count")

    # calculate the amount of onlines points (e.g. status = '1')
    onlines = data
    |> filter(fn: (r) => r._value == 1)
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "online_count")

    union(tables: [totals, onlines])

This returns as output:

    [{'online_count': 58.0}, {'total_count': 60.0}]

I would like to have appended to this output a percentage calculated from this. Something like:

    [{'online_count': 58.0}, {'total_count': 60.0}, {'availability': 0.96666667}]    

I’ve tried combining this using .map(), but to no avail:

    # It feels like the map() is what I need, but can't find the right 
    # combination with .join/union(), .map(), .set()., .keep() etc.
    union(tables: [totals, onlines])
    |> map(fn: (r) => ({ r with percentage_online: r.onlines.online_count / r.totals.total_count * 100 }))

How can I append the (calculated) percentage as new field ‘availability’ in this Flux query?

Or, alternatively, is there a different Flux query approach to achieve this outcome?

N.B. I am aware of the Calculate percentages with Flux article from the docs, which I can’t get working into this specific scenario. But it’s close.

Please note, this exact question has also been posted on StackOverflow.

1 Like

I’ve also noticed the question [How to calculate difference between min and max with Flux?](How to calculate difference between min and max with Flux?], which uses a join().

But my poor attempt below results in an undesired empty array:

        join(tables: {{totals: totals, onlines: onlines}}, on: ["total_count", "online_count"])
        |> map(fn: (r) => ({{ r with _value: r._value_online_count / r._value_total_count * 100 }}))
1 Like

Hello @moriarty,
You’re so very close:

data = from(bucket: "foo")
    |> range(start:-1d)
    |> filter(fn: (r) => r._measurement == "io")
    |> filter(fn: (r) => r["device_id"] == "12345")
    |> filter(fn: (r) => r._field == "status_id" )

    # count the total points
    totals = data
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "total_count")

    # calculate the amount of onlines points (e.g. status = '1')
    onlines = data
    |> filter(fn: (r) => r._value == 1)
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "online_count")

    union(tables: [totals, onlines])
// we have to pivot on _start instead of _time because count() is a bare aggregator and there isn't a timestamp associated wtih the count. 
    |> pivot(rowKey:["_start"], columnKey: ["_field"], valueColumn: "_value")
    |> map(fn: (r) => ({ r with percentage_online: r.online_count / r.total_count * 100 }))

OR

data = from(bucket: "foo")
    |> range(start:-1d)
    |> filter(fn: (r) => r._measurement == "io")
    |> filter(fn: (r) => r["device_id"] == "12345")
    |> filter(fn: (r) => r._field == "status_id" )

    # count the total points
    totals = data
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "total_count")

    # calculate the amount of onlines points (e.g. status = '1')
    onlines = data
    |> filter(fn: (r) => r._value == 1)
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "online_count")

join(tables: {co: co, hum: hum}, on: ["_measurement", "device_id", "_start", "_stop"], method: "inner")
|> map(fn: (r) => ({ r with percentage_online: r._value_count / r._value_total_count * 100 }))

OR

data = from(bucket: "foo")
    |> range(start:-1d)
    |> filter(fn: (r) => r._measurement == "io")
    |> filter(fn: (r) => r["device_id"] == "12345")
    |> filter(fn: (r) => r._field == "status_id" )

    # count the total points
    totals = data
    |> count(column: "_value")
    |> toFloat()
    |> findRecord(fn: (key) => true, idx: 0)

    # calculate the amount of onlines points (e.g. status = '1')
    onlines = data
    |> filter(fn: (r) => r._value == 1)
    |> count(column: "_value")
    |> toFloat()
    |> findRecord(fn: (key) => true, idx: 0)

percentage_online = totals._value/online._value 

data |> map(fn: (r) => ({ r with percentage_online: percentage_online}))

Let me know if you need anymore help :slight_smile:

Thanks a lot on the effort, @Anaisdg. :pray: Unfortunately, after a lot of puzzling on this I’m still unable to achieve the desired result. I could not get any of your examples working.

Example 1:

	# error: > KeyError: '_value'
    union(tables: [totals, onlines])
    |> pivot(rowKey:["_start"], columnKey: ["_field"], valueColumn: "_value")
    |> map(fn: (r) => ({{ r with percentage_online: r.onlines_count / r.total_count * 100.0 }}))

Throws an error on the pivot:

> KeyError: '_value'

I assume due to field _value not actually being available in the totals and onlines dataset?


Example 2:

    join(tables: {{totals: totals, onlines: onlines}}, on: ["_measurement", "device_id"], method: "inner")
    |> map(fn: (r) => ({{ r with percentage_online: r._value_onlines_count / r._value_totals_count * 100.0 }}))

Returns an empty array: [].


Example 3:

	percentage_online = totals._value / onlines._value 
	data |> map(fn: (r) => ({{ r with percentage_online: percentage_online}}))

Returns a huge list of statuses, either 0 or 1, and has the original values all omitted:

[{'status_id': 0}, {'status_id': 0}, {'status_id': 0}, {'status_id': 1}, {'status_id': 1}, {'status_id': 1}, {'status_id': 1}, ... etc.

N.B. I’ve also noticed and applied the change in the totals + onlines dataset here:

|> findRecord(fn: (key) => true, idx: 0)


Lastly, my updated examples above also include the following changes:

  • In two cases I also got the error “type conflict: int != float”; this could be fixed by changing the constant 100 => 100.0
  • In the examples renaming r.online_count => r.onlines_count (plural) seemed to also be required

I’m still in the dark on this. Do you have any further tips or suggestions on how to fix my problem?