Flux - Append calculated field (percentage) and combine with results from different datasets

moriarty · October 8, 2021, 10:02am

I’m struggling with an Influx 2 query in Flux on how to join and map data from two differents sets (tables) into a specific desired output.

My current Flux query is this:

data = from(bucket: "foo")
    |> range(start:-1d)
    |> filter(fn: (r) => r._measurement == "io")
    |> filter(fn: (r) => r["device_id"] == "12345")
    |> filter(fn: (r) => r._field == "status_id" )

    # count the total points
    totals = data
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "total_count")

    # calculate the amount of onlines points (e.g. status = '1')
    onlines = data
    |> filter(fn: (r) => r._value == 1)
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "online_count")

    union(tables: [totals, onlines])

This returns as output:

    [{'online_count': 58.0}, {'total_count': 60.0}]

I would like to have appended to this output a percentage calculated from this. Something like:

    [{'online_count': 58.0}, {'total_count': 60.0}, {'availability': 0.96666667}]

I’ve tried combining this using .map(), but to no avail:

    # It feels like the map() is what I need, but can't find the right 
    # combination with .join/union(), .map(), .set()., .keep() etc.
    union(tables: [totals, onlines])
    |> map(fn: (r) => ({ r with percentage_online: r.onlines.online_count / r.totals.total_count * 100 }))

How can I append the (calculated) percentage as new field ‘availability’ in this Flux query?

Or, alternatively, is there a different Flux query approach to achieve this outcome?

N.B. I am aware of the Calculate percentages with Flux article from the docs, which I can’t get working into this specific scenario. But it’s close.

Please note, this exact question has also been posted on StackOverflow.

moriarty · October 8, 2021, 10:06am

I’ve also noticed the question [How to calculate difference between min and max with Flux?](How to calculate difference between min and max with Flux?], which uses a join().

But my poor attempt below results in an undesired empty array:

        join(tables: {{totals: totals, onlines: onlines}}, on: ["total_count", "online_count"])
        |> map(fn: (r) => ({{ r with _value: r._value_online_count / r._value_total_count * 100 }}))

Anaisdg · October 13, 2021, 3:16pm

Hello @moriarty,
You’re so very close:

data = from(bucket: "foo")
    |> range(start:-1d)
    |> filter(fn: (r) => r._measurement == "io")
    |> filter(fn: (r) => r["device_id"] == "12345")
    |> filter(fn: (r) => r._field == "status_id" )

    # count the total points
    totals = data
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "total_count")

    # calculate the amount of onlines points (e.g. status = '1')
    onlines = data
    |> filter(fn: (r) => r._value == 1)
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "online_count")

    union(tables: [totals, onlines])
// we have to pivot on _start instead of _time because count() is a bare aggregator and there isn't a timestamp associated wtih the count. 
    |> pivot(rowKey:["_start"], columnKey: ["_field"], valueColumn: "_value")
    |> map(fn: (r) => ({ r with percentage_online: r.online_count / r.total_count * 100 }))

OR

data = from(bucket: "foo")
    |> range(start:-1d)
    |> filter(fn: (r) => r._measurement == "io")
    |> filter(fn: (r) => r["device_id"] == "12345")
    |> filter(fn: (r) => r._field == "status_id" )

    # count the total points
    totals = data
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "total_count")

    # calculate the amount of onlines points (e.g. status = '1')
    onlines = data
    |> filter(fn: (r) => r._value == 1)
    |> count(column: "_value")
    |> toFloat()
    |> set(key: "_field", value: "online_count")

join(tables: {co: co, hum: hum}, on: ["_measurement", "device_id", "_start", "_stop"], method: "inner")
|> map(fn: (r) => ({ r with percentage_online: r._value_count / r._value_total_count * 100 }))

OR

data = from(bucket: "foo")
    |> range(start:-1d)
    |> filter(fn: (r) => r._measurement == "io")
    |> filter(fn: (r) => r["device_id"] == "12345")
    |> filter(fn: (r) => r._field == "status_id" )

    # count the total points
    totals = data
    |> count(column: "_value")
    |> toFloat()
    |> findRecord(fn: (key) => true, idx: 0)

    # calculate the amount of onlines points (e.g. status = '1')
    onlines = data
    |> filter(fn: (r) => r._value == 1)
    |> count(column: "_value")
    |> toFloat()
    |> findRecord(fn: (key) => true, idx: 0)

percentage_online = totals._value/online._value 

data |> map(fn: (r) => ({ r with percentage_online: percentage_online}))

Let me know if you need anymore help

moriarty · October 20, 2021, 9:17pm

Thanks a lot on the effort, @Anaisdg. Unfortunately, after a lot of puzzling on this I’m still unable to achieve the desired result. I could not get any of your examples working.

Example 1:

	# error: > KeyError: '_value'
    union(tables: [totals, onlines])
    |> pivot(rowKey:["_start"], columnKey: ["_field"], valueColumn: "_value")
    |> map(fn: (r) => ({{ r with percentage_online: r.onlines_count / r.total_count * 100.0 }}))

Throws an error on the pivot:

> KeyError: '_value'

I assume due to field _value not actually being available in the totals and onlines dataset?

–
Example 2:

    join(tables: {{totals: totals, onlines: onlines}}, on: ["_measurement", "device_id"], method: "inner")
    |> map(fn: (r) => ({{ r with percentage_online: r._value_onlines_count / r._value_totals_count * 100.0 }}))

Returns an empty array: [].

–
Example 3:

	percentage_online = totals._value / onlines._value 
	data |> map(fn: (r) => ({{ r with percentage_online: percentage_online}}))

Returns a huge list of statuses, either 0 or 1, and has the original values all omitted:

[{'status_id': 0}, {'status_id': 0}, {'status_id': 0}, {'status_id': 1}, {'status_id': 1}, {'status_id': 1}, {'status_id': 1}, ... etc.

N.B. I’ve also noticed and applied the change in the totals + onlines dataset here:

|> findRecord(fn: (key) => true, idx: 0)

Lastly, my updated examples above also include the following changes:

In two cases I also got the error “type conflict: int != float”; this could be fixed by changing the constant 100 => 100.0
In the examples renaming r.online_count => r.onlines_count (plural) seemed to also be required

I’m still in the dark on this. Do you have any further tips or suggestions on how to fix my problem?

Topic		Replies	Views
Flux Assistance - Divide Two Integrals Fluxlang grafana , flux	5	1322	January 6, 2022
Count() sum() on the same measurement InfluxDB 2	2	559	October 7, 2022
Calculate Percentage with Union Fluxlang flux	1	759	November 17, 2021
Flux piping intermediate results Welcome & Getting Started	1	30	September 26, 2024
Calulating fields from multiple tags InfluxDB 2 query , flux	5	264	December 22, 2023

Flux - Append calculated field (percentage) and combine with results from different datasets

Related topics