Pivot on a custom row id column and assign the minimum timestamp of all the values in row to time_

ValentineO · May 7, 2024, 3:03pm

Hi team,
My application consists of multiple computation cycles. Within each cycle, I calculate the value of N fields which I post to InfluxDB. Note that each of the fields computed within a cycle will have a slightly different time stamp since they are computed sequentially in time.

My task is display all the fields computed within a cycle on a single row. I have been reading about the pivot function and have decided to add a cycle_id to link all fields within each cycle and pivot against this cycle_id column.

However, I will also like to set the timestamp of the row to the earliest timestamp within each cycle. Is this operation possible? If not, I am open to suggestions on how to achieve this requirement.
Thanks.

scott · May 7, 2024, 3:51pm

@ValentineO Yes, I think there are a couple of ways to do this.

Query the earliest timestamp for each cycle separately

With this approach, you’d query the earliest time associated with each cycle ID, remap the earliest time as a field, union that data set with the other results, and then pivot the union’d data:

baseData = 
    from(bucket: "example-bucket")
        |> range(...)
        |> filter(...)

earliestTimes =
    baseData
        |> group(columns: ["cycle_id"])
        |> first()
        |> map(fn: (r) => ({_field: "earliest_time", _value: r._time, cycle_id: r.cycle_id}))

union(tables: [baseData, earliestTimes]
    |> pivot(rowKey: ["cycle_id"], columnKey: ["_field"], valueColumn: ["_value"])

Query earliest timestamps using a custom function

With this approach, you define a custom function that returns the earliest time in a cycle for a specified cycle ID, then use that to map values into each row.

baseData = 
    from(bucket: "example-bucket")
        |> range(...)
        |> filter(...)

getEarliest = (cycle_id) => {
    earliestTimes =
        baseData
            |> group(columns: ["cycle_id"])
            |> first()
            |> keep(columns: ["_time", "cycle_id"])

    cycleTime = 
        earliestTimes
            |> findRecord(fn: (key) => key.cycle_id == cycle_id, idx: 0)

    return cycleTime._time
}

baseData
     |> pivot(rowKey: ["cycle_id"], columnKey: ["_field"], valueColumn: ["_value"])
     |> map(fn: (r) => ({r with earliest_time: getEarliest(cycle_id: r.cycle_id)}))

ValentineO · May 7, 2024, 4:12pm

Thanks @scott. Let me give this a go.

ValentineO · May 8, 2024, 10:49am

Hi @scott,
I also wanted to rule out another idea based on the concept that influx data is essentially presented as the so called “Stream of tables”. Can the row id within the stream of tables be used instead of explicitly adding a cycle_id? In other words, is the row id within the stream of tables exposed in a column?

scott · May 8, 2024, 2:07pm

Flux doesn’t really structure rows with an ID. You’d have to manually add that ID. But that ID wouldn’t help with the pivot operation since every row would have a unique ID. If you pivoted on a row ID, you’d end up with the same number of rows, just differently structured columns.

The problem here is identifying what rows belong to each cycle. I don’t really see another way other than added the cycle ID.

ValentineO · May 8, 2024, 2:53pm

Cheers @scott. Noted!

Topic		Replies	Views
Using Pivot in Flux on InfluxDB data - what am I doing wrong? Fluxlang influxdb , flux	6	1672	November 20, 2022
Pivot function columnKey: [ "_time"] - changes Timestamp and format InfluxDB 2	1	402	May 7, 2021
Flux Pivot() to Value vs Hours grouped by Day Dashboards influxdb	1	998	November 19, 2018
Iteration between rows InfluxDB 2 time-series , query , flux	3	536	May 30, 2023
Python Client write_api.write() Client SDKs influxdb , python	1	474	April 4, 2023

Pivot on a custom row id column and assign the minimum timestamp of all the values in row to time_

Query the earliest timestamp for each cycle separately

Query earliest timestamps using a custom function

Related topics