Pivot on a custom row id column and assign the minimum timestamp of all the values in row to time_

Hi team,
My application consists of multiple computation cycles. Within each cycle, I calculate the value of N fields which I post to InfluxDB. Note that each of the fields computed within a cycle will have a slightly different time stamp since they are computed sequentially in time.

My task is display all the fields computed within a cycle on a single row. I have been reading about the pivot function and have decided to add a cycle_id to link all fields within each cycle and pivot against this cycle_id column.

However, I will also like to set the timestamp of the row to the earliest timestamp within each cycle. Is this operation possible? If not, I am open to suggestions on how to achieve this requirement.
Thanks.

@ValentineO Yes, I think there are a couple of ways to do this.

Query the earliest timestamp for each cycle separately

With this approach, you’d query the earliest time associated with each cycle ID, remap the earliest time as a field, union that data set with the other results, and then pivot the union’d data:

baseData = 
    from(bucket: "example-bucket")
        |> range(...)
        |> filter(...)

earliestTimes =
    baseData
        |> group(columns: ["cycle_id"])
        |> first()
        |> map(fn: (r) => ({_field: "earliest_time", _value: r._time, cycle_id: r.cycle_id}))

union(tables: [baseData, earliestTimes]
    |> pivot(rowKey: ["cycle_id"], columnKey: ["_field"], valueColumn: ["_value"])

Query earliest timestamps using a custom function

With this approach, you define a custom function that returns the earliest time in a cycle for a specified cycle ID, then use that to map values into each row.

baseData = 
    from(bucket: "example-bucket")
        |> range(...)
        |> filter(...)

getEarliest = (cycle_id) => {
    earliestTimes =
        baseData
            |> group(columns: ["cycle_id"])
            |> first()
            |> keep(columns: ["_time", "cycle_id"])

    cycleTime = 
        earliestTimes
            |> findRecord(fn: (key) => key.cycle_id == cycle_id, idx: 0)

    return cycleTime._time
}

baseData
     |> pivot(rowKey: ["cycle_id"], columnKey: ["_field"], valueColumn: ["_value"])
     |> map(fn: (r) => ({r with earliest_time: getEarliest(cycle_id: r.cycle_id)}))

Thanks @scott. Let me give this a go.

Hi @scott,
I also wanted to rule out another idea based on the concept that influx data is essentially presented as the so called “Stream of tables”. Can the row id within the stream of tables be used instead of explicitly adding a cycle_id? In other words, is the row id within the stream of tables exposed in a column?

Flux doesn’t really structure rows with an ID. You’d have to manually add that ID. But that ID wouldn’t help with the pivot operation since every row would have a unique ID. If you pivoted on a row ID, you’d end up with the same number of rows, just differently structured columns.

The problem here is identifying what rows belong to each cycle. I don’t really see another way other than added the cycle ID.

Cheers @scott. Noted!