Flux: Table of records with only the first and last record with the same tag

I want to have a table of records within a specific time in which by planID I only have the first and last record. Since then I want to understand the value of the JLCA column and create a new column to compare whether the value of the first record is the same or not and put false or true. But for that I first need to have the table with the filtered records… but I still only managed to get the list of all the records and not by group(columns: [“plan_id”]).

Note: the plan_id column is a tag.

from(bucket: “Aplanner”)
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => (r._measurement == “success”) and r[“step”] == “operation”)
|> pivot(rowKey:[“_time”], columnKey: [“_field”], valueColumn: “_value”)
|> map(fn: (r) => ({
logId: r[“log_id”],
_time: r._time,
measurementName: r[“measurement”],
planIdValue: r[“plan_id”],
userIdValue: r[“userId”],
stepName: r[“stepName”],
procedureName: r[“procedure”],
nameJLCA: r[“JLCA”]
}))

Can anyone help me?
Thanks

@Claudia_Silva Is JCLA a tag or a field? Can your provide the schema of your raw data as well as the structure of the query result you’re looking for?

JLCA is a tag. Basically I am collecting logs from a user where this user has an associated plan_id and can do several tasks associated with the same plan_id. At the end of the day, each user has several plans and this plan_id always has a procedure and an associated JLCA as well.

Tags: plan_id, userId, procedure and JLCA

The log_id value is always unique, the rest may have the same values.

Current table with the query I placed:

log_id _time plan_id userId procedure JLCA
1234 2024-04-16 17:36:15 123ea8f399d58194fc37456d 10 RH True
7868 2024-04-16 17:57:04 123ea8f399d58194fc37456d 10 RH False
4567 2024-04-16 18:03:19 123ea8f399d58194fc37456d 10 RH False
9654 2024-04-16 18:16:41 789ea8f187d58194fc37456d 5 RH False
3456 2024-04-17 15:08:07 880fd82f21185288a590edc4 5 RH False

Problems with the current table: As we can see for the same plan_id → 123ea8f399d58194fc37456d I have 3 records so I just wanted to keep the oldest and the most recent and also in the new column that I want to create it will have the result True because there was a change in value from the first record to the last record if it is not false (JLCA)

Here is the expected result:

log_id _time plan_id userId procedure JLCA New Column
1234 2024-04-16 17:36:15 123ea8f399d58194fc37456d 10 RH True True
4567 2024-04-16 18:03:19 123ea8f399d58194fc37456d 10 RH False
9654 2024-04-16 18:16:41 789ea8f187d58194fc37456d 5 RH False False
3456 2024-04-17 15:08:07 880fd82f21185288a590edc4 5 RH False False

Ok, and which columns are fields?

Everything is tags except for _time. Is there any problem with everything being “tag”? I’m referring to the columns I need in this case. Because inside the “Aplanner” bucket I have the response, payload and response_status columns for example as fields…

Ok, I see. So using the current query and the intermediate results you provided, in order to create the new column, you’re actually going to have to pivot the JLCA first and last values so the both exist in the same row. That’s really the only way to compare them. So the closest I could come would be an end result would actually look something like:

plan_id* userId* procedure* JCLA_first JCLA_last JCLA_diff
123ea8f399d58194fc37456d 10 RH True False true
789ea8f187d58194fc37456d 5 RH False False false
880fd82f21185288a590edc4 5 RH False False false

Here’s the query that would get you there:

data =
    from(bucket: "Aplanner")
        |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
        |> filter(fn: (r) => r._measurement == "success" and r["step"] == "operation")
        |> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
        |> group(columns: ["plan_id", "userId", "procedure"])

first =
    data
        |> first(column: "JCLA")
        |> set(key: "context", value: "JCLA_first")

last =
    data
        |> last(column: "JCLA")
        |> set(key: "context", value: "JCLA_last")

union(tables: [first, last])
    |> pivot(rowKey: ["plan_id"], columnKey: ["context"], valueColumn: "JCLA")
    |> map(fn: (r) => ({r with JCLA_diff: r.JCLA_first != r.JCLA_last}))

Thanks for the solution Scott! At this moment the result is correct in terms of records and JCLA value, the only issue is that each answer is appearing in a dropdown and not a table in which each record is on a line…
For example: JLCA_first {plan_id = “123ea8f399d58194fc37456d”, procedure=“KO”, userId=“10”}, JLCA_last{plan_id = “123ea8f399d58194fc37456d”, procedure=“KO”, userId=“10”},JLCA_diff {plan_id=“123ea8f399d58194fc37456d”, procedure=“KO”, userId=“10”}

This is the table at the moment… And there is the dropdown below to select the record…

JLCA_first {plan_id = “123ea8f399d58194fc37456d” procedure=“KO” userId=“10”} JLCA_last {plan_id = “123ea8f399d58194fc37456d” procedure=“KO” userId=“10”} JLCA_diff {plan_id = “123ea8f399d58194fc37456d” procedure=“KO” userId=“10”}
false false false

In the ideal situation it would be:

plan_id user_id procedure JCLA_first JCLA_last JCLA_diff
123ea8f399d58194fc37456d 10 KO false false false
679ea8f399d58194fc367890 5 KO false false false

I added a new map, after some tests everything seems ok!!

union(tables: [first, last])
|> pivot(rowKey: [“plan_id”], columnKey: [“context”], valueColumn: “JLCA”)
|> map(fn: (r) => ({r with JLCA_diff: r.JLCA_first != r.JLCA_last}))
|> map(fn: (r) => ({
planId: r[“plan_id”],
userID: r[“userId”],
procedureName: r[“procedure”],
firstJLCA: r[“JLCA_first”],
lasttJLCA: r[“JLCA_last”],
difftJLCA: r[“JLCA_diff”]
}))

One last question Scott, can I get this column “JLCA_diff” and get the total number of “false and true” answers? I duplicated the table dashboard and now in this new dashboard my objective was to know the distribution of false and true responses in the “JLCA_diff” column…

@Claudia_Silva Add the following to your existing query:

//... existing query
    |> group(columns: ["JLCA_diff"])
    |> set(key: "_value", value: "")
    |> count()

Worked perfectly. Thank you for the help @scott !