Flux: Table of records with only the first and last record with the same tag

Claudia_Silva · April 18, 2024, 2:44pm

I want to have a table of records within a specific time in which by planID I only have the first and last record. Since then I want to understand the value of the JLCA column and create a new column to compare whether the value of the first record is the same or not and put false or true. But for that I first need to have the table with the filtered records… but I still only managed to get the list of all the records and not by group(columns: [“plan_id”]).

Note: the plan_id column is a tag.

from(bucket: “Aplanner”)
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => (r._measurement == “success”) and r[“step”] == “operation”)
|> pivot(rowKey:[“_time”], columnKey: [“_field”], valueColumn: “_value”)
|> map(fn: (r) => ({
logId: r[“log_id”],
_time: r._time,
measurementName: r[“measurement”],
planIdValue: r[“plan_id”],
userIdValue: r[“userId”],
stepName: r[“stepName”],
procedureName: r[“procedure”],
nameJLCA: r[“JLCA”]
}))

Can anyone help me?
Thanks

scott · April 18, 2024, 2:53pm

@Claudia_Silva Is JCLA a tag or a field? Can your provide the schema of your raw data as well as the structure of the query result you’re looking for?

Claudia_Silva · April 18, 2024, 3:19pm

JLCA is a tag. Basically I am collecting logs from a user where this user has an associated plan_id and can do several tasks associated with the same plan_id. At the end of the day, each user has several plans and this plan_id always has a procedure and an associated JLCA as well.

Tags: plan_id, userId, procedure and JLCA

The log_id value is always unique, the rest may have the same values.

Current table with the query I placed:

log_id	_time	plan_id	userId	procedure	JLCA
1234	2024-04-16 17:36:15	123ea8f399d58194fc37456d	10	RH	True
7868	2024-04-16 17:57:04	123ea8f399d58194fc37456d	10	RH	False
4567	2024-04-16 18:03:19	123ea8f399d58194fc37456d	10	RH	False
9654	2024-04-16 18:16:41	789ea8f187d58194fc37456d	5	RH	False
3456	2024-04-17 15:08:07	880fd82f21185288a590edc4	5	RH	False

Problems with the current table: As we can see for the same plan_id → 123ea8f399d58194fc37456d I have 3 records so I just wanted to keep the oldest and the most recent and also in the new column that I want to create it will have the result True because there was a change in value from the first record to the last record if it is not false (JLCA)

Here is the expected result:

log_id	_time	plan_id	userId	procedure	JLCA	New Column
1234	2024-04-16 17:36:15	123ea8f399d58194fc37456d	10	RH	True	True
4567	2024-04-16 18:03:19	123ea8f399d58194fc37456d	10	RH	False
9654	2024-04-16 18:16:41	789ea8f187d58194fc37456d	5	RH	False	False
3456	2024-04-17 15:08:07	880fd82f21185288a590edc4	5	RH	False	False

scott · April 18, 2024, 3:32pm

Ok, and which columns are fields?

Claudia_Silva · April 18, 2024, 3:34pm

Everything is tags except for _time. Is there any problem with everything being “tag”? I’m referring to the columns I need in this case. Because inside the “Aplanner” bucket I have the response, payload and response_status columns for example as fields…

scott · April 18, 2024, 7:40pm

Ok, I see. So using the current query and the intermediate results you provided, in order to create the new column, you’re actually going to have to pivot the JLCA first and last values so the both exist in the same row. That’s really the only way to compare them. So the closest I could come would be an end result would actually look something like:

plan_id*	userId*	procedure*	JCLA_first	JCLA_last	JCLA_diff
123ea8f399d58194fc37456d	10	RH	True	False	true
789ea8f187d58194fc37456d	5	RH	False	False	false
880fd82f21185288a590edc4	5	RH	False	False	false

Here’s the query that would get you there:

data =
    from(bucket: "Aplanner")
        |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
        |> filter(fn: (r) => r._measurement == "success" and r["step"] == "operation")
        |> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
        |> group(columns: ["plan_id", "userId", "procedure"])

first =
    data
        |> first(column: "JCLA")
        |> set(key: "context", value: "JCLA_first")

last =
    data
        |> last(column: "JCLA")
        |> set(key: "context", value: "JCLA_last")

union(tables: [first, last])
    |> pivot(rowKey: ["plan_id"], columnKey: ["context"], valueColumn: "JCLA")
    |> map(fn: (r) => ({r with JCLA_diff: r.JCLA_first != r.JCLA_last}))

Claudia_Silva · April 19, 2024, 10:02am

Thanks for the solution Scott! At this moment the result is correct in terms of records and JCLA value, the only issue is that each answer is appearing in a dropdown and not a table in which each record is on a line…
For example: JLCA_first {plan_id = “123ea8f399d58194fc37456d”, procedure=“KO”, userId=“10”}, JLCA_last{plan_id = “123ea8f399d58194fc37456d”, procedure=“KO”, userId=“10”},JLCA_diff {plan_id=“123ea8f399d58194fc37456d”, procedure=“KO”, userId=“10”}

This is the table at the moment… And there is the dropdown below to select the record…

JLCA_first {plan_id = “123ea8f399d58194fc37456d”	procedure=“KO”	userId=“10”}	JLCA_last {plan_id = “123ea8f399d58194fc37456d”	procedure=“KO”	userId=“10”}	JLCA_diff {plan_id = “123ea8f399d58194fc37456d”	procedure=“KO”	userId=“10”}
false	false	false

In the ideal situation it would be:

plan_id	user_id	procedure	JCLA_first	JCLA_last	JCLA_diff
123ea8f399d58194fc37456d	10	KO	false	false	false
679ea8f399d58194fc367890	5	KO	false	false	false

Claudia_Silva · April 19, 2024, 10:35am

I added a new map, after some tests everything seems ok!!

union(tables: [first, last])
|> pivot(rowKey: [“plan_id”], columnKey: [“context”], valueColumn: “JLCA”)
|> map(fn: (r) => ({r with JLCA_diff: r.JLCA_first != r.JLCA_last}))
|> map(fn: (r) => ({
planId: r[“plan_id”],
userID: r[“userId”],
procedureName: r[“procedure”],
firstJLCA: r[“JLCA_first”],
lasttJLCA: r[“JLCA_last”],
difftJLCA: r[“JLCA_diff”]
}))

Claudia_Silva · April 19, 2024, 2:18pm

One last question Scott, can I get this column “JLCA_diff” and get the total number of “false and true” answers? I duplicated the table dashboard and now in this new dashboard my objective was to know the distribution of false and true responses in the “JLCA_diff” column…

scott · April 21, 2024, 2:22am

@Claudia_Silva Add the following to your existing query:

//... existing query
    |> group(columns: ["JLCA_diff"])
    |> set(key: "_value", value: "")
    |> count()

Claudia_Silva · April 22, 2024, 8:18am

Worked perfectly. Thank you for the help @scott !

Topic		Replies	Views
Flux: List all records of the most recent values (not only the last one) Fluxlang grafana , query , flux	1	1936	December 23, 2021
Select Last From Different Tag Grafana influxdb , query	1	594	July 5, 2022
Query only a tag in flux InfluxDB 2 influxdb , flux	3	922	April 12, 2021
How to efficiently get the timestamp of the first and last records in a table? Fluxlang	2	2673	February 18, 2022
How do I get the last timestamp of all measurements filtered by a tag? Fluxlang	1	771	September 12, 2020

Flux: Table of records with only the first and last record with the same tag

Related topics