Joining tables. New to Influx

data table
and
calculatedValue table

I’m trying to bring the meanElapsed column to merge with the data table.
So it will be an extra column called meanElapsed with its same value on each row.
How?
I also added an x column on both tables to make them both have a common column and values. I don’t know if it helps.

Here is a query that I tried.

join.left(
    left: data,
    right: calculatedValue,
    on: (l, r) => l.x == r.x,
    as: (l, r) => 
    ({
    l with meanElapsed: r.meanElapsed
    })
)

But the output table has the meanElapsed column empty

Welcome @AAron to the forum.

Can you please post your full Flux queries for data table and calculatedValue table? You may not need to do a join, but instead use a pivot(). Will help tremendously to see the two queries.

Hi yes, here you go

import "join"

calculatedValue = from(bucket: "apd")
    |> range(start: 2023-03-29T17:04:08Z, stop: 2023-04-30T17:04:08Z)
    |> filter(fn: (r) => r._measurement == "dryerAmbient" and r._field == "temperature" and r.dryerId == "IBD09" and r.deviceId == "I1")
    |> pivot(rowKey: ["_time", "dryerId"], columnKey: ["deviceId"], valueColumn: "_value")
    |> elapsed(unit: 1s)
    |> mean(column: "elapsed")
    |> rename(columns: {elapsed: "meanElapsed"})
    |> map(fn: (r) => ({ r with meanElapsed: r.meanElapsed * 1.3 }))
    |> map(fn: (r) => ({ r with x: 1.0}))
    |> keep(columns: ["meanElapsed", "x"])


data = from(bucket: "apd")
    |> range(start: 2023-03-29T17:04:08Z, stop: 2023-04-30T17:04:08Z)
    |> filter(fn: (r) => r._measurement == "dryerAmbient" and r._field == "temperature" and r.dryerId == "IBD09" and r.deviceId == "I1")
    |> map(fn: (r) => ({r with x: 1.0}))


join.left(
    left: data,
    right: calculatedValue,
    on: (l, r) => l.x == r.x,
    as: (l, r) => 
    ({
    l with meanElapsed: r.meanElapsed
    })
)

Let’s try something like this…

joinedData = join(
    tables: {calculatedValue: calculatedValue, data: data},
    on: ["_time", "dryerId", "x"]
)

There were no results. Also, I had to remove the import join statement at the top.

calculatedValue = from(bucket: "apd")
    |> range(start: 2023-03-29T17:04:08Z, stop: 2023-04-30T17:04:08Z)
    |> filter(fn: (r) => r._measurement == "dryerAmbient" and r._field == "temperature" and r.dryerId == "IBD09" and r.deviceId == "I1")
    |> pivot(rowKey: ["_time", "dryerId"], columnKey: ["deviceId"], valueColumn: "_value")
    |> elapsed(unit: 1s)
    |> mean(column: "elapsed")
    |> rename(columns: {elapsed: "meanElapsed"})
    |> map(fn: (r) => ({ r with meanElapsed: r.meanElapsed * 1.3 }))
    |> map(fn: (r) => ({ r with x: 1.0}))
    |> keep(columns: ["meanElapsed", "x"])


data = from(bucket: "apd")
    |> range(start: 2023-03-29T17:04:08Z, stop: 2023-04-30T17:04:08Z)
    |> filter(fn: (r) => r._measurement == "dryerAmbient" and r._field == "temperature" and r.dryerId == "IBD09" and r.deviceId == "I1")
    |> map(fn: (r) => ({r with x: 1.0}))


joinedData = join(
    tables: {calculatedValue: calculatedValue, data: data},
    on: ["_time", "dryerId", "x"]
)

joinedData

I am sure this is possible. I did it a while back but cannot seem to find my notes or test example. I will have a look when I am back at my PC.

Hi @AAron your first code doesn’t look wrong but try adding a |> group() at the end of each table, and just to br clear on what to expect, join.left will take all the rows on the left column and if there is a match based on the column condition used it will ad the entries of the right column.

Hard to tell what to expect whitout actually seeing both tables, and also there are 2 types of joins.

If you use import “join” there are more functions in that library (left, right, inner, outter), but you can not use the simpler join() wich if i remember correctly is an inner join

The explanation:

In the join functions you need to have the same schema usinf |> group () removes any grouping.

If I am correct, when you used keep you dropped some grouping columns so now your 2 data sets have different schemas.

Thanks! The meanElapsed column is populated after adding the group() function to the data table. Still not entirely sure why it started to work, since there is no keep() function in my data table.

import "join"

calculatedValue = from(bucket: "apd")
    |> range(start: 2023-03-29T17:04:08Z, stop: 2023-04-30T17:04:08Z)
    |> filter(fn: (r) => r._measurement == "dryerAmbient" and r._field == "temperature" and r.dryerId == "IBD09" and r.deviceId == "I1")
    |> pivot(rowKey: ["_time", "dryerId"], columnKey: ["deviceId"], valueColumn: "_value")
    |> elapsed(unit: 1s)
    |> mean(column: "elapsed")
    |> rename(columns: {elapsed: "meanElapsed"})
    |> map(fn: (r) => ({ r with meanElapsed: r.meanElapsed * 1.3 }))
    |> map(fn: (r) => ({ r with x: 1.0}))
    |> keep(columns: ["meanElapsed", "x"])

data = from(bucket: "apd")
    |> range(start: 2023-03-29T17:04:08Z, stop: 2023-04-30T17:04:08Z)
    |> filter(fn: (r) => r._measurement == "dryerAmbient" and r._field == "temperature" and r.dryerId == "IBD09" and r.deviceId == "I1")
    |> map(fn: (r) => ({r with x: 1.0}))
    |> group()

join.left(
    left: data,
    right: calculatedValue,
    on: (l, r) => l.x == r.x,
    as: (l, r) => 
    ({
    l with meanElapsed: r.meanElapsed
    })
)

Awesome @AAron. Please mark above as solution so others can find it.

Nice job @fercasjr

Nooo. so, you are joining 2 tables: data, and calculatedValue.

Both should have the same scheme (achieved with |> group() ) before joining them , you can also achieve this inside the join function :

join.left(
    left: data |> group(),
    right: calculatedValue |> group(),
    on: (l, r) => l.x == r.x,
    as: (l, r) => 
    ({
    l with meanElapsed: r.meanElapsed
    })
)

What is this then :
|> keep(columns: [“meanElapsed”, “x”])