Join two table streams issue with flux

I have two table streams
one (left) is

table _value source_ip
0 5 15.185.142.23
0 3 15.185.200.105

another (right) is

table _measurement _start _stop _time host path source_country source_ip
0 tail 2023-07-06T05:44:21.844Z 2023-07-06T05:59:21.844Z 2023-07-06T05:45:06.437Z 3532c312ea2e /firewall/log/untrust-2-trust.log BH 15.185.142.23

try to join two tables stream with following flux

join.left(
    left: leftdata,
    right: rightdata,
    on: (l, r) => l.source_ip == r.source_ip,
    as: (l, r) => ({C: r.source_country,IP:l.source_ip}),
)

but can not get anything, should be retrun country information, is there problem? thanks

table C IP
0 15.185.142.23

Hey, make sure that group keys are equals for both measurements.

join.left(
    left: leftdata |> group(),
    right: rightdata |> group(),
    on: (l, r) => l.source_ip == r.source_ip,
    as: (l, r) => ({C: r.source_country,IP:l.source_ip}),
)

Hi guys,
thanks, as this suggestion, can get source_country value, but join result amount is not right. My leftdata amount is 5 with different source_ip, rightdata amount is more than 5, and has same source_ip. I used left.inner, but still is not right. can you give me suggestion, how to fix it? and about group key, I am not sure how to check group keys?

Hey,

Look at the following example prepared for u:

import "array"
import "join"


left =
    array.from(
        rows: [
            {source_ip: "A", _value: 5,},
            {source_ip: "B", _value: 6},
        ],
    )

right =
    array.from(
        rows: [
            {source_ip: "A", source_country: "IT", host: "host3", extra: 121.2},
            {source_ip: "C", source_country: "GER", host: "host4", extra: 234.3},
        ],
    )

join.inner(
    left: left |> group(columns: ["source_ip"]),
    right: right |> group(columns: ["source_ip"]),
    on: (l, r) => l.source_ip == r.source_ip,
    as: (l, r) => ({r with dataFromLeft: l._value}),
)

In general: group your data for left and right streams to have matchin group keys, and decide which stream you want to keep or extend from the other stream. Use ‘with’ keyword to extend the selected stream from coulms from the other one. Make sure that column names are not overwritten.

I hope it helps!

Join types: Join data with Flux | Flux 0.x Documentation
Join().as parameter: Join data with Flux | Flux 0.x Documentation
Group keys: Group data in InfluxDB with Flux | InfluxDB Cloud (TSM) Documentation

1 Like

Hi d60,

thanks your reply. actully the right stream like this
right =
    array.from(
        rows: [
            {source_ip: "A", source_country: "IT", host: "host3", extra: 121.2},
            {source_ip: "C", source_country: "GER", host: "host4", extra: 234.3},
            {source_ip: "A", source_country: "GER", host: "host4", extra: 221.3},
        ],
    )

I just want keep source_ip and source_country. and as your join, the will be generated duplication record

@d60 and @tbs575,
Thanks for helping each other out! I really appreciate your contribution to the community.

1 Like