Question: Join two batch nodes

I am using kapacitor for the first time and would like to perform a MYSQL like join between a data and a meta table in my influxDB database.

data -> contains health check data for a specific server
meta -> contains meta-data (such as environment info, app info etc.) regarding all servers in the environment.

Sample data:

data

time host status
1521557874323003855 serverA OK
1521557877633920451 serverB CRITICAL

meta-data

time app environment host
1521556804421339962 XYZ PROD serverA
1521556810571560708 ABC PROD serverB

I tried using the following TICK script to achieve this but the output was not as expected:

TICK script

dbrp "test"."autogen"

var hostinfo = batch
  |query('SELECT host, environment, app from "test"."autogen"."meta"')
    .period(100000m)
    .every(5s)

var hc = batch
  |query('SELECT status from "test"."autogen"."data"')
    .period(100000m)
    .every(5s)

hc
  |join(hostinfo)
    .as('hc', 'hostinfo')
    .tolerance(1w)
  |eval(lambda: "hc.status",
        lambda: "hostinfo.environment",
        lambda: "hostinfo.app",
        lambda: "hostinfo.host")
      .as('status',
          'env',
          'app',
          'host')
  |influxDBOut()
    .database('test')
    .measurement('final')

Expected Output:

time host status environment app
1521557874323003855 serverA OK PROD XYZ
1521557877633920451 serverB CRITICAL PROD ABC

Actual Output:

time app env host status
1521417600000000000 ABC PROD serverB CRITICAL

As you can see above it does not output only one row.
I tried adding the distinct parameter but that also didnt help.

influxDB version:

influx -version
InfluxDB shell version: 1.4.3

Kapacitor version:

kapacitor version
Kapacitor 1.4.0 (git: HEAD fcce3ee9e6abcee5595fd61066bfc904edb1e113)

Can you please let me know if I am missing something or this is not possible at all in kapacitor/influxdb??

Update:

I was able to make the joins work by adding host as a tag in meta-data table

data

time host status
1521741575077651109 serverA OK
1521741581813515231 serverB CRITICAL

meta-data

time app environment host value
1521745072550031135 XYZ PROD serverA serverA
1521745078199571772 XYZ PROD serverB serverB

TICK script

dbrp "test"."autogen"

var hostinfo = batch
  |query('SELECT host, environment, app, value from "test"."autogen"."meta"')
    .groupBy(*)
    .period(100000m)
    .every(5s)

var hc = batch
  |query('SELECT status from "test"."autogen"."data"')
    .groupBy(*)
    .period(100000m)
    .every(5s)

hc
  |join(hostinfo)
    .as('hc', 'hostinfo')
    .tolerance(100w)
    .on('host')
  |eval(lambda: "hc.status",
        lambda: "hostinfo.environment",
        lambda: "hostinfo.app",
        lambda: "hostinfo.host",
        lambda: "hostinfo.value")
      .as('status',
          'env',
          'app',
          'host',
          'value')
  |influxDBOut()
    .database('test')
    .measurement('final')

Actual Output:

time app app_1 env environment host host_1 status value
1549843200000000000 XYZ XYZ PROD PROD serverA serverA OK serverA
1549843200000000000 XYZ XYZ PROD PROD serverB serverB CRITICAL serverB
1 Like