JOINS to non-timeseries data/enrichment

bheadley · June 7, 2018, 7:21pm

I’m collecting metrics on multiple servers. But for the front end, I want to be able filter the servers based on what data center they’re in, or what region. Or possibly, by what applications they run. That would be non-timeseries metadata; maybe it’d come from postgres or some other data source.

So: I realize JOINS are not going to happen with a non TS table. But, when I look at Kapacitor enrichment, I see examples of it making columns based on timestamps. Question: can Kapacitor TickScripts load, say, a map of hostnames to data center info? From let’s say, a traditional DB, or a flat file, or some other source? And use that to create columns that’ll allow me to put together useful queries from the front end? If not the Kapacitor route, what approach do you suggest?

Florian_Haas · February 10, 2023, 11:46am

Yes, Kapacitor can be used to load data from external sources and use it to enrich time-series data. However, Kapacitor is designed to work with time-series data, so you will need to structure your non-time-series metadata in a way that can be joined with your time-series data based on some shared identifier, such as a hostname.

One way to achieve this is to load the non-time-series metadata into Kapacitor as a Key-Value store, and then use the .lookup method in your Kapacitor TICKscripts to enrich the incoming time-series data with the metadata. For example:

var dataCenter = stream
|from()
.measurement(‘server_metrics’)
|lookup(‘data_center_lookup’, ‘hostname’, ‘datacenter’)
|…

In this example, the data_center_lookup key-value store would contain the mapping from hostname to data center information. You can populate the key-value store from a variety of sources, such as a flat file, a traditional database (such as PostgreSQL), or even a REST API.

If Kapacitor is not the best fit for your use case, you may consider other solutions for data enrichment, such as integrating the data directly in your front-end application, or using a data processing tool like Apache NiFi to pre-process the data before storing it in a time-series database like InfluxDB.

Topic		Replies	Views
Stream inner join on a non time column with Kapacitor Kapacitor kapacitor	0	884	April 5, 2018
Using Kapacitor to rollup metrics Kapacitor kapacitor , datalifecycle	12	3004	April 13, 2017
Join on latest tick kapacitor	2	819	July 13, 2017
Using Kapacitor to Join IDs to Lookup Table Kapacitor	1	1203	May 22, 2017
Kapacitor Stream timestamp-confusion when using JoinNode kapacitor	2	860	March 19, 2019

JOINS to non-timeseries data/enrichment

Related Topics