JOINS to non-timeseries data/enrichment

I’m collecting metrics on multiple servers. But for the front end, I want to be able filter the servers based on what data center they’re in, or what region. Or possibly, by what applications they run. That would be non-timeseries metadata; maybe it’d come from postgres or some other data source.

So: I realize JOINS are not going to happen with a non TS table. But, when I look at Kapacitor enrichment, I see examples of it making columns based on timestamps. Question: can Kapacitor TickScripts load, say, a map of hostnames to data center info? From let’s say, a traditional DB, or a flat file, or some other source? And use that to create columns that’ll allow me to put together useful queries from the front end? If not the Kapacitor route, what approach do you suggest?

Yes, Kapacitor can be used to load data from external sources and use it to enrich time-series data. However, Kapacitor is designed to work with time-series data, so you will need to structure your non-time-series metadata in a way that can be joined with your time-series data based on some shared identifier, such as a hostname.

One way to achieve this is to load the non-time-series metadata into Kapacitor as a Key-Value store, and then use the .lookup method in your Kapacitor TICKscripts to enrich the incoming time-series data with the metadata. For example:

var dataCenter = stream
|from()
.measurement(‘server_metrics’)
|lookup(‘data_center_lookup’, ‘hostname’, ‘datacenter’)
|…

In this example, the data_center_lookup key-value store would contain the mapping from hostname to data center information. You can populate the key-value store from a variety of sources, such as a flat file, a traditional database (such as PostgreSQL), or even a REST API.

If Kapacitor is not the best fit for your use case, you may consider other solutions for data enrichment, such as integrating the data directly in your front-end application, or using a data processing tool like Apache NiFi to pre-process the data before storing it in a time-series database like InfluxDB.