Strategies for linking time-series and relational data

I have two different use cases that share a need to link time-series data and relational data. I’m interested in strategies on how to best accomplish this.

The particulars aren’t especially important. But to paint a picture with an invented example… Imagine a manufacturing line where certain metrics are clearly time-series — e.g. defects, widgets produced — but these metrics are also tied to clearly relational data — e.g. a widget is built from a particular bill of material filled by specific lots of parts from different vendors. Say we wanted to correlate defects to vendors. It’s best to avoid forcing Influx to store complex relationships it’s not designed for. Perhaps it could be done in simple cases, but I think our use cases are over that threshold of simplicity.

At first blush it seems we need one or more unique IDs to tie things together. Along with this comes necessary but manageable complications in maintaining unique IDs and linked records in disconnected systems.

Generally, are there other ways to accomplish something like the above? Entirely different strategies than a unique ID to link records in different databases? If linking IDs are the best option, are there existing good practices or even tools to help accomplish this?

I’m interested in the larger philosophical thinking and design than drilling down into the specifics of our use cases (which are not yet well defined anyhow).

Thanks in advance for any guidance or resources.

1 Like

Hello @mkarlesky,

Good question! I’m not entirely sure what the best practices are. You might want to take a look at these: SideloadNode | Kapacitor 1.5 Documentation
telegraf/plugins/inputs/postgresql_extensible at master · influxdata/telegraf · GitHub

You’re on the right track though. You would probably need some unique ID’s to tie things together. This reminds me of how trucking companies use geohashes with influxdb to get an approximation for the trucks’ position. Maybe you could apply a similar logic? Maybe you could have tags that describe a range of vendors and then use sideload to add the exact ID once you’ve identified a default? Using some process of elimination logic? Again, great question!

Here’s a more eloquent answer from a coworker “People will either store extra data needed as tags or do a DIY implementation to add a tag that points back to data in another system”.

Thank you, @Anaisdg. The more eloquent answer is the approach we’re taking. It’s nice to have some confirmation. It was hard to envision any other way to do it, but we thought it best to ask.

UPDATE: Flux’s ability to join multiple data sources is ultimately one of the best approaches for the use case I described.