I’m looking for a way to downsample a set of points using a specific timestamp, I’d like to know if that’s possible using a CQ or Kapacitor before using a custom solution (scheduled custom script or something similar).
I need to do so in order to reduce the overall amount of data, here is the plan:
- I fetch data every 15sec (including some big texts)
- I split the gathered data, and store the text and the key (a hash) in a separate RP “temp”, with a duration of 1h
- Using a built-in solution like CQ or Kapacitor
- get data every few minutes from an RP (“temp”)
- Write them with a fixed timestamp in a different RP
As time passes, in my “temp” RP I’ll have some data like this
time | hash | text |
---|---|---|
2020-10-19 11:00 | 001 | “aaa” |
2020-10-19 11:00 | 002 | “bbb” |
2020-10-19 11:15 | 001 | “aaa” |
2020-10-19 11:15 | 002 | “bbb” |
2020-10-19 11:30 | 001 | “aaa” |
2020-10-19 11:30 | 003 | “ccc” |
2020-10-19 11:45 | 004 | “ddd” |
2020-10-19 11:45 | 005 | “eee” |
… | … | … |
2020-10-19 14:00 | 001 | “aaa” |
2020-10-19 14:00 | 002 | “bbb” |
2020-10-19 14:15 | 005 | “eee” |
2020-10-19 14:15 | 006 | “fff” |
I want to achieve a final result which stores all the possible values of those hashes, at the same exact time, in order to don’t have the text repeated every 15sec, but once a day, having the same key (time+hash) will just update the value, allowing me to have less points.
time | hash | text |
---|---|---|
2020-10-19 00:00 | 001 | “aaa” |
2020-10-19 00:00 | 002 | “bbb” |
2020-10-19 00:00 | 003 | “ccc” |
2020-10-19 00:00 | 004 | “ddd” |
2020-10-19 00:00 | 005 | “eee” |
2020-10-19 00:00 | 006 | “fff” |
Ideally, this upsert (insert/update) is performed every few minutes (1-5min).
Is this possible using a CQ or Kapacitor (no UDF) ?
For what I’ve found so far, the answer is “no” (sadly)