Noob here. I work at a financial services company, and we have a collection of transactional data that we receive which we have to process in near real time. Specifically, we have data arriving in the order of 15,000 transactions per second coming off a Kafka topic, and we have to do comparison operations on the records as they roll by. The records are very wide (1900 columns or thereabouts), but the comparison operations occur on very few columns (~10-20). Our comparison window is about a minute.
What we would like to be able to do is to insert all of the records into Influx, execute the comparison operations and then do a continuous read on the back end of the window. Something like (again, noob, so bear with me):
Insert into COMPARISON_WINDOW (kafka data)
Create Continuous Query (get_results) on data
BEGIN
Select *
from COMPARISON_WINDOW
where
and time <= now() - 60seconds
END
My theory is that this approach would drain the desired data from InfluxDB after the comparison window is up on a continual basis. The output of that query could then be piped to a Kafka topic for downstream consumption.
Is this reasonable, or am I barking up the wrong tree?