I’m trying to figure out if influxdb is a good way to store the time series data we are generating and if it can do the queries we want. One common paradigm of query we have is to select all elements from a table where a particular field’s value is a member of a largish set of other items.
For example, one set of data that we might store is a record of all files accessed on a system. I want to be able to get all such records where the file is one of 10,000 or so files in another set. (e.g., all files on a system that have known security issues).
What is the best way to do this? There doesn’t seem to be any good query as a group by on the time series data would result in a possible enormous list of enitites (especially if the data has been logged for a long time). Running 10k+ queries with different where clauses also seems less than ideal since you’d really want to check against the set of files (which can probably be memory resident) while the time series data point is resident in memory.
Any good solutions to this problem?