Recently I’ve tried to use influxdb OOS to query timestamp-based dataset (1.5 million of points). The timestamp values are in second, and all points can enclose in a timed range (1994 to 2018).
A single point assumes this form:
category: Food, categoryId: 1234, rating: 3.3, timestamp: 1147880044
There is only one influxTag the field categoryId.
I’ve noticed a slow query response when I executed (via CURL) this query using default RP configuration:
SELECT MEAN(rating), last(category) from “db0”.“autogen” WHERE time >= ‘1994-01-01’ AND time <= ‘2008-01-01’ GROUP BY categoryId (12.9sec).
I analyze the produced shard, and I found that there was a lot of shards using the default RP configuration, so, I changed the RP (CREATE DATABASE “db0” WITH SHARD DURATION 2190d NAME “shard_years”) and re-executed the ingestion. After that, I reran the query, and I got a response in 2sec.
For both of them examples, I used the influxdb docker image over a MacBook pro with 16Gb RAM and 6 core.
I’d like to have under the second queries for my scenario, am I missing something during DB configuration?
Any tips are welcome.
Thanks in advance,