I have the following three buckets:
Bucket redis_stats_1h receives data from another service and I have downsampling tasks that pass data from redis_stats_1h to redis_stats_2h and from redis_stats_2h to redis_stats_4h. The problem is that data in each bucket but is kept WAY LONGER than retention period + shard duration. In data explorer I can see the following results:
for query:
from(bucket: "redis_stats_1h")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "redis_cmdstats")
|> filter(fn: (r) => r["_field"] == "calls")
|> filter(fn: (r) => r["clusterName"] == "<cluster_name>")
|> filter(fn: (r) => r["cmdstat_name"] == "info")
|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
|> yield(name: "mean")
I can see data from more than 24 hours, while there should be only data from the last 2 hours (retention period + shard group duration). The same thing happens in buckets redis_stats_2h and redis_stats_4h. Here there is a plot for bucket redis_stats_2h that was generated using the same query:
The total disk size of bucket redis_stats_1h is 147M.
Note that I created this buckets from scratch, that is I did not modify retention/shard duration periods.
I could not find any information in influx docs on why this could be happening: is this a bug, or am I doing something wrong?
I use influx 2.0.8.