Tag values not expiring

Hello,

I am experimenting with InfluxDB 1.4 and I’m noticing that tag values still show up in show tag values from "raw_data" with key=route despite some of those values having no data points in the raw_data measurement (I have a short two-day retention policy).

Here’s roughly how I have things set up:

I have three RPs:

  • Autogen (not used)
  • 2_days
  • 90_days

Data comes into the raw_data measurement in the 2_days RP. The data has a route tag and a value.

Every 15 minutes, I have a kapacitor script to downsample the data and writing the new samples into a different measurement (say data_15m) in the 90_days RP. These new samples are also tagged by route.

I’m assuming that I should be able to use show tag values from "raw_data" with key=route to get a list of routes that have been seen over the past 2ish days, but I definitely see routes in that list that return no results when I try to select them from the raw_data measurement.

Restarting InfluxDB seems to trigger this prune of orphaned tag values. As such, I have two questions for the group:

  1. Is this behavior intended
  2. Is there a way I can trigger this pruning without having to restart InfluxDB?

That’s an interesting observation.
I don’t think this behavior was intended…, probably overlooked.

By looking at influxdb/store.go at master · influxdata/influxdb · GitHub
it appears that deleting a shard checks if series are used in other shards and deletes them from series file,
however it doesn’t delete them from memory and doesn’t kick inmem cache rebuild.
This also explains why after restarting InfluxDB the series from dropped shard don’t show up.

Workaround 1.
I found that in-mem index Rebuild is triggered by DeleteSeriesRange (https://github.com/influxdata/influxdb/blob/master/tsdb/index/inmem/inmem.go influxdb/engine.go at master · influxdata/influxdb · GitHub)
but I didn’t find any other direct way to force the Rebuild.

Workaround 2.
As a quick work-around, it is possible to switch to TSI1 to avoid storing index in memory.
This way deleting series from series file will be mapped to memory by the OS.

I would also recommend opening an issue at Issues · influxdata/influxdb · GitHub

Thank you very much!