I’ve been using InfluxDB without any issues, but it suddenly stopped. The InfluxDB version is 2.6.0
I am relatively new to Go and still learning about concurrency and locking mechanisms. I suspect that a potential deadlock might occur in InfluxDB 2.6.0 related to the TagValueIterator()
function when interacting with AddSeriesList()
.
Specifically, I wonder if the following sequence could lead to a deadlock scenario:
TagValueIterator()
acquires the firstRLock()
onf.mu
.AddSeriesList()
then attempts to acquire a writeLock()
onf.mu
.- Meanwhile,
tk.TagValueIterator()
attempts to acquire anotherRLock()
, which might cause a deadlock.
If tk.f
is pointing to the same LogFile
instance, is it possible that this sequence could result in a deadlock? Since Go’s sync.RWMutex
does not allow acquiring a write lock (Lock()
) while a read lock (RLock()
) is already held, I am curious whether this could be a potential issue in certain conditions.
Below is the pprof
output from when the issue first occurred. In this state, the lock is not being released. Additionally, deletion and queries are not working.
goroutine 106814401 [semacquire, 6 minutes]:
sync.runtime_SemacquireMutex(0xc00015020c?, 0x78?, 0x3?)
/usr/local/go/src/runtime/sema.go:77 +0x25
sync.(*RWMutex).Lock(0xc023a48620?)
/usr/local/go/src/sync/rwmutex.go:152 +0x71
github.com/influxdata/influxdb/v2/tsdb/index/tsi1.(*LogFile).AddSeriesList(0xc0095a71d0, 0xc000150200, {0xc00863f800?, 0x13, 0x0?}, {0xc00863fb00?, 0x13, 0xc00e37daf8?})
influxdb-2.6.0/tsdb/index/tsi1/log_file.go:545 +0x4a5
github.com/influxdata/influxdb/v2/tsdb/index/tsi1.(*Partition).createSeriesListIfNotExists(0xc037ff10e0, {0xc00863f800, 0x13, 0x20}, {0xc00863fb00, 0x13, 0x20})
influxdb-2.6.0/tsdb/index/tsi1/partition.go:725 +0x165
github.com/influxdata/influxdb/v2/tsdb/index/tsi1.(*Index).CreateSeriesListIfNotExists.func1()
influxdb-2.6.0/tsdb/index/tsi1/index.go:680 +0x13e
created by github.com/influxdata/influxdb/v2/tsdb/index/tsi1.(*Index).CreateSeriesListIfNotExists
influxdb-2.6.0/tsdb/index/tsi1/index.go:673 +0x1dd
~~~
goroutine 106815338 [semacquire, 6 minutes]:
sync.runtime_SemacquireMutex(0x4?, 0x40?, 0x2?)
/usr/local/go/src/runtime/sema.go:77 +0x25
sync.(*RWMutex).RLock(...)
/usr/local/go/src/sync/rwmutex.go:71
github.com/influxdata/influxdb/v2/tsdb/index/tsi1.(*LogFile).MeasurementIterator(0xc0095a71d0)
influxdb-2.6.0/tsdb/index/tsi1/log_file.go:784 +0x6b
~~~
goroutine 106814631 [semacquire, 6 minutes]:
sync.runtime_SemacquireMutex(0x3318308?, 0x38?, 0xc?)
/usr/local/go/src/runtime/sema.go:77 +0x25
sync.(*RWMutex).RLock(...)
/usr/local/go/src/sync/rwmutex.go:71
github.com/influxdata/influxdb/v2/tsdb/index/tsi1.(*logTagKey).TagValueIterator(0xc02a1a6fb8)
influxdb-2.6.0/tsdb/index/tsi1/log_file.go:1385 +0x51
github.com/influxdata/influxdb/v2/tsdb/index/tsi1.(*LogFile).TagValueIterator(0xc0095a71d0?, {0xc04537e640?, 0xa?, 0x158ed72?}, {0xc03be04a20, 0x9, 0x28?})
influxdb-2.6.0/tsdb/index/tsi1/log_file.go:432 +0x185
~~~
I appreciate any insights!