influxdb/services/retention
Geoffrey Wossum b4bd607eef
fix: prevent retention service from hanging (#25055)
* fix: prevent retention service from hanging

Fix issue that can cause the retention service to hang waiting on a
`Shard.Close` call. When this occurs, no other shards will be deleted
by the retention service. This is usually noticed as an increase in
disk usage because old shards are not cleaned up.

The fix adds to new methods to `Store`, `SetShardNewReadersBlocked`
and `InUse`. `InUse` can be used to poll if a shard has active readers,
which the retention service uses to skip over in-use shards to prevent
the service from hanging. `SetShardNewReadersBlocked` determines if
new read access may be granted to a shard. This is required to prevent
race conditions around the use of `InUse` and the deletion of shards.

If the retention service skips over a shard because it is in-use, the
shard will be checked again the next time the retention service is run.
It can be deleted on subsequent checks if it is no longer in-use. If
the shards is stuck in-use, the retention service will not be able to
delete the shards, which can be observed in the logs for manual
intervention. Other shards can still be deleted by the retention service
even if a shard is stuck with readers.

closes: #25054
2024-06-13 11:07:17 -05:00
..
helpers fix: prevent retention service creating orphaned shard files (#24530) 2024-01-04 11:59:28 -06:00
config.go Report subset of config values in SHOW DIAGNOSTICS 2017-03-14 11:34:19 -07:00
config_test.go Cleanup services package 2018-01-21 10:52:37 -08:00
service.go fix: prevent retention service from hanging (#25055) 2024-06-13 11:07:17 -05:00
service_test.go fix: prevent retention service from hanging (#25055) 2024-06-13 11:07:17 -05:00