influxdb

Commit Graph

Author	SHA1	Message	Date
Geoffrey Wossum	b4bd607eef	fix: prevent retention service from hanging (#25055 ) * fix: prevent retention service from hanging Fix issue that can cause the retention service to hang waiting on a `Shard.Close` call. When this occurs, no other shards will be deleted by the retention service. This is usually noticed as an increase in disk usage because old shards are not cleaned up. The fix adds to new methods to `Store`, `SetShardNewReadersBlocked` and `InUse`. `InUse` can be used to poll if a shard has active readers, which the retention service uses to skip over in-use shards to prevent the service from hanging. `SetShardNewReadersBlocked` determines if new read access may be granted to a shard. This is required to prevent race conditions around the use of `InUse` and the deletion of shards. If the retention service skips over a shard because it is in-use, the shard will be checked again the next time the retention service is run. It can be deleted on subsequent checks if it is no longer in-use. If the shards is stuck in-use, the retention service will not be able to delete the shards, which can be observed in the logs for manual intervention. Other shards can still be deleted by the retention service even if a shard is stuck with readers. closes: #25054	2024-06-13 11:07:17 -05:00
Geoffrey Wossum	7bd3f89d18	fix: prevent retention service creating orphaned shard files (#24530 ) * fix: prevent retention service creating orphaned shard files Under certain circumstances, the retention service can fail to delete shards from the store in a timely manner. When the shard groups are pruned based on age, this leaves orphaned shard files on the disk. The retention service will then not attempt to remove the obsolete shard files because the meta store does not know about them. This can cause excessive disk space usage for some users. This corrects that by requiring shards files be deleted before they can be removed from the meta store. fixes: #24529	2024-01-04 11:59:28 -06:00
Edd Robinson	663566e3e0	Ensure go fmt passes on 1.10/11	2018-08-21 17:39:42 +01:00
Ben Johnson	1fe9abd66f	Delete deleted shards in retention service.	2018-03-28 10:44:14 -06:00
Jonathan A. Sternberg	b775ad3d5d	Expand unit test code coverage in services that were undercovered This expands code coverage for the following packages: * monitor (3.5% -> 86.9%) * services/precreator (31.6% -> 83.8%) * services/retention (83.0% -> 84.9%) * services/snapshotter (0.0% -> 82.1%) * tcp (48.7% -> 60.0%)	2017-11-28 15:44:35 -06:00
Jonathan A. Sternberg	ca5a773c34	Initial jenkinsfile	2017-11-13 14:02:23 -06:00
Jonathan A. Sternberg	0b7c56bcd8	Update the zap logger dependency The previous sha was taken from a revision on a devel branch that I thought would continue staying in the tree after it was merged. That revision was rebased away and the API was changed for the logger. This updates the usage of the logger and adds a simple package for constructing the base logger. The 1.0 version of zap changed the format of the default console logger so this change moves over to this new logger instead of attempting to retain backwards compatibility with the old format.	2017-11-10 16:27:16 -06:00
Edd Robinson	2ea2abb001	Remove possibility of race when dropping shards Fixes #8819. Previously, the process of dropping expired shards according to the retention policy duration, was managed by two independent goroutines in the retention policy service. This behaviour was introduced in #2776, at a time when there were both data and meta nodes in the OSS codebase. The idea was that only the leader meta node would run the meta data deletions in the first goroutine, and all other nodes would run the local deletions in the second goroutine. InfluxDB no longer operates in that way and so we ended up with two independent goroutines that were carrying out an action that was really dependent on each other. If the second goroutine runs before the first then it may not see the meta data changes indicating shards should be deleted and it won't delete any shards locally. Shortly after this the first goroutine will run and remove the meta data for the shard groups. This results in a situation where it looks like the shards have gone, but in fact they remain on disk (and importantly, their series within the index) until the next time the second goroutine runs. By default that's 30 minutes. In the case where the shards to be removed would have removed the last occurences of some series, then it's possible that if the database was already at its maximum series limit (or tag limit for that matter), no further new series can be inserted.	2017-10-26 16:15:13 +01:00
Edd Robinson	77977af685	Add repro test for #8819	2017-10-26 14:47:30 +01:00
Edd Robinson	1629ec7f5f	Add tests to Retention service	2017-10-26 14:47:30 +01:00

10 Commits (master-1.x)