If NewTSMReader() fails because mmap fails, do not
rename the file, because the error is probably
caused by vm.max_map_count being too low
Closes: #25337
(cherry picked from commit ec412f793b)
* fix: prevent retention service from hanging (#25055)
Fix issue that can cause the retention service to hang waiting on a
`Shard.Close` call. When this occurs, no other shards will be deleted
by the retention service. This is usually noticed as an increase in
disk usage because old shards are not cleaned up.
The fix adds to new methods to `Store`, `SetShardNewReadersBlocked`
and `InUse`. `InUse` can be used to poll if a shard has active readers,
which the retention service uses to skip over in-use shards to prevent
the service from hanging. `SetShardNewReadersBlocked` determines if
new read access may be granted to a shard. This is required to prevent
race conditions around the use of `InUse` and the deletion of shards.
If the retention service skips over a shard because it is in-use, the
shard will be checked again the next time the retention service is run.
It can be deleted on subsequent checks if it is no longer in-use. If
the shards is stuck in-use, the retention service will not be able to
delete the shards, which can be observed in the logs for manual
intervention. Other shards can still be deleted by the retention service
even if a shard is stuck with readers.
This is a port of ad68ec8 from master-1.x to main-2.x.
closes: #25076
(cherry picked from commit b4bd607eef)
If there was an error after the cache has been snapshotted to one or
more TSM files, but before the cache and WAL are cleaned up, then the
cache would be repeatedly snapshotted, generated duplicate level 1 TSM
files.
This commit attempts to clean those files up by removing the temporary
TSM file(s). The snapshot will be retried.
* introduced tmpl from Arrow, which allows existing templates to be
reused with additional command-line properties to control output.
* duplicated suite of ReadFloatBlock tests for ReadFloatArrayBlock
* only the float data type is tested as the Read APIs are generated
from a single template.
multiple users have attempted to run influxdb in a docker container
with a windows host and a volume mounted from windows. that causes
problems because it apparently uses samba/cifs which does not
support fsync on directories. this patchset will, if it receives an EINVAL
on directory fsync, as is what appears to happen on samba/cifs, then it
will ignore it. this should help.
fixes#9833.
fixes#9630.
This commit restricts the number of TSM1 files that can be opened
concurrently across the entire `tsdb.Store`. There is currently
a limit for the number of shards that can be opened concurrently,
however, this limit does not help when the number of CPU cores
is higher than the number of shards. Because TSM1 files have a 2GB
limit and there is no limit on the number of files per shard,
extremely large shards (1TB+) can load 1,000s of files simultaneously.
callers can always ensure that the observer set on the engine options
is appropriate for that shard id. this simplifies the api and reduces
the chance of bugs due to mixing up shard ids.
just adds some interface for hooks about when these files come and go.
we do them before the action is taken so that if the hook has an
error, it doesn't have any consistency problems.
The InUse call on TSMFiles is inherently racy in the presence of
Ref calls outside of the file store mutex. In addition, we return
some TSMFiles to callers without them being Ref'd which might allow
them to be closed from underneath. While I believe it is the case
that it would be impossible, as the only thing that gets a handle
externally is compaction, and compaction enforces that only one
handle exists at a time, and thus is only deleted once after the
compaction is done with it, it's not very obvious or enforced.
Instead, always return a TSMFile with a Ref call under the read
lock, and require that no one else calls Ref. That way, it cannot
transition to referenced if the InUse call returns false under the
write lock.
The CreateSnapshot method was racy in a number of ways in the presence
of multiple calls or compactions: it did not take references to the
TSMFiles, and the temporary directory it creates could have been
shared with concurrent CreateSnapshot calls. In addition, the
files slice could have been concurrently mutated during a compaction
as well.
Instead, under the write lock, make a local copy of the state for
the compaction, including Ref calls (write locks are implicitly
read locks). Then, there is no need for a lock at all afterward.
Add some comments to explain these issues at the call sites of InUse,
and document that the Files method that returns the slice unprotected
is only for tests.
Re-open the last wal segment instead of creating a new one. This fixes
an issue where the last modified time of the WAL would change on
restart. It also avoids a lot of IO file churn on restart.
This commit firstly ensures that a shard's size on disk is accurately
reported when using the tsi1 index, by including the on-disk size of the
tsi1 index in the calculation.
Secondly, this commit add support for shard streaming/copying when using
the tsi1 index. Prior to this, a tsi1 index would not be correctly
restored when streaming shards.
* replaces coordinating goroutines for single k-way heap merge iterator
* removes contention sending keys across buffered channels
startup time from 46s -> 28s for iterating 1MM keys across 14 shards