tsdb.Engine.IsIdle and tsdb.Engine.Digest now return a reason string for why the engine & shard are not idle.
Callers can then use this string for logging, if desired. The returned reason does not allocate memory, so the
caller may want to add the shard ID and path for more information in the log. This is intended to be used in
calls from the anti-entropy service in Enterprise.
(cherry picked from commit bf45841359)
fixes https://github.com/influxdata/influxdb/issues/21448
Extending the context instead of fixing the API breaks type safety.
For tracking the number of points / values written, it is much clearer
to pass an explicit tracker.
When an InfluxDB database is very busy writing new points the backup
the process can fail because it can not write a new snapshot.
The error is: operation timed out with error: create snapshot: snapshot in progress.
This happens because InfluxDB takes almost "continuously" a snapshot
from the cache caused by the high number of points ingested.
The fix for this was https://github.com/influxdata/influxdb/pull/16627
but it was for OSS only, and was not in the code path for backups
in clusters.
This fix adds a skipCacheOk flag to tsdb.Engine.CreateSnapshot().
A value of true allows the backup to proceed even if a cache snapshot
cannot be taken.
This flag is set to true in tsm1.Engine.Backup(), the OSS backup code path
and in tsdb.Shard.CreateSnapshot(), the cluster backup code path.
This flag is set to false in tsm1.Engine.Export()
https://github.com/influxdata/plutonium/issues/3227
This commit restricts the number of TSM1 files that can be opened
concurrently across the entire `tsdb.Store`. There is currently
a limit for the number of shards that can be opened concurrently,
however, this limit does not help when the number of CPU cores
is higher than the number of shards. Because TSM1 files have a 2GB
limit and there is no limit on the number of files per shard,
extremely large shards (1TB+) can load 1,000s of files simultaneously.
callers can always ensure that the observer set on the engine options
is appropriate for that shard id. this simplifies the api and reduces
the chance of bugs due to mixing up shard ids.
just adds some interface for hooks about when these files come and go.
we do them before the action is taken so that if the hook has an
error, it doesn't have any consistency problems.
No appreciable changes in benchmark results. It seems like this
function is less than 4% of cpu time in the write workloads in the
benchmarks at least.
* filters allow specific combinations of database, retention policy and
shard groups to be opened. This was added to reduce the start-up time
of the export tool and limit the memory usage.
This moves the time range to delete to be returned by the predicate
func in DeleteSeriesRangeWithPredicate. It allows for a single delete
to delete different ranges of times per series instead of a single
range of time for all series.
This commit adds initial empty sketches back to the tsi1 index, as well
as ensuring that ephemeral sketches in the index `LogFile` are updated
accordingly.
The commit also adds a test that verifies that the merged sketches at
the store level produce the correct results under writes, deletions and
re-opening of the store.
This commit does not provide working sketches for post-compaction on the
tsi1 index.
This commit adds the ability to correctly mark a series as deleted in
the global series file. Whenever a shard engine determines that a series
should be deleted, it checks with each shard's bitset for series that
are to be deleted and are no longer contained in any shard-local
bitsets.
These series are then removed from the series file.
This limits the disk IO for writing TSM files during compactions
and snapshots. This helps reduce the spiky IO patterns on SSDs and
when compactions run very quickly.
Since possibly v0.9 DELETE SERIES has had the unwanted side effect of
removing series from the index when the last traces of series data are
removed from TSM. This occurred because the inmem index was rebuilt on
startup, and if there was no TSM data for a series then there could be
not series to add to the index.
This commit returns to the original (documented) DROP/DETETE SERIES
behaviour. As such, when issuing DROP SERIES all instances of matching
series will be removed from both the TSM engine and the index. When
issuing DELETE SERIES only TSM data will be removed.
It is up to the operator to remove series from the index.
NB, this commit does not address how to remove series data from the
series file when a shard rolls over.
This adds the capability to the engine to force a full compaction
to be scheduled. When called, it snapshots any data in the cache,
aborts running compactions and prevents level plans from returning
level plans.