This commit ensures that any orphaned series (series that are to be
removed and no longer are referenced anywhere in the database) are
removed from the `inmem` index when a shard is dropped.
PR #9204 introduced a maximum default concurrent compaction limit of 4.
The idea was to reduce IO utilisation on large systems with many cores,
and high write load. Often on these systems, disks were not scaled
appropriately to to the write volume, and while the write path could
keep up, compactions would saturate disks.
In #9225 work was done to reduce IO saturation by limiting the
compaction throughput. To some extent, both #9204 and #9225 work towards
solving the same problem.
We have recently begun to notice larger clusters to suffer from
situations where compactions are not keeping up because they have been
scaled up, but the limit of 4 has stayed in place. While users can
manually override the setting, it seems more user friendly if we remove
the limit by default, and set it manually in cases where compactions are
causing too much IO on large boxes.
This commit restricts the number of TSM1 files that can be opened
concurrently across the entire `tsdb.Store`. There is currently
a limit for the number of shards that can be opened concurrently,
however, this limit does not help when the number of CPU cores
is higher than the number of shards. Because TSM1 files have a 2GB
limit and there is no limit on the number of files per shard,
extremely large shards (1TB+) can load 1,000s of files simultaneously.
* filters allow specific combinations of database, retention policy and
shard groups to be opened. This was added to reduce the start-up time
of the export tool and limit the memory usage.
This commit adds initial empty sketches back to the tsi1 index, as well
as ensuring that ephemeral sketches in the index `LogFile` are updated
accordingly.
The commit also adds a test that verifies that the merged sketches at
the store level produce the correct results under writes, deletions and
re-opening of the store.
This commit does not provide working sketches for post-compaction on the
tsi1 index.
The Store.Delete series held an RLock while deleting from each shard.
While deleting, the Engine uses shardSet to see if a series is fully
deleted. The shardSet.ForEach also takes and RLock. If a Lock is
requested between these two calls, a deadlock occurs.
To fix, we don't need to hold an RLock for the duration of the delete
in the store as each Shard handles concurrency itself and we have a
snapshot of the shards we need to access.
Series should only be removed from the series file when they're no
longer present in any shard. This commit ensures that during a shard
rollover, the series local to the shard are checked against all other
series in the database.
Series that are no longer present in any other shards' bitsets, are then
marked as deleted in the series file.
use. However, because the reference counting was implemented via
mutexes, it was possible to double `RLock` the series file mutex. This
allows a `Lock` to arrive in-between the two `RLock`s, (such as when
deleting the database), causing deadlock.
This commit addresses this by ensuring that from within `IndexSet`
methods, when calling other `IndexSet` methods, that they're all
unexported, and that those unexported methods never take a lock on the
series file.
Keeping series file locking in exported `IndexSet` methods only, allows
one to see any future races more easily.
This commit adds the ability to correctly mark a series as deleted in
the global series file. Whenever a shard engine determines that a series
should be deleted, it checks with each shard's bitset for series that
are to be deleted and are no longer contained in any shard-local
bitsets.
These series are then removed from the series file.
When dropping series, if the series file does not exists we returned
and error. This breaks compatibility with prior versions that would
not return an error if the series do not exists.