* Live Restore + Enterprise data format compatability
* Extended ImportData to import all DB's if no db name given
* Added a new enterprise data test, and backup command now prints the backup file paths at conclusion
* Added whole-system backup test
* Update to use protobuf in all enterprise data cases
* Update to test to do cross-testing with enterprise version
* incremental enterprise backup format support
The cache defaulted to entry capacity size of 32. This default
is fine for lower cardinalities, but causes big spikes in InUse
heap with higher cardinalities that can OOM the process. Since
the hints had to be removed previously due to increased memory usage,
they are now completely removed. For lower cardinalities, we do
grow the slice, but this has a small performance penalty compared
to the large memory usage/OOMs with larger cardinalities.
* only call ParseTags when necessary
* remove dependency on inmem.Series in tsdb test package
* Measurement and Series are no longer exported. Their use is restricted
to the inmem package
* improve Measurement and Series types by exporting immutable
fields and removing unnecessary APIs and locks
Reduced startup time from 28s to 17s. Overall improvement including
#9162 reduces startup from 46s to 17s for 1MM series across 14 shards.
This commit ensures that the series file should work appropriately on
32-bit architecturs. It does this by reducing the maximum size of a
series file to 512MB on 32-bit systems, which should be fully
addressable.
It further updates tests so that the series file size can be reduced
further when running many tests in parallel on 32-bit architectures.
This limits the disk IO for writing TSM files during compactions
and snapshots. This helps reduce the spiky IO patterns on SSDs and
when compactions run very quickly.
Since possibly v0.9 DELETE SERIES has had the unwanted side effect of
removing series from the index when the last traces of series data are
removed from TSM. This occurred because the inmem index was rebuilt on
startup, and if there was no TSM data for a series then there could be
not series to add to the index.
This commit returns to the original (documented) DROP/DETETE SERIES
behaviour. As such, when issuing DROP SERIES all instances of matching
series will be removed from both the TSM engine and the index. When
issuing DELETE SERIES only TSM data will be removed.
It is up to the operator to remove series from the index.
NB, this commit does not address how to remove series data from the
series file when a shard rolls over.
This changes the approach to adjusting the amount of concurrency
used for snapshotting to be based on the snapshot latency vs
cardinality. The cardinality approach could use too much concurrency
and increase the number of level 1 TSM files too quickly which incurs
more disk IO.
The latency model seems to adjust better to different workloads.
The disk based temp index for writing a TSM file was used for
compactions other than snapshot compactions. That meant it was
used even for smaller compactiont that would not use much memory.
An unintended side-effect of this is higher disk IO when copying
the index to the final file.
This switches when to use the index based on the estimated size of
the new index that will be written. This isn't exact, but seems to
work kick in at higher cardinality and larger compactions when it
is necessary to avoid OOMs.
O_SYNC was added with writing TSM files to fix an issue where the
final fsync at the end cause the process to stall. This ends up
increase disk util to much so this change switches to use multiple
fsyncs while writing the TSM file instead of O_SYNC or one large
one at the end.
If a delete for a time that does not exist was run, we would not
remove the series key from the slice of series to remove from the
index.
This could be triggered by running somethin like "delete from cpu where
time = 0" and if there was no data at time 0, the series would still
be removed from the index.
If there were many individual deletes to a series that ended up
deleting every value in the block and the tombstone timestamps
were not contigous, it was possible for the TSMKeyIterator to
return false for Next incorrectly. This causes the compaction to
drop any remaining data in the file.
Normally, if all the data is deleted via tombstones, we remove the
whole key from the TSM index. In this case, we're not able to determine
that the key is fully deleted until the block is decode and tombstones
are applied.
This changes the TSMKeyIterator to detect this condition and continue
to the next key instead of aborting.
The loop to check if a series still exists in a TSM file was wrong
in that it 1) exited early after one iteration and 2) had an off
by one error that causes the wrong series to be marked as existing.
This fixes both of these cases which can cause the index to become
inconsistent with the data store on disk.
This fixes a regression in the Cache introduced in ca40c1ad3c where
not all the values in the cache entry would be removed. Previously,
calling Exclude did not require the values to be sorted. The change
in ca40c1ad3c relies on the values being sorted so it was possible for
it to find the wrong indexes in when calling FindRange and leave some
data that should be deleted.
Fixes#9161