This commit ensures that the series file should work appropriately on
32-bit architecturs. It does this by reducing the maximum size of a
series file to 512MB on 32-bit systems, which should be fully
addressable.
It further updates tests so that the series file size can be reduced
further when running many tests in parallel on 32-bit architectures.
This limits the disk IO for writing TSM files during compactions
and snapshots. This helps reduce the spiky IO patterns on SSDs and
when compactions run very quickly.
Since possibly v0.9 DELETE SERIES has had the unwanted side effect of
removing series from the index when the last traces of series data are
removed from TSM. This occurred because the inmem index was rebuilt on
startup, and if there was no TSM data for a series then there could be
not series to add to the index.
This commit returns to the original (documented) DROP/DETETE SERIES
behaviour. As such, when issuing DROP SERIES all instances of matching
series will be removed from both the TSM engine and the index. When
issuing DELETE SERIES only TSM data will be removed.
It is up to the operator to remove series from the index.
NB, this commit does not address how to remove series data from the
series file when a shard rolls over.
This changes the approach to adjusting the amount of concurrency
used for snapshotting to be based on the snapshot latency vs
cardinality. The cardinality approach could use too much concurrency
and increase the number of level 1 TSM files too quickly which incurs
more disk IO.
The latency model seems to adjust better to different workloads.
The disk based temp index for writing a TSM file was used for
compactions other than snapshot compactions. That meant it was
used even for smaller compactiont that would not use much memory.
An unintended side-effect of this is higher disk IO when copying
the index to the final file.
This switches when to use the index based on the estimated size of
the new index that will be written. This isn't exact, but seems to
work kick in at higher cardinality and larger compactions when it
is necessary to avoid OOMs.
The default max-concurrent-compactions settings allows up to 50%
of cores to be used for compactions. When the number of cores is
high (>8), this can lead to high disk utilization. Capping at
4 and combined with high snapshot sizes seems to keep the compaction
backlog reasonable and not tax the disks as much. Systems with lots
of IOPS, RAM and CPU cores may want to increase these.
With the recent changes to compactions and snapshotting, the current
default can create lots of small level 1 TSM files. This increases
the default in order to create larger level 1 files and less disk
utilization.
O_SYNC was added with writing TSM files to fix an issue where the
final fsync at the end cause the process to stall. This ends up
increase disk util to much so this change switches to use multiple
fsyncs while writing the TSM file instead of O_SYNC or one large
one at the end.