This changes full compactions within a shard to run sequentially
instead of running all the compaction groups in parallel. Normally,
there is only 1 full compaction group to run. At times, there could
be several which causes instability if they are all running concurrently
as they tie up a cpu for long periods of time.
Level compactions are also capped to a max of 4 concurrently running for each level
in a shard. This prevents sudden spikes in CPU and disk usage due to a large backlog
of tsm files at a given level.
Measurement name and field were converted between []byte and string
repetively causing lots of garbage. This switches the code to use
[]byte in the write path.
This pool was previously a pool.Bytes to avoid repetitive allocations.
It was recently switchted to a sync.Pool because pool.Bytes held onto
very larger buffers at times which were never released. sync.Pool is
showing up in allocation profiles quite frequently.
This switches the pool to a new pool that limits how many buffers are
in the pool as well as the max size of each buffer in the pool. This
provides better bounds on allocations.
This speeds up time encoding and decoding by skipping the divisor
scaling if scaling by 1. Since division and multiplication are expensive
cpu and scaling by 1 has no effect, this just slows encoding and decoding
down.
Tombstone files would be written to all TSM files even if the deleted
keys or timerange did not exist in the TSM file. This had the side
effect of causing shards to get recompacted back to the same state. If
any shards or large numbers of TSM files existed, disk usage and CPU
utilization would spike causing issues.
This prevents tombstones being written for TSM files that could not
possiby contain the series keys being deleted or if the delted time
range is outside the range of the file.
When monitoring shards, a slice of measurements is allocated for
each shard. With many shards and measurements, these allocations
can be large. Since inmem shards share the same index, we only
need to do this once since the resulting slices are all the same.
This reduces memory usage when monitoring shard cardinality.
Since this is called more frequently now, the cleanup func was invoked
quite a bit which makes several syscalls per shard. This should only
be called the first time compactions are disabled.
Index.ForEachMeasurementTagKey held an RLock while call the fn,
if the fn made another call into the index which acquired an RLock
and after another goroutine tried to acquire a Lock, it would deadlock.
This was causing a shard to appear idle when in fact a snapshot compaction
was running. If the time was write, the compactions would be disabled and
the snapshot compaction would be aborted.
The monitor goroutine ran for each shard and updated disk stats
as well as logged cardinality warnings. This goroutine has been
removed by making the disks stats more lightweight and callable
direclty from Statisics and move the logging to the tsdb.Store. The
latter allows one goroutine to handle all shards.
Each shard has a number of goroutines for compacting different levels
of TSM files. When a shard goes cold and is fully compacted, these
goroutines are still running.
This change will stop background shard goroutines when the shard goes
cold and start them back up if new writes arrive.
The compactor prevents the same file from being compacted by different
compaction runs, but it can result in warning errors in the logs that
are confusing.
This adds compaction plan tracking to the planner so that files are
only part of one plan at a given time.
This limit allows the number of concurrent level and full compactions
to be throttled. Snapshot compactions are not affected by this limit
as then need to run continously.
This limit can be used to control how much CPU is consumed by compactions.
The default is to limit to the number of CPU available.
Compactions are enabled as soon as the shard is opened. This can
slow down startup or cause the system to spike in CPU usage at startup
if many shards need to be compacted.
This now delays compactions until after they are loaded.
The lazy sorting of series caused a deadlock since it can not take
a Lock when a caller may have already acquired an RLock.
filters should be called w/o any locks as the function already acquires
locks as needed.