There was a race where the same series would get added to the in-memory
index for a measurement more than once. This would result in the same
series being returned more than once during queries causing duplicate
results. The issue was that we check for the series under the read
lock, but did not check again under the write lock where there was
a small window where the series could be added by another goroutine.
We now check for the series under the write lock.
Fixes#6946
A slower disk can can cause excessive allocations to occur when
writing to the WAL because the slower encoding and compression occurs
before taking the write lock. The encoding/compression grabs a large
byte slice from a pool and ultimately waits until it can acquire the
write lock.
This adds a throttle to limit how many inflight WAL writes can be queued
up to prevent OOMing the processess with slower disks and heavy writes.
If a delete is issued while a compaction is running, the a newly
deleted series could re-appear after the compaction completed. This
could occur the compaction had already written the blocks for series
that were just deleted. When the compaction completes, the newly
written tombstone files would be deleted, essentially undeleting the
series.
Reduce the lock contention on tsdb.Store by taking a short lived
read-lock instead of a long write lock. Also close shards in parallel
and drop the whole RP dir in bulk instead of each shard dir.
Reduces the lock contention on the tsdb.Store by taking a short
read lock instead of a long write lock. Also processes shards
in parallel instead of serially.
Due to a bug in compactions, it's possible some blocks may have duplicate
points stored. If those blocks are decoded and re-compacted, an assertion
panic could trigger.
We now dedup those blocks if necessary to remove the duplicate points
and avoid the panic.
For larger datasets, it's possible for shards to get into a state where
many large, dense TSM files exist. While the shard is still hot for
writes, full compactions will skip these files since they are already
fairly optimized and full compactions are expensive. If the write volume
is large enough, the shard can accumulate lots of these files. When
a file is in this state, it's index can contain every series which
causes startup times to increase since each file must parse the full
set of series keys for every file. If the number of series is high,
the index can be quite large causing large amount of disk IO at startup.
To fix this, a optmize compaction is run when a full compaction planning
step decides there is nothing to do. The optimize compaction combines
and spreads the data and series keys across all files resulting in each
file containing the full series data for that shard and a subset of the
total set of keys in the shard.
This allows a shard to only store a series key once in the shard reducing
storage size as well allows a shard to only load each key once at startup.
Large files created early in the leveled compactions could cause
a shard to get into a bad state. This reworks the level planner
to handle those cases as well as splits large compactions up into
multiple groups to leverage more CPUs when possible.
* Removes sysvinit-tools as an RPM package dependency.
* Update init script to not rely on sysvinit utils for backwards
compatibility.
* Minor overall improvements to init script (improved error messages,
comments, check for root privileges).
* Adds SLES support to post-installation script.