Commit Graph

40 Commits (3d544a9136386beeef35e09990856a7537653421)

Author SHA1 Message Date
Mark Rushakoff cdcb079769 Tag TSM stats with database, retention policy
... by extracting the db/rp from the given path.

Now that the code has "standardized" on extracting db/rp this way, the
ShardLocation struct is no longer necessary and thus has been removed.
We're back on the previous style of passing the path and walPath to
NewShard.
2016-02-29 09:17:34 -08:00
Jon Seymour 2c7cd06b99 tsm: cache: need to check that snapshot has been sorted.
Previously, the for loop at the end of the method assumed that all entries
had been deduplicated, including the entry discovered in the snapshot.

However, this wasn't actually true. With this change, we make it true.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-26 07:56:25 +11:00
Jon Seymour 4d98a1cf28 tsm: cache: remove unnecessary lock escalation.
Previously, we needed a write lock on the cache because it was the
only lock we had available to guard updates to entry.values and
entry.needSort.

However, now we have a entry-scoped lock for this purpose, we don't
need the cache write lock for this purpose. Since merged() doesn't
modify the .store or the c.snapshot.sort, there is no need for
a write lock on the cache to protect the cache.

So, we don't need to escalate here - we simply rely on the entry lock
to protect the entries we are iterating over.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-26 01:31:54 +11:00
Jason Wilder 452d77cbaf tsm: cache: introduce entry locks.
Based on @jwilder's alternative to the 'dirty' slice that featured
in previous iterations of this fix.

Suggested-by: Jason Wilder <jason@influxdb.com>
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-26 00:05:38 +11:00
Jon Seymour eb7eec078d tsm: cache: introduce commit lock to Cache
Currently two compactors can execute Engine.WriteSnapshot at once.

This isn't thread safe since both threads want to make modifications to
Cache.snapshot at the same time.

This commit introduces a lock which is acquired during Snapshot() and
released during ClearSnapshot(), ensuring that at most one thread
executes within Engine.WriteSnapshot() at once.

To ensure that we always release this lock, but only release the
snapshot resources on a successful commit, we modify ClearSnapshot() to
accept a boolean which indicates whether the write was successful or not
and guarantee to call this function if Snapshot() has been called.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-25 12:10:37 +11:00
Jon Seymour 530b86ba7d tsm: cache: restore the semantics of cachedBytes and memSize stats
Fixes #5805.

This commit undoes a regression introduced by #5789.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-24 06:16:46 +11:00
Jon Seymour 3475356dc9 tsm: cache: fix semantics of snapshotCount statistic to make it useful.
Fix for #5804.

The commit for #5789 rendered the semantics of snapshotCount statistic
useless. This commit restores semantics that have diagnostic value to
this statistic.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-24 06:13:54 +11:00
Jason Wilder 017c24c98e Simplify cache snapshotting
The Cache had support for taking multiple snapshots to support writing
multiple snapshots to TSM files concurrently if that happened to be
a bottleneck.  In practice, this is never a bottleneck and we only
run one snappshoting goroutine continously per shard which has worked
well for all workloads.

The multiple snapshot support introduces some unhandled failure scenarios
where wal segments could be removed without writing them to TSM files.  If
a snapshot compaction fails to write due to transient disk errors, subsequent
snapshots will continue, but the failed one will not be retried.  When the
subsequent ones succeeded, all closed wal segments are removed causing data
loss.

This change simplifies the snapshotting capability to ensure that there is only
ever one snapshot.  If one fails, the next snapshot will update the existing
snapshot and retry all of old and new data.

Fixes #5686
2016-02-23 09:38:51 -07:00
Mark Rushakoff fc5c8597ab Merge pull request #5758 from influxdata/mr-disk-stats
Track cache, WAL, filestore stats within tsm1 engine
2016-02-22 13:01:55 -08:00
Jason Wilder aa2e878019 Fix cache not deduplicating points in some cases
The cache had some incorrect logic for determine when a series needed
to be deduplicated.  The logic was checking for unsorted points and
not considering duplicate points.  This would manifest itself as many
points (duplicate) points being returned from the cache and after a
snapshot compaction run, the points would disappear because snapshot
compaction always deduplicates and sorts the points.

Added a test that reproduces the issue.

Fixes #5719
2016-02-22 13:24:42 -07:00
Jon Seymour c93da21a61 tsm: cache: only use NewCache for engine cache's snapshots use a simpler constructor
The intent of this change is to avoid writing caches created for
snapshot cache instances into the tsm1_cache measurement. We can do
this by avoiding use of the NewCache constructor. All other methods
are only intended to be called from on the engine cache - never
on a snapshot.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-22 15:17:43 +11:00
Jon Seymour 510ee2c790 tsm: cache: during writes, update the memSize statistic outside the lock
Since we are not locking but relying on atomic arithmetic,
use Add rather than Set. Will also result in slightly less garbage
being created.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-22 08:26:35 +11:00
Jon Seymour 9c6efe99f1 tsm: cache: ensure all statistics are initialised on cache creation.
The intent of this change is to ensure that all statistic fields of the
resulting tsm1_cache measurement are initialized on initialization of
the cache. That way, any consumer of those measurements doesn't
have to deal with the null case.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-21 15:33:50 +11:00
Jon Seymour 6697c721fb tsm: cache: add cache throughput related statistics.
Complementing and extending the changes in #5758.

Add 2 level statistics:

  * snapshotCount
  * cacheAgeMs

Add 2 counter statistics

  * cachedBytes
  * WALCompactionTimeMs

snapshotCount can be used to measure transient write errors that are causing snapshots to accumulate

cacheAgeMs can be used to guage the level of write activity into the cache

The differences between cachedBytes stats sampled at different times can be used to calculate cache throughput rates

The ratio (cachedBytes-diskBytes)/WALCompactionTimeMs can be used calculate WAL compaction throughput.

The ratio of difference between first and last WAL compaction time over the interval
length is an estimate of percentage of cache throughput consumed.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-20 22:18:57 +11:00
Mark Rushakoff e76967efb6 Add stats to tsm1.Cache 2016-02-19 16:37:34 -08:00
Jason Wilder 1d165d38a9 Optimize Cache entry.add
This reduces some of the lock contention when writing to the cache.
When a new entry is created, it avoids an allocation.  It also skips
a check to see if we need to sorted if we already know it needs to sorted.
2016-01-27 14:26:42 -07:00
Jason Wilder 372302bcbd Reduce lock contention in Cache.WriteMulti
A write-lock was taken the whole time, but we only need the write
lock at the end.
2016-01-25 16:48:34 -07:00
Jason Wilder 5bee8880db Reduce lock content in engine.WritePoints
Writing the snapshot would deduplicate the snapshot points
while still holding the engine write-lock.  This can be expensive
under high load and cause writes to back up and OOM the server.

Instead, grab the snapshot under the lock and dedup it after releasing
the lock.

Possible fix for #5442
2016-01-25 15:37:34 -07:00
Paul Dix 820b0d31d6 Update TSM to delete from the WAL/cache
* Update cache loader to delete entries from cache
* Add cache.Delete()
* Update delete to look at keys in the Cache in addition to the FileStore
* Update cache compaction to never happen if the cache is empty
2015-12-07 14:35:48 -05:00
Philip O'Toole fe7b3ad134 Add CacheLoader
The CacheLoader loads a given cache from a slice of segment files.
2015-12-05 22:13:57 -08:00
Philip O'Toole 7296de1fac Merge pull request #4999 from influxdb/cache_sort
Always copy the Cache values for query and merge with snapshot
2015-12-05 08:15:13 -08:00
Philip O'Toole 1b12ff9c1c Only take write-lock for Values when necessary 2015-12-05 08:06:01 -08:00
Philip O'Toole 789ab10658 Merge hot cache values with snapshots
This change starts by building the sequence of entries, which also
allows the required size of destination buffer to be calculated. Then
the buffer is allocated up-front in 1 call.

Each snapshot and hot value-set is appended to the buffer. If ordering
is violated at anytime, set the 'needSort' flag. Sorting, if necessary,
is performed just before returning the data.
2015-12-04 20:58:02 -08:00
Philip O'Toole 859877fd09 Move all sort logic to entry type 2015-12-04 20:21:16 -08:00
Philip O'Toole 6e91679fab Always copy the Cache values for query 2015-12-04 15:37:45 -08:00
Paul Dix 9637446ba9 Merge pull request #4990 from influxdb/pd-loadmetadata-wal
Update TSM engine, WAL and encoding
2015-12-04 18:21:47 -05:00
Paul Dix 33506e4d3e Update TSM cache and engine LoadMetadataIndex 2015-12-04 16:40:01 -05:00
Philip O'Toole f939e49f0f Fix comment and remove snapshot stutter 2015-12-04 07:29:58 -08:00
Paul Dix 4624fb2a78 Update cache to address PR comments 2015-12-03 14:03:11 -05:00
Paul Dix b0fb8a0a27 Update TSM cache, compact, wal, encoding
* Update cache to have a single slice of values for a key (removed checkpoints)
* Changed compact.Plan to only worry about TSM files.
* Updated Plan to not return an error since there was no case in which it would.
* Update WAL to not keep stats since they're no longer needed.
* Update engine to flush the Cache/WAL to a new TSM file when the min threshold is hit.
* Split compact logic between TSM compacts and WAL/Cache writes.
* Remove unnecessary merge iterator, wal segment iterator, and other no longer necessary stuff.
* Remove the asending bool from the Dedupe method. Values should always be in ascending order. It's up to the cursor to iterate through values based on the direction. Giving the cursor responsibility makes it so we don't need to sort, dedupe or reallocate anything for different query orders.
* Updated engine to use its locks to ensure writes and cache flushes don't cause a race.
* Update all tests with new signatures. Removed a bunch of tests around TSM rewrites and WAL segment iteration that are no longer necessary.
2015-12-03 08:11:50 -05:00
Jason Wilder 6847a6ba0c Fix rebase 2015-12-02 09:47:16 -07:00
Jason Wilder 708266da69 Cache related compaction fixes 2015-12-02 09:45:24 -07:00
Jason Wilder 7e249e0555 Use CacheKeyIterator instead of WALKeyIterator during compactions 2015-12-02 09:45:24 -07:00
Philip O'Toole fc83968e2e Cache values supports sorting order 2015-12-01 13:24:25 -08:00
Philip O'Toole 7da3fc1aeb Merge pull request #4934 from influxdb/dedupe_sort_order
Deduplicate supports requesting sort order
2015-12-01 16:23:25 -05:00
Philip O'Toole bad0f657de Deduplicate supports requesting sort order 2015-11-30 16:21:44 -08:00
Philip O'Toole 1bca38bb84 Cache supports writing multiple keys
This keeps the locking to a minimum if the data is available for
multiple keys at once.
2015-11-26 06:07:16 -08:00
Philip O'Toole 5b573b9248 Move to simpler cache
This cache simply evicts as much as possible whenever a checkpoint is
set.
2015-11-20 21:09:24 -08:00
Philip O'Toole 6aede8f562 Clone should sort values
This code may actually change soon due to internal design changes, but
this will ensure testing output is constant.
2015-11-17 11:59:50 -08:00
Philip O'Toole d8ea132c53 Add WAL cache 2015-11-16 19:52:49 -08:00