Commit Graph

682 Commits (a0dfd8ae5efd22b1d6b72da03cbc8aa98119b769)

Author SHA1 Message Date
Edd Robinson 58c03448aa Merge pull request #5514 from influxdata/er-engine-panic
Ensure shards and engine are safely closed
2016-03-09 18:56:36 +00:00
Jason Wilder e3fef5593c Merge pull request #5855 from jonseymour/jss-5854-go-master-breaks-build
fix tests to cope with future changes to testing.quick.Check - see #5854
2016-03-01 19:03:21 -07:00
Mark Rushakoff cdcb079769 Tag TSM stats with database, retention policy
... by extracting the db/rp from the given path.

Now that the code has "standardized" on extracting db/rp this way, the
ShardLocation struct is no longer necessary and thus has been removed.
We're back on the previous style of passing the path and walPath to
NewShard.
2016-02-29 09:17:34 -08:00
Jon Seymour 73b3a2a056 Merge #5855 (issue: #5854).
RHS merges cleanly with 0.10.0

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-29 20:37:32 +11:00
Jon Seymour 716cdd7f41 tsm: modify encoding tests to deal with possible nil slices from testing.quick.Check in go master
The current go compiler at the tip of the go master (1d5001af) has a modified implementation of
testing.quick.Check that now generates nil slices as test data. (See: https://gophers.slack.com/archives/general/p14567053570110). The existing tests expect round tripping to work in this case
but it does not. So, in these cases we change the expectation to reflect actual behaviour.

This needs to be checked for reasonableness.
2016-02-29 20:36:19 +11:00
Jason Wilder 8d70d65a82 Convert time.Time to int64 2016-02-25 15:15:01 -07:00
Jon Seymour 11123d2694 Merge #5833 (issue: #5832).
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-26 07:59:03 +11:00
Jon Seymour 2c7cd06b99 tsm: cache: need to check that snapshot has been sorted.
Previously, the for loop at the end of the method assumed that all entries
had been deduplicated, including the entry discovered in the snapshot.

However, this wasn't actually true. With this change, we make it true.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-26 07:56:25 +11:00
Jon Seymour 7eabae68de tsm: cache: add a test for the write sequence {6,1,snapshot,7,2}
Consider the write sequence: 6,1,snapshot,7,2.

The hot cache gets deduplicated, so is 2,7.

Now consider the test if 1 >= 2, this is false, so needSort is not set to true.

The problem is the implicit assumption that the snapshot is always sorted
by the time that merged() runs, but this may not be true.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-26 07:43:50 +11:00
Jason Wilder 6ebc192298 Merge pull request #5678 from jonseymour/typo
doc: typographical, spelling, grammar, word-choice and phrasing improvements.
2016-02-25 09:33:41 -07:00
Jason Wilder daf68dbbd2 Merge pull request #5701 from jonseymour/js-deduplicate-safety
tsm: cache: improve thread safety of Cache.Deduplicate (see #5699)
2016-02-25 09:18:10 -07:00
Jon Seymour 4d98a1cf28 tsm: cache: remove unnecessary lock escalation.
Previously, we needed a write lock on the cache because it was the
only lock we had available to guard updates to entry.values and
entry.needSort.

However, now we have a entry-scoped lock for this purpose, we don't
need the cache write lock for this purpose. Since merged() doesn't
modify the .store or the c.snapshot.sort, there is no need for
a write lock on the cache to protect the cache.

So, we don't need to escalate here - we simply rely on the entry lock
to protect the entries we are iterating over.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-26 01:31:54 +11:00
Jason Wilder 452d77cbaf tsm: cache: introduce entry locks.
Based on @jwilder's alternative to the 'dirty' slice that featured
in previous iterations of this fix.

Suggested-by: Jason Wilder <jason@influxdb.com>
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-26 00:05:38 +11:00
Jon Seymour eb7eec078d tsm: cache: introduce commit lock to Cache
Currently two compactors can execute Engine.WriteSnapshot at once.

This isn't thread safe since both threads want to make modifications to
Cache.snapshot at the same time.

This commit introduces a lock which is acquired during Snapshot() and
released during ClearSnapshot(), ensuring that at most one thread
executes within Engine.WriteSnapshot() at once.

To ensure that we always release this lock, but only release the
snapshot resources on a successful commit, we modify ClearSnapshot() to
accept a boolean which indicates whether the write was successful or not
and guarantee to call this function if Snapshot() has been called.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-25 12:10:37 +11:00
Jon Seymour 45d025db99 tsm: cache: add a tests to demonstrate thread safety vulnerabilities
There are two tests that show two different one vulnerability.

One test shows that Cache.Deduplicate modifies entries in a snapshot's
store without a lock while cache readers are deduplicating those same
entries while correctly locked.

A second test shows that two threads trying to execute the methods
that Engine.WriteSnapshot calls will cause concurrent, unsynchronized
mutating access to the snapshot's store and entries.

The tests fail at this commit and are fixed by subsequent commits.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-25 12:10:31 +11:00
Jon Seymour d7d81f79da tsm: cache: add a test that demonstrates concurrent reads are safe
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-25 12:06:10 +11:00
Mark Rushakoff fb83374389 Track stats for number of series, measurements
Per database: track number of series and measurements
Per measurement: track number of series
2016-02-24 08:10:16 -08:00
Jon Seymour 530b86ba7d tsm: cache: restore the semantics of cachedBytes and memSize stats
Fixes #5805.

This commit undoes a regression introduced by #5789.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-24 06:16:46 +11:00
Jon Seymour 3475356dc9 tsm: cache: fix semantics of snapshotCount statistic to make it useful.
Fix for #5804.

The commit for #5789 rendered the semantics of snapshotCount statistic
useless. This commit restores semantics that have diagnostic value to
this statistic.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-24 06:13:54 +11:00
Jason Wilder 017c24c98e Simplify cache snapshotting
The Cache had support for taking multiple snapshots to support writing
multiple snapshots to TSM files concurrently if that happened to be
a bottleneck.  In practice, this is never a bottleneck and we only
run one snappshoting goroutine continously per shard which has worked
well for all workloads.

The multiple snapshot support introduces some unhandled failure scenarios
where wal segments could be removed without writing them to TSM files.  If
a snapshot compaction fails to write due to transient disk errors, subsequent
snapshots will continue, but the failed one will not be retried.  When the
subsequent ones succeeded, all closed wal segments are removed causing data
loss.

This change simplifies the snapshotting capability to ensure that there is only
ever one snapshot.  If one fails, the next snapshot will update the existing
snapshot and retry all of old and new data.

Fixes #5686
2016-02-23 09:38:51 -07:00
Jonathan A. Sternberg 50753de032 Merge pull request #5782 from influxdata/js-5777-audit-panics-in-influxql
Remove the non-unreachable panics in the new query engine
2016-02-22 17:18:57 -05:00
Mark Rushakoff 191de2670c Fix non-compiling test 2016-02-22 13:49:11 -08:00
Mark Rushakoff fc5c8597ab Merge pull request #5758 from influxdata/mr-disk-stats
Track cache, WAL, filestore stats within tsm1 engine
2016-02-22 13:01:55 -08:00
Jason Wilder aa2e878019 Fix cache not deduplicating points in some cases
The cache had some incorrect logic for determine when a series needed
to be deduplicated.  The logic was checking for unsorted points and
not considering duplicate points.  This would manifest itself as many
points (duplicate) points being returned from the cache and after a
snapshot compaction run, the points would disappear because snapshot
compaction always deduplicates and sorts the points.

Added a test that reproduces the issue.

Fixes #5719
2016-02-22 13:24:42 -07:00
Jonathan A. Sternberg 7a03df2af1 Remove the non-unreachable panics in the new query engine
The only panics left are ones that should be unreachable unless there is
a bug.

Fixes #5777.
2016-02-22 12:52:43 -05:00
Jon Seymour c93da21a61 tsm: cache: only use NewCache for engine cache's snapshots use a simpler constructor
The intent of this change is to avoid writing caches created for
snapshot cache instances into the tsm1_cache measurement. We can do
this by avoiding use of the NewCache constructor. All other methods
are only intended to be called from on the engine cache - never
on a snapshot.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-22 15:17:43 +11:00
Jon Seymour 510ee2c790 tsm: cache: during writes, update the memSize statistic outside the lock
Since we are not locking but relying on atomic arithmetic,
use Add rather than Set. Will also result in slightly less garbage
being created.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-22 08:26:35 +11:00
Jon Seymour 9c6efe99f1 tsm: cache: ensure all statistics are initialised on cache creation.
The intent of this change is to ensure that all statistic fields of the
resulting tsm1_cache measurement are initialized on initialization of
the cache. That way, any consumer of those measurements doesn't
have to deal with the null case.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-21 15:33:50 +11:00
Jon Seymour 6697c721fb tsm: cache: add cache throughput related statistics.
Complementing and extending the changes in #5758.

Add 2 level statistics:

  * snapshotCount
  * cacheAgeMs

Add 2 counter statistics

  * cachedBytes
  * WALCompactionTimeMs

snapshotCount can be used to measure transient write errors that are causing snapshots to accumulate

cacheAgeMs can be used to guage the level of write activity into the cache

The differences between cachedBytes stats sampled at different times can be used to calculate cache throughput rates

The ratio (cachedBytes-diskBytes)/WALCompactionTimeMs can be used calculate WAL compaction throughput.

The ratio of difference between first and last WAL compaction time over the interval
length is an estimate of percentage of cache throughput consumed.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-20 22:18:57 +11:00
Mark Rushakoff 602043e11b Add disk stats for FileStore 2016-02-19 16:37:34 -08:00
Mark Rushakoff d99c09cedd Add stats for current and old WAL segment sizes 2016-02-19 16:37:34 -08:00
Mark Rushakoff e76967efb6 Add stats to tsm1.Cache 2016-02-19 16:37:34 -08:00
Joe LeGasse dc8ed7953d Remove custom binary-conversion functions
Also cleaned up some excess allocations, and other cruft from the code
2016-02-18 13:56:35 -05:00
Ben Johnson f7e04abef7 remove NaN from query engine
This commit removes `math.NaN` returns from float iterators.
2016-02-17 14:11:31 -07:00
liang@qiniu.com 1ad0f933f4 Remove redundant wal files 2016-02-16 20:45:13 +08:00
Jon Seymour ab702eb44a doc: remove the implication that the wal directory is inside the shard directory.
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-15 05:33:22 +11:00
Jon Seymour ed0a112f8e doc: Add an Errata section intended to capture clarifications prior to full revisions of the text.
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-15 00:29:02 +11:00
Jon Seymour 5e563d53c1 doc: revise discussion about cache design
The description of the cache design was out of date - reflecting an older
design based on checkpoints and evictions. This revision updates the
design to describe snapshots and also clarify that if compaction performance
falls behind the inbound write rate then writes will fail.

Updates based in part of clarifications provided by Jason Wilder. See https://goo.gl/L7AzVu

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-15 00:29:02 +11:00
Jon Seymour cdc7e28338 doc: rephrasing of how sets of SeriesIterators are generated.
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-15 00:29:02 +11:00
Jon Seymour 58d1b7223a doc: refine TSM file system layout description
Minor improvements to phrasing to use the English word 'directory' and slight improvements to grammar.
2016-02-15 00:29:02 +11:00
Jon Seymour 285e0ad17a doc: refine description of the conclusion of the compaction process.
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-15 00:29:02 +11:00
Jon Seymour 008af05f7b doc: various grammar/word-choice improvements in TSM design document
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-15 00:29:02 +11:00
Jon Seymour 88598f78dc doc: fix up some spelling errors/typos in .MD files
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-15 00:29:02 +11:00
Ben Johnson 0b3d367e5c Merge pull request #5623 from influxdata/jw-query-panic
Fix panic: runtime error: index out of range
2016-02-10 14:59:04 -07:00
Jason Wilder 0ce6dd1304 Fix panic: runtime error: index out of range
There was a fix in 5b1791, but is not present in the current branch likely due to a rebase issue.
The current code panics with a query like:

select value from cpu group by host order by time desc limit 1

This fixes the panic as well as prevents #5193 from re-occurring.  The issue is that agressively
closing the cursors clears out the seeks slice so re-seeking will fail.
2016-02-10 14:00:58 -07:00
Justin Nuß 82c276756a Lint tsdb and tsdb/engine package 2016-02-10 21:33:46 +01:00
Ben Johnson d9a6a7340f add canonical paths 2016-02-10 11:30:52 -07:00
Ben Johnson 5a0d1ab7c1 rename influxdb/influxdb to influxdata/influxdb
This commit changes all the import and URL references from:

    github.com/influxdb/influxdb

to:

    github.com/influxdata/influxdb
2016-02-10 10:26:18 -07:00
Jonathan A. Sternberg d1f7c445e7 Modify iterators to work across shards
Aux iterators now ask the iterator creator what series will be returned
and determine which aux fields to create based on the results.

The `tsdb.Shards` struct also creates a call iterator around the
iterators returned from each shard.
2016-02-10 09:40:29 -07:00
Jonathan A. Sternberg c2d1206177 Implement the fill iterator
Fill requires an additional function for IteratorCreator to retrieve the
series that will be returned from the iterator. When fill is required
for an aggregate, the IteratorCreator will be asked what series will be
returned by the created iterator.
2016-02-10 09:40:29 -07:00
Ben Johnson 6204350d65 fix math operations 2016-02-10 09:40:27 -07:00
Ben Johnson b4cb770a7f refactor aux iterators 2016-02-10 09:40:27 -07:00
Ben Johnson b8918a780c integer support 2016-02-10 09:40:25 -07:00
Jonathan A. Sternberg 583477064c Check for `tsdb.EOF` when looking for the lowest timestamp of aux fields 2016-02-10 09:40:25 -07:00
Jonathan A. Sternberg 34f14424dd Filter tags from the condition when building cursors on tsm1 2016-02-10 09:40:25 -07:00
Ben Johnson 00806de9b8 refactor query engine 2016-02-10 09:40:25 -07:00
Ben Johnson cde973f409 refactor query engine 2016-02-10 09:40:24 -07:00
Gabriel Levine 7d4217ab97 enabled golint for tsdb/engine/wal.go and wal_test.go and updated changelog. 2016-02-09 10:29:09 -05:00
Jason Wilder 2b3c640695 Fix reading too far in fileAccess.readBytes
Fixes #5566
2016-02-08 09:08:57 -07:00
Jason Wilder 28ae8b6fe0 Merge pull request #5434 from runner-mei/tsm_tombstone_windows
fix TSMReader.Delete() and all unit tests is pass in the windows
2016-02-04 16:27:26 -07:00
Jason Wilder b635e516e5 Merge pull request #5485 from runner-mei/patch-7
fix munmap bug in the windows
2016-02-04 13:47:51 -07:00
Jason Wilder 5a124e0e0b Merge pull request #5431 from runner-mei/patch-5
fix determine the file size
2016-02-04 10:24:05 -07:00
Edd Robinson 1bcb1d033f Allow Close to be called multiple times safely 2016-02-03 10:20:22 +00:00
INADA Naoki 80a637904d tsm1: Use unixnano instead of time.Time 2016-02-03 10:05:40 +09:00
INADA Naoki 771253256b FloatValue uses unixnano instead of time.Time 2016-02-03 09:57:00 +09:00
INADA Naoki 898babf616 add float bench 2016-02-03 03:12:16 +09:00
runner.mei 4ca47103b1 fix TSMReader.Delete() and all unit tests is pass in the windows 2016-01-31 11:32:08 +08:00
runner bc992fea5e fix munmap bug in the windows
fix munmap bug in the windows

fix munmap bug in the windows

fix munmap bug in the windows

fix munmap bug in the windows
2016-01-31 10:46:46 +08:00
runner 4b7fe70cd3 fix determine the file size
fix determine the file size
2016-01-30 14:16:53 +08:00
runner.mei 53f7e03f72 fix TSMReader.Delete() and all unit tests is pass in the windows 2016-01-30 14:15:46 +08:00
Jason Wilder 924275b337 Fix panic preventing wal file truncation
Fixes #5455
2016-01-28 21:50:51 -07:00
Jason Wilder 9528c3ea70 Merge pull request #5465 from influxdata/jw-remote-writes
Optimize remote writes
2016-01-27 15:47:02 -07:00
Jason Wilder 1d165d38a9 Optimize Cache entry.add
This reduces some of the lock contention when writing to the cache.
When a new entry is created, it avoids an allocation.  It also skips
a check to see if we need to sorted if we already know it needs to sorted.
2016-01-27 14:26:42 -07:00
Ben Johnson 98baf078d0 tsm1 query performance improvements 2016-01-27 13:42:32 -07:00
Jason Wilder 372302bcbd Reduce lock contention in Cache.WriteMulti
A write-lock was taken the whole time, but we only need the write
lock at the end.
2016-01-25 16:48:34 -07:00
Jason Wilder 5bee8880db Reduce lock content in engine.WritePoints
Writing the snapshot would deduplicate the snapshot points
while still holding the engine write-lock.  This can be expensive
under high load and cause writes to back up and OOM the server.

Instead, grab the snapshot under the lock and dedup it after releasing
the lock.

Possible fix for #5442
2016-01-25 15:37:34 -07:00
Jason Wilder 24f1bcfd20 Remove Dev prefix from tsm engine/tx 2016-01-10 16:43:36 -07:00
Jason Wilder 5b179113fc Don't close tsm cursor prematurely
We were closing the cursor when we read the last block which caused
the internal state to be cleared.  In a group by query, we seeked multiple
times so depending on the group by interval and how the data was laid out
in the blocks, we woudl close the cursor and the last block would get skipped.

Fixes #5193
2016-01-10 15:26:01 -07:00
Jason Wilder 3c45015311 Remove MAP_POPULATE
This may be causing slow restart times for systems with many large TSM files.
What I believe is happening at startup in these cases is that multiple goroutines
are started to load each TSM file concurrently.  The kernel appears to serialize
mmap calls from the same process so all of the goroutines end up getting blocked
on the actual mmap system call.  MAP_POPULATE instruct the kernel to pre-fault the
page table for the files and triggers read-ahead of the pages.  For larger, 2GB files,
this makes the mmap call more expensive and slower.  When there are many of these files
and calls it is possible to fill all available memory with pagecache.  In this case,
the OS will end up pre-faulting pages from one file and have to remove pages that it just
loaded from another files causing slowness.  MAP_POPULATE may also be cause much more data
to be pre-faulted than necessary.  To load a file, we just need to scan the index at the end
of the file.  MAP_POPULATE is likely causing the whole file to be loaded when it won't actually
be accessed for a while (or at all).

Might fix issue #5311.
2016-01-08 08:45:27 -07:00
Jason Wilder 756421ec4a Look for fully compacted block in addition to max size during compaction
Some data shapes would cause files to grow larger than the max size more
quickly which resulted in them getting skipped by the full compaction planner
at times.  Some datasets that could make this happen are very large keys or
very large numbers of keys (10M).  When this happened, multiple max sized
files would accumulate but the blocks would not be full.  When the shard went
cold for writes, these files would get recompacted down to the optimal size, but
a lot of space would be wasted in the mean time.
2016-01-07 15:18:42 -07:00
Jason Wilder faf8ee17fa Fix typo 2016-01-06 12:53:04 -07:00
Jason Wilder d2b7c03175 Re-use the series key
Avoid allocating the string twice.
2016-01-06 12:52:13 -07:00
Jason Wilder 2f7a0090c1 Don't allocate a pre-sized buffer for each cursor
This is contributing to some of the high memory usage on queries and possibly
some OOMs.  This is slightly slower, but removing it allows some fairly large
count queries over 5M series to complete instead of crashing the process using
tsm1 engine.
2016-01-06 10:50:38 -07:00
Jason Wilder 6f577cfef5 Reduce allocations when compacting
Key() returned the key and the entries.  We did not always need the
entries so they would be allocated and ignored.  Added a KeyAt func
that just returns the key to avoid the unnecesary entries allocation.
2016-01-05 16:16:44 -07:00
Jason Wilder 9a9ccab560 Reduce allocation in wal encoder
Use sync.Pool for some temporary buffers used while encoding instead of
allocatin new ones each time.  Also increased the default buffer size which
might be too small.  Probably need to make this a config var.
2016-01-05 16:12:25 -07:00
Jason Wilder ee54a1e791 Write TSM data directly to writer
We were buffering up the data to write into byte slices to reduce
IO calls but at larger sizes, this causes memory to spike.  The TSMWriter
was switched to use a bufio.Writer internally so this byte slice buffering
is unnecessary and costly now.
2016-01-05 14:46:07 -07:00
Jason Wilder d2889ecd6a Avoid creating slices of all keys during compaction 2016-01-05 09:38:00 -07:00
Jason Wilder 7794b9c5d4 Fix panic: runtime error: slice bounds out of range
The block count was an uint16 when incrementing the index location
which was an int32.  This caused the value the uint16 value to overflow
before the index location was incremented causing the wrong location
to be read on the next iteration of the loop.  This triggers the slice
out of range errors.

Added a test that recreates the panic seen in #5257 and possibly #5202 which
is older code.

Fixes #5257
2016-01-04 11:20:24 -07:00
Paul Dix 49d480cb0c Fix races in backup/restore 2015-12-31 08:42:01 -05:00
Paul Dix 5974d37649 Fix backup test to mock out compaction 2015-12-31 08:15:13 -05:00
Paul Dix 26e1c6464a Update backup to address PR comments 2015-12-30 18:06:51 -05:00
Paul Dix 59fbd371fc Implement backup/restore for TSM.
This changes backup and restore to work for TSM. It breaks it for b1 and bz1, but since those are getting removed it's ok.

The backup runs against any host that is specified and can backup either the metasstore, a database, specific retention policy, or a specific shard. It can also take incremental backups with the `since` flag, which will only backup TSM files that have been created since that timestamp.

The backup is safe to run online. However, for shards that are still hot for writes, they won't be able to create new TSM files while the backup for that single shard runs. If the backup isn't too large and the write throughput isn't too high this shouldn't be a problem since the writes will just go into the WAL cache.
2015-12-30 18:06:50 -05:00
Jason Wilder b6da176a4b Fix direct index size not calculated 2015-12-23 18:01:11 -07:00
Jason Wilder f9ae8077da Allow compactions to run when files have tombstones 2015-12-23 18:01:11 -07:00
Jason Wilder a38c95ec85 Update compactions to run concurrently
This has a few changes in it (unfortuantely).  The main change is to run compactions
concurrently.  While implementing this, a few query and performance bugs showed up that
are also fixed by this commit.
2015-12-23 18:01:11 -07:00
Jason Wilder 48d4156eac Fix blocks not sorted correctly when chunking 2015-12-23 18:01:11 -07:00
Jason Wilder bb2562b2ab Return CompactionGroups from planning 2015-12-23 18:01:11 -07:00
Jason Wilder d0ec0a15e2 Fix wrong test data setup 2015-12-23 18:01:11 -07:00
Ady 5c888b3673 Merge branch 'master' of https://github.com/influxdb/influxdb into mvadu-patch-4358
Trying to get to latest master from influxdb
2015-12-19 01:45:07 +05:30
Jason Wilder 7e97b0eafd Fix rename temp file on windows 2015-12-18 11:57:37 -07:00
Jason Wilder 611017f4ed Add comments 2015-12-18 10:00:07 -07:00
Jason Wilder 930174bf4d Handle calling WriteBlock with no data gracefully 2015-12-18 09:57:16 -07:00
Jason Wilder 6bc7765b88 Handle calling write with no values to TSMWriter gracefully 2015-12-18 09:52:53 -07:00
Jason Wilder 421a127f11 Add indirectIndex.UnmarshalBinary benchmark 2015-12-17 15:38:51 -07:00
Jason Wilder 8c7e11f4cf Aggressively clean up KeyCursor resources 2015-12-17 12:51:51 -07:00
Jason Wilder fd2a409ea3 Skip decoding blocks that are already full 2015-12-17 12:47:05 -07:00
Jason Wilder 825296ddd8 Add comments 2015-12-16 11:30:06 -07:00
Jason Wilder 88324bf61c Optimize indirectIndex.UnmarshalBinary further 2015-12-16 11:28:13 -07:00
Jason Wilder 70d1f45058 Load TSM files concurrently 2015-12-16 11:28:12 -07:00
Jason Wilder 737871268b Speed up indirectIndex.UnmarshalBinary
Remove a bunch of unnecessary allocations to improve startup times.
2015-12-16 11:16:17 -07:00
Jason Wilder 3893bc60e1 Speed up TSM compactor
Just keep the current block for each iterator in the buffers.
2015-12-16 11:16:17 -07:00
Jason Wilder 00f570441b Convert TSMKeyIterator to return blocks 2015-12-16 11:16:17 -07:00
Jason Wilder 59a57d8f73 Convert CacheKeyIterator to return encoded blocks 2015-12-16 11:16:17 -07:00
Jason Wilder 0623648140 Add chunking support back to TSMKeyIterator
Was removed when MergeIterator was deleted.
2015-12-16 11:16:17 -07:00
Jason Wilder 31b97c3fe0 Add max points per block back for CacheKeyIterator
Was removed when MergeIterator was removeed.
2015-12-16 11:16:16 -07:00
Jason Wilder 45e87cdfe4 Strip checksum when returning block from ReadBytes 2015-12-16 11:16:16 -07:00
Jason Wilder 97435b9124 Return minTime/maxTime from BlockIterator.Read 2015-12-16 11:16:16 -07:00
Jason Wilder ce6de9728e Add test for BlockIterator with multiple blocks for a key 2015-12-16 11:16:16 -07:00
Jason Wilder 4a3037814f Add WriteBlock to TSMWriter 2015-12-16 11:16:16 -07:00
Jason Wilder d99c1f944e Add BlockIterator for reading TSM blocks without decoding 2015-12-16 11:16:16 -07:00
Jason Wilder 928aef04cd Split data_file.go into reader.go and writer.go 2015-12-16 11:16:16 -07:00
Alexandre Viau ad1044dde9 typo: unkown -> unknown 2015-12-15 18:10:47 -05:00
Philip O'Toole 01ac0b3f23 Tweak compaction log messages 2015-12-15 10:33:13 -08:00
Philip O'Toole a6cdb5229d Log tsm initialization 2015-12-14 15:50:56 -08:00
Philip O'Toole 75764517f6 Merge pull request #5082 from li-ang/fix_x
Fix wrong value of countCompacting in WAL
2015-12-11 10:07:56 -08:00
Philip O'Toole 03f8cd3956 Add comment explaining magic number 2015-12-10 11:46:40 -08:00
Jason Wilder 631ecc23de Fix growing destination buffer during WAL entry encoding
The test to see if the destination buffer for encoding and decoding a WAL
entry was broken and would cause a panic if there were large batches that
would overflow the buffer size.

Fixes #5075
2015-12-10 11:46:40 -08:00
liang@qiniu.com 34bdffdb00 Fix wrong value of countCompacting in wal 2015-12-10 17:47:20 +08:00
Ady 07c0939fe1 Added logic To let the memeory mapped files to renamed by OS. Now a copy
is created in memory with SHARED_DELETE flag, so that OS is free to rename
or delete original file
2015-12-10 01:07:50 +05:30
Jason Wilder 992aea7bd3 Merge pull request #5060 from influxdb/jw-drop-db
Cancel writing TSM files when engine closes
2015-12-08 16:16:07 -07:00
Paul Dix b192136887 Merge pull request #5058 from influxdb/pd-update-compaction-logic
Update TSM compaction logic
2015-12-08 18:14:15 -05:00
Paul Dix 27cc2ea0cc Update compact.Plan 2015-12-08 18:01:31 -05:00
Jason Wilder d7cff651d1 Cancel writing TSM files when engine closes
If the engine is closed while a compaction is going on, the close call
blocks until the goroutine exits.  This could be several minutes because
the control does not return back up to the channel selector while there is
still data to write.
2015-12-08 15:41:53 -07:00
Paul Dix 96445a53a7 Update TSM compaction logic
* Update compaction to look at newest files of the smallest step first
* Update compaction to look at older files in larger steps if newer files don't have enough small steps to compact
* Changed the TestDefaultCompactionPlanner_CombineSequence test to reflect what's possible now. We'd only have multiple files in the same generation if the all files but one were over the max allowable size.
* Clean up the logic on when full compactions are run and when planning can be skipped
2015-12-08 17:33:38 -05:00
Jason Wilder 62cb3a1e9b Merge pull request #5057 from influxdb/jw-5046
Fix leaking TSM files when compacting
2015-12-08 13:11:46 -07:00
Jason Wilder 3543917a74 Avoid allocating strings during search 2015-12-08 13:02:17 -07:00
Jason Wilder 99c313ddae Fix leaking TSM files when compacting
The files being read were not closed after the compaction ran causing
them to leak.

Fixes #5046
2015-12-08 12:55:30 -07:00
Jason Wilder 9d82e24ca0 Fix performance of dropping large number of keys 2015-12-08 10:47:06 -07:00
Jason Wilder f245b44afa Set full compaction duration option on planner
Was set on engine and not planner so it was always 0.
2015-12-08 09:56:36 -07:00
Jason Wilder d32aeb2535 Merge pull request #5031 from influxdb/jw-mintime
Dedupe points at query time if there are overlapping blocks
2015-12-07 21:28:29 -07:00
Jason Wilder 87892d79da Dedupe points at query time if there are overlapping blocks 2015-12-07 21:10:10 -07:00
Fazal Majid bb386219f4 ran go fmt on mmap_solaris.go #4787 2015-12-07 17:41:26 -08:00
Fazal Majid 0f889a77d1 fix tsm1 for Solaris #4787, passes unit tests now 2015-12-07 17:14:26 -08:00
Jason Wilder a2583d2be1 Reduce lock contention when planning TSM queries 2015-12-07 15:42:36 -07:00
Jason Wilder 4da20c49e9 Optimize TSM file scanning for time queries
Move the index locations planning to be lazily created after the first
seek when we know what time and direction we're searching for.  This
allows files and blocks to be skip before having to scan the files index.

This improves queries times with time filters wherne there are many TSM
files on disk.
2015-12-07 15:42:36 -07:00
Paul Dix 93d6afec97 Merge pull request #5019 from influxdb/jw-mintime
Remove min time from TSM blocks
2015-12-07 15:00:12 -05:00
Paul Dix 8096c6b845 Update TSM, address PR #5011 comments
* Moved TSM file extension to a constant
* Fixed typos
* Changed group.size() back to being a uint64 since it can have multiple files up to 4GB each.
2015-12-07 14:47:17 -05:00
Paul Dix 820b0d31d6 Update TSM to delete from the WAL/cache
* Update cache loader to delete entries from cache
* Add cache.Delete()
* Update delete to look at keys in the Cache in addition to the FileStore
* Update cache compaction to never happen if the cache is empty
2015-12-07 14:35:48 -05:00
Jason Wilder cf341eaa6a Remove MinTime from blocks
MinTime is not in the index for each block so storing it in the block
header is redundant.  The encodings also store it in their header so
we are actually storing it 3 times.

Removing this is an incompatible change with the current tsm1 file format.
2015-12-07 11:26:58 -07:00
Adarsha 5482c6de03 Avoid closing the handle in mmap
Added mmap implementation for Windows. It uses MapViewOfFile similar to Bolt's implementation. MapViewOfFile  returns a pointer and not a byte array. Bolt changed their data structure to support it. 

Instead of changing the implementation of tsm data structure, I used a trick shown in https://groups.google.com/forum/#!topic/golang-nuts/g0nLwQI9www to use SliceHeader to convert the pointer into a slice.

Bolt's implementation also closes the file handle in mmap itself. It was resulting in a timeout, so implemented https://github.com/edsrzf/mmap-go/blob/master/mmap_windows.go logic to keep file handle open until munmap
2015-12-07 23:30:19 +05:30
Paul Dix 440a8a8a1f Change all TSM file sizes to uint32 2015-12-07 10:12:24 -05:00
Paul Dix 937233d988 Update TSM compaction planning logic
* Update Plan to do a full compaction if cold for writes
* Remove MaxFileSize as a config variable from Compactor. Should be a set constant
* Update Plan to keep track of if the last check was fully compacted so we can skip future planning calls
* Update compact min file count to 3 so that compactions run more frequently
2015-12-07 08:26:30 -05:00
Paul Dix 1bee7d1512 Update TSM, remove old version, add config
* remove rolloverTSMFileSize constant that is no longer used
* remove the maxGenerationFileCount since it is no longer a limitation that's necessary with the new compaction scheme. We no longer read WAL segments as part of the compaction so memory is only used as we read in each individual key
* remove minFileCount and switch to a user configurable variable
* remove the mutex from WALSegmentWriter. There's never more than one open in the WAL at one time and it's not exported through any function so the lock on the WAL should be used. This simplified keeping track of the last write time and removed a bunch of unnecessary locks.
* update WALSegmentWriter.Write to take the compressed bytes so that encoding and compression can occur before the call to write (while we don't hold the WAL lock)
* remove a bunch of unnecessary locking in WAL.writeToLog
* Add check for TSM file magic number and vesion
* Remove old tsm, log, and unused cursor code
* Remove references to tsm1dev everywhere except in the inspector
* Clean up config options for compaction and snapshotting
* Remove old TSM configuration options
* Update the config.sample.toml with TSM options
* Update WAL compact to force if it has been cold for writes for a configurable period of time (1h by default)
2015-12-06 18:50:39 -05:00
Philip O'Toole 6e88547a5e Support shutting down engine goroutines
This was causing races in the code, when the cache was being reloaded,
because back-to-back open-and-closing of the engine during testing left
goroutines running. With this change the engine is completely shutdown
when Close() is called on it.
2015-12-06 09:16:38 -08:00
Philip O'Toole 0d0b919144 Integrate CacheLoader with tsm2 engine 2015-12-05 22:13:57 -08:00
Philip O'Toole fe7b3ad134 Add CacheLoader
The CacheLoader loads a given cache from a slice of segment files.
2015-12-05 22:13:57 -08:00
Philip O'Toole 4b5fb8db72 WALSegmentReader counts bytes read without error 2015-12-05 22:13:57 -08:00
Philip O'Toole c67831bc79 Remove double-checking of error when reading WAL 2015-12-05 22:13:57 -08:00
Paul Dix 40e606cb14 Merge pull request #5003 from influxdb/jw-compaction
Update compaction planning
2015-12-05 16:49:54 -05:00
Jason Wilder 33a33e6a23 Fix 32bit int overflow of constant value 2015-12-05 13:09:18 -07:00
Jason Wilder 41b24995a7 Compcation fixes 2015-12-05 12:19:28 -07:00
Philip O'Toole 7296de1fac Merge pull request #4999 from influxdb/cache_sort
Always copy the Cache values for query and merge with snapshot
2015-12-05 08:15:13 -08:00
Philip O'Toole 1b12ff9c1c Only take write-lock for Values when necessary 2015-12-05 08:06:01 -08:00
Jason Wilder 6592615958 Updated compaction strategy
This changes compacting files to merge sequences of files in lower generations
up to later generations
2015-12-04 23:30:39 -07:00
Philip O'Toole 789ab10658 Merge hot cache values with snapshots
This change starts by building the sequence of entries, which also
allows the required size of destination buffer to be calculated. Then
the buffer is allocated up-front in 1 call.

Each snapshot and hot value-set is appended to the buffer. If ordering
is violated at anytime, set the 'needSort' flag. Sorting, if necessary,
is performed just before returning the data.
2015-12-04 20:58:02 -08:00
Philip O'Toole 859877fd09 Move all sort logic to entry type 2015-12-04 20:21:16 -08:00
Philip O'Toole 6e91679fab Always copy the Cache values for query 2015-12-04 15:37:45 -08:00
Paul Dix 9637446ba9 Merge pull request #4990 from influxdb/pd-loadmetadata-wal
Update TSM engine, WAL and encoding
2015-12-04 18:21:47 -05:00
Paul Dix 33506e4d3e Update TSM cache and engine LoadMetadataIndex 2015-12-04 16:40:01 -05:00
Paul Dix b0f3dcc8cc Update TSM metadata loading and write snapshot
* Update WriteSnapshot to always call synchronously
* Update LoadMetadataIndex to load WAL metadata from the cache
2015-12-04 16:03:17 -05:00
Jason Wilder 357b88c439 Increment sequence of max generation when compaction files 2015-12-04 13:46:28 -07:00
Jason Wilder 52bec1f7f6 Change TSM file naming to generation-sequence.tsm 2015-12-04 11:51:33 -07:00
Jason Wilder 479469994a Optimize FileStats calls
FileStats called frequently during compaction planning was too expensive because
they were cleared out every time a file replaced causing them all to be reloaded.
Insted, we grab the stats that are already maintained by the files themselves from
the files when needed.
2015-12-04 11:16:39 -07:00
Jason Wilder 70710df910 Fix typo 2015-12-04 10:02:59 -07:00
Jason Wilder c7e37766e7 Avoid repetitive index searches when iterating over cursors
First pass at TSM cursor iteration ended up searching the file indexes
too frequently and hurt performance.  This changes that to search it once
and then have the cursor hold onto the block locations to seek
to.  Doubles the query performance from the first iteration, but still a lot
of room for improvement.
2015-12-04 10:02:59 -07:00
Jason Wilder 4b7cc6720a Merge pull request #4983 from influxdb/jw-tsm-deletes2
Implement delete series/measurement
2015-12-04 10:02:11 -07:00
Jason Wilder c54a3da0ca Implement delete series/measurement 2015-12-04 09:10:26 -07:00
Paul Dix eafb703afc Update TSM engine, WAL and encoding
* Add InfluxQLType to Values to map the TSM type to InfluxQL
* Fix bug in WAL where close wouldn't nil out the currentSegment after closing it
* Export writeSnapshot to be used in tests, add argument to run it async or not
* Update reloadCache to load temporary metadata information in the engine
* Update LoadMetadataIndex to use the temp WAL metadata information
2015-12-04 11:09:39 -05:00
Philip O'Toole 89a3490cae Merge pull request #4989 from influxdb/cache_rename
Fix comment and remove snapshot stutter
2015-12-04 07:43:26 -08:00
Philip O'Toole f939e49f0f Fix comment and remove snapshot stutter 2015-12-04 07:29:58 -08:00
Paul Dix b7bae53405 Merge pull request #4980 from influxdb/cursor_desc
Fix descending cache cursor
2015-12-04 07:02:13 -05:00
Adarsha 6a0e60c67e Added mmap implementation for Windows
Added mmap implementation for Windows. It uses MapViewOfFile similar to Bolt's implementation. MapViewOfFile  returns a pointer and not a byte array. Bolt changed their data structure to support it. 

Instead of changing the implementation of tsm data structure, I used a trick shown in https://groups.google.com/forum/#!topic/golang-nuts/g0nLwQI9www to use SliceHeader to convert the pointer into a slice.
2015-12-04 10:20:43 +05:30
Adarsha d39d0a5c90 Removed Syscall.Mmap to use platform specific mmap
Updates lines 1794 and 2304 to use mmamp in windows or unix versions instead of Syscall.Mmap
2015-12-04 09:17:13 +05:30
Philip O'Toole 2d79d7e35f Fix descending cache cursor 2015-12-03 14:34:29 -08:00
Jason Wilder 66c9ef862e Fix regressions
Something broke with writing to the WAL now that compactions are running
concurrently.  There was also a performance problem with Next/Prev doing
twice as many searches as necessary.
2015-12-03 14:25:03 -07:00
Jason Wilder adf5c5b223 Replace Next/Prev with Scan 2015-12-03 12:39:13 -07:00
Jason Wilder 193a36eeb6 Fix code review comments 2015-12-03 12:39:13 -07:00
Jason Wilder 2019e70331 Fix reading string blocks
The block value to decode was 4 bytes too long so decoding string
block returned a snappy decode error.
2015-12-03 12:39:13 -07:00
Jason Wilder 2ad32af7ea Add desc quey support 2015-12-03 12:39:13 -07:00
Jason Wilder be59ba3455 Add Prev support to FileStore
Allows read the previous block of values given a timestamp and key.
2015-12-03 12:39:12 -07:00
Jason Wilder e9832d7414 Add multi-field cursor support to devtsm1 engine 2015-12-03 12:37:47 -07:00
Jason Wilder 6fba01df89 Implement single field TSM queries 2015-12-03 12:35:36 -07:00
Paul Dix 4624fb2a78 Update cache to address PR comments 2015-12-03 14:03:11 -05:00
Adarsha c2b8a24004 Added mmap implementation for Windows
Added mmap implementation for Windows based on MapViewOfFile. Used SliceHeader trick to change the pointer returned by MapViewOfFile to a byte slice. This will not call for any change in rest of tsm.

However I am not sure where this mmap function is called, as go build is still complains about 
    
    tsdb\engine\tsm1\tsm1.go:1974: undefined: syscall.Mmap
    tsdb\engine\tsm1\tsm1.go:1974: undefined: syscall.PROT_READ
    tsdb\engine\tsm1\tsm1.go:1974: undefined: syscall.MAP_SHARED
    tsdb\engine\tsm1\tsm1.go:2033: undefined: syscall.Munmap
2015-12-03 23:43:48 +05:30
Paul Dix be4891c40b Update TSM write snapshot, Compactor
* Ensure that writing snapshots in engine are goroutine safe
* Add Clone method to Compactor
2015-12-03 11:49:47 -05:00
Paul Dix 6722e9ff14 Update TSM engine, engine_test, and wal_test
* Address jwilder's comments in #4966
2015-12-03 10:49:47 -05:00
Paul Dix bf65e967aa Add test for compacting multiple TSM files 2015-12-03 10:36:17 -05:00
Paul Dix b0fb8a0a27 Update TSM cache, compact, wal, encoding
* Update cache to have a single slice of values for a key (removed checkpoints)
* Changed compact.Plan to only worry about TSM files.
* Updated Plan to not return an error since there was no case in which it would.
* Update WAL to not keep stats since they're no longer needed.
* Update engine to flush the Cache/WAL to a new TSM file when the min threshold is hit.
* Split compact logic between TSM compacts and WAL/Cache writes.
* Remove unnecessary merge iterator, wal segment iterator, and other no longer necessary stuff.
* Remove the asending bool from the Dedupe method. Values should always be in ascending order. It's up to the cursor to iterate through values based on the direction. Giving the cursor responsibility makes it so we don't need to sort, dedupe or reallocate anything for different query orders.
* Updated engine to use its locks to ensure writes and cache flushes don't cause a race.
* Update all tests with new signatures. Removed a bunch of tests around TSM rewrites and WAL segment iteration that are no longer necessary.
2015-12-03 08:11:50 -05:00
Jason Wilder 83ccaaa656 Reload cache at startup 2015-12-02 14:16:36 -07:00
Jason Wilder ba99dece0c Wire up tsm1dev engine cursor 2015-12-02 14:01:10 -07:00
Jason Wilder 3a8a19a99d Implement LoadMetaDataIndex for tsm1dev engine 2015-12-02 13:38:06 -07:00
Jason Wilder 3014d7e391 Return errors for func not implemented in tsm1dev engine 2015-12-02 11:06:01 -07:00
Jason Wilder a7e21c2975 Don't set a cache memory limit by default
100mb is easy it hit even with basic stress test config.  Don't set
a limit by default so that an operator can size it appropriately based
on their hardware.
2015-12-02 11:01:13 -07:00
Jason Wilder 6847a6ba0c Fix rebase 2015-12-02 09:47:16 -07:00
Jason Wilder 751d1dd467 Don't rewrite TSM files while WAL segments exist
This approach is not working and needs to be reworked.
2015-12-02 09:45:24 -07:00
Jason Wilder 5744f5ba02 Add ability to filter values by time when writing TSM files 2015-12-02 09:45:24 -07:00
Jason Wilder 708266da69 Cache related compaction fixes 2015-12-02 09:45:24 -07:00
Jason Wilder 231c052003 Don't limit WAL segments during compaction
Since they are already loaded in the cache, this limit is not really
needed anymore.
2015-12-02 09:45:24 -07:00
Jason Wilder 7e249e0555 Use CacheKeyIterator instead of WALKeyIterator during compactions 2015-12-02 09:45:24 -07:00
Jason Wilder 4a03469662 Integrate TSM compaction into dev engine 2015-12-02 09:45:23 -07:00
Jason Wilder 78fda2b89b Implement WAL SegmentStats for compactions 2015-12-02 09:45:23 -07:00
Jason Wilder 1485ea7e41 Implement Size on TSMReader 2015-12-02 09:45:23 -07:00
Jason Wilder d4b1c25f8e Add CompactionPlanner type
CompactionPlanner is used to determine which files (WAL Segments, TSM
Files) to include in a given compaction run.
2015-12-02 09:45:23 -07:00
Jason Wilder 5291fbcf39 Add TSM support to MergeIterator
Enables the ability to combine multiple TSM files into one as well
as merge existing TSM files with newer WAL segment values.
2015-12-02 09:45:23 -07:00
Jason Wilder acdb6bcdf6 Add TSMKeyIterator
Allows iterating of multiple TSM files in sort key and values order.
2015-12-02 09:45:23 -07:00
Jason Wilder 4b6767bf01 Add MMAP based file reader 2015-12-02 09:45:23 -07:00
Philip O'Toole fc83968e2e Cache values supports sorting order 2015-12-01 13:24:25 -08:00
Philip O'Toole 3a72e40e3f Implement descending cursor support 2015-12-01 13:24:25 -08:00
Philip O'Toole ec4daaccff Test ascending tsm1dev cursor 2015-12-01 13:24:25 -08:00
Philip O'Toole 59674fda21 Integrate cache query with tsm1dev engine 2015-12-01 13:24:25 -08:00
Philip O'Toole 7da3fc1aeb Merge pull request #4934 from influxdb/dedupe_sort_order
Deduplicate supports requesting sort order
2015-12-01 16:23:25 -05:00
Philip O'Toole bad0f657de Deduplicate supports requesting sort order 2015-11-30 16:21:44 -08:00
Philip O'Toole 6b3c6a90a1 Merge pull request #4911 from influxdb/integrate_cache
Integrate cache with tsm1dev write path
2015-11-30 14:58:42 -08:00
Philip O'Toole 8649ce4c49 Integrate cache with tsm1dev write path 2015-11-26 06:07:19 -08:00
Philip O'Toole 1bca38bb84 Cache supports writing multiple keys
This keeps the locking to a minimum if the data is available for
multiple keys at once.
2015-11-26 06:07:16 -08:00
Ben Johnson 41459cf687 fix flush deadlock
This commit fixes a deadlock that occurs during b1 flushes. It's
caused by taking locks in a different order. In the flush, b1
locks the engine and then bolt. However, in the query cursor, a
lock is obtained on bolt first (via `DB.Begin()`) and then the
engine is locked while reading from the engine's cache.
2015-11-25 15:00:06 -07:00
Philip O'Toole 8e7dc3bef9 WAL returns current segment ID on write and delete 2015-11-25 12:23:10 -08:00
Jason Wilder d931f5dd22 Merge pull request #4900 from influxdb/jw-compact
WAL segment compaction
2015-11-24 21:35:13 -07:00
Jason Wilder 34bffd5e18 Code review fixes 2015-11-24 21:24:13 -07:00
Jason Wilder 1ce8d6290b Remove values pool replacement
Getting an intermittent test failure with this so removing it for now since compactions
are still able to keep up without it.  Will need to look into this further because the
allocations is still very high and will affect compactions over longer periods of time.
2015-11-24 13:40:07 -07:00
Jason Wilder 0832a03333 Move pools to separate file 2015-11-24 09:44:37 -07:00
Jason Wilder a6541937f8 Add dumptsmdev to influx_inspect
Allow inspecting the updated TSM format.
2015-11-24 08:50:13 -07:00
Jason Wilder 25206c729c Add compactor type 2015-11-24 08:50:07 -07:00
Philip O'Toole f8b4950ea9 Enhance tsm1dev logging 2015-11-23 14:24:39 -08:00
Jason Wilder f70323cb89 Add MergeIterator
MergeIterator will be used to merge multiple TSM KeyIterators and the
WAL KeyIterator using a stream based iteration approach.  Each iteration
cycle returns a key and values ordered in way to write a new TSM file
optimally.
2015-11-23 14:59:15 -07:00
Jason Wilder 5334271e26 Add KeyIterator for WAL segments
This provides and interface and type to combine multiple WAL segments
in order and then allow the values to be read in an order suitable for
writing to a TSM file.
2015-11-23 14:59:15 -07:00
Jason Wilder d2b045f89b Code cleanup 2015-11-23 14:03:50 -07:00
Jason Wilder 697cfe604b Add stubbed out dev tsm engine
Starting to integrate some of the components into a engine that is
usable for development purposes.  This allows the code to evolve while
keeping the existing TSM engine in tact for reference.

Currently, just the WAL is wired up so writes can be tested.  Other engine
functions will panic the server if called.
2015-11-23 13:55:34 -07:00
Jason Wilder 7461b61bf2 Fix race in WAL and WALSegmentWriter
WAL currentSegmentWriter was not accessed under a mutex.  The WALSegmentWriter
also did not use a mutex to protect the underlying writer.
2015-11-23 13:55:34 -07:00
Jason Wilder aa00ef953a Fix typo in func names 2015-11-23 13:55:34 -07:00
Jason Wilder e2b1a09ece Implemment WAL write/delete functions 2015-11-23 13:55:33 -07:00
Jason Wilder afc0d5bfb9 Add WALSegmentReader/Writer
Basic types for reading and writing WAL segment files.
2015-11-23 13:55:33 -07:00
Jason Wilder 151b33d000 Rename wal.go to log.go
This is the existing WAL + cache implementation.  Moving it to a separate file
so that it can remain intact while a refactoring to a independent WAL can occur.

The WAL was also named Log in the code so this names file more closely to the concept
in the code.
2015-11-23 13:53:30 -07:00
Philip O'Toole 19f53d8a75 Add some simple benchmarks 2015-11-20 21:09:44 -08:00
Philip O'Toole 5b573b9248 Move to simpler cache
This cache simply evicts as much as possible whenever a checkpoint is
set.
2015-11-20 21:09:24 -08:00
Jason Wilder 0d1508a7c6 Add comments for search 2015-11-17 23:24:10 -07:00
Jason Wilder a7d7c280ed Add block type to index
This will faciliate loading a block into a type specific result without
first loading the block.  This will also allow us to populate the database
index solely from the index.
2015-11-17 23:24:09 -07:00
Jason Wilder e5022a898d Support decoding into type specific slices
There is a lot of allocations performed when decoding blocks.  These
types can be re-used to reduce allocations in many cases.  This change
allows a type specific slice to be passed in to decode funcs to be re-used
if it is large enough.

The existing decode is is left for backwards compatibility but is not
very efficient right now.  It may be removed.
2015-11-17 23:24:09 -07:00
Jason Wilder 5a12c49475 Make type specific decoders exported 2015-11-17 23:24:09 -07:00
Jason Wilder d517bad6f2 Add BlockType func
Allows the block type to be determined without decoding all the values.
2015-11-17 23:24:09 -07:00