This change starts by building the sequence of entries, which also
allows the required size of destination buffer to be calculated. Then
the buffer is allocated up-front in 1 call.
Each snapshot and hot value-set is appended to the buffer. If ordering
is violated at anytime, set the 'needSort' flag. Sorting, if necessary,
is performed just before returning the data.
FileStats called frequently during compaction planning was too expensive because
they were cleared out every time a file replaced causing them all to be reloaded.
Insted, we grab the stats that are already maintained by the files themselves from
the files when needed.
First pass at TSM cursor iteration ended up searching the file indexes
too frequently and hurt performance. This changes that to search it once
and then have the cursor hold onto the block locations to seek
to. Doubles the query performance from the first iteration, but still a lot
of room for improvement.
* Add InfluxQLType to Values to map the TSM type to InfluxQL
* Fix bug in WAL where close wouldn't nil out the currentSegment after closing it
* Export writeSnapshot to be used in tests, add argument to run it async or not
* Update reloadCache to load temporary metadata information in the engine
* Update LoadMetadataIndex to use the temp WAL metadata information
Added mmap implementation for Windows. It uses MapViewOfFile similar to Bolt's implementation. MapViewOfFile returns a pointer and not a byte array. Bolt changed their data structure to support it.
Instead of changing the implementation of tsm data structure, I used a trick shown in https://groups.google.com/forum/#!topic/golang-nuts/g0nLwQI9www to use SliceHeader to convert the pointer into a slice.
Something broke with writing to the WAL now that compactions are running
concurrently. There was also a performance problem with Next/Prev doing
twice as many searches as necessary.
Added mmap implementation for Windows based on MapViewOfFile. Used SliceHeader trick to change the pointer returned by MapViewOfFile to a byte slice. This will not call for any change in rest of tsm.
However I am not sure where this mmap function is called, as go build is still complains about
tsdb\engine\tsm1\tsm1.go:1974: undefined: syscall.Mmap
tsdb\engine\tsm1\tsm1.go:1974: undefined: syscall.PROT_READ
tsdb\engine\tsm1\tsm1.go:1974: undefined: syscall.MAP_SHARED
tsdb\engine\tsm1\tsm1.go:2033: undefined: syscall.Munmap
* Update cache to have a single slice of values for a key (removed checkpoints)
* Changed compact.Plan to only worry about TSM files.
* Updated Plan to not return an error since there was no case in which it would.
* Update WAL to not keep stats since they're no longer needed.
* Update engine to flush the Cache/WAL to a new TSM file when the min threshold is hit.
* Split compact logic between TSM compacts and WAL/Cache writes.
* Remove unnecessary merge iterator, wal segment iterator, and other no longer necessary stuff.
* Remove the asending bool from the Dedupe method. Values should always be in ascending order. It's up to the cursor to iterate through values based on the direction. Giving the cursor responsibility makes it so we don't need to sort, dedupe or reallocate anything for different query orders.
* Updated engine to use its locks to ensure writes and cache flushes don't cause a race.
* Update all tests with new signatures. Removed a bunch of tests around TSM rewrites and WAL segment iteration that are no longer necessary.
100mb is easy it hit even with basic stress test config. Don't set
a limit by default so that an operator can size it appropriately based
on their hardware.
This commit fixes a deadlock that occurs during b1 flushes. It's
caused by taking locks in a different order. In the flush, b1
locks the engine and then bolt. However, in the query cursor, a
lock is obtained on bolt first (via `DB.Begin()`) and then the
engine is locked while reading from the engine's cache.
Getting an intermittent test failure with this so removing it for now since compactions
are still able to keep up without it. Will need to look into this further because the
allocations is still very high and will affect compactions over longer periods of time.
MergeIterator will be used to merge multiple TSM KeyIterators and the
WAL KeyIterator using a stream based iteration approach. Each iteration
cycle returns a key and values ordered in way to write a new TSM file
optimally.
This provides and interface and type to combine multiple WAL segments
in order and then allow the values to be read in an order suitable for
writing to a TSM file.
Starting to integrate some of the components into a engine that is
usable for development purposes. This allows the code to evolve while
keeping the existing TSM engine in tact for reference.
Currently, just the WAL is wired up so writes can be tested. Other engine
functions will panic the server if called.
This is the existing WAL + cache implementation. Moving it to a separate file
so that it can remain intact while a refactoring to a independent WAL can occur.
The WAL was also named Log in the code so this names file more closely to the concept
in the code.
This will faciliate loading a block into a type specific result without
first loading the block. This will also allow us to populate the database
index solely from the index.
There is a lot of allocations performed when decoding blocks. These
types can be re-used to reduce allocations in many cases. This change
allows a type specific slice to be passed in to decode funcs to be re-used
if it is large enough.
The existing decode is is left for backwards compatibility but is not
very efficient right now. It may be removed.
This writes a tombstone file containing a line per deleted key. This
file is read when a TSMReader is created and any keys listed in the file
are removed from the index.
This prevented the encoders from using other implementations of the Value
interface because it would always cast one of the types to our specific
implementations.
This change ensures that if there are any fields in the WHERE clause of
an aggregate that are different from the fields in the SELECT clause,
that the cursors also decode those fields. Otherwise WHERE clauses of
the form 'SELECT f(w) FROM x WHERE y=z' will return incorrect results
Fixes issue #4701.
This adds some basic file reader/writers for creating the updated TSM file format. It uses a simple
in-memory index without MMAP for now, but will be extended to use and indirect indexing approach as well as MMAPed file access as described in the design doc.
This code is not integrated into the TSM engine yet
Go style -- and existing runtime stats -- do not use underscores, but
instead use camel case. This change makes the internal stats adhere to
that convention.
When a dataFile is deleted, the f file pointer is set to nil. Since deleting
a file happens asynchronously, code that had a reference when it was valid may
run when it's gone.
dataFile was not protected by a mutex which causes a data race and live
code and tests. filesAndLock used reflect.DeepEqual on a copy of dataFile
slices. reflect.DeepEqual appears to access unexported dataFile fields
which can't be protected. This was changed to use a equals func that will
require a mutex to be acquired.
The other issue was that many of the dataFile funcs access the mmap without
acquiring a lock. When a dataFile is deleted (possibly during rewriting),
reads from the mmap could return invalid data because references to the dataFile
are still in use by other goroutines.
Fixes#4534