The buffer allocation in bz1 was unused and I'm fairly certain that it
was harmful to performance if used. For queries that run through a bz1
block, needing to hold on to a 64kb block is expensive. Better to churn
on the allocator and have the blocks be released when they are unused
than to have 64kb hanging around for each series regardless of size.
Thanks to @jwilder for brainstorming this issue with me.
By using preallocated buffers for marshaling WAL entries, we can
reduce the amount of memory we allocate.
On a run of `influx_stress -series 10000 -points 1000` this cuts
total allocations from 18684.15MB to 15200.73MB
* Update the store to remove the WAL directories associated with a shard or database when they are deleted.
* Fix the Store so that it creates separate WAL directories for databases and retention policies.
This commit changes the bz1 append to check for a small
ending block first. If the block is below the threshold
for block size then it is rewritten with the new data
points instead of having a new block written.
If a flush is happening and you bring up a cursor for a series, if that series didn't have any data in the cache (after the flush started) then it would return no data. What it should have done instead is return the data that is in the flush cache, which is held in separate area of memory until it is committed to the index.
* All metadata for each shard is now stored in a single key with compressed value
* Creation of new metadata no longer requires a syncrhnous write to Bolt. It is passed to the WAL and written to Bolt periodically outside the write path
* Added DeleteSeries to WAL and updated bz1 to remove series there when DeleteSeries or DropMeasurement are called
This commit fixes the b1 cursor so that reads from either the cache
or bolt buffer will check against the previously read key to ensure
that two of the same keys are not returned.
Fixes#3571.
This commit fixes issues found from using a more complex `testing/quick`
implementation of the `WriteIndex()` test. The newer test inserts
multiple sets of random data that's confined to a smaller random space
so there's more chance of overlapping data.
The fixes were primarily around inserting old data or inserting the same
timestamp multiple times for a single write. The block splitting was not
working correctly before and the sorting and deduping was not handled
correctly.