Commit Graph

22 Commits (5dfb7660f6030123ba9b9e470fc2b8f7d886fcdc)

Author SHA1 Message Date
Jeff Wendling 0a85e3b0dd tsm1: add initial index cleanup to DeletePrefix 2019-01-08 16:32:43 -07:00
Jeff Wendling f712828016 tsm1: refactor and rename some methods 2019-01-08 14:52:30 -07:00
Jeff Wendling 8744a82665 tsm1: add DeletePrefix to the reader 2019-01-07 21:11:49 -07:00
Jeff Wendling f65b0933f6 tsm1: move code around into smaller files and add tests 2019-01-07 21:11:49 -07:00
Jeff Wendling fed3154506 tsm1: DeletePrefix on the indirectIndex 2019-01-07 21:08:32 -07:00
Jeff Wendling ad5352926f tsm1: log when error reading entries for tsm key 2019-01-07 11:00:35 -07:00
Jeff Wendling 9cdefa8e4f tsm1: fix staticcheck and refactor closure out 2019-01-07 11:00:35 -07:00
Jeff Wendling 1ffcd77342 tsm1: fix remaining issues and add small benchmarks
- notice when keys are deleted during iteration and return an error
- make sure all the consumers check the error
- add some benchmarks for small indexes to compare
- allow concurrent readers to flag deletes

benchmarks against base:

name                                           old time/op    new time/op    delta
IndirectIndex_UnmarshalBinary-8                  70.0ms ±17%    71.0ms ±12%      ~     (p=1.000 n=8+8)
IndirectIndex_DeleteRangeLast-8                  1.48µs ± 1%    0.28µs ± 5%   -81.29%  (p=0.000 n=8+7)
IndirectIndex_DeleteRangeFull/Large-8             786ms ± 1%     363ms ± 3%   -53.89%  (p=0.000 n=7+8)
IndirectIndex_DeleteRangeFull/Small-8            2.37ms ± 0%    1.14ms ± 3%   -52.02%  (p=0.000 n=7+8)
IndirectIndex_DeleteRangeFull_Covered/Large-8     384ms ± 2%     188ms ± 3%   -51.04%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull_Covered/Small-8     470µs ± 1%     190µs ± 1%   -59.71%  (p=0.000 n=8+7)
IndirectIndex_Delete/Large-8                     74.0ms ± 1%   128.7ms ± 1%   +73.80%  (p=0.001 n=7+7)
IndirectIndex_Delete/Small-8                      142µs ± 1%     130µs ± 1%    -8.24%  (p=0.000 n=8+8)

name                                           old alloc/op   new alloc/op   delta
IndirectIndex_UnmarshalBinary-8                  11.6MB ± 0%    11.7MB ± 0%    +0.02%  (p=0.000 n=8+7)
IndirectIndex_DeleteRangeLast-8                  3.26kB ± 0%   0.00kB ±NaN%  -100.00%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull/Large-8             233MB ± 0%     161MB ± 0%   -30.75%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull/Small-8            2.13MB ± 0%    1.40MB ± 0%   -34.53%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull_Covered/Large-8    12.4MB ± 0%     0.4MB ± 0%   -96.82%  (p=0.002 n=7+8)
IndirectIndex_DeleteRangeFull_Covered/Small-8     120kB ± 0%       0kB ± 0%   -99.89%  (p=0.000 n=8+8)
IndirectIndex_Delete/Large-8                     4.54kB ± 0%    0.21kB ± 0%   -95.26%  (p=0.000 n=8+8)
IndirectIndex_Delete/Small-8                      80.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.000 n=8+8)

name                                           old allocs/op  new allocs/op  delta
IndirectIndex_UnmarshalBinary-8                    35.0 ± 0%      42.0 ± 0%   +20.00%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeLast-8                    3.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull/Large-8             1.53M ± 0%     0.52M ± 0%   -65.98%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull/Small-8             15.2k ± 0%      5.2k ± 0%   -65.97%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull_Covered/Large-8       620 ± 0%       124 ± 0%   -80.00%  (p=0.002 n=7+8)
IndirectIndex_DeleteRangeFull_Covered/Small-8      10.0 ± 0%       2.0 ± 0%   -80.00%  (p=0.000 n=8+8)
IndirectIndex_Delete/Large-8                        246 ± 0%         1 ± 0%   -99.59%  (p=0.000 n=8+8)
IndirectIndex_Delete/Small-8                       4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.000 n=8+8)
2019-01-07 11:00:35 -07:00
Jeff Wendling 14cf01911e tsm1: change TSMFile to use an iterator style api 2019-01-07 11:00:35 -07:00
Jeff Wendling 917584b054 tsm1: use readerOffsetsIterator for deletes
This reduces the amount of disk hits at some costs in cpu on some benchmarks. Notably, the
DeleteRangeFull_Covered and Delete benchmarks both went to approximately zero page faults
meaning they read from the index file linearly.

name                                     old time/op    new time/op    delta
IndirectIndex_UnmarshalBinary-8            68.8ms ±10%    63.1ms ±16%   -8.28%          (p=0.021 n=8+8)
IndirectIndex_Entries-8                    9.09µs ± 3%    9.62µs ± 1%   +5.84%          (p=0.000 n=8+7)
IndirectIndex_ReadEntries-8                5.86µs ± 1%    6.15µs ± 3%   +5.03%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeLast-8             562ns ± 6%     308ns ± 2%  -45.25%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull-8             363ms ±10%     376ms ± 5%     ~             (p=0.054 n=8+7)
IndirectIndex_DeleteRangeFull_Covered-8     574ms ± 2%     746ms ± 0%  +30.01%          (p=0.000 n=8+7)
IndirectIndex_Delete-8                     51.2ms ± 0%    88.2ms ± 0%  +72.38%          (p=0.000 n=8+7)

name                                     old alloc/op   new alloc/op   delta
IndirectIndex_UnmarshalBinary-8            11.7MB ± 0%    11.7MB ± 0%     ~     (all samples are equal)
IndirectIndex_Entries-8                    32.8kB ± 0%    32.8kB ± 0%     ~     (all samples are equal)
IndirectIndex_ReadEntries-8                0.00B ±NaN%    0.00B ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8            0.00B ±NaN%    0.00B ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeFull-8             162MB ± 0%     162MB ± 0%     ~             (p=0.798 n=8+8)
IndirectIndex_DeleteRangeFull_Covered-8    82.4MB ± 0%    82.4MB ± 0%     ~             (p=0.857 n=8+8)
IndirectIndex_Delete-8                     4.01kB ± 0%    4.04kB ± 0%   +0.90%          (p=0.000 n=8+8)

name                                     old allocs/op  new allocs/op  delta
IndirectIndex_UnmarshalBinary-8              42.0 ± 0%      42.0 ± 0%     ~     (all samples are equal)
IndirectIndex_Entries-8                      1.00 ± 0%      1.00 ± 0%     ~     (all samples are equal)
IndirectIndex_ReadEntries-8                 0.00 ±NaN%     0.00 ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8             0.00 ±NaN%     0.00 ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeFull-8              522k ± 0%      522k ± 0%     ~             (p=0.743 n=8+8)
IndirectIndex_DeleteRangeFull_Covered-8     3.31k ± 0%     3.31k ± 0%     ~             (p=0.856 n=8+8)
IndirectIndex_Delete-8                        123 ± 0%       123 ± 0%     ~     (all samples are equal)

name                                     old speed      new speed      delta
IndirectIndex_DeleteRangeFull-8          18.1MB/s ± 9%  17.5MB/s ± 7%     ~             (p=0.105 n=8+8)
IndirectIndex_Delete-8                    116MB/s ± 0%     0MB/s ± 0%  -99.96%          (p=0.000 n=8+8)
2019-01-07 11:00:35 -07:00
Jeff Wendling 6f5c94f3f7 tsm1: introduce readerOffsets to manage the offsets slice
It exposes an API that will clean up the bodies of many methods and
provide a safe abstraction around iteration that will be able to
handle reads with concurrent deletes.

Benchmarks are flat.
2019-01-07 11:00:35 -07:00
Jeff Wendling f860305124 tsm1: keep first 8 bytes of each key in memory
Since most keys will share the first 8 bytes, we collapse them into
a slice containing partial sums of the counts. We can then binary search
into that slice to find the associated prefix for a given offset index.
Compressing in this way causes the overhead to be negligable and reduces
disk misses by about 30% in these benchmarks (500k series across 100 orgs).

name                                     old time/op    new time/op    delta
IndirectIndex_UnmarshalBinary-8            67.5ms ± 1%    64.6ms ± 1%   -4.33%          (p=0.000 n=8+7)
IndirectIndex_Entries-8                    9.41µs ± 2%    9.39µs ± 1%     ~             (p=0.959 n=8+8)
IndirectIndex_ReadEntries-8                5.99µs ± 1%    6.07µs ± 1%   +1.29%          (p=0.001 n=8+8)
IndirectIndex_DeleteRangeLast-8             369ns ± 2%     566ns ± 1%  +53.37%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull-8             368ms ± 9%     369ms ± 2%     ~             (p=0.232 n=8+7)
IndirectIndex_DeleteRangeFull_Covered-8     600ms ± 1%     618ms ± 0%   +3.03%          (p=0.000 n=8+7)
IndirectIndex_Delete-8                     50.0ms ± 1%    47.6ms ± 9%     ~             (p=0.463 n=7+8)

name                                     old alloc/op   new alloc/op   delta
IndirectIndex_UnmarshalBinary-8            11.6MB ± 0%    11.7MB ± 0%   +0.02%          (p=0.000 n=8+7)
IndirectIndex_Entries-8                    32.8kB ± 0%    32.8kB ± 0%     ~     (all samples are equal)
IndirectIndex_ReadEntries-8                0.00B ±NaN%    0.00B ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8            0.00B ±NaN%    0.00B ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeFull-8             162MB ± 0%     162MB ± 0%     ~             (p=0.382 n=8+8)
IndirectIndex_DeleteRangeFull_Covered-8    82.4MB ± 0%    82.4MB ± 0%     ~             (p=0.776 n=8+8)
IndirectIndex_Delete-8                     4.01kB ± 0%    4.01kB ± 0%     ~     (all samples are equal)

name                                     old allocs/op  new allocs/op  delta
IndirectIndex_UnmarshalBinary-8              35.0 ± 0%      42.0 ± 0%  +20.00%          (p=0.000 n=8+8)
IndirectIndex_Entries-8                      1.00 ± 0%      1.00 ± 0%     ~     (all samples are equal)
IndirectIndex_ReadEntries-8                 0.00 ±NaN%     0.00 ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8             0.00 ±NaN%     0.00 ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeFull-8              522k ± 0%      522k ± 0%     ~             (p=0.382 n=8+8)
IndirectIndex_DeleteRangeFull_Covered-8     3.31k ± 0%     3.31k ± 0%     ~             (p=0.457 n=8+8)
IndirectIndex_Delete-8                        123 ± 0%       123 ± 0%     ~     (all samples are equal)

name                                     old speed      new speed      delta
IndirectIndex_DeleteRangeFull-8          24.7MB/s ±10%  17.8MB/s ± 2%  -28.18%          (p=0.000 n=8+7)
IndirectIndex_DeleteRangeFull_Covered-8  14.2MB/s ± 1%   9.6MB/s ± 0%  -32.30%          (p=0.000 n=8+7)
IndirectIndex_Delete-8                    171MB/s ± 1%   126MB/s ±10%  -26.35%          (p=0.000 n=7+8)

IndirectIndex_DeleteRangeLast went from 17 page faults, or ~180GB/sec at 369ns/op
to zero page faults. So even though it got 50% slower, it was actually I/O bound
and no longer is.
2019-01-07 11:00:35 -07:00
Jeff Wendling 0becfc6239 tsm1: add helper to track page faults in index
Since the methods inline and dead code is eliminated, it has no runtime
overhead in the benchmarks when disabled.

benchmark                                  recorded faults
BenchmarkIndirectIndex_Entries-8           11
BenchmarkIndirectIndex_ReadEntries-8       11
BenchmarkIndirectIndex_DeleteRangeLast-8   17
BenchmarkIndirectIndex_DeleteRangeFull-8   2218
BenchmarkIndirectIndex_Delete-8            2084
2019-01-07 11:00:35 -07:00
Jeff Wendling 91e820a9d8 tsm1: fix multiple issues with DeleteRange
1. Correctly acquires locks
2. Seeks for discontiguous key ranges (like delete ["aaa", "zzz"])
3. Is precise about deleting a key when it contains no data

name                             old time/op    new time/op    delta
IndirectIndex_UnmarshalBinary-8    67.3ms ± 1%    63.2ms ±15%      ~             (p=0.463 n=7+8)
IndirectIndex_Entries-8            9.14µs ± 1%    9.01µs ± 0%    -1.40%          (p=0.004 n=8+7)
IndirectIndex_ReadEntries-8        5.83µs ± 1%    5.68µs ± 2%    -2.62%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeLast-8     283ns ± 2%     191ns ± 1%   -32.37%          (p=0.000 n=8+7)
IndirectIndex_DeleteRangeFull-8     612ms ± 1%     361ms ± 1%   -41.02%          (p=0.000 n=8+8)
IndirectIndex_Delete-8             49.0ms ± 1%    49.8ms ± 1%    +1.80%          (p=0.001 n=7+8)

name                             old alloc/op   new alloc/op   delta
IndirectIndex_UnmarshalBinary-8    11.6MB ± 0%    11.6MB ± 0%      ~     (all samples are equal)
IndirectIndex_Entries-8            32.8kB ± 0%    32.8kB ± 0%      ~     (all samples are equal)
IndirectIndex_ReadEntries-8        0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8     64.0B ± 0%     0.0B ±NaN%  -100.00%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull-8     168MB ± 0%     162MB ± 0%    -3.71%          (p=0.000 n=8+8)
IndirectIndex_Delete-8             3.94kB ± 0%    3.94kB ± 0%      ~     (all samples are equal)

name                             old allocs/op  new allocs/op  delta
IndirectIndex_UnmarshalBinary-8      35.0 ± 0%      35.0 ± 0%      ~     (all samples are equal)
IndirectIndex_Entries-8              1.00 ± 0%      1.00 ± 0%      ~     (all samples are equal)
IndirectIndex_ReadEntries-8         0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8      2.00 ± 0%     0.00 ±NaN%  -100.00%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull-8     1.04M ± 0%     0.52M ± 0%   -49.77%          (p=0.000 n=8+8)
IndirectIndex_Delete-8                123 ± 0%       123 ± 0%      ~     (all samples are equal)
2019-01-07 11:00:35 -07:00
Jeff Wendling d40c3e662f tsm1: use uint32 key for tombstones
rough, noisy benchmarks.

benchmark                                    old ns/op     new ns/op     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     62462250      67710057      +8.40%
BenchmarkIndirectIndex_Entries-8             9601          9239          -3.77%
BenchmarkIndirectIndex_ReadEntries-8         5984          5964          -0.33%
BenchmarkIndirectIndex_DeleteRangeLast-8     314           317           +0.96%
BenchmarkIndirectIndex_DeleteRangeFull-8     813838165     615346992     -24.39%
BenchmarkIndirectIndex_Delete-8              52079181      52906315      +1.59%

benchmark                                    old allocs     new allocs     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     35             35             +0.00%
BenchmarkIndirectIndex_Entries-8             1              1              +0.00%
BenchmarkIndirectIndex_ReadEntries-8         0              0              +0.00%
BenchmarkIndirectIndex_DeleteRangeLast-8     2              2              +0.00%
BenchmarkIndirectIndex_DeleteRangeFull-8     1532670        1038932        -32.21%
BenchmarkIndirectIndex_Delete-8              123            123            +0.00%

benchmark                                    old bytes     new bytes     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     11648760      11648760      +0.00%
BenchmarkIndirectIndex_Entries-8             32768         32768         +0.00%
BenchmarkIndirectIndex_ReadEntries-8         1             1             +0.00%
BenchmarkIndirectIndex_DeleteRangeLast-8     64            64            +0.00%
BenchmarkIndirectIndex_DeleteRangeFull-8     232738960     168112352     -27.77%
BenchmarkIndirectIndex_Delete-8              3936          3936          +0.00%
2019-01-07 11:00:35 -07:00
Jeff Wendling ffd35ce1aa tsm1: use a uint32 for offsets globally
benchmarks are flat.
2019-01-07 11:00:35 -07:00
Jeff Wendling 7a7a4b6d58 tsm1: remove offsets from mmap
benchmark                                    old ns/op     new ns/op     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     74525387      66439305      -10.85%
BenchmarkIndirectIndex_Entries-8             8892          9200          +3.46%
BenchmarkIndirectIndex_ReadEntries-8         5816          5691          -2.15%
BenchmarkIndirectIndex_DeleteRangeLast-8     1550          311           -79.94%
BenchmarkIndirectIndex_DeleteRangeFull-8     773649708     767030277     -0.86%
BenchmarkIndirectIndex_Delete-8              79755991      52015903      -34.78%

benchmark                                    old allocs     new allocs     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     35             35             +0.00%
BenchmarkIndirectIndex_Entries-8             1              1              +0.00%
BenchmarkIndirectIndex_ReadEntries-8         0              0              +0.00%
BenchmarkIndirectIndex_DeleteRangeLast-8     3              2              -33.33%
BenchmarkIndirectIndex_DeleteRangeFull-8     1532589        1532344        -0.02%
BenchmarkIndirectIndex_Delete-8              246            123            -50.00%

benchmark                                    old bytes     new bytes     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     11648760      11648760      +0.00%
BenchmarkIndirectIndex_Entries-8             32768         32768         +0.00%
BenchmarkIndirectIndex_ReadEntries-8         1             1             +0.00%
BenchmarkIndirectIndex_DeleteRangeLast-8     3264          64            -98.04%
BenchmarkIndirectIndex_DeleteRangeFull-8     232710448     232624208     -0.04%
BenchmarkIndirectIndex_Delete-8              4432          3936          -11.19%
2019-01-07 11:00:35 -07:00
Jeff Wendling 04605eb266 tsm1: speed up deleterange for large keys
rather than starting at the first key, do a binary search to the
first key. changes O(N) when deleting the largest key to O(log N).

benchmark                                    old ns/op       new ns/op     delta
BenchmarkIndirectIndex_DeleteRangeFull-8     17884166763     738717473     -95.87%
2018-12-14 10:06:24 -07:00
Jeff Wendling 0d411023f2 config: clean up
- Breaks the weird cycle that existed with the EngineOptions
- Removes a bunch of useless parameters
- Moves around a bunch of defaults
2018-11-08 11:39:36 -07:00
Jacob Marble b6a1c0e9c7 storage: MeasurementStats.ReadFrom requires ByteReader 2018-10-19 14:16:20 -07:00
Ben Johnson 68450681ef
Add TSM1 measurement stats.
This commit generates an additional `.tss` stats file alongside each
TSM file when it is written that contains size stats for all measurements
within the TSM file. These files can be combined to generate stats for
all measurements across all TSM files.
2018-10-08 10:43:53 -06:00
Edd Robinson 074f263e08 Initial import of tsm1.Engine 2018-10-01 12:08:37 +01:00