Commit Graph

2533 Commits (772a44eddf8535d2b8de99a6a977dab7a8d7c7cb)

Author SHA1 Message Date
Edd Robinson 5054d6fae4 Address PR feedback 2018-10-16 13:37:49 +01:00
Stuart Carnie a792fbbdfa fix(encoding): Improve array string encoding perf a little more
Encode the compressed data at the start internal buffer. This ensures
the returned slice maintains the entire capacity and is available for
subsequent use.

When we pool / reuse string buffers, this will help considerably.

Improvements over previous commit:

```
name                        old time/op    new time/op    delta
EncodeStrings/10/batch-8       542ns ± 1%     355ns ± 2%   -34.53%  (p=0.008 n=5+5)
EncodeStrings/100/batch-8     5.29µs ± 1%    3.58µs ± 2%   -32.20%  (p=0.008 n=5+5)
EncodeStrings/1000/batch-8    48.6µs ± 0%    36.2µs ± 2%   -25.40%  (p=0.008 n=5+5)

name                        old alloc/op   new alloc/op   delta
EncodeStrings/10/batch-8        704B ± 0%        0B       -100.00%  (p=0.008 n=5+5)
EncodeStrings/100/batch-8     9.47kB ± 0%    0.00kB       -100.00%  (p=0.008 n=5+5)
EncodeStrings/1000/batch-8    90.1kB ± 0%     0.0kB       -100.00%  (p=0.008 n=5+5)

name                        old allocs/op  new allocs/op  delta
EncodeStrings/10/batch-8        0.00           0.00           ~     (all equal)
EncodeStrings/100/batch-8       1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
EncodeStrings/1000/batch-8      1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
```
2018-10-16 12:08:12 +01:00
Stuart Carnie 964bc3c19e fix(encoding): Improve simple8b another 6%; fix inconsequential bug
simple8b encodes deltas[1:], thus deltas[0] >= simple8b.MaxValue is
invalid.

Also changed loop calculating deltas, RLE and max to be similar to
batch timestamp, for greater consistency.

Improvements over previous commit:

```
name                             old time/op    new time/op    delta
name                             old time/op    new time/op    delta
EncodeIntegers/1000_seq/batch-8    1.50µs ± 1%    1.48µs ± 1%  -1.40%  (p=0.008 n=5+5)
EncodeIntegers/1000_ran/batch-8    6.10µs ± 0%    5.69µs ± 2%  -6.58%  (p=0.008 n=5+5)
EncodeIntegers/1000_dup/batch-8    1.50µs ± 1%    1.49µs ± 0%  -1.21%  (p=0.008 n=5+5)
```

Improvements overall:

```
name                             old time/op    new time/op    delta
EncodeIntegers/1000_seq/batch-8    2.04µs ± 0%    1.48µs ± 1%  -27.25%  (p=0.008 n=5+5)
EncodeIntegers/1000_ran/batch-8    8.80µs ± 2%    5.69µs ± 2%  -35.29%  (p=0.008 n=5+5)
EncodeIntegers/1000_dup/batch-8    2.03µs ± 1%    1.49µs ± 0%  -26.93%  (p=0.008 n=5+5)
```
2018-10-16 12:08:12 +01:00
Stuart Carnie 43f96a6ddf feat(encoding): Improve timestamp encoding
Timestamp improvements prior to any improvements to simple8b

```
name                               old time/op    new time/op    delta
name                               old time/op    new time/op    delta
EncodeTimestamps/1000_seq/batch-8    2.64µs ± 1%    1.36µs ± 1%  -48.25%  (p=0.008 n=5+5)
EncodeTimestamps/1000_ran/batch-8    64.0µs ± 1%    32.2µs ± 1%  -49.64%  (p=0.008 n=5+5)
EncodeTimestamps/1000_dup/batch-8    9.32µs ± 0%    1.30µs ± 1%  -86.06%  (p=0.008 n=5+5)
```
2018-10-16 12:08:12 +01:00
Stuart Carnie e9531b7830 feat(encoding): Improve integer and simple8b encoding performance
simple8b EncodeAll improvements should

```
name                     old time/op  new time/op  delta
EncodeAll/1_bit-8        28.5µs ± 1%  28.6µs ± 1%     ~     (p=0.133 n=9+10)
EncodeAll/2_bits-8       28.9µs ± 2%  28.7µs ± 0%     ~     (p=0.068 n=10+8)
EncodeAll/3_bits-8       29.3µs ± 1%  28.8µs ± 0%   -1.70%  (p=0.000 n=10+10)
EncodeAll/4_bits-8       29.6µs ± 1%  29.1µs ± 1%   -1.85%  (p=0.000 n=10+10)
EncodeAll/5_bits-8       30.6µs ± 1%  29.8µs ± 2%   -2.70%  (p=0.000 n=10+10)
EncodeAll/6_bits-8       31.3µs ± 1%  30.0µs ± 1%   -4.08%  (p=0.000 n=9+9)
EncodeAll/7_bits-8       32.6µs ± 1%  30.8µs ± 0%   -5.49%  (p=0.000 n=9+9)
EncodeAll/8_bits-8       33.6µs ± 2%  31.0µs ± 1%   -7.77%  (p=0.000 n=10+9)
EncodeAll/10_bits-8      34.9µs ± 0%  31.9µs ± 2%   -8.55%  (p=0.000 n=9+10)
EncodeAll/12_bits-8      36.8µs ± 1%  32.6µs ± 1%  -11.35%  (p=0.000 n=9+10)
EncodeAll/15_bits-8      39.8µs ± 1%  34.1µs ± 2%  -14.40%  (p=0.000 n=10+10)
EncodeAll/20_bits-8      45.2µs ± 3%  36.2µs ± 1%  -19.97%  (p=0.000 n=10+9)
EncodeAll/30_bits-8      55.0µs ± 0%  40.9µs ± 1%  -25.62%  (p=0.000 n=9+9)
EncodeAll/60_bits-8      86.2µs ± 1%  55.2µs ± 1%  -35.92%  (p=0.000 n=10+10)
EncodeAll/combination-8   582µs ± 2%   502µs ± 1%  -13.80%  (p=0.000 n=9+9)
```

EncodeIntegers:

```
name                             old time/op    new time/op    delta
EncodeIntegers/1000_seq/batch-8    2.04µs ± 0%    1.50µs ± 1%  -26.22%  (p=0.008 n=5+5)
EncodeIntegers/1000_ran/batch-8    8.80µs ± 2%    6.10µs ± 0%  -30.73%  (p=0.008 n=5+5)
EncodeIntegers/1000_dup/batch-8    2.03µs ± 1%    1.50µs ± 1%  -26.04%  (p=0.008 n=5+5)
```

EncodeTimestamps (ran is improved due to simple8b improvements)

```
name                               old time/op    new time/op    delta
EncodeTimestamps/1000_seq/batch-8    2.64µs ± 1%    2.65µs ± 2%     ~     (p=0.310 n=5+5)
EncodeTimestamps/1000_ran/batch-8    64.0µs ± 1%    33.8µs ± 1%  -47.23%  (p=0.008 n=5+5)
EncodeTimestamps/1000_dup/batch-8    9.32µs ± 0%    9.28µs ± 1%     ~     (p=0.087 n=5+5)
```
2018-10-16 12:08:12 +01:00
Edd Robinson 91d0a8c3d2 Fix index bug in float encoder 2018-10-16 12:08:12 +01:00
Edd Robinson 09da18c08e Add TSM batch key iterator
The batch focussed TSM key iterator iterates TSM blocks, decoding and
merging blocks where appropriate using the the batch focussed
approaches.
2018-10-16 12:08:12 +01:00
Edd Robinson 51233b71a5 Add batch block encoders 2018-10-16 12:05:52 +01:00
Edd Robinson 592127e411 Batch oriented unsigned encoder 2018-10-16 12:05:52 +01:00
Edd Robinson a7a70a920e Batch oriented boolean encoders
This commit adds a tsm1 function for encoding a batch of booleans into a
provided buffer.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders using
randomly generated sets of booleans.
2018-10-16 12:05:52 +01:00
Jeff Wendling a4d4ef6999 Improvements to batch float encoder
- Inlined the closure to avoid a function call.
- Changed append(b, make([]byte, 8)...) to inline the make call.
- Check for NaN once at the end assuming NaN is infrequent.

New performance delta comparing the current iterators to the new batch
function:

name                   old time/op    new time/op    delta
EncodeFloats/10_seq      1.32µs ± 2%    0.17µs ± 2%  -87.39%  (p=0.000 n=10+10)
EncodeFloats/10_ran      2.09µs ± 1%    0.15µs ± 0%  -92.97%  (p=0.000 n=10+9)
EncodeFloats/100_seq     8.37µs ± 2%    1.28µs ± 2%  -84.74%  (p=0.000 n=10+10)
EncodeFloats/100_ran     19.1µs ± 1%     1.3µs ± 1%  -93.08%  (p=0.000 n=9+9)
EncodeFloats/1000_seq    60.4µs ± 1%    12.6µs ± 0%  -79.13%  (p=0.000 n=9+7)
EncodeFloats/1000_ran     212µs ± 1%      12µs ± 1%  -94.53%  (p=0.000 n=9+8)

name                   old alloc/op   new alloc/op   delta
EncodeFloats/10_seq       0.00B          0.00B          ~     (all equal)
EncodeFloats/10_ran       0.00B          0.00B          ~     (all equal)
EncodeFloats/100_seq      0.00B          0.00B          ~     (all equal)
EncodeFloats/100_ran      0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_seq     0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_ran     0.00B          0.00B          ~     (all equal)

name                   old allocs/op  new allocs/op  delta
EncodeFloats/10_seq        0.00           0.00          ~     (all equal)
EncodeFloats/10_ran        0.00           0.00          ~     (all equal)
EncodeFloats/100_seq       0.00           0.00          ~     (all equal)
EncodeFloats/100_ran       0.00           0.00          ~     (all equal)
EncodeFloats/1000_seq      0.00           0.00          ~     (all equal)
EncodeFloats/1000_ran      0.00           0.00          ~     (all equal)
2018-10-16 12:05:52 +01:00
Edd Robinson ee607f9288 Batch oriented string encoders
This commit adds a tsm1 function for encoding a batch of strings into a
provided buffer. The new function also shares the buffer between the
input data and the snappy encoded output, reducing allocations.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders using
randomly generated strings.

name                old time/op    new time/op    delta
EncodeStrings/10      2.14µs ± 4%    1.42µs ± 4%   -33.56%  (p=0.000 n=10+10)
EncodeStrings/100     12.7µs ± 3%    10.9µs ± 2%   -14.46%  (p=0.000 n=10+10)
EncodeStrings/1000     132µs ± 2%     114µs ± 2%   -13.88%  (p=0.000 n=10+9)

name                old alloc/op   new alloc/op   delta
EncodeStrings/10        657B ± 0%      704B ± 0%    +7.15%  (p=0.000 n=10+10)
EncodeStrings/100     6.14kB ± 0%    9.47kB ± 0%   +54.14%  (p=0.000 n=10+10)
EncodeStrings/1000    61.4kB ± 0%    90.1kB ± 0%   +46.66%  (p=0.000 n=10+10)

name                old allocs/op  new allocs/op  delta
EncodeStrings/10        3.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeStrings/100       3.00 ± 0%      1.00 ± 0%   -66.67%  (p=0.000 n=10+10)
EncodeStrings/1000      3.00 ± 0%      1.00 ± 0%   -66.67%  (p=0.000 n=10+10)
2018-10-16 12:05:52 +01:00
Edd Robinson d1b7e02483 Batch oriented timestamp encoders
This commit adds a tsm1 function for encoding a batch of timestamps into a
provided buffer.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders. They look
at a sequential input slice, a randomly generated input slice and a
duplicate slice. All slices are sorted.

name                       old time/op    new time/op    delta
EncodeTimestamps/10_seq       153ns ± 2%     104ns ± 2%  -31.62%  (p=0.000 n=9+10)
EncodeTimestamps/10_ran       191ns ± 2%     142ns ± 0%  -25.73%  (p=0.000 n=10+9)
EncodeTimestamps/10_dup       114ns ± 1%      68ns ± 4%  -39.77%  (p=0.000 n=8+10)
EncodeTimestamps/100_seq      704ns ± 2%     321ns ± 2%  -54.44%  (p=0.000 n=9+9)
EncodeTimestamps/100_ran     7.27µs ± 4%    7.01µs ± 2%   -3.59%  (p=0.000 n=10+10)
EncodeTimestamps/100_dup      756ns ± 3%     396ns ± 2%  -47.57%  (p=0.000 n=10+10)
EncodeTimestamps/1000_seq    6.32µs ± 1%    2.46µs ± 2%  -61.01%  (p=0.000 n=8+10)
EncodeTimestamps/1000_ran     108µs ± 0%      68µs ± 3%  -37.57%  (p=0.000 n=8+10)
EncodeTimestamps/1000_dup    7.26µs ± 1%    3.64µs ± 1%  -49.80%  (p=0.000 n=10+8)

name                       old alloc/op   new alloc/op   delta
EncodeTimestamps/10_seq       0.00B          0.00B          ~     (all equal)
EncodeTimestamps/10_ran       0.00B          0.00B          ~     (all equal)
EncodeTimestamps/10_dup       0.00B          0.00B          ~     (all equal)
EncodeTimestamps/100_seq      0.00B          0.00B          ~     (all equal)
EncodeTimestamps/100_ran      0.00B          0.00B          ~     (all equal)
EncodeTimestamps/100_dup      0.00B          0.00B          ~     (all equal)
EncodeTimestamps/1000_seq     0.00B          0.00B          ~     (all equal)
EncodeTimestamps/1000_ran     0.00B          0.00B          ~     (all equal)
EncodeTimestamps/1000_dup     0.00B          0.00B          ~     (all equal)

name                       old allocs/op  new allocs/op  delta
EncodeTimestamps/10_seq        0.00           0.00          ~     (all equal)
EncodeTimestamps/10_ran        0.00           0.00          ~     (all equal)
EncodeTimestamps/10_dup        0.00           0.00          ~     (all equal)
EncodeTimestamps/100_seq       0.00           0.00          ~     (all equal)
EncodeTimestamps/100_ran       0.00           0.00          ~     (all equal)
EncodeTimestamps/100_dup       0.00           0.00          ~     (all equal)
EncodeTimestamps/1000_seq      0.00           0.00          ~     (all equal)
EncodeTimestamps/1000_ran      0.00           0.00          ~     (all equal)
EncodeTimestamps/1000_dup      0.00           0.00          ~     (all equal)
2018-10-16 12:05:52 +01:00
Edd Robinson de5ca4a108 Batch oriented int encoders
This commit adds a tsm1 function for encoding a batch of ints into a
provided buffer.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders. They look
at a sequential input slice, a randomly generated input slice and a
duplicate slice:

name                     old time/op    new time/op    delta
EncodeIntegers/10_seq       144ns ± 2%      41ns ± 1%   -71.46%  (p=0.000 n=10+10)
EncodeIntegers/10_ran       304ns ± 7%     140ns ± 2%   -53.99%  (p=0.000 n=10+10)
EncodeIntegers/10_dup       147ns ± 4%      41ns ± 2%   -72.14%  (p=0.000 n=10+9)
EncodeIntegers/100_seq      483ns ± 7%     208ns ± 1%   -56.98%  (p=0.000 n=10+9)
EncodeIntegers/100_ran     1.64µs ± 7%    1.01µs ± 1%   -38.42%  (p=0.000 n=9+9)
EncodeIntegers/100_dup      484ns ±14%     210ns ± 2%   -56.63%  (p=0.000 n=10+10)
EncodeIntegers/1000_seq    3.11µs ± 2%    1.81µs ± 2%   -41.68%  (p=0.000 n=10+10)
EncodeIntegers/1000_ran    16.9µs ±10%    11.0µs ± 2%   -34.58%  (p=0.000 n=10+10)
EncodeIntegers/1000_dup    3.05µs ± 3%    1.81µs ± 2%   -40.71%  (p=0.000 n=10+8)

name                     old alloc/op   new alloc/op   delta
EncodeIntegers/10_seq       32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_ran       32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_dup       32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_seq      32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_ran       128B ± 0%        0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_dup      32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_seq     32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_ran    1.15kB ± 0%    0.00kB       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_dup     32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)

name                     old allocs/op  new allocs/op  delta
EncodeIntegers/10_seq        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_ran        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_dup        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_seq       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_ran       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_dup       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_seq      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_ran      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_dup      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
2018-10-16 12:05:52 +01:00
Edd Robinson 6b52231a37 Batch oriented float encoders
This commit adds a tsm1 function for encoding a batch of floats into a
buffer. Further, it replaces the `bitstream` library used in the
existing encoders (and all the current decoders) with inlined bit
expressions within the encoder, significantly reducing the function call
overhead for larger batches.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders. They look
at a sequential input slice and a randomly generated input slice.

name                   old time/op    new time/op    delta
EncodeFloats/10_seq      1.14µs ± 3%    0.24µs ± 3%  -78.94%  (p=0.000 n=10+10)
EncodeFloats/10_ran      1.69µs ± 2%    0.21µs ± 3%  -87.43%  (p=0.000 n=10+10)
EncodeFloats/100_seq     7.07µs ± 1%    1.72µs ± 1%  -75.62%  (p=0.000 n=7+9)
EncodeFloats/100_ran     15.8µs ± 4%     1.8µs ± 1%  -88.60%  (p=0.000 n=10+9)
EncodeFloats/1000_seq    50.2µs ± 3%    16.2µs ± 2%  -67.66%  (p=0.000 n=10+10)
EncodeFloats/1000_ran     174µs ± 2%      16µs ± 2%  -90.77%  (p=0.000 n=10+10)

name                   old alloc/op   new alloc/op   delta
EncodeFloats/10_seq       0.00B          0.00B          ~     (all equal)
EncodeFloats/10_ran       0.00B          0.00B          ~     (all equal)
EncodeFloats/100_seq      0.00B          0.00B          ~     (all equal)
EncodeFloats/100_ran      0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_seq     0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_ran     0.00B          0.00B          ~     (all equal)

name                   old allocs/op  new allocs/op  delta
EncodeFloats/10_seq        0.00           0.00          ~     (all equal)
EncodeFloats/10_ran        0.00           0.00          ~     (all equal)
EncodeFloats/100_seq       0.00           0.00          ~     (all equal)
EncodeFloats/100_ran       0.00           0.00          ~     (all equal)
EncodeFloats/1000_seq      0.00           0.00          ~     (all equal)
EncodeFloats/1000_ran      0.00           0.00          ~     (all equal)
2018-10-16 12:05:52 +01:00
Edd Robinson 9ecadd1a9c Rename time batch decoders 2018-10-16 12:05:52 +01:00
Edd Robinson c1d82fccf0 Rename unsigned batch decoders 2018-10-16 12:05:52 +01:00
Edd Robinson 536e7bb62f Rename string batch decoders 2018-10-16 12:05:52 +01:00
Edd Robinson d819378ad3 Rename boolean batch decoders 2018-10-16 12:05:52 +01:00
Edd Robinson f81e75f4c1 Rename integer batch decoders 2018-10-16 12:05:52 +01:00
Edd Robinson 12bb8881be Rename float batch decoders 2018-10-16 12:05:52 +01:00
Jonathan A. Sternberg af8bf99256
Do not panic when a series id iterator is nil 2018-10-11 15:16:59 -05:00
Jeff Wendling 69dc031a75 Use platform for most of the read service code
This commit deletes most of the code to service reads from influxdb
and pulls it in from platform instead.

Of note, the models.Tag and models.Tags types are now aliases to the
platform models.Tag and models.Tags types. Additionally, many types
in the tsdb package relating to cursors are also aliases to the same
types in the platform cursors package.

This updates the platform and flux repos to the current master in the
Gopkg.lock.
2018-10-10 11:20:25 -06:00
Ben Johnson 844b7ef9bf
Merge pull request #10299 from influxdata/bj-tsm1-panic-fix
Fix TSM1 panic on reader error.
2018-10-10 08:12:17 -06:00
Edd Robinson ee61ed3dca
Merge pull request #10327 from influxdata/er-duplicate-tsm
Cleanup failed TSM snapshots
2018-10-09 17:08:04 +01:00
Ben Johnson 1580f90be4
Merge pull request #10339 from influxdata/bj-fix-series-index-tombstone
Fix series file tombstoning.
2018-10-08 08:35:05 -06:00
Ben Johnson 2cb97146f0
Fix series file tombstoning.
This commit fixes an issue with the series file compaction process
where tombstones are lost after compaction and series existence
checks are not correct. This commit also fixes some smaller flushing
issues within the series file that mainly related to testing.
2018-10-05 08:23:25 -06:00
Edd Robinson d649d5928b Cleanup failed TSM snapshot
If there was an error after the cache has been snapshotted to one or
more TSM files, but before the cache and WAL are cleaned up, then the
cache would be repeatedly snapshotted, generated duplicate level 1 TSM
files.

This commit attempts to clean those files up by removing the temporary
TSM file(s). The snapshot will be retried.
2018-10-03 16:34:54 +01:00
Ben Johnson bdcbad3fc9
Fix append of possible nil iterator.
This commit updates an iterator list to ignore `nil` iterators.
Adding a `nil` caused the `SeriesIterators.Close()` to panic.
2018-10-02 13:19:21 -06:00
Ben Johnson 0d777ad423
Fix tsi1 sketch locking. 2018-09-26 17:01:47 -06:00
Ben Johnson da2dfa495e
Fix TSM1 panic on reader error.
This commit fixes an error check so that a `nil` TSM reader does
not cause a panic.
2018-09-24 08:54:28 -06:00
Edd Robinson 812ac6da25 PR feedback 2018-09-18 15:58:38 -07:00
Edd Robinson a15bdeef92 Fix megacheck 2018-09-18 15:58:38 -07:00
Edd Robinson 76237d80f2 Address PR feedback 2018-09-18 15:58:38 -07:00
Ben Johnson e651153f1c Add TagValueSeriesIDCache.Delete(). 2018-09-18 15:58:38 -07:00
Ben Johnson fcbc03240a Inline mutex into TagValueSeriesIDCache. 2018-09-18 15:58:38 -07:00
Ben Johnson e4f8637234 Fix ParseSeriesKeyInto() buffer shrinkage. 2018-09-18 15:58:38 -07:00
Edd Robinson bdc293abdd Tidy up 2018-09-18 15:58:38 -07:00
Edd Robinson cc6f8c3502 Reduce allocations in TSI TagSets implementation
Since all tag sets are materialised to strings before this method
returns, a large number of allocations can be avoided by carefully
resuing buffers and containers.

This commit reduces allocations by about 75%, which can be very
significant for high cardinality workloads.

The benchmark results shown below are for a benchmark that asks for all
series keys matching `tag5=value0'. There are 100K matching series keys.

benchmark                                       old ns/op     new ns/op     delta
BenchmarkIndexSet_TagSets/1M_series/inmem-8     10959963      11144345      +1.68%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      23632757      18768888      -20.58%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     10496303      10380551      -1.10%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      24344359      19020234      -21.87%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     10359864      10818296      +4.43%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      23453357      19027445      -18.87%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     10479519      10400619      -0.75%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      26364965      19023749      -27.84%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     10437794      10557066      +1.14%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      23126946      19196955      -16.99%

benchmark                                       old allocs     new allocs     delta
BenchmarkIndexSet_TagSets/1M_series/inmem-8     51             51             +0.00%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      80067          20071          -74.93%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     51             51             +0.00%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      80067          20071          -74.93%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     51             51             +0.00%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      80067          20071          -74.93%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     51             51             +0.00%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      80067          20071          -74.93%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     51             51             +0.00%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      80067          20071          -74.93%

benchmark                                       old bytes     new bytes     delta
BenchmarkIndexSet_TagSets/1M_series/inmem-8     3556728       3556728       +0.00%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      12677328      5157992       -59.31%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     3556728       3556728       +0.00%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      12677328      5157992       -59.31%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     3556728       3556728       +0.00%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      12677328      5157992       -59.31%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     3556728       3556728       +0.00%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      12677328      5157992       -59.31%
BenchmarkIndexSet_TagSets/1M_series/inmem-8     3556728       3556728       +0.00%
BenchmarkIndexSet_TagSets/1M_series/tsi1-8      12677328      5157992       -59.31%
2018-09-18 15:58:38 -07:00
Edd Robinson d8af622333 Add benchmark for TagSets across indexes 2018-09-18 15:58:38 -07:00
Edd Robinson 5c88a1dd0e Fix locking on cache 2018-09-18 15:58:38 -07:00
Edd Robinson 6d12f5d323 Debug 2018-09-18 15:58:38 -07:00
Edd Robinson 8af7c133db Refactor cache 2018-09-18 15:58:38 -07:00
Edd Robinson 1ae716b64e Use copy-on-write when cloning bitmaps
This commit sets the copy-on-write feature of the SeriesIDSets, such
that we can make immutable clones of underlying bitmaps efficiently. If
the original bitmap is modified then a copy will be made, which won't
affect the clone.
2018-09-18 15:58:38 -07:00
Edd Robinson baf35f2138 Add benchmarks for cache and option to disable 2018-09-18 15:58:38 -07:00
Edd Robinson 3f6ef0ba22 Update cached bitset results with new series ids
This commit ensures that cached bitset results at the Index level are
updated whenever new series ids are created that would belong in those
bitsets.

For example, if we have a cached bitset for the tuple {mem, region,
west}, and we add the series mem,host=prod,region=west then we would
update the cached bitset for {mem, region, west} with the series id of
the newly written series.
2018-09-18 15:58:38 -07:00
Edd Robinson 065d47e4f2 Return created series ids from LogFile insertion 2018-09-18 15:58:38 -07:00
Edd Robinson 52b5640a4a Add test for TagValueSeriesIDIterator 2018-09-18 15:58:38 -07:00
Edd Robinson 81f640e9ae Add more methods and benchmarks to SeriesIDSet 2018-09-18 15:58:38 -07:00
Edd Robinson 9fb301cf10 Add CreateSeriesListIfNotExists benchmark 2018-09-18 15:58:38 -07:00