Commit Graph

1464 Commits (db/wait-timeout-utility)

Author SHA1 Message Date
Sam Arnold 98a76a11a0 feat(tsi): optimize series iteration
When using queries like 'select count(_seriesKey) from bigmeasurement`, we
should iterate over the tsi structures to serve the query instead of loading
all the series into memory up front.

Closes #20543
2021-01-25 14:27:31 -05:00
Sam Arnold 32612313df fix: minor test fixes for go1.15 and also flaky timeouts
Also run gofmt
2021-01-08 14:59:33 -05:00
Sam Arnold d1a1e4b667 chore: restore ImportShard
This reverts commit d14acea44d.
2020-12-07 11:01:00 -04:00
davidby-influx 0faac1a478 chore(tsm1): fix formatting
Failed to format code before commit.
2020-11-16 21:25:26 -08:00
davidby-influx b3724581bc fix(tsm1): "snapshot in progress" error during backup
Loop with backoff in (*Engine).CreateSnapshot() to retry
(*Engine).WriteSnapshot() up to 3 times if
ErrSnapshotInPrgress is returned.  Then continue
on no error or on SnapshotInProgress if skipCacheOk is
true.

https://github.com/influxdata/plutonium/issues/3227
(cherry picked from commit dfa6aa8cea)
2020-11-16 21:23:00 -08:00
davidby-influx 0dcff81f56 fix(tsm1): "snapshot in progress" error during backup
Test the skipCacheOk flag to tsdb.Shard.CreateSnapshot() and
tsdb.Engine.CreateSnapshot()
A value of true allows the backup to proceed even if a cache
snapshot cannot be taken.

https://github.com/influxdata/plutonium/issues/3227
2020-11-05 16:50:51 -08:00
davidby-influx 6ec446f422 fix(tsm1): "snapshot in progress" error during backup
This fix adds a skipCacheOk flag to
tsdb.Store.CreateShardSnapshot() and tsdb.Shard.CreateSnapshot()
to pass to tsdb.Engine.CreateSnapshot()
A value of true allows the backup to proceed even if a cache snapshot
cannot be taken.
This flag is set to true in tsm1.Engine.Backup(), the OSS backup code path
This flag is set to false in tsm1.Engine.Export()

https://github.com/influxdata/plutonium/issues/3227
2020-11-05 11:08:08 -08:00
davidby-influx 23be20bf1b fix(tsm1): "snapshot in progress" error during backup
When an InfluxDB database is very busy writing new points the backup
the process can fail because it can not write a new snapshot.
The error is: operation timed out with error: create snapshot: snapshot in progress.
This happens because InfluxDB takes almost "continuously" a snapshot
from the cache caused by the high number of points ingested.
The fix for this was https://github.com/influxdata/influxdb/pull/16627
but it was for OSS only, and was not in the code path for backups
in clusters.
This fix adds a skipCacheOk flag to tsdb.Engine.CreateSnapshot().
A value of true allows the backup to proceed even if a cache snapshot
cannot be taken.
This flag is set to true in tsm1.Engine.Backup(), the OSS backup code path
and in tsdb.Shard.CreateSnapshot(), the cluster backup code path.
This flag is set to false in tsm1.Engine.Export()

https://github.com/influxdata/plutonium/issues/3227
2020-10-30 10:37:36 -07:00
David Norton 3d92eef720 feat: allow disable compaction per shard
This feature allows compaction to be disabled on a per-shard basis by
creating a file named do_not_compact in a shard's directory. When
disabled, a message is logged every 15 minutes with the reason for
compaction being disabled (existance of the file). This makes it easy to
know if compaction has been disabled for any shards by searching the log
for "compaction disabled" or running "find path/to/data -type f -name
do_not_compact".
2020-10-06 10:58:07 -04:00
Ayan George 3436db4ebb
refactor: Use binary.Read() instead of io.ReadFull() (#19323)
The original version of verifyVersion() reads into a byte slice,
manually ensures its byte order, then converts it to a type comparable
with Version and MagicNumber.

This patch hides those details by calling binary.Read() and reading
values into properly typed variables.

This adds a bit of overhead but this code isn't in the hot-path and this
patch greatly simplifies the code.

verifyVersion() originally accepted an io.ReadSeeker.  It is only called
in once place and that function immediately calls seek after
verifyVersion(), therefore it is probably safe to call Seek() BEFORE
verifyVersion().

The benefit is that verifyVersion() is easier to test since we can pass
it a bytes.Buffer.

This patch adds a test for verifyVersion() as well as a benchmark.

benchmark                    old ns/op     new ns/op     delta
BenchmarkVerifyVersion-8     73.5          123           +67.35%

Finally, this commit moves verifyVersion() from writer.go to reader.go
which is where it is actually used.
2020-08-13 14:54:18 -04:00
Ayan George 6ce0e11738
feat: Collect values written stats (#19187)
* feat(engine/tsm1): Add WritePointsWithContext()

Add WritePontsWithContext() and make WritePoints() a thin wrapper for
it.

The purpose is to add statistics context values that we'll use to
propagate the number of fields and points written to calls up the call
chain.

* feat(tsdb): Add WriteToShardWithContext()

When applied, this patch adds WriteToShardWithContext() and wraps it
with WriteToShard() to preserve the API.

The the purpose of this addition is to propagate a context.Context value
to Shard.WritePointsWithContext().

* feat(tsdb/shard): Add WritePointsWithContext()

The purpose of adding WritePointsWithContext() is to propage context
values down to engine code and propage statistics via the context.Value
up to callers.

This patch also adds values written statistics to the shard.

* feat(http): Gather values written stats

WritePointsWithContext() was added to propagate context values down to
the engine and communicate stats to the caller.

* feat(http): Gather values written stats

WritePointsWithContext() was added to propagate context values down to
the engine and communicate stats to the caller.

* refactor: Change MetricKey to ContextKey

This patch gives the type we're useing for context keys a better name.
2020-08-12 11:26:12 -04:00
Ben Johnson 4a1a8c0041
Merge pull request #18689 from influxdata/batch-write-tombstones-when-deleting
perf(tsi1): batch write tombstone entries when dropping/deleting
2020-06-25 08:15:12 -06:00
Ayan George a9d02e7ab7
fix: Handle snapshot related errors (#18710)
When applied this patch will:

* log snapshot directory removal errors

  Prior to this patch, errors when removing temporary snapshot
  directories happens silently.

  This patch ensures that errors are logged when os.RemoveAll() fails.

* refactor tsm1: Declare error value in condition

  Save a line of code and limits the scope of an error value.

* refactor tsm1: Add MakeSnapshotLinks()

  This commit adds (*FileStore).MakeSnapshotLinks().  The code in this
  function was originally part of CreateSnapshot().

  That code was hoisted out and into MakeSnapshotLinks() becuase there
  are two points of failure that require cleanup -- we have to delete a
  temporary directory on failure.

  Placing the code in one function allows us to check its returned error
  value and perform cleanup in only once place.

  In short, we hoisted code out of CreateSnapshot() to simplify error
  handling.

  On error, we remove any directories we created.
2020-06-25 10:05:04 -04:00
dengzhi.ldz 331569bc11 perf(tsi1): batch write tombstone entries when dropping/deleting 2020-06-24 09:26:09 -06:00
J. Emrys Landivar 6be2b37e0e
Merge pull request #18014 from foobar/cleanup-unused-functions
chore: clean up unused functions
2020-05-18 18:47:07 -05:00
Ayan George c0e391935d
fix(tsdb): address staticcheck warning SA4006 (#18127)
This commit addresses staticcheck warning SA4006 "this value of VAR is
never used" by removing unused variables
2020-05-18 13:11:44 -04:00
Ayan George 1a22af7172
fix(tsdb): address staticcheck warning st1006 (#18126)
* fix(tsdb): address staticcheck ST1006

This patch addresses staticcheck warning "receiver name should not be an
underscore, omit the name if it is unused (ST1006)" for 6 methods.
2020-05-17 22:36:06 -04:00
Tristan Su d14acea44d chore: clean up unused functions 2020-05-08 13:45:34 +08:00
Ayan George a0f2e0c21a
fix(tsm1): Fix temp directory search bug (#17685)
* fix: verify precision parameter in write requests

This change updates the HTTP endpoints that service v1 and v2 writes to
verify the values passed in the precision parameter.

* fix(tsm1): Fix temp directory search bug

The original code's intention is to scan a directory for the directory
with the higest value when converted to an integer.

So directories may be in the form:

  0.tmp
  1.tmp
  2.tmp
  30.tmp
  ...
  100.tmp

The loop should scan the directory, strip the basename and extension
from the file name to leave just a number, then store the higest number
it finds.

Before this patch, there is a bug that has the code only store the
higest value if there is an error converting the numeric value into an
integer.

This patch primarily fixes that logic.

In addition, this patch will save an indent level by inverting logic in
two places:

Instead if checkig if a file is a directory and has a suffix of ".tmp",
it is probably better to test if a file is NOT a directory OR does NOT
have an extension of ".tmp" then continue.

Also, instead of testig if len(ss) == 2, we can test if len(ss) != 2 and
continue if so.

Both of these save an indent level and keeps our "happy path" to the
left.

Finally, this patch will use string concatination instead of calling
fmt.Sprintf() to add periods to "tmp" and "tsm" extension.

Co-authored-by: David Norton <dgnorton@gmail.com>
2020-04-15 10:29:46 -04:00
Ben Johnson 3f59d4be8e fix(tsdb): Fix error lists 2020-04-01 14:54:39 -06:00
elbehery 042128b948 fix(tsdb): Replace panic with error while de/encoding corrupt data
fixes #17440

While encoding or decoding corrupt data, the current behaviour is to `panic`.
This commit replaces the `panic` with `error` to be propagated up to the calling `iterator`.
To avoid overwriting other `error`, iterators now wraps a `TSMErrors` which contains ALL the encountered errors.
TSMErrors itself implements `Error()`, the returned string contains all the error msgs, separated by "," delimiter.
2020-04-01 20:51:11 +02:00
David Norton 25381f97c8
Merge pull request #15952 from influxdata/er-verify-tombstone
feat(inspect): add influx_inspect verify-tombstone tool
2020-03-11 15:56:37 -04:00
Edd Robinson c3f4382ed8
Merge pull request #16606 from influxdata/BP-1.8-er-tsm-block-fix
fix(storage): ensure all block data returned
2020-03-03 14:32:15 +00:00
Gianluca Arbezzano 30621ca9ec chore(tsm1): skip WriteSnapshot during backup if snapshotter is busy
When an InfluxDB database is very busy writing new points the backup
the process can fail because it can not write a new snapshot.

The error is: `operation timed out with error: create snapshot: snapshot in progress`.

This happens because InfluxDB takes almost "continuously" a snapshot
from the cache caused by the high number of points ingested.

This PR skips snapshots if the `snapshotter` does not come available
after three attempts when a backup is requested.

The backup won't contain the data in the cache or WAL.

Signed-off-by: Gianluca Arbezzano <gianarb92@gmail.com>
2020-02-04 20:09:50 +01:00
Edd Robinson 22798fa290 fix(storage): ensure all block data returned
This commit prevents multiple blocks for the same series key having
values truncated when they are being read into an empty buffer.

The current cursor reader code has an optimisation that incorrectly
assumes the incoming array will be limited to 1,000 values (the maximum
block size), but arrays can contain values from multiple matching
blocks.
2020-01-21 18:00:11 +00:00
Sean Brickley fe55d728f0 fix(tsm1): Compaction log error 2020-01-13 20:05:05 -05:00
tmgordeeva f1d26652e9
fix(storage): skip TSM files with block read errors (#15885)
* fix(storage): skip TSM files with block read errors

When we find a bad TSM file during compaction, propagate the error up and move
the bad file aside. The engine will disregard the file so the next compaction
will not hit the same error.
2019-12-13 15:05:39 -08:00
Edd Robinson f7f19c5904 refactor(storage): add tombstone extension 2019-11-17 14:43:38 +00:00
David Norton 102fcd671b fix(tsm1): make Digest() safe for concurrent use
This change adds a lock around digest creation so that it is safe for
concurrent calls. Prior to this change, calls from multiple goroutines
resulted in "Digest aborted, problem renaming tmp digest" errors.
2019-11-12 18:02:41 -05:00
elbehery a4bb1083f2 fix(storage): Renaming corrupt data files fails
fixes#14107
2019-10-28 17:32:58 +01:00
maxunt 021dd3114d
Merge pull request #14245 from influxdata/mu-tsm-clean-1.8-14058
Clean tmp tsm files when replace fails
2019-08-08 17:02:01 -07:00
Max U c6c0a5d3b1 change log level from info to error 2019-07-08 14:52:53 -04:00
Mark Rushakoff 2dfafe822c Remove stray fmt.Println in tsm1.StringArrayEncodeAll
It was introduced in #13699.

Updates #14265.
2019-07-05 15:58:53 -07:00
Max U 9091d72ba7 initial commit for 1.8 2019-07-02 13:18:20 -04:00
Edd Robinson 43e144a923 fix(storage): ensure WAL size correctly set on startup 2019-06-28 16:20:45 +01:00
Edd Robinson ecff62b9e4 test(storage): add test for reproducing #14229 2019-06-28 16:18:32 +01:00
Stuart Carnie a0f7c15b08
chore: Fix constant for 32-bit architecture 2019-06-07 11:00:13 -07:00
Stuart Carnie 86734e7fcd
fix(storage): Don't panic when length of source slice is too large
StringArrayEncodeAll will panic if the total length of strings
contained in the src slice is > 0xffffffff. This change adds a unit
test to replicate the issue and an associated fix to return an error.

This also raises an issue that compactions will be unable to make
progress under the following condition:

* multiple string blocks are to be merged to a single block and
* the total length of all strings exceeds the maximum block size that
  snappy will encode (0xffffffff)

The observable effect of this is errors in the logs indicating a
compaction failure.

Fixes #13687
2019-06-04 17:05:01 -07:00
Stuart Carnie f1e1164e96
fix(storage): Don't panic when length of source slice is too large
StringArrayEncodeAll will panic if the total length of strings
contained in the src slice is > 0xffffffff. This change adds a unit
test to replicate the issue and an associated fix to return an error.

This also raises an issue that compactions will be unable to make
progress under the following condition:

* multiple string blocks are to be merged to a single block and
* the total length of all strings exceeds the maximum block size that
  snappy will encode (0xffffffff)

The observable effect of this is errors in the logs indicating a
compaction failure.

Fixes #13687
2019-05-30 08:23:58 -07:00
Ben Johnson aa3dfc0662
Merge pull request #11791 from influxdata/bj-revert-limit-full-compaction-1.8
Revert "Limit force-full and cold compaction size."
2019-02-11 12:50:55 -07:00
Ben Johnson 198f6fde38
Fix deleteSeriesRange() race condition. 2019-02-11 11:29:09 -07:00
Ben Johnson 2dd913d71b
Revert "Limit force-full and cold compaction size."
This reverts commit 40db64d0b9.
2019-02-11 11:07:44 -07:00
Edd Robinson 05e7def600
Merge pull request #10332 from ludweeg/ludweeg/unslice
Simplify s[:] to s where s is a slice
2019-02-11 10:24:43 +00:00
Jonathan A. Sternberg 2811cde76d
Merge pull request #10414 from seebs/seebs/valuerReuse
reuse ValuerEval objects
2019-02-06 09:14:15 -06:00
Seebs 5525240de3 reuse ValuerEval objects
Scanner objects and iterators often need a ValuerEval. This
object is created, often with a function call, and has at
least one interface in it, so it allocates storage. Then it's
dropped again right away. The only part of it that might be
subject to change is usually a map. While the map's contents
change over time, the actual map doesn't change for the
lifetime of the object.

So, in both iterators and scanners, stash the ValuerEval
and continue reusing it. On a query returning a fair number
of data points, this produces a small (<5% in practice)
improvement in observed performance, visible as a significant
reduction in time spent in runtime (mallocgc, newobject,
etcetera).

The performance improvement isn't big, but it's reasonably
easy to evaluate it and establish that it's a safe change
to make.

Signed-off-by: seebs <seebs@seebs.net>
2019-02-05 15:10:23 -06:00
Ben Johnson def9589584
Merge pull request #10522 from hahnjo/fix-compaction-cache-snapshots
Fix compaction logic on infrequent cache snapshots
2019-02-04 08:34:03 -07:00
Ben Johnson 4083ae01e3
Merge branch '1.8' into hpb-no-series-rebuild-on-delete-when-series-still-in-cache 2019-02-04 08:32:04 -07:00
Edd Robinson 3a81921bb0
Merge pull request #10505 from hpbieker/hpb-no-series-rebuild-on-delete-without-overlap-timerange
Do not rebuild series index on delete for series not overlapping in time
2019-02-04 03:22:00 -08:00
Ben Wells e9bada090f Fix misspelling identified by misspell 2019-02-03 20:27:43 +00:00
Ben Johnson 0c6d77d952
Merge pull request #9944 from michaelyou/hotfix-hashring-mod
Hash ring's hash mod
2019-02-01 12:54:15 -08:00
Edd Robinson 348dac1672 Add repro test case for UTF-8 issue 2018-12-19 14:38:31 +00:00
Ben Johnson 40db64d0b9
Limit force-full and cold compaction size.
This commit limits the number of files that can be compacted in
a single group when forcing a full compaction or when a shard
becomes cold. This is to prevent too many files being compacted
at the same time.
2018-12-05 10:18:56 -07:00
Stuart Carnie 39a3d2335e chore(flux): Update to Flux 0.7.1
Resolve breaking API changes
2018-11-30 10:38:56 -07:00
Jeff Wendling 0a2f6191a6 tsdb: clean up fields index for every kind of delete
Before this, if you deleted everything with `delete where true`
for example, then you would be left with all of your measurements
in the fields index. That would cause ghost fields to reappear
if someone reinserted to the measurement.

This fixes that by making it so the deepest most delete code
checks if the measurement was removed from the index, and if so
cleaning it up out of the fields index.

Additionally, it fixes bugs in that cleanup code where if you had
a measurement like "m1" and "m10", when iterating over the cache
or file store, "m1" would match "m10" due to it only checking the
prefix. This also has it check the character right after the
measurement to be either a comma because tags started, or the first
character of the field separator.
2018-11-27 16:12:06 -07:00
Jonas Hahnfeld 217772752d Fix compaction logic on infrequent cache snapshots
This change fixes #10511 that manifests when a shard is considered cold
faster than its cache is snapshotted. This can happen if WAL is enabled
because previously the code only considered the last modification of
compacted tsm1 files. Instead Engine.LastModified() also takes the WAL
into account if necessary.
2018-11-26 22:05:37 +01:00
Hans Petter Bieker 4670b8d65e Removed file that should not have been added. 2018-11-20 16:39:27 +01:00
Hans Petter Bieker 1d5463e0a0 Do not rebuild series index on delete for series not overlapping in time. 2018-11-20 16:24:13 +01:00
Hans Petter Bieker 926f78d832 Do not rebuild series index on delete when the series still exists in the cache. 2018-11-20 10:34:59 +01:00
Stuart Carnie c3d7f3de2b fix: Allow compactor to make progress if v.MaxTime() != entry.MaxTime 2018-11-14 09:13:13 -07:00
Stuart Carnie 5d083887a5 chore: Compactor test which replicates issue #10465
Due to an encoding bug with simple8b, it is possible that the
MaxTime for a TSM index entry does not match the last encoded timestamp.
2018-11-14 09:13:13 -07:00
Jonathan A. Sternberg a16096cbc4
Merge pull request #9943 from michaelyou/hotfix-typo
Some typo and Wrong position of comment
2018-11-05 12:36:05 -06:00
Edd Robinson be662a5853 Fix TSM index maxtime modification 2018-10-29 15:44:31 +00:00
David Norton 3ad44c0ff4 error if manifest is read/written more than once
This change makes the shard digest writer and reader return an error if
the manifest is written or read more than once.
2018-10-22 14:42:05 -04:00
David Norton 3d01051dfc make digest reader skip manifest if needed
This change makes the digest reader read and discard the manifest if
needed. Not all readers of a digest are interested in the manifest.

This change also makes it a requirement for the writer to write a
manifest because it is a non-optional part of a digest file.
2018-10-22 13:14:35 -04:00
Stuart Carnie 0734f6fe21 feat(tsm1): Improve performance of Gorilla float block decoding
```
name                        old time/op   new time/op    delta
FloatArrayDecodeAll/1-8      45.9ns ± 1%    13.8ns ± 1%   -70.00%  (p=0.000 n=9+9)
FloatArrayDecodeAll/55-8      686ns ± 0%     232ns ± 1%   -66.10%  (p=0.000 n=9+8)
FloatArrayDecodeAll/550-8    5.78µs ± 0%    2.22µs ± 1%   -61.61%  (p=0.000 n=9+9)
FloatArrayDecodeAll/1000-8   10.2µs ± 2%     4.0µs ± 5%   -60.47%  (p=0.000 n=10+10)

name                        old speed     new speed      delta
FloatArrayDecodeAll/1-8     414MB/s ± 1%  1383MB/s ± 1%  +233.76%  (p=0.000 n=9+9)
FloatArrayDecodeAll/55-8    144MB/s ± 0%   424MB/s ± 1%  +194.19%  (p=0.000 n=9+9)
FloatArrayDecodeAll/550-8   133MB/s ± 0%   346MB/s ± 1%  +160.09%  (p=0.000 n=9+10)
FloatArrayDecodeAll/1000-8  135MB/s ± 2%   340MB/s ± 5%  +153.03%  (p=0.000 n=10+10)
```
2018-10-16 17:28:36 -07:00
Stuart Carnie 4dccba29c3 chore(tsm1): go fmt file 2018-10-16 17:07:19 -07:00
Ben Johnson a989b01356
Merge pull request #10249 from hpbieker/hpb-delete-from-prevent-rebuild-series
Prevent DELETE FROM to rebuild series files for shards where nothing is deleted
2018-10-16 14:53:09 -06:00
Edd Robinson 5054d6fae4 Address PR feedback 2018-10-16 13:37:49 +01:00
Stuart Carnie a792fbbdfa fix(encoding): Improve array string encoding perf a little more
Encode the compressed data at the start internal buffer. This ensures
the returned slice maintains the entire capacity and is available for
subsequent use.

When we pool / reuse string buffers, this will help considerably.

Improvements over previous commit:

```
name                        old time/op    new time/op    delta
EncodeStrings/10/batch-8       542ns ± 1%     355ns ± 2%   -34.53%  (p=0.008 n=5+5)
EncodeStrings/100/batch-8     5.29µs ± 1%    3.58µs ± 2%   -32.20%  (p=0.008 n=5+5)
EncodeStrings/1000/batch-8    48.6µs ± 0%    36.2µs ± 2%   -25.40%  (p=0.008 n=5+5)

name                        old alloc/op   new alloc/op   delta
EncodeStrings/10/batch-8        704B ± 0%        0B       -100.00%  (p=0.008 n=5+5)
EncodeStrings/100/batch-8     9.47kB ± 0%    0.00kB       -100.00%  (p=0.008 n=5+5)
EncodeStrings/1000/batch-8    90.1kB ± 0%     0.0kB       -100.00%  (p=0.008 n=5+5)

name                        old allocs/op  new allocs/op  delta
EncodeStrings/10/batch-8        0.00           0.00           ~     (all equal)
EncodeStrings/100/batch-8       1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
EncodeStrings/1000/batch-8      1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
```
2018-10-16 12:08:12 +01:00
Stuart Carnie 964bc3c19e fix(encoding): Improve simple8b another 6%; fix inconsequential bug
simple8b encodes deltas[1:], thus deltas[0] >= simple8b.MaxValue is
invalid.

Also changed loop calculating deltas, RLE and max to be similar to
batch timestamp, for greater consistency.

Improvements over previous commit:

```
name                             old time/op    new time/op    delta
name                             old time/op    new time/op    delta
EncodeIntegers/1000_seq/batch-8    1.50µs ± 1%    1.48µs ± 1%  -1.40%  (p=0.008 n=5+5)
EncodeIntegers/1000_ran/batch-8    6.10µs ± 0%    5.69µs ± 2%  -6.58%  (p=0.008 n=5+5)
EncodeIntegers/1000_dup/batch-8    1.50µs ± 1%    1.49µs ± 0%  -1.21%  (p=0.008 n=5+5)
```

Improvements overall:

```
name                             old time/op    new time/op    delta
EncodeIntegers/1000_seq/batch-8    2.04µs ± 0%    1.48µs ± 1%  -27.25%  (p=0.008 n=5+5)
EncodeIntegers/1000_ran/batch-8    8.80µs ± 2%    5.69µs ± 2%  -35.29%  (p=0.008 n=5+5)
EncodeIntegers/1000_dup/batch-8    2.03µs ± 1%    1.49µs ± 0%  -26.93%  (p=0.008 n=5+5)
```
2018-10-16 12:08:12 +01:00
Stuart Carnie 43f96a6ddf feat(encoding): Improve timestamp encoding
Timestamp improvements prior to any improvements to simple8b

```
name                               old time/op    new time/op    delta
name                               old time/op    new time/op    delta
EncodeTimestamps/1000_seq/batch-8    2.64µs ± 1%    1.36µs ± 1%  -48.25%  (p=0.008 n=5+5)
EncodeTimestamps/1000_ran/batch-8    64.0µs ± 1%    32.2µs ± 1%  -49.64%  (p=0.008 n=5+5)
EncodeTimestamps/1000_dup/batch-8    9.32µs ± 0%    1.30µs ± 1%  -86.06%  (p=0.008 n=5+5)
```
2018-10-16 12:08:12 +01:00
Stuart Carnie e9531b7830 feat(encoding): Improve integer and simple8b encoding performance
simple8b EncodeAll improvements should

```
name                     old time/op  new time/op  delta
EncodeAll/1_bit-8        28.5µs ± 1%  28.6µs ± 1%     ~     (p=0.133 n=9+10)
EncodeAll/2_bits-8       28.9µs ± 2%  28.7µs ± 0%     ~     (p=0.068 n=10+8)
EncodeAll/3_bits-8       29.3µs ± 1%  28.8µs ± 0%   -1.70%  (p=0.000 n=10+10)
EncodeAll/4_bits-8       29.6µs ± 1%  29.1µs ± 1%   -1.85%  (p=0.000 n=10+10)
EncodeAll/5_bits-8       30.6µs ± 1%  29.8µs ± 2%   -2.70%  (p=0.000 n=10+10)
EncodeAll/6_bits-8       31.3µs ± 1%  30.0µs ± 1%   -4.08%  (p=0.000 n=9+9)
EncodeAll/7_bits-8       32.6µs ± 1%  30.8µs ± 0%   -5.49%  (p=0.000 n=9+9)
EncodeAll/8_bits-8       33.6µs ± 2%  31.0µs ± 1%   -7.77%  (p=0.000 n=10+9)
EncodeAll/10_bits-8      34.9µs ± 0%  31.9µs ± 2%   -8.55%  (p=0.000 n=9+10)
EncodeAll/12_bits-8      36.8µs ± 1%  32.6µs ± 1%  -11.35%  (p=0.000 n=9+10)
EncodeAll/15_bits-8      39.8µs ± 1%  34.1µs ± 2%  -14.40%  (p=0.000 n=10+10)
EncodeAll/20_bits-8      45.2µs ± 3%  36.2µs ± 1%  -19.97%  (p=0.000 n=10+9)
EncodeAll/30_bits-8      55.0µs ± 0%  40.9µs ± 1%  -25.62%  (p=0.000 n=9+9)
EncodeAll/60_bits-8      86.2µs ± 1%  55.2µs ± 1%  -35.92%  (p=0.000 n=10+10)
EncodeAll/combination-8   582µs ± 2%   502µs ± 1%  -13.80%  (p=0.000 n=9+9)
```

EncodeIntegers:

```
name                             old time/op    new time/op    delta
EncodeIntegers/1000_seq/batch-8    2.04µs ± 0%    1.50µs ± 1%  -26.22%  (p=0.008 n=5+5)
EncodeIntegers/1000_ran/batch-8    8.80µs ± 2%    6.10µs ± 0%  -30.73%  (p=0.008 n=5+5)
EncodeIntegers/1000_dup/batch-8    2.03µs ± 1%    1.50µs ± 1%  -26.04%  (p=0.008 n=5+5)
```

EncodeTimestamps (ran is improved due to simple8b improvements)

```
name                               old time/op    new time/op    delta
EncodeTimestamps/1000_seq/batch-8    2.64µs ± 1%    2.65µs ± 2%     ~     (p=0.310 n=5+5)
EncodeTimestamps/1000_ran/batch-8    64.0µs ± 1%    33.8µs ± 1%  -47.23%  (p=0.008 n=5+5)
EncodeTimestamps/1000_dup/batch-8    9.32µs ± 0%    9.28µs ± 1%     ~     (p=0.087 n=5+5)
```
2018-10-16 12:08:12 +01:00
Edd Robinson 91d0a8c3d2 Fix index bug in float encoder 2018-10-16 12:08:12 +01:00
Edd Robinson 09da18c08e Add TSM batch key iterator
The batch focussed TSM key iterator iterates TSM blocks, decoding and
merging blocks where appropriate using the the batch focussed
approaches.
2018-10-16 12:08:12 +01:00
Edd Robinson 51233b71a5 Add batch block encoders 2018-10-16 12:05:52 +01:00
Edd Robinson 592127e411 Batch oriented unsigned encoder 2018-10-16 12:05:52 +01:00
Edd Robinson a7a70a920e Batch oriented boolean encoders
This commit adds a tsm1 function for encoding a batch of booleans into a
provided buffer.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders using
randomly generated sets of booleans.
2018-10-16 12:05:52 +01:00
Jeff Wendling a4d4ef6999 Improvements to batch float encoder
- Inlined the closure to avoid a function call.
- Changed append(b, make([]byte, 8)...) to inline the make call.
- Check for NaN once at the end assuming NaN is infrequent.

New performance delta comparing the current iterators to the new batch
function:

name                   old time/op    new time/op    delta
EncodeFloats/10_seq      1.32µs ± 2%    0.17µs ± 2%  -87.39%  (p=0.000 n=10+10)
EncodeFloats/10_ran      2.09µs ± 1%    0.15µs ± 0%  -92.97%  (p=0.000 n=10+9)
EncodeFloats/100_seq     8.37µs ± 2%    1.28µs ± 2%  -84.74%  (p=0.000 n=10+10)
EncodeFloats/100_ran     19.1µs ± 1%     1.3µs ± 1%  -93.08%  (p=0.000 n=9+9)
EncodeFloats/1000_seq    60.4µs ± 1%    12.6µs ± 0%  -79.13%  (p=0.000 n=9+7)
EncodeFloats/1000_ran     212µs ± 1%      12µs ± 1%  -94.53%  (p=0.000 n=9+8)

name                   old alloc/op   new alloc/op   delta
EncodeFloats/10_seq       0.00B          0.00B          ~     (all equal)
EncodeFloats/10_ran       0.00B          0.00B          ~     (all equal)
EncodeFloats/100_seq      0.00B          0.00B          ~     (all equal)
EncodeFloats/100_ran      0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_seq     0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_ran     0.00B          0.00B          ~     (all equal)

name                   old allocs/op  new allocs/op  delta
EncodeFloats/10_seq        0.00           0.00          ~     (all equal)
EncodeFloats/10_ran        0.00           0.00          ~     (all equal)
EncodeFloats/100_seq       0.00           0.00          ~     (all equal)
EncodeFloats/100_ran       0.00           0.00          ~     (all equal)
EncodeFloats/1000_seq      0.00           0.00          ~     (all equal)
EncodeFloats/1000_ran      0.00           0.00          ~     (all equal)
2018-10-16 12:05:52 +01:00
Edd Robinson ee607f9288 Batch oriented string encoders
This commit adds a tsm1 function for encoding a batch of strings into a
provided buffer. The new function also shares the buffer between the
input data and the snappy encoded output, reducing allocations.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders using
randomly generated strings.

name                old time/op    new time/op    delta
EncodeStrings/10      2.14µs ± 4%    1.42µs ± 4%   -33.56%  (p=0.000 n=10+10)
EncodeStrings/100     12.7µs ± 3%    10.9µs ± 2%   -14.46%  (p=0.000 n=10+10)
EncodeStrings/1000     132µs ± 2%     114µs ± 2%   -13.88%  (p=0.000 n=10+9)

name                old alloc/op   new alloc/op   delta
EncodeStrings/10        657B ± 0%      704B ± 0%    +7.15%  (p=0.000 n=10+10)
EncodeStrings/100     6.14kB ± 0%    9.47kB ± 0%   +54.14%  (p=0.000 n=10+10)
EncodeStrings/1000    61.4kB ± 0%    90.1kB ± 0%   +46.66%  (p=0.000 n=10+10)

name                old allocs/op  new allocs/op  delta
EncodeStrings/10        3.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeStrings/100       3.00 ± 0%      1.00 ± 0%   -66.67%  (p=0.000 n=10+10)
EncodeStrings/1000      3.00 ± 0%      1.00 ± 0%   -66.67%  (p=0.000 n=10+10)
2018-10-16 12:05:52 +01:00
Edd Robinson d1b7e02483 Batch oriented timestamp encoders
This commit adds a tsm1 function for encoding a batch of timestamps into a
provided buffer.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders. They look
at a sequential input slice, a randomly generated input slice and a
duplicate slice. All slices are sorted.

name                       old time/op    new time/op    delta
EncodeTimestamps/10_seq       153ns ± 2%     104ns ± 2%  -31.62%  (p=0.000 n=9+10)
EncodeTimestamps/10_ran       191ns ± 2%     142ns ± 0%  -25.73%  (p=0.000 n=10+9)
EncodeTimestamps/10_dup       114ns ± 1%      68ns ± 4%  -39.77%  (p=0.000 n=8+10)
EncodeTimestamps/100_seq      704ns ± 2%     321ns ± 2%  -54.44%  (p=0.000 n=9+9)
EncodeTimestamps/100_ran     7.27µs ± 4%    7.01µs ± 2%   -3.59%  (p=0.000 n=10+10)
EncodeTimestamps/100_dup      756ns ± 3%     396ns ± 2%  -47.57%  (p=0.000 n=10+10)
EncodeTimestamps/1000_seq    6.32µs ± 1%    2.46µs ± 2%  -61.01%  (p=0.000 n=8+10)
EncodeTimestamps/1000_ran     108µs ± 0%      68µs ± 3%  -37.57%  (p=0.000 n=8+10)
EncodeTimestamps/1000_dup    7.26µs ± 1%    3.64µs ± 1%  -49.80%  (p=0.000 n=10+8)

name                       old alloc/op   new alloc/op   delta
EncodeTimestamps/10_seq       0.00B          0.00B          ~     (all equal)
EncodeTimestamps/10_ran       0.00B          0.00B          ~     (all equal)
EncodeTimestamps/10_dup       0.00B          0.00B          ~     (all equal)
EncodeTimestamps/100_seq      0.00B          0.00B          ~     (all equal)
EncodeTimestamps/100_ran      0.00B          0.00B          ~     (all equal)
EncodeTimestamps/100_dup      0.00B          0.00B          ~     (all equal)
EncodeTimestamps/1000_seq     0.00B          0.00B          ~     (all equal)
EncodeTimestamps/1000_ran     0.00B          0.00B          ~     (all equal)
EncodeTimestamps/1000_dup     0.00B          0.00B          ~     (all equal)

name                       old allocs/op  new allocs/op  delta
EncodeTimestamps/10_seq        0.00           0.00          ~     (all equal)
EncodeTimestamps/10_ran        0.00           0.00          ~     (all equal)
EncodeTimestamps/10_dup        0.00           0.00          ~     (all equal)
EncodeTimestamps/100_seq       0.00           0.00          ~     (all equal)
EncodeTimestamps/100_ran       0.00           0.00          ~     (all equal)
EncodeTimestamps/100_dup       0.00           0.00          ~     (all equal)
EncodeTimestamps/1000_seq      0.00           0.00          ~     (all equal)
EncodeTimestamps/1000_ran      0.00           0.00          ~     (all equal)
EncodeTimestamps/1000_dup      0.00           0.00          ~     (all equal)
2018-10-16 12:05:52 +01:00
Edd Robinson de5ca4a108 Batch oriented int encoders
This commit adds a tsm1 function for encoding a batch of ints into a
provided buffer.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders. They look
at a sequential input slice, a randomly generated input slice and a
duplicate slice:

name                     old time/op    new time/op    delta
EncodeIntegers/10_seq       144ns ± 2%      41ns ± 1%   -71.46%  (p=0.000 n=10+10)
EncodeIntegers/10_ran       304ns ± 7%     140ns ± 2%   -53.99%  (p=0.000 n=10+10)
EncodeIntegers/10_dup       147ns ± 4%      41ns ± 2%   -72.14%  (p=0.000 n=10+9)
EncodeIntegers/100_seq      483ns ± 7%     208ns ± 1%   -56.98%  (p=0.000 n=10+9)
EncodeIntegers/100_ran     1.64µs ± 7%    1.01µs ± 1%   -38.42%  (p=0.000 n=9+9)
EncodeIntegers/100_dup      484ns ±14%     210ns ± 2%   -56.63%  (p=0.000 n=10+10)
EncodeIntegers/1000_seq    3.11µs ± 2%    1.81µs ± 2%   -41.68%  (p=0.000 n=10+10)
EncodeIntegers/1000_ran    16.9µs ±10%    11.0µs ± 2%   -34.58%  (p=0.000 n=10+10)
EncodeIntegers/1000_dup    3.05µs ± 3%    1.81µs ± 2%   -40.71%  (p=0.000 n=10+8)

name                     old alloc/op   new alloc/op   delta
EncodeIntegers/10_seq       32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_ran       32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_dup       32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_seq      32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_ran       128B ± 0%        0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_dup      32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_seq     32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_ran    1.15kB ± 0%    0.00kB       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_dup     32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)

name                     old allocs/op  new allocs/op  delta
EncodeIntegers/10_seq        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_ran        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_dup        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_seq       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_ran       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_dup       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_seq      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_ran      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_dup      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
2018-10-16 12:05:52 +01:00
Edd Robinson 6b52231a37 Batch oriented float encoders
This commit adds a tsm1 function for encoding a batch of floats into a
buffer. Further, it replaces the `bitstream` library used in the
existing encoders (and all the current decoders) with inlined bit
expressions within the encoder, significantly reducing the function call
overhead for larger batches.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders. They look
at a sequential input slice and a randomly generated input slice.

name                   old time/op    new time/op    delta
EncodeFloats/10_seq      1.14µs ± 3%    0.24µs ± 3%  -78.94%  (p=0.000 n=10+10)
EncodeFloats/10_ran      1.69µs ± 2%    0.21µs ± 3%  -87.43%  (p=0.000 n=10+10)
EncodeFloats/100_seq     7.07µs ± 1%    1.72µs ± 1%  -75.62%  (p=0.000 n=7+9)
EncodeFloats/100_ran     15.8µs ± 4%     1.8µs ± 1%  -88.60%  (p=0.000 n=10+9)
EncodeFloats/1000_seq    50.2µs ± 3%    16.2µs ± 2%  -67.66%  (p=0.000 n=10+10)
EncodeFloats/1000_ran     174µs ± 2%      16µs ± 2%  -90.77%  (p=0.000 n=10+10)

name                   old alloc/op   new alloc/op   delta
EncodeFloats/10_seq       0.00B          0.00B          ~     (all equal)
EncodeFloats/10_ran       0.00B          0.00B          ~     (all equal)
EncodeFloats/100_seq      0.00B          0.00B          ~     (all equal)
EncodeFloats/100_ran      0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_seq     0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_ran     0.00B          0.00B          ~     (all equal)

name                   old allocs/op  new allocs/op  delta
EncodeFloats/10_seq        0.00           0.00          ~     (all equal)
EncodeFloats/10_ran        0.00           0.00          ~     (all equal)
EncodeFloats/100_seq       0.00           0.00          ~     (all equal)
EncodeFloats/100_ran       0.00           0.00          ~     (all equal)
EncodeFloats/1000_seq      0.00           0.00          ~     (all equal)
EncodeFloats/1000_ran      0.00           0.00          ~     (all equal)
2018-10-16 12:05:52 +01:00
Edd Robinson 9ecadd1a9c Rename time batch decoders 2018-10-16 12:05:52 +01:00
Edd Robinson c1d82fccf0 Rename unsigned batch decoders 2018-10-16 12:05:52 +01:00
Edd Robinson 536e7bb62f Rename string batch decoders 2018-10-16 12:05:52 +01:00
Edd Robinson d819378ad3 Rename boolean batch decoders 2018-10-16 12:05:52 +01:00
Edd Robinson f81e75f4c1 Rename integer batch decoders 2018-10-16 12:05:52 +01:00
Edd Robinson 12bb8881be Rename float batch decoders 2018-10-16 12:05:52 +01:00
Ben Johnson 844b7ef9bf
Merge pull request #10299 from influxdata/bj-tsm1-panic-fix
Fix TSM1 panic on reader error.
2018-10-10 08:12:17 -06:00
ludweeg 5622355526 Simplify s[:] to s where s is a slice 2018-10-04 17:10:21 +03:00
Edd Robinson d649d5928b Cleanup failed TSM snapshot
If there was an error after the cache has been snapshotted to one or
more TSM files, but before the cache and WAL are cleaned up, then the
cache would be repeatedly snapshotted, generated duplicate level 1 TSM
files.

This commit attempts to clean those files up by removing the temporary
TSM file(s). The snapshot will be retried.
2018-10-03 16:34:54 +01:00
Ben Johnson da2dfa495e
Fix TSM1 panic on reader error.
This commit fixes an error check so that a `nil` TSM reader does
not cause a panic.
2018-09-24 08:54:28 -06:00
linxGnu 1dde9a1e12 Update test case 2018-09-14 14:09:24 -07:00
Stuart Carnie a940ebd45a fix(tsm1): Fix FloatBatchDecodeAll to return empty slice an no error
FloatBatchDecodeAll behaves the same as the iterator-based float
decoder, returning an empty slice and no error when passed a buffer
with no encoded float values.

Fixes #10270
2018-09-10 13:59:47 -07:00
Hans Petter Bieker de3a2d657d Fixed indentation. 2018-08-31 11:01:45 +02:00
Hans Petter Bieker 28f5fb4ea5 Prevent rebuilding of series files for shards where nothing is deleted. 2018-08-31 10:51:38 +02:00
Jeff Wendling 4e62c3f795 fix(tsm1): return boolean array iterator for booleans
booleans are still not strings.
2018-08-27 11:32:15 -06:00
Stuart Carnie 2f4fcd8255 chore: Remove BatchCursor references 2018-08-24 11:56:04 -07:00
David Norton 05d979d6b1
Merge pull request #10215 from influxdata/dn-snappy-digests
Switch digests to use snappy compression
2018-08-23 13:17:28 -04:00
David Norton 2f6a1fc03b switch digests to use snappy compression 2018-08-23 13:02:12 -04:00