Commit Graph

410 Commits (9d3294b31cec53310bf8accf5ce4ab45ffc68345)

Author SHA1 Message Date
Ben Johnson 7d72b4e511 feat(tsdb): Bulk delete series performance improvement 2020-03-18 15:47:35 -06:00
Edd Robinson d96cbd4f74
Merge pull request #17016 from influxdata/er-bulk-import
feat(storage): prototype 1.x–2.x migration tooling
2020-03-18 17:57:26 +00:00
Jacob Marble 679215de97
chore: Revert "refactor(tsdb): remove read from unexported field (#17279)" (#17305)
This reverts commit 0ec2b453b9.

Fixes panic.
2020-03-16 17:48:01 -07:00
Jacob Marble 0ec2b453b9
refactor(tsdb): remove read from unexported field (#17279)
* refactor(tsdb): remove read from unexported field

* fix(tsdb): add regression test to check for panic

* fix(tsdb): detect nil without panic
2020-03-16 14:26:14 -07:00
Jacob Marble 386098da36
refactor(storage): move and remove to help cleanup tsdb package (#17275)
* refactor(tsdb): move series file config to seriesfile package

* refactor(tsdb): removed unchecked const EOF

* refactor(tsdb): unexport errors

* refactor(tsdb): remove unused TagValueIterators

* refactor(tsdb): remove SeriesIDIterator usage in tsdb/seriesfile

* refactor(tsdb): remove one-use MeasurementIterators

* refactor(tsdb): remove unused type measurementSliceIterator

* refactor(tsdb): remove unused types TagKeyIterators and tagKeySliceIterator

* refactor(storage): remove unused method Engine.ApplyFnToSeriesIDSet

* refactor(tsdb): rename AllSeriesIDs() -> SeriesIDs()
2020-03-16 12:23:15 -07:00
Jacob Marble 7dbc07beda
chore: Revert "refactor(storage): move and remove to help cleanup tsdb package (#17241)" (#17272)
This reverts commit 4b8a71b97f.

Fixes incident #inc-aws-error-rate-spi-5e6c1423
2020-03-13 17:14:51 -07:00
Jacob Marble 4b8a71b97f
refactor(storage): move and remove to help cleanup tsdb package (#17241)
* refactor(tsdb): move series file config to seriesfile package

* refactor(tsdb): removed unchecked const EOF

* refactor(tsdb): unexport errors

* refactor(tsdb): remove unused TagValueIterators

* refactor(tsdb): remove SeriesIDIterator usage in tsdb/seriesfile

* refactor(tsdb): remove one-use MeasurementIterators

* refactor(tsdb): remove unused type measurementSliceIterator

* refactor(tsdb): remove unused types TagKeyIterators and tagKeySliceIterator

* refactor(storage): remove unused method Engine.ApplyFnToSeriesIDSet

* refactor(tsdb): remove read from unexported field
2020-03-13 13:04:58 -07:00
Edd Robinson 5b437a2966 refactor: fix build 2020-03-13 15:24:53 +00:00
Edd Robinson 08add490e0 fix: ensure buckets are created properly 2020-03-13 11:00:28 +00:00
Edd Robinson bbe40aeb82 feat: prototype 1.x - 2.x migration tool 2020-03-13 11:00:28 +00:00
Jacob Marble 26ca766459
refactor(tsdb): move series file to its own package (#17224)
* refactor(storage): move type ByTagKey to the only package that uses it

* refactor(tsdb): use types in tsdb/cursors

* refactor(tsdb): remove unused type SeriesIDElems

* refactor(tsdb): inline only use of tsdb.ReadAllSeriesIDIterator

* refactor(tsdb): move series file to its own package

* refactor(storage): remove platform->influxdb aliases
2020-03-12 11:32:52 -07:00
Jacob Marble cdbf532f57
refactor(storage): remove dead code and rename a few things (#17217)
* refactor(storage): remove CursorIterators type

* refactor(storage): remove unused tsdb.MarshalTags()

* refactor(storage): remove unused package tsdb/internal

* refactor(storage): rename tsdb/metrics.go to tsdb/series_file_metrics.go

* refactor(storage): remove unused type tagValueSliceIterator

* refactor(storage): rename field row to seriesRow

* refactor(storage): rename tsdb/index.go to tsdb/series_iterators.go
2020-03-12 10:45:48 -07:00
Jacob Marble b91e3f36ab
refactor(hll): remove unused Sketch interface (#17218) 2020-03-12 08:59:05 -07:00
Ben Johnson 627b6f86bb feat(storage): Series file compaction 2020-03-11 19:31:58 -06:00
Ben Johnson ce47e57089 fix(tsdb): Fix predicate clone 2020-02-04 10:12:26 -07:00
Jacob Marble b836ab9c17
feat(storage): implement backup and restore (#16504)
* feat(backup): `influx backup` creates data backup

* feat(backup): initial restore work

* feat(restore): initial restore impl

Adds a restore tool which does offline restore of data and metadata.

* fix(restore): pr cleanup

* fix(restore): fix data dir creation

* fix(restore): pr cleanup

* chore: amend CHANGELOG

* fix: restore to empty dir fails differently

* feat(backup): backup and restore credentials

Saves the credentials file to backups and restores it from backups.

Additionally adds some logging for errors when fetching backup files.

* fix(restore): add missed commit

* fix(restore): pr cleanup

* fix(restore): fix default credentials restore path

* fix(backup): actually copy the credentials file for the backup

* fix: dirs get 0777, files get 0666

* fix: small review feedback

Co-authored-by: tmgordeeva <tanya@influxdata.com>
2020-01-21 14:22:45 -08:00
Stuart Carnie 13a248a4fb
fix(tsm1): Add multiple unit tests to verify correctness
This commit adds numerous tests for ascending and descending cursors
that generate merged blocks across multiple files, which exceed the
default fixed buffer size used by the array cursors (MaxPointsPerBlock).

Tests cover two scenarios

1. Each file has one block and the block from the second file is
   entirely contained within the first block of the first file.
   When merging, the new block is 1200 values, which exceeds the
   MaxPointsPerBlock.

2. Each file has multiple blocks, and the blocks have a mixture of
   values which interleave and overwrite.
2020-01-19 22:53:58 -07:00
Edd Robinson 91551302f9 fix(storage): ensure all block data returned
This commit prevents multiple blocks for the same series key having
values truncated when they are being read into an empty buffer.

The current cursor reader code has an optimisation that incorrectly
assumes the incoming array will be limited to 1,000 values (the maximum
block size), but arrays can contain values from multiple matching
blocks.
2020-01-19 22:03:20 +00:00
Edd Robinson f11504b987 fix(storage): prevent infinite loop in matcher
Fixes #15817

This commit addresses a potential infinite loop, caused
by series keys that contain a certain pattern of escaped
characters.
2020-01-14 15:05:07 +00:00
Edd Robinson a06dc0fd7f fix(storage): prevent data-races on predicate
Fixes #15817

This commit addresses several data-races on the `tsm1.Predicate` type
that were causing a live-lock or similar in rare cases during a delete.

Because `tsm1/FileStore.Apply` executes concurrently across TSM files
the state of the delete's predicate was being unsafely mutated.

This commit adds a `Clone` method to the `influxdb.Predicate` type,
which should be used whenever an `influxdb.Predicate` implementation
needs to be used concurrently.
2020-01-09 10:00:25 +00:00
Jacob Marble 5f19c6cace
chore: Remove several instances of WithLogger (#15996)
* chore: Remove several instances of WithLogger

* chore: unexport Logger fields

* chore: unexport some more Logger fields

* chore: go fmt

chore: fix test

chore: s/logger/log

chore: fix test

chore: revert http.Handler.Handler constructor initialization

* refactor: integrate review feedback, fix all test nop loggers

* refactor: capitalize all log messages

* refactor: rename two logger to log
2019-12-04 15:10:23 -08:00
Edd Robinson 2f86815f83 fix(storage): ensure field is 64-bit aligned 2019-11-22 13:44:58 +00:00
Edd Robinson 7146af61b0 fix(storage): enable package to build on 32-bit arch 2019-11-22 12:55:20 +00:00
Edd Robinson 2471c2468c fix(storage): fixes panic when building predicates
Fixes #15916.

If a predicate was passed in with multiple key/value matches for the
same tag key, then the value index would be incorrect. This ensures that
each tag key can only be added to the location map once.
2019-11-15 15:07:36 +00:00
Edd Robinson 0dd2d38eac fix(tsi1): index defect with negated equality filters
Fixes #15859

This commit fixes a defect in the TSI index where a filter using the
negated equality operator would result in no matching series being
returned for series stored within the `IndexFile` portions of the index.

The root cause of this was due to missing legacy-handling code in the
index for this particular iterator.
2019-11-12 13:26:23 +00:00
George 3804d50fbd
fix(storage): array cursor iterator should return stats of all observed cursors (#15731)
* fix(storage): add failing test for array cursor iterator stats

* fix(storage): make arrayCursorIterator.Stats() return stats of in-focus cursor

* fix(storage): add failing test to assert arrayCursorIterator.Stats() returns accumulated result

* fix(storage): assumulate stats in arrayCursorIterator.Stats() call across all observed cursors
2019-11-05 10:41:06 +01:00
Christopher Wolff 04bc7bf76b test(tsdb): skip flaky test
https://github.com/influxdata/influxdb/issues/15220
2019-10-30 10:40:03 -07:00
Edd Robinson dc78d7c0eb
Merge pull request #14373 from zhulongcheng/add-missing-err
fix(tsdb): add missing err in SeriesPartition.Open
2019-10-24 13:13:32 +01:00
Edd Robinson 2727ae3c25 refactor: simpify Semaphore interface 2019-10-23 19:49:48 +01:00
Edd Robinson b6e911d72c refactor: move goroutine out to function 2019-10-23 19:49:46 +01:00
Edd Robinson 8f6701d4b1 feat(storage): add full compaction semaphore
By default this feature is disabled; the full compaction behaviour does
not change. When this feature is enabled compactions can be limited
across multiple storage engines running in multiple processes.

The mechanism by which this happens is not part of the abstraction added
here.
2019-10-23 19:45:01 +01:00
Edd Robinson ef1e15a0ad
Merge pull request #15318 from influxdata/er-mv-comp-limiter
feat(storage): allow compaction limiter to be injected into engine
2019-10-09 13:11:44 +01:00
Ilya Sevostyanov 596414a3ff
fix(storage): added missing string values for CacheStatus type.
Closes: #15284.
2019-10-04 23:50:21 +03:00
Edd Robinson 179c57ab2e feat(storage): allow compaction limiter to be injected 2019-10-04 12:35:21 -07:00
elbehery 663d4bb901 test(tasks): skip flaky test 2019-09-25 18:17:59 +02:00
elbehery c0b87c657c fix(storage): remove level=0 from TSM disk bytes metrics. 2019-09-25 15:57:25 +02:00
Lorenzo Affetti 053836e5a5
Merge pull request #15203 from influxdata/flux-staging-v0.48.x
build(flux): update to Flux v0.48.0
2019-09-20 18:24:02 +02:00
Edd Robinson d714be45a4
Merge pull request #15200 from influxdata/er-retention-service
refactor(storage): add more context to traces and logs
2019-09-20 09:00:00 +01:00
Lorenzo Affetti ab835c8e0e
refactor(dependencies): use new dependency injection framework (#15174)
refactor(dependencies): use new dependency injection framework
2019-09-19 17:01:17 +02:00
Edd Robinson e2f5b2bd9d refactor(storage): add more context to traces and logs 2019-09-19 13:48:06 +01:00
Stuart Carnie 9a89900785
fix(tsm1): Fix duplicate points
All seeks must be added to the c.current slice so the
min and max read values can be updated on each read pass.
2019-09-18 17:44:27 -07:00
Ben Johnson ee3cf79ae7
fix(tsdb): Fix pull request feedback. 2019-09-13 10:00:54 -06:00
Ben Johnson d08403b658
feat(tsdb): Add SQL export for TSI indexes 2019-09-13 10:00:54 -06:00
Mark Rushakoff c2f847299c ci: use latest staticcheck
We were still referring to megacheck in tools.go; this confused
dependent projects also using staticcheck.
2019-09-04 16:34:45 -07:00
Ben Johnson 9237ee6a40
fix(tsi1): Remove TSI cardinality stats cache 2019-09-04 14:48:22 -06:00
Edd Robinson 030083e1a3 perf(storage): optimistically check compactions 2019-09-04 17:38:13 +01:00
Ben Johnson 729558d64b
fix(tsdb): Replace TSI compaction wait group with counter.
Previously the TSI partition would panic if a compaction was
started while `Wait()` was waiting. This commit removes the previous
wait group and replaces it with a simple counter. The `Wait()`
function now polls the counter until it reaches zero.
2019-09-02 09:37:35 -06:00
Edd Robinson 7efb73930b refactor: address PR feedback 2019-08-30 21:07:32 +01:00
Edd Robinson 2e5ebbe251 perf(storage): reduce allocations when deleting from cache
When deleting from the cache, each cache key must be checked to
determine if it matches the prefix we're deleting. Since the keys are
stored as strings in the cache (map keys) there were a lot of allocations
happening because `applySerial` expects `[]byte` keys.

It's beneficial to reduce allocations by refacting `applySerial` to work
on strings. Whilst some allocations now have to happen the other way
(string -> []byte), they only happen if we actually need to delete the
key from the cache. Most of the keys don't get deleted so it's better
doing it this way.

Performance on the benchmark from the previous commit improved by ~40-50%.

name                                          old time/op    new time/op    delta
Engine_DeletePrefixRange_Cache/exists-24         102ms ±11%      59ms ± 3%  -41.95%  (p=0.000 n=10+8)
Engine_DeletePrefixRange_Cache/not_exists-24    97.1ms ± 4%    45.0ms ± 1%  -53.66%  (p=0.000 n=10+10)

name                                          old alloc/op   new alloc/op   delta
Engine_DeletePrefixRange_Cache/exists-24        25.5MB ± 1%     3.1MB ± 2%  -87.83%  (p=0.000 n=10+10)
Engine_DeletePrefixRange_Cache/not_exists-24    23.9MB ± 1%     0.1MB ±86%  -99.65%  (p=0.000 n=10+10)

name                                          old allocs/op  new allocs/op  delta
Engine_DeletePrefixRange_Cache/exists-24          305k ± 1%       28k ± 1%  -90.77%  (p=0.000 n=10+10)
Engine_DeletePrefixRange_Cache/not_exists-24      299k ± 1%        1k ±63%  -99.74%  (p=0.000 n=9+10)

Raw benchmarks on a 24T/32GB/NVME machine are as follows:

goos: linux
goarch: amd64
pkg: github.com/influxdata/influxdb/tsdb/tsm1
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     300	  50379720 ns/op	 3054106 B/op	   27859 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     300	  57326032 ns/op	 3124764 B/op	   28217 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     300	  58943855 ns/op	 3162146 B/op	   28527 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     300	  60565115 ns/op	 3138811 B/op	   28176 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     200	  59775969 ns/op	 3087910 B/op	   27921 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     300	  59530451 ns/op	 3120986 B/op	   28207 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     300	  59185532 ns/op	 3113066 B/op	   28302 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     300	  59295867 ns/op	 3100832 B/op	   28108 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     300	  59599776 ns/op	 3100686 B/op	   28113 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     200	  62065907 ns/op	 3048527 B/op	   27879 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     300	  44979062 ns/op	  123026 B/op	    1244 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     300	  44733344 ns/op	   52650 B/op	     479 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     300	  44534180 ns/op	   35119 B/op	     398 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     300	  45179881 ns/op	  105256 B/op	     706 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     300	  44918964 ns/op	   47426 B/op	     621 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     300	  45000465 ns/op	   63164 B/op	     564 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     300	  45332999 ns/op	  117008 B/op	    1146 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     300	  45652342 ns/op	   66221 B/op	     616 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     300	  45083957 ns/op	  154354 B/op	    1143 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     300	  44560228 ns/op	   65024 B/op	     724 allocs/op
PASS
ok  	github.com/influxdata/influxdb/tsdb/tsm1	1690.583s
2019-08-30 20:35:05 +01:00
Edd Robinson eba4dec7e6 perf(storage): reduce lock contention on Cache entries
The cache is essentially a set of maps, where a key in each map is a
series key, and the value is a slice of values associated with that key.
The cache is sharded and series keys are hashed to determine which shard
(map) they live in.

When deleting from the cache we have to check each key to see if it
matches the delete command (predicate and timestamp). If it does then
the entries for that range are removed. As part of this work we check if
the entries are already empty (already removed) and if so we don't check
if the key is valid.

This involved a lot of mutex grabbing, which has now been replaced with
atomic operations.

Benchmarking this commit against the previous commit in this branch
shows a 9% improvement:

name                                          old time/op    new time/op    delta
Engine_DeletePrefixRange_Cache/exists-24         113ms ± 8%     102ms ±11%   -9.40%  (p=0.000 n=10+10)
Engine_DeletePrefixRange_Cache/not_exists-24    95.6ms ± 2%    97.1ms ± 4%     ~     (p=0.089 n=10+10)

name                                          old alloc/op   new alloc/op   delta
Engine_DeletePrefixRange_Cache/exists-24        29.6MB ± 1%    25.5MB ± 1%  -13.71%  (p=0.000 n=10+10)
Engine_DeletePrefixRange_Cache/not_exists-24    24.3MB ± 2%    23.9MB ± 1%   -1.48%  (p=0.000 n=10+10)

name                                          old allocs/op  new allocs/op  delta
Engine_DeletePrefixRange_Cache/exists-24          334k ± 0%      305k ± 1%   -8.67%  (p=0.000 n=8+10)
Engine_DeletePrefixRange_Cache/not_exists-24      302k ± 1%      299k ± 1%   -1.25%  (p=0.000 n=10+9)

Raw benchmarks on a 24T / 32GB / NVME machine:

goos: linux
goarch: amd64
pkg: github.com/influxdata/influxdb/tsdb/tsm1
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     200	  91035525 ns/op	25557809 B/op	  305258 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     200	  99416796 ns/op	25385052 B/op	  303584 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     100	 100149484 ns/op	25570062 B/op	  305761 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     100	 100222516 ns/op	25474372 B/op	  303089 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     200	 101868258 ns/op	25531572 B/op	  304736 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     100	 106268683 ns/op	25648213 B/op	  306768 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     100	 102905477 ns/op	25572314 B/op	  305798 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     100	 108742857 ns/op	25483068 B/op	  304788 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     100	 103292149 ns/op	25401388 B/op	  303401 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24         	     100	 107178026 ns/op	25573602 B/op	  305821 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     200	  95082692 ns/op	23942491 B/op	  299116 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     200	  96088487 ns/op	23957028 B/op	  298545 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     200	  94279165 ns/op	23620981 B/op	  294536 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     200	  94509000 ns/op	23989593 B/op	  299453 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     200	  98530062 ns/op	23935846 B/op	  299237 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     200	  98008093 ns/op	23821683 B/op	  297875 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     200	  97603172 ns/op	23878336 B/op	  298350 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     200	  96867920 ns/op	23782588 B/op	  296236 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     200	  99148908 ns/op	23997702 B/op	  299277 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24     	     100	 100866840 ns/op	24019916 B/op	  300339 allocs/op
PASS
ok  	github.com/influxdata/influxdb/tsdb/tsm1	1144.213s
2019-08-30 20:35:05 +01:00