* feat(backup): `influx backup` creates data backup
* feat(backup): initial restore work
* feat(restore): initial restore impl
Adds a restore tool which does offline restore of data and metadata.
* fix(restore): pr cleanup
* fix(restore): fix data dir creation
* fix(restore): pr cleanup
* chore: amend CHANGELOG
* fix: restore to empty dir fails differently
* feat(backup): backup and restore credentials
Saves the credentials file to backups and restores it from backups.
Additionally adds some logging for errors when fetching backup files.
* fix(restore): add missed commit
* fix(restore): pr cleanup
* fix(restore): fix default credentials restore path
* fix(backup): actually copy the credentials file for the backup
* fix: dirs get 0777, files get 0666
* fix: small review feedback
Co-authored-by: tmgordeeva <tanya@influxdata.com>
This commit adds numerous tests for ascending and descending cursors
that generate merged blocks across multiple files, which exceed the
default fixed buffer size used by the array cursors (MaxPointsPerBlock).
Tests cover two scenarios
1. Each file has one block and the block from the second file is
entirely contained within the first block of the first file.
When merging, the new block is 1200 values, which exceeds the
MaxPointsPerBlock.
2. Each file has multiple blocks, and the blocks have a mixture of
values which interleave and overwrite.
This commit prevents multiple blocks for the same series key having
values truncated when they are being read into an empty buffer.
The current cursor reader code has an optimisation that incorrectly
assumes the incoming array will be limited to 1,000 values (the maximum
block size), but arrays can contain values from multiple matching
blocks.
Fixes#15817
This commit addresses several data-races on the `tsm1.Predicate` type
that were causing a live-lock or similar in rare cases during a delete.
Because `tsm1/FileStore.Apply` executes concurrently across TSM files
the state of the delete's predicate was being unsafely mutated.
This commit adds a `Clone` method to the `influxdb.Predicate` type,
which should be used whenever an `influxdb.Predicate` implementation
needs to be used concurrently.
* chore: Remove several instances of WithLogger
* chore: unexport Logger fields
* chore: unexport some more Logger fields
* chore: go fmt
chore: fix test
chore: s/logger/log
chore: fix test
chore: revert http.Handler.Handler constructor initialization
* refactor: integrate review feedback, fix all test nop loggers
* refactor: capitalize all log messages
* refactor: rename two logger to log
Fixes#15916.
If a predicate was passed in with multiple key/value matches for the
same tag key, then the value index would be incorrect. This ensures that
each tag key can only be added to the location map once.
Fixes#15859
This commit fixes a defect in the TSI index where a filter using the
negated equality operator would result in no matching series being
returned for series stored within the `IndexFile` portions of the index.
The root cause of this was due to missing legacy-handling code in the
index for this particular iterator.
* fix(storage): add failing test for array cursor iterator stats
* fix(storage): make arrayCursorIterator.Stats() return stats of in-focus cursor
* fix(storage): add failing test to assert arrayCursorIterator.Stats() returns accumulated result
* fix(storage): assumulate stats in arrayCursorIterator.Stats() call across all observed cursors
By default this feature is disabled; the full compaction behaviour does
not change. When this feature is enabled compactions can be limited
across multiple storage engines running in multiple processes.
The mechanism by which this happens is not part of the abstraction added
here.
Previously the TSI partition would panic if a compaction was
started while `Wait()` was waiting. This commit removes the previous
wait group and replaces it with a simple counter. The `Wait()`
function now polls the counter until it reaches zero.
The cache is essentially a set of maps, where a key in each map is a
series key, and the value is a slice of values associated with that key.
The cache is sharded and series keys are hashed to determine which shard
(map) they live in.
When deleting from the cache we have to check each key to see if it
matches the delete command (predicate and timestamp). If it does then
the entries for that range are removed. As part of this work we check if
the entries are already empty (already removed) and if so we don't check
if the key is valid.
This involved a lot of mutex grabbing, which has now been replaced with
atomic operations.
Benchmarking this commit against the previous commit in this branch
shows a 9% improvement:
name old time/op new time/op delta
Engine_DeletePrefixRange_Cache/exists-24 113ms ± 8% 102ms ±11% -9.40% (p=0.000 n=10+10)
Engine_DeletePrefixRange_Cache/not_exists-24 95.6ms ± 2% 97.1ms ± 4% ~ (p=0.089 n=10+10)
name old alloc/op new alloc/op delta
Engine_DeletePrefixRange_Cache/exists-24 29.6MB ± 1% 25.5MB ± 1% -13.71% (p=0.000 n=10+10)
Engine_DeletePrefixRange_Cache/not_exists-24 24.3MB ± 2% 23.9MB ± 1% -1.48% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Engine_DeletePrefixRange_Cache/exists-24 334k ± 0% 305k ± 1% -8.67% (p=0.000 n=8+10)
Engine_DeletePrefixRange_Cache/not_exists-24 302k ± 1% 299k ± 1% -1.25% (p=0.000 n=10+9)
Raw benchmarks on a 24T / 32GB / NVME machine:
goos: linux
goarch: amd64
pkg: github.com/influxdata/influxdb/tsdb/tsm1
BenchmarkEngine_DeletePrefixRange_Cache/exists-24 200 91035525 ns/op 25557809 B/op 305258 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24 200 99416796 ns/op 25385052 B/op 303584 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24 100 100149484 ns/op 25570062 B/op 305761 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24 100 100222516 ns/op 25474372 B/op 303089 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24 200 101868258 ns/op 25531572 B/op 304736 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24 100 106268683 ns/op 25648213 B/op 306768 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24 100 102905477 ns/op 25572314 B/op 305798 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24 100 108742857 ns/op 25483068 B/op 304788 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24 100 103292149 ns/op 25401388 B/op 303401 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/exists-24 100 107178026 ns/op 25573602 B/op 305821 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24 200 95082692 ns/op 23942491 B/op 299116 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24 200 96088487 ns/op 23957028 B/op 298545 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24 200 94279165 ns/op 23620981 B/op 294536 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24 200 94509000 ns/op 23989593 B/op 299453 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24 200 98530062 ns/op 23935846 B/op 299237 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24 200 98008093 ns/op 23821683 B/op 297875 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24 200 97603172 ns/op 23878336 B/op 298350 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24 200 96867920 ns/op 23782588 B/op 296236 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24 200 99148908 ns/op 23997702 B/op 299277 allocs/op
BenchmarkEngine_DeletePrefixRange_Cache/not_exists-24 100 100866840 ns/op 24019916 B/op 300339 allocs/op
PASS
ok github.com/influxdata/influxdb/tsdb/tsm1 1144.213s