This commit adds ref counting for files that we pull tag keys from.
Previously, files were only ref counted during the time we extracted
tag keys but this commit adds additional ref counting for the life of
the `Engine.tagKeysNoPredicate()` function.
This commit
* adds new request and response data types for schema gRPC calls
* adds fmt.Stringer implementation to cursors.FieldType
* adds APIs to sort a slice of MeasurementField values,
* upgrades the gogo protobuf package to v1.3.1, which
includes improvements to serialization.
This commit adds a new API to `Cache` to address data races
with the `TagKeys` and `TagValues` APIs.
`Cache` and `entry` provide `AppendTimestamps`, which
appends the current timestamps to the provided slice
to reduce allocations. As noted in the documentation,
it is the responsibility of the caller to sort and deduplicate
the values, if required.
The `cursors.TimestampArray` type was extended to permit
use of the `sort.Sort` API.
This commit introduces a new API for finding the maximum
timestamp of a series when iterating over the keys in a
set of TSM files.
This API will be used to determine the field type of a single
field key by selecting the series with the maximum timestamp.
It has also refactored the common functionality for iterating
TSM keys into `timeRangeBlockReader`, which is shared
between `TimeRangeIterator` and `TimeRangeMaxTimeIterator`.
These APIs require a measurement, permitting an additional optimization
to reduce the search space against the TSM index. Specifically, the
search key prefix is extended from `org+bucket` to
`org+bucket,\x00=<measurement>`
* MeasurementNames
* MeasurementTagKeys
* MeasurementTagValues
* Adds an api to the models package for efficiently parsing the
measurement tag (\x00) from a normalized series key
The root cause is that the Unsigned data type has no representation
in the valueType function in the cache and falls back to the default
case of 0.
0 is also a sentinel value in the entry#add function that will
result in skipping the value type check.
It therefore is possible that unsigned values followed by some other
data type is stored in the cache.
It is suspected that the write may be rejected before reaching the
cache, and therefore may not occur in practice. Specifically, the
series file stores the data types on a per-series basis and would
reject the write.
This commit turns the value types into explicit constants and
ensures all existing block types are represented. In addition,
it adds a mapping function to convert these to a known Block type,
which will be used by the `MeasurementFields` schema request to
determine the type of a series in the cache.
* refactor(storage): move type ByTagKey to the only package that uses it
* refactor(tsdb): use types in tsdb/cursors
* refactor(tsdb): remove unused type SeriesIDElems
* refactor(tsdb): inline only use of tsdb.ReadAllSeriesIDIterator
* refactor(tsdb): move series file to its own package
* refactor(storage): remove platform->influxdb aliases
* feat(backup): `influx backup` creates data backup
* feat(backup): initial restore work
* feat(restore): initial restore impl
Adds a restore tool which does offline restore of data and metadata.
* fix(restore): pr cleanup
* fix(restore): fix data dir creation
* fix(restore): pr cleanup
* chore: amend CHANGELOG
* fix: restore to empty dir fails differently
* feat(backup): backup and restore credentials
Saves the credentials file to backups and restores it from backups.
Additionally adds some logging for errors when fetching backup files.
* fix(restore): add missed commit
* fix(restore): pr cleanup
* fix(restore): fix default credentials restore path
* fix(backup): actually copy the credentials file for the backup
* fix: dirs get 0777, files get 0666
* fix: small review feedback
Co-authored-by: tmgordeeva <tanya@influxdata.com>
This commit adds numerous tests for ascending and descending cursors
that generate merged blocks across multiple files, which exceed the
default fixed buffer size used by the array cursors (MaxPointsPerBlock).
Tests cover two scenarios
1. Each file has one block and the block from the second file is
entirely contained within the first block of the first file.
When merging, the new block is 1200 values, which exceeds the
MaxPointsPerBlock.
2. Each file has multiple blocks, and the blocks have a mixture of
values which interleave and overwrite.
This commit prevents multiple blocks for the same series key having
values truncated when they are being read into an empty buffer.
The current cursor reader code has an optimisation that incorrectly
assumes the incoming array will be limited to 1,000 values (the maximum
block size), but arrays can contain values from multiple matching
blocks.
Fixes#15817
This commit addresses several data-races on the `tsm1.Predicate` type
that were causing a live-lock or similar in rare cases during a delete.
Because `tsm1/FileStore.Apply` executes concurrently across TSM files
the state of the delete's predicate was being unsafely mutated.
This commit adds a `Clone` method to the `influxdb.Predicate` type,
which should be used whenever an `influxdb.Predicate` implementation
needs to be used concurrently.
* chore: Remove several instances of WithLogger
* chore: unexport Logger fields
* chore: unexport some more Logger fields
* chore: go fmt
chore: fix test
chore: s/logger/log
chore: fix test
chore: revert http.Handler.Handler constructor initialization
* refactor: integrate review feedback, fix all test nop loggers
* refactor: capitalize all log messages
* refactor: rename two logger to log
Fixes#15916.
If a predicate was passed in with multiple key/value matches for the
same tag key, then the value index would be incorrect. This ensures that
each tag key can only be added to the location map once.
Fixes#15859
This commit fixes a defect in the TSI index where a filter using the
negated equality operator would result in no matching series being
returned for series stored within the `IndexFile` portions of the index.
The root cause of this was due to missing legacy-handling code in the
index for this particular iterator.
* fix(storage): add failing test for array cursor iterator stats
* fix(storage): make arrayCursorIterator.Stats() return stats of in-focus cursor
* fix(storage): add failing test to assert arrayCursorIterator.Stats() returns accumulated result
* fix(storage): assumulate stats in arrayCursorIterator.Stats() call across all observed cursors
By default this feature is disabled; the full compaction behaviour does
not change. When this feature is enabled compactions can be limited
across multiple storage engines running in multiple processes.
The mechanism by which this happens is not part of the abstraction added
here.