When splitting windows into their own individual tables, do not wait for
empty tables to be read. Empty tables may never be read so waiting for
them can result in a deadlock.
This was found while writing a test case for in progress work and so a
test case for this condition will follow in a future commit.
This commit adds `mincore.Limiter` which throttles page faults caused
by mmap() data. It works by periodically calling `mincore()` to determine
which pages are not resident in memory and using `rate.Limiter` to
throttle accessing using a token bucket algorithm.
* feat(tsdb): SHOW TAG KEYS (no time) query using only TSI data.
* fix(tsdb): Allow for earlier return when scanning during show tag keys.
* fix(tsdb): Speed things up by using the key merger to reduce allocs.
* chore(tsm1): Fix golint.
* fix(tsdb): Remove sorting, because these keys should already be sorted.
* fix(tsdb): Remove dead code to placate the linter.
Previously windowed aggregate tables would construct the _start and
_stop arrays before checking if the underlying array cursor was empty.
The table would therefore report itself as non-empty even though
there were no points within the table's time range. Now we check if
the underlying array cursor is empty before we construct any output
arrays. If the cursor is empty, meaning the series is empty, the
table will be dropped.
Fixes https://github.com/influxdata/influxdb/issues/18704.
* refactor: migrator and introduce Store.(Create|Delete)Bucket
feat: kvmigration internal utility to create / managing kv store migrations
fix: ensure migrations applied in all test cases
* chore: update kv and migration documentation
* fix(storage): Push-down a predicate to match tags for SHOW MEASUREMENTS calls.
* chore: Address feedback.
* fix(tsm1): Split behavior based on existence of predicate for show measurements.
* fix(tsm1): Allow parenthesis expression on the LHS of a predicate.
* fix(tsm1): Create a separate tag predicate verifier that rejects negative comparisons.
* fix(tsm1): Additional test cases for show measurements with predicate.
Storage should not have a dependency on libflux. This commit removes
some constants that were used from the flux universe package and
replaces them with local copies so that storage no longer has a
dependecy on the universe package.
This commit adds the `WithWritePointsValidationEnabled()` option
to disable validation in `Engine.WritePoints()` as the same
validation is performed earlier in the call stack by cloud.
* feat(storage): first array cursor
* feat: add first and last to rpc messages
* test(launcher): push down group first and group last
* feat(storage): window first array cursor
* test(launcher): push down bare first and bare last
* feat(storage): add capabilities for group first and group last
* refactor: rename first to limit
* refactor: make zero value for every period meaningful
* refactor: standardize launcher pushdown tests
The tables produced by `storage/flux` didn't previously pass our table
tests. The `Empty()` call is supposed to return false if the table was
ever not empty, but reading the table or calling `Done()` would cause
the table implementations here to return that they were always empty.
This messes up the csv encoder which then believes that it just emitted
an empty table.
The table tests for valid table implementations states that this is an
error for the table implementation. This change introduces a simple test
for `ReadFilter` and also runs the table tests on the filter iterator.
This enables a new rule that will push down the full `aggregateWindow`
query including the `duplicate` and `window(every: inf)` that recombines
the tables. When the full rule is used, the table is not split into
tables for each window and instead retains itself as a single table. The
start or stop column is renamed to `_time` and `_start` and `_stop` will
be the boundaries of the query.
* feat: flags for pushing down new aggregates
* refactor: grouped aggregate rewrite rules
The storage operation ReadGroup aggregates per series on the storage
side. The planner will rewrite grouped aggregate queries to call
ReadGroup, which will perform a partial aggregation, followed by
another operation that will perform the rest of the aggregation on
the compute side.
* feat: storage capabilities for grouped aggregates
* fix: changes from review
* feat: group read operation name should include aggregate
This implements create empty for the window table reader and allows this
table read function to be used when it is specified. It will pass down
the create empty flag from the original window call into the storage
read function.
This also fixes the window table reader so it properly creates
individual tables for each window. Previously, it was constructing one
table for an entire series instead of one table per window.
Tests have been added to verify three edge case behaviors. The first is
the normal read operation where all values are present. The second is
when create empty is specified so null values may be created. The third
is with truncated boundaries to ensure that storage is read from and the
start and stop timestamps get correctly truncated.
The ResultSetToLineProtocol test class was not generating correct
line protocol for string output (appending `i`)
In addition, the PR improves the mock.NewResultSetFromSeriesGenerator
type with options. The one option added is `WithGeneratorMaxValues`,
to limit the total number of values produced by the SeriesGenerator.
The tags cache was not thread safe when called from multiple goroutines
at the same time. It was intended that it would be, but the locking was
done incorrectly and in too complicated a way. There was an assumption
that the LRU would only be updated from a single thread which wasn't
true at all.
The tags cache has now been updated to include some test cases that test
for race conditions and data validity. The tags cache itself has been
changed to follow a simpler algorithm.
1. Obtain a read lock.
2. Check if the cached array can be used.
3. Release the read lock.
4. If the above was unusable or did not exist, create an array for the
tag.
5. Obtain a write lock.
6. Check if the cached array should be replaced and replace if needed.
7. Move the entry to the front of the LRU.
8. Release the write lock.
This simpler algorithm should ensure that this code is correct and that
creating the array is still done outside of the lock since creating the
array is the most expensive operation of the ones above.