This commit allows users to filter on the `value` field in the
`SHOW TAG VALUES` command:
SHOW TAG VALUES WITH KEY = "mytag" WHERE "value" = 'myvalue'
Previously this command would return all values.
This change makes it so that we simplify the math engine so it doesn't
use a complicated set of nested iterators. That way, we have to change
math in one fewer place.
It also greatly simplifies the query engine as now we can create the
necessary iterators, join them by time, name, and tags, and then use the
cursor interface to read them and use eval to compute the result. It
makes it so the auxiliary iterators and all of their complexity can be
removed.
This also makes use of the new eval functionality that was recently
added to the influxql package.
No math functions have been added, but the scaffolding has been included
so things like trigonometry functions are just a single commit away.
This also introduces a small breaking change. Because of the call
optimization, it is now possible to use the same selector multiple times
as a selector. So if you do this:
SELECT max(value) * 2, max(value) / 2 FROM cpu
This will now return the timestamp of the max value rather than zero
since this query is considered to have only a single selector rather
than multiple separate selectors. If any aspect of the selector is
different, such as different selector functions or different arguments,
it will consider the selectors to be aggregates like the old behavior.
This commits adds a randomised deletion test, which aims to test the
correctness of deletions and drops in the index and series file.
The test works by maintaining its own index of points, series and
measurements, tracking the presence or absence or series.
There are four commands that can be randomly executed:
- insert a series at a random time
- drop an entire measurement
- drop an entire series
- delete series across a time range
After every invocation of one of these commands, `SHOW SERIES` is
executed on the server, and the results are compared to what the
test's series tracker believes should be present. If the results
differ then the test fails.
On failure, the last 100 queries executed, as well as the series
we expected, and the series returned by `SHOW SERIES`, are returned.
* Live Restore + Enterprise data format compatability
* Extended ImportData to import all DB's if no db name given
* Added a new enterprise data test, and backup command now prints the backup file paths at conclusion
* Added whole-system backup test
* Update to use protobuf in all enterprise data cases
* Update to test to do cross-testing with enterprise version
* incremental enterprise backup format support
This commit adds time support to SHOW TAG VALUES. Time can be used as
both a lower and upper boundary. However, there are some caveats.
For the `inmem` index, filtering by time will still return all results
because the index data is shared across shards.
For the `tsi1` index, filtering by time will only work down to the shard
lever. Specifically, when querying by time all shards within that time
range will be used to generate the results.
When a meta query does not include a time component then it can be
answered exclusively by the index. This should result in a much faster
query execution that if the TSM engine was engaged.
This commit rewrites the following queries such that they make use
of the index where no time component is present:
- SHOW MEASUREMENTS
- SHOW SERIES
- SHOW TAG KEYS
- SHOW FIELD KEYS
This allows the query:
SELECT mean(value) FROM cpu GROUP BY time(1d)
To function in some way that makes sense. The upper limit is implicitly
the `now()` starting time and the lower limit will be whichever interval
the lowest point falls into.
When no lower bound is specified and `max-select-buckets` is specified,
the query will only consider points that would satisfy
`max-select-buckets`. So if you have one point written in 1970, have
another point within the last minute, and then do the above query with
`max-select-buckets` being equal to 10, the older point from 1970 will
not be considered.
This refactors the validation code so it is more flexible and performs a
small bit of work to make preparing and executing the query easier.
The general idea is that compilation will eventually do more heavy
lifting in creating the initial plan and prepare will construct an
actual plan rather than just doing some basic field rewriting.
This change at least sets us up for that change in the future and moves
the validation code to the query execution instead of in the parser.
This also frees up the parser to parse the complete AST without worrying
if the query itself is valid. That could be useful for client code that
wants to compile a partial query to an AST and then perform
modifications on the AST for some reason.
When merging streams of system iterators we don't use tags or time.
Instead we add series keys (in the case of, for example, `SHOW SERIES`)
to the `Aux` field of the iterators' elements. This is because we only
emit merged and sorted sets of series key to the client.
We currently use `SortedMergeHeap`s to merge together multiple
iterators, and the comparitor function did not consider `Aux` fields
when determining which heap to pop the next item off during a merge. As
such, `SHOW SERIES` and `SHOW TAG KEYS` (any meta query that gets
converted into a special type of `SELECT`) were returning results in
arbitrary order.
This issue was never noticed on the `inmem` index because the streams
are always duplicates of each other, and of course it doesn't matter if
you arbitrarily merge together two idential, sorted streams...
The issue first manifested itself on the `tsi1` index, but this fix will
apply to both indexes.