When a GROUP BY or multiple sources are used, the top level limit
iterator requires reading the entire iterator stream so it can find all
of the tag groups it needs to return. For large data series, this ends
up with the limit iterator discarding a lot of output.
This change adds a new lower level limit iterator on each series itself
so that there are fewer data points that have to be thrown away by the
top level iterator.
Fixes#5553.
For aggregate queries, derivatives will now alter the start time to one
interval behind and will use that interval to find the derivative of the
first point instead of giving no value for that interval. Null values
will still be discarded so if the interval before the one you are
querying is null, then it will be discarded like if it were in the
middle of the query. You can use `fill(0)` to fill in these values.
This does not apply to raw queries yet.
Also modified the derivative and difference aggregates to use the stream
iterator instead of the reduce slice iterator for space efficiency.
Fixes#3247. Contributes to #5943.
The deprecated message is now attached to a new attribute returned with
the results. This message can then be read by clients to warn a user
about upcoming changes to the query engine.
The `influx` client has already been modified to read this message and
print it out for every format except CSV.
The first warning message is a deprecated message about removing `IF NOT
EXISTS` from `CREATE DATABASE`.
The message will also be printed to the server log.
Fixes#5707.
Now it is possible to compare tags and fields and it is also now
possible to compare tags and tags. Previously, it was only possible to
compare fields with fields and tags with a string or a regex.
Fixes#3371.
This commit makes a number of performance improvements to
reduce allocations during query execution. Several objects
and buffers are now reused across the components to avoid
allocations.
Previously a simple `count(value)` query across 1M points
would require 26,000+ allocations. After the changes in
this commit that number has been reduced to 88.
The tsdb package can't have a dependency on the meta package so it takes
a slice of uint64 types. The clustering implementation needs the full
ShardInfo to know the shard owners though, so a different implementation
needs to be used by clustering.
The `*tsdb.Store` type gets wrapped in the cluster package so it can
implement the `IteratorCreator` function without having a dependency on
the meta package.
The QueryExecutor had a lot of dead code made obsolete by the query
engine refactor that has now been removed. The TSDBStore interface has
also been cleaned up so we can have multiple implementations of this
(such as a local and remote version).
A StatementExecutor interface has been created for adding custom
functionality to the QueryExecutor that may not be available in the open
source version. The QueryExecutor delegate all statement execution to
the StatementExecutor and the QueryExecutor will only keep track of
housekeeping. Implementing additional queries is as simple as wrapping
the cluster.StatementExecutor struct or replacing it with something
completely different.
The PointsWriter in the QueryExecutor has been changed to a simple
interface that implements the one method needed by the query executor.
This is to allow different PointsWriter implementations to be used by
the QueryExecutor. It has also been moved into the StatementExecutor
instead.
The TSDBStore interface has now been modified to contain the code for
creating an IteratorCreator. This is so the underlying TSDBStore can
implement different ways of accessing the underlying shards rather than
always having to access each shard individually (such as batch
requests).
Remove the show servers handling. This isn't a valid command in the open
source version of InfluxDB anymore.
The QueryManager interface is now built into QueryExecutor and is no
longer necessary. The StatementExecutor and QueryExecutor split allows
task management to much more easily be built into QueryExecutor rather
than as a separate struct.
The simple moving average will gradually emit points instead of waiting
until the end. This should apply to derivative and difference in the
future too.
Fixes#6112.
Related to #6140, but won't actually fix that problem. It will correctly
stop new queries from being started during shutdown and will send the
interrupt signal to queries during shutdown.
Since the interrupt signal is asynchronous, there isn't currently a way
to wait for the queries to complete themselves before shutting down the
engine.
The difference function is implemented very similar to how derivative is
implemented. It is an aggregate function that acts over the entire
aggregate. This function will also have the same problems that
derivative has with getting values from the previous interval or point.
This will be fixed separately as part of #5943.
Fixes#1825.
Allows configuration of shard group duration at database creation, and retention
policy create/alter time.
Query examples:
```
CREATE DATABASE testdb WITH DURATION 90d SHARD DURATION 30m NAME rp_testdb
CREATE RETENTION POLICY rp_testdb2 ON testdb DURATION INF REPLICATION 1 SHARD DURATION 30m
ALTER RETENTION POLICY rp_testdb2 ON testdb SHARD DURATION 1h
```
This can be useful with long duration retention policies with lots of data, where
you can split into smaller shards to relieve memory pressure.
This commit adds a configurable limit to the number of series that
can be returned from a `SELECT` statement. The limit is checked
immediately after planning and is determined by the use of iterator
stats.
Fixes#6076
If an OR was used, merging filters between different expressions would
not work correctly. If one of the sides had a set of series ids with a
condition and the other side had no series ids associated with the
expression, all of the series from the side with a condition would have
the condition ignored. Instead of defaulting a non-existant series
filter to true, it should just be false and the evaluation of the one
side that does exist should take care of determining if the series id
should be included or not. The AND condition used false correctly so did
not have to be changed.
If a tag did not exist and `!=` or `!~` were used, it would return false
even though the neither a field or a tag equaled those values. This has
now been modified to correctly return the correct series ids and the
correct condition.
Also fixed a panic that would occur when a tag caused a field access to
become unnecessary. The filter using the field access still got created
and used even though it was unnecessary, resulting in an attempted
access to a non-initialized map.
Fixes#5152 and a bunch of other miscellaneous issues.
The currently running queries can be listed with the command
`SHOW QUERIES` and it will display the current commands that have been
run, the database they were run against, and how long they have been
running.
These were all b1/bz1 settings that no longer have any effect:
- {Default,}MaxWALSize
- {Default,}WALFlushInterval
- {Default,}WALPartitionFlushDelay
- {Default,WAL}ReadySeriesSize
- {Default,WAL}CompactionThreshold
- {Default,WAL}MaxSeriesSize
- {Default,WAL}FlushColdInterval
- {Default,WAL}PartitionSizeThreshold
Numbers in the query without any decimal will now be emitted as integers
instead and be parsed as an IntegerLiteral. This ensures we keep the
original context that a query was issued with and allows us to act more
similar to how programming languages are typically structured when it
comes to floats and ints.
This adds functionality for dealing with integers promoting to floats in
the various different places where math are used.
Fixes#5744 and #5629.
Normalize the time for the distinct() call to either be at the beginning
of the group by interval or the start time similar to every other call.
The timestamp previously just showed the first time found and didn't
make a lot of sense in the context of what the function was supposed to
do.
Fixes#6040.
0.11 no longer uses some files from 0.10. The code was a little
too aggressive and remove these files which would break rolling back
to 0.10 if necessary. Since shards must be migrated to tsm before
upgrading to 0.11 and a user might not know they still have old shard
formats, they would not be able to revert back to 0.11 and migrate
them.
Also adds uptime stats to usage data.