The previous parseFill would try to parse an expression and only unscan
one token when it failed. This caused it to not put back the correct
number of tokens with some expression.
Now it has been modified to check for the fill ident ahead of time and
then use ParseExpr() to parse the call. If the expression fails to parse
into a call, it will send an error instead of trying to continue with an
invalid parser state.
Fixes#6543.
The `SHOW MEASUREMENTS` and `SHOW TAG VALUES` cannot go through the
query engine to get the speed they need. They also only need access to
the database index and do not need access to specific shards. This
removes the query rewriting that was done to turn these two queries into
a select statement and reimplements them inside of the coordinator as an
interface on the TSDBStore.
Truncate the time interval output of the monitor service to be on even
time intervals rather than on every minute based on the start time. This
normalizes the output from the monitor service.
Previously, it encoded the text representation of the regex literal
which included the surrounding slashes used in the query language. The
binary encoding should only include the exact string used to create the
regular expression.
The tsdb package had a substantial amount of dead code related to the
old query engine still in there. It is no longer used, so it was removed
since it was left unmaintained. There is likely still more code that is
the same, but wasn't found as part of this code cleanup.
influxql has dead code show up because of the code generation so it is
not included in this pruning.
The highest time represented by a nanosecond needs to be used for an
exclusive range, so the maximum time needs to be one less than the
possible maximum number of nanoseconds representable by an int64 so that
we don't lose a point at that one time.
Previously worked in the open source version because the timestamp used
for finding a shard would be truncated by the retention policy so the
lookup time didn't run into this edge case because it didn't rest on the
truncation boundary. Since that point didn't really belong in that shard
group and was placed there by mistake, it's best to fix this bug since
the timestamp used to create the shard group should be capable of
retrieving it.
This adds support for using regex expressions in SHOW TAG VALUES when
selecting the key. Also supporting the `!=` operation for the
comparison. Now you can do any of the following:
SHOW TAG VALUES WITH KEY != "region"
SHOW TAG VALUES WITH KEY =~ /region/
SHOW TAG VALUES WITH KEY !~ /region/
It also adds a new SetLiteral AST node that will potentially be used in
the future to allow set operations for other comparisons in the future.
Fixes#4532.
The task manager now acts as its own statement executor so that a custom
statement executor can perform custom actions for KillQueryStatement and
ShowQueriesStatement.
This allows us to add additional options to ExecuteQuery without
creating parameter bloat.
Removing the unused Series structs. Their necessity was removed by a
previous commit, but the structs were not removed yet.
Add another type of interrupt iterator that monitors the interrupt
channel and calls `Close()` on the iterator when the interrupt happens.
It will primarily be used for asynchronously closing the ReaderIterator,
but it will only close the read side of the connection properly. More
work needs to be done to allow closing the write side efficiently.
The current code would compare every string literal it crossed and tried
to coerce them to time literals if the _looked_ like date/time strings.
The only time the TimeLiteral was used is when comparing to the the
'time' value in a where clause. This change moves the string parsing
code until we attempt to compare 'time' to a string, at which point we
know we need/want a TimeLiteral, and not just an ordinary string.
Fixes#6727
The default retention policy name is changed to "autogen" instead of
"default" since it ends up being ambiguous when we tell a user to check
the default retention policy, it is uncertain if we are referring to the
default retention policy (which can be changed) or the retention policy
with the name "default".
Now the automatically generated retention policy name is "autogen".
The default retention policy is now also configurable through the
configuration file so an administrator can customize what they think
should be the default.
Fixes#3733.
A query's String method is called multiple times per query. This commit
ensures all calls to query.String share use of a strings.NewReplacer.
This approximately halves the number of allocations for the benchmarked
query.
If you use a statement like this:
SELECT value FROM one..cpu, two..cpu
It will access both the `one` and `two` databases as if you had selected
the `cpu` measurement twice for both of them. Updated the `tsdb.Shard`
create iterator function to filter out any sources that do not apply to
that shard so this duplication doesn't happen.
Fixes#6701.
The parser can be passed a map of keys to literal values to be replaced
into the query. Parameters are preceded by a dollar sign (`$`). If a
parameter key is missing, an error is thrown by the parser.
Fixes#2926.
Casting syntax is done with the PostgreSQL syntax `field1::float` to
specify which type should be used when selecting a field. You can also
do `field1::field` or `tag1::tag` to specify that a field or tag should
be selected.
This makes it possible to select a tag when a field key and a tag key
conflict with each other in a measurement. It also means it's possible
to choose a field with a specific type if multiple shards disagree. If
no types are given, the same ordering for how a type is chosen is used
to determine which type to return.
The FieldDimensions method has been updated to return the data type for
the fields that get returned. The SeriesKeys function has also been
removed since it is no longer needed. SeriesKeys was originally used for
the fill iterator, but then expanded to be used by auxiliary iterators
for determining the channel iterator types. The fill iterator doesn't
need it anymore and the auxiliary types are better served by
FieldDimensions implementing that functionality, so SeriesKeys is no
longer needed.
Fixes#6519.
encodeTags would encode the tags by outputting every key followed by
every value in alphabetical order. decodeTags would try to read this in
an old format that printed tags in key/value order.
This fix matches decodeTags to match the same format encodeTags outputs.
This commit moves the `CallIterator` to wrap the individual series
instead of wrapping a shard. This allows individual points to be
aggregated before being merged.
This will cause a small increase in memory usuage per series but
it shows a 20% decrease in query time when there are a moderate
number of points per series.
This commit changes the `SeriesIterator` to process one measurement
at a time and uses a `floatFastDedupeIterator` to avoid point
encoding during deduplication.
If a shard is empty for a specific field and the field type is something
other than a float, a nil iterator would get returned from one of the
empty shards and cause the combined iterators to be cast to the float
type and all other iterator types to be discarded (or for integers, to
be cast).
This is rare since most aggregates don't accept strings or booleans, but
for queries like:
SELECT distinct(string) FROM mydata
It would result in nothing getting returned if one of the shards didn't
have a value for `string`.
This change modifies the query engine to return nil for the shards
instead of a fake iterator and then to only use the fake iterator if the
final aggregate iterator is nil (meaning that no iterators could be
constructed for the field from any shard).
Fixes#6495.
An offset of `time(1m, now())` will anchor the offset to the current
time of the query. The default offset is `0s` which is the current
default anyway.
This fixes#2074 by making time zone offset support unnecessary. Time
comparisons can use timezones inside of the time clause and the offset
needed for non-hour timezone differences can be used as part of the
offset argument.
Also removes the functions `HasSimpleCount()` and `HasCountDistinct()`
as they are no longer useful. They had a small role in validation that
has now been moved into `validateAggregates()`.
Fixes#6472.
The time of the point will be returned with a selector when there is no
group by interval and when there is only one selector. Any other
conditions will return the start time of the interval.
Fixes#5890.
In order to follow REST a bit more carefully, all write operations
should go through a POST in the future. We still allow read operations
through either GET or POST (similar to the Graphite /render endpoint),
but write operations will trigger a returned warning as part of the JSON
response and will eventually return an error.
Also updates the Golang client libraries to always use POST instead of
GET.
Fixes#6290.
The interrupt iterator currently introduces a non-trivial amount of
overhead to queries by checking for interrupts every 256 points.
This commit adjusts that check to every 5000 points.
There are also several places where nested field access has been
adjusted to minimize field lookups.
This commit removes support for `SHOW SERVERS` and `DROP SERVER`
from the `influxql` package. It also removes extraneous cluster
testing code from `cmd/influxd/run`.
Fixes#6465
The derivative function had an arbitrary limitation that would cause it
to set the value to zero if the previous value was after the next value.
This caused all `ORDER BY desc` queries with `derivative()` to always
return zero values.
Fixes#4675.
Sanitizing is now done through pattern matching rather than parsing the
query and replacing the password in the query. This prevents
accidentally redacting the wrong part of a query and revealing what the
password is through association.
Fixes#3883.
This removes the previous restrictions that kept derivative as only
capable of being used in a single field and only at the top level.
This lets users determine how they want to use derivative more freely
and opens up the possibility of also using math between derivatives.
This may open up some problems when it comes to math between derivatives
as timestamps may not match correctly. That is likely a problem related
to any binary math to begin with though and can probably be ignored by
the derivatives. I'm also not sure it makes sense to perform any math
between a derivative and a difference or perform math between a
derivative and a mean.
Fixes#6118.
This has various benefits:
- Users embedding InfluxDB within other Go programs can specify a different logger / prefix easily.
- More consistent with code used elsewhere in InfluxDB (e.g. services, other `run.Server.*` fields, etc).
- This is also more efficient, because it means `executeQuery` no longer allocates a single `*log.Logger` each time it is called.
The series keys within a tag set were previously not sorted which would
cause the output to be non-deterministic. This sorts the output series
by their keys so it has a consistent output especially when using
limits.
Fixes#3166.
This also switches the remaining iterators to be lazy so they can return
errors properly. They needed to be converted to lazy initialization
anyway, which has the side effect of making it much easier for us to
propagate the underlying error during initialization.
Updated the Emitter to return errors when it cannot read properly from
the iterators.
The optional sections of the command consumed the semicolon token and
didn't put it back for the outer loop. The code shouldn't explicitly
check for a semicolon or EOF anyway, so these checks were removed and
the token gets unscanned if it doesn't match the optional token that the
parser is looking for.
Fixes#6398.
For aggregate queries, derivatives will now alter the start time to one
interval behind and will use that interval to find the derivative of the
first point instead of giving no value for that interval. Null values
will still be discarded so if the interval before the one you are
querying is null, then it will be discarded like if it were in the
middle of the query. You can use `fill(0)` to fill in these values.
This does not apply to raw queries yet.
Also modified the derivative and difference aggregates to use the stream
iterator instead of the reduce slice iterator for space efficiency.
Fixes#3247. Contributes to #5943.
The deprecated message is now attached to a new attribute returned with
the results. This message can then be read by clients to warn a user
about upcoming changes to the query engine.
The `influx` client has already been modified to read this message and
print it out for every format except CSV.
The first warning message is a deprecated message about removing `IF NOT
EXISTS` from `CREATE DATABASE`.
The message will also be printed to the server log.
Fixes#5707.
This commit changes the channel iterators to use a double buffer
to reduce allocations. The caller of `Iterator.Next()` must copy
out the point before calling `Next()` again.
The `percentile()` call previously did not validate that the first
argument was a variable reference and that would let an invalid query
slip by that would panic the query engine.
Added checking for this case and also included test cases for the other
calls that require a variable reference as the first argument.
Fixes#6379.