When using `non_negative_derivative()` and `last()` in a math aggregate
with each other, the math would not be matched with each other because
one of those aggregates would emit one fewer point than the others. The
math iterators have been modified so they now track the name and tags of
a point and match based on those.
This isn't necessarily ideal and may come to bite us in the future. We
don't necessarily have a defined structure for all iterators so it can
be difficult to know which of two points is supposed to come first in
the ordering. This uses the common ordering that usually makes sense,
but the query engine is getting complicated enough where I am not 100%
certain that this is correct in all circumstances.
Fixes#7906
In an attempt to reduce the overhead of using regex for exact matches,
the query parser will replace `=~ /^thing$/` with `== 'thing'`, but the
conditions being checked would ignore if any flags were set on the
expression, so `=~ /(?i)^THING$/` was replaced with `== 'THING'`, which
will fail unless the case was already exact. This change ensures that no
flags have been changed from those defaulted by the parser.
Fixes#7906
In an attempt to reduce the overhead of using regex for exact matches,
the query parser will replace `=~ /^thing$/` with `== 'thing'`, but the
conditions being checked would ignore if any flags were set on the
expression, so `=~ /(?i)^THING$/` was replaced with `== 'THING'`, which
will fail unless the case was already exact. This change ensures that no
flags have been changed from those defaulted by the parser.
During development, I, at some point, decided that the dimensions should
be expanded based on what was available rather than what was present in
the subquery. I don't really know the rationale for this because I
forgot, but it doesn't make sense or seem to be particularly useful.
Expanding dimensions now just uses the values specified in the subquery
rather than expanding to all available dimensions of the measurement in
the subquery.
During development, I, at some point, decided that the dimensions should
be expanded based on what was available rather than what was present in
the subquery. I don't really know the rationale for this because I
forgot, but it doesn't make sense or seem to be particularly useful.
Expanding dimensions now just uses the values specified in the subquery
rather than expanding to all available dimensions of the measurement in
the subquery.
Previously, only time expressions got propagated inwards. The reason for
this was simple. If the outer query was going to filter to a specific
time range, then it would be unnecessary for the inner query to output
points within that time frame. It started as an optimization, but became
a feature because there was no reason to have the user repeat the same
time clause for the inner query as the outer query. So we allowed an
aggregate query with an interval to pass validation in the subquery if
the outer query had a time range. But `GROUP BY` clauses were not
propagated because that same logic didn't apply to them. It's not an
optimization there. So while grouping by a tag in the outer query
without grouping by it in the inner query was useless, there wasn't any
particular reason to care.
Then a bug was found where wildcards would propagate the dimensions
correctly, but the outer query containing a group by with the inner
query omitting it wouldn't correctly filter out the outer group by. We
could fix that filtering, but on further review, I had been seeing
people make that same mistake a lot. People seem to just believe that
the grouping should be propagated inwards. Instead of trying to fight
what the user wanted and explicitly erase groupings that weren't
propagated manually, we might as well just propagate them for the user
to make their lives easier. There is no useful situation where you would
want to group into buckets that can't physically exist so we might as
well do _something_ useful.
This will also now propagate time intervals to inner queries since the
same applies there. But, while the interval propagates, the following
query will not pass validation since it is still not possible to use a
grouping interval with a raw query (even if the inner query is an
aggregate):
SELECT * FROM (SELECT mean(value) FROM cpu) WHERE time > now() - 5m GROUP BY time(1m)
This also means wildcards will behave a bit differently. They will
retrieve dimensions from the sources in the inner query rather than just
using the dimensions in the group by.
Fixing top() and bottom() to return the correct auxiliary fields.
Unfortunately, we were not copying the buffer with the auxiliary fields
so those values would be overwritten by a later point.
Previously, only time expressions got propagated inwards. The reason for
this was simple. If the outer query was going to filter to a specific
time range, then it would be unnecessary for the inner query to output
points within that time frame. It started as an optimization, but became
a feature because there was no reason to have the user repeat the same
time clause for the inner query as the outer query. So we allowed an
aggregate query with an interval to pass validation in the subquery if
the outer query had a time range. But `GROUP BY` clauses were not
propagated because that same logic didn't apply to them. It's not an
optimization there. So while grouping by a tag in the outer query
without grouping by it in the inner query was useless, there wasn't any
particular reason to care.
Then a bug was found where wildcards would propagate the dimensions
correctly, but the outer query containing a group by with the inner
query omitting it wouldn't correctly filter out the outer group by. We
could fix that filtering, but on further review, I had been seeing
people make that same mistake a lot. People seem to just believe that
the grouping should be propagated inwards. Instead of trying to fight
what the user wanted and explicitly erase groupings that weren't
propagated manually, we might as well just propagate them for the user
to make their lives easier. There is no useful situation where you would
want to group into buckets that can't physically exist so we might as
well do _something_ useful.
This will also now propagate time intervals to inner queries since the
same applies there. But, while the interval propagates, the following
query will not pass validation since it is still not possible to use a
grouping interval with a raw query (even if the inner query is an
aggregate):
SELECT * FROM (SELECT mean(value) FROM cpu) WHERE time > now() - 5m GROUP BY time(1m)
This also means wildcards will behave a bit differently. They will
retrieve dimensions from the sources in the inner query rather than just
using the dimensions in the group by.
Fixing top() and bottom() to return the correct auxiliary fields.
Unfortunately, we were not copying the buffer with the auxiliary fields
so those values would be overwritten by a later point.
Also, fix the `Iterators.Merge(IteratorOptions)` function so it consults
the `Ordered` attribute to determine which iterator it should use to
merge the input iterators.
This adds query syntax support for subqueries and adds support to the
query engine to execute queries on subqueries.
Subqueries act as a source for another query. It is the equivalent of
writing the results of a query to a temporary database, executing
a query on that temporary database, and then deleting the database
(except this is all performed in-memory).
The syntax is like this:
SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *)
This will execute derivative and then sum the result of those derivatives.
Another example:
SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host)
This would let you find the maximum minimum value of each host.
There is complete freedom to mix subqueries with auxiliary fields. The only
caveat is that the following two queries:
SELECT mean(value) FROM cpu
SELECT mean(value) FROM (SELECT value FROM cpu)
Have different performance characteristics. The first will calculate
`mean(value)` at the shard level and will be faster, especially when it comes to
clustered setups. The second will process the mean at the top level and will not
include that optimization.
The reducers already had a local RNG but mistakenly did not use it when
sampling points.
Because the local RNG is not protected by a mutex, there is a slight
speedup as a result of this change:
benchmark old ns/op new ns/op delta
BenchmarkSampleIterator_1k-4 418 418 +0.00%
BenchmarkSampleIterator_100k-4 434 422 -2.76%
BenchmarkSampleIterator_1M-4 449 439 -2.23%
benchmark old allocs new allocs delta
BenchmarkSampleIterator_1k-4 3 3 +0.00%
BenchmarkSampleIterator_100k-4 3 3 +0.00%
BenchmarkSampleIterator_1M-4 3 3 +0.00%
benchmark old bytes new bytes delta
BenchmarkSampleIterator_1k-4 304 304 +0.00%
BenchmarkSampleIterator_100k-4 304 304 +0.00%
BenchmarkSampleIterator_1M-4 304 304 +0.00%
The speedup would presumably increase when multiple sample iterators are
used concurrently.
It looks like the real import path to the project is go.uber.org/zap
instead of github.com/uber-go/zap since the example in the project
references that path.
The logging library has been switched to use uber-go/zap. While the
logging has been changed to use structured logging, this commit does not
change any of the logging statements to take advantage of the new
structured log or new log levels. Those changes will come in future
commits.
`percentile()` is supposed to be a selector and return the time of the
point, but that only got changed when the input was a float. Updating
the integer processor to also return the time of the point rather than
the beginning of the interval.
The `partial` tag has been added to the JSON response of a series and
the result so that a client knows when more of the series or result will
be sent in a future JSON chunk.
This helps interactive clients who don't want to wait for all of the
data to know if it is done processing the current series or the current
result. Previously, the client had to guess if the next chunk would
refer to the same result or a new result and it had to match the name
and tags of the two series to know if they were the same series. Now,
the client just needs to check the `partial` field included with the
response to know if it should expect more.
Fixed `max-row-limit` so it counts rows instead of results and it
truncates the response when the `max-row-limit` is reached.
When the `max-row-limit` was hit, the goroutine reading from the results
channel would stop reading from the channel, but it didn't signal to the
sender that it was no longer reading from the results. This caused the
sender to continue trying to send results even though nobody would ever
read it and this created a deadlock.
Include an `AbortCh` on the `ExecutionContext` that will signal when
results are no longer desired so the sender can abort instead of
deadlocking.
When a query would use a grouping with two different aggregates, it was
possible for one of the aggregates to return a value from a different
series key than the second aggregate. When these series keys didn't
match, the returned grouping would be screwed up because it sorted by
time before checking for name and tags.
This did not happen when the aggregates returned values for the same
series keys because then the iterators were aligned with each other.
The parser was updated previously in #7295 and the functionality was
supposed to be there, but the wiring in the query engine for that to
happen was never written.
The previous version was showing the microseconds unit when it was
outputting nanoseconds. Now we correctly identify which sub-second unit
to use (milliseconds, microseconds, or nanoseconds) and use the correct
unit while dividing the duration unit correctly to produce the correct
output.
Also updated to use the default duration string instead of our own
custom formatters. It turns out that the string method for
`time.Duration` does the correct thing as long as we truncate the value
first.
When a limit is exceeded, we return errors and sometimes log (if appropriate)
that a limit was exceeded. The messages don't always provide an indication
as to where or how they are configured.
Instead, return the config option (easily searchable for) as well as the limit
currently set and the value that exceeded it when possible.
The delete and drop statements apply to the measurement within a db.
The parser allowed a db or rp to be specified and these values were
silently ignored. This could cause data loss as someone would think
they are only deleting the series within a rp, but they are actually
deleting all their data.
Instead, we return a parse error if a db or rp is specified in the
delete or drop statements. Ideally, we'd be able to respect the
db and rp, but that requires significant work in the query engine
and tsdb store to make that work.
Fixes#7053
This commit adds support for replacing regexes with non-regex conditions
when possible. Currently the following regexes are supported:
- host =~ /^foo$/ will be converted into host = 'foo'
- host !~ /^foo$/ will be converted into host != 'foo'
Note: if the regex expression contains character classes, grouping,
repetition or similar, it may not be rewritten.
For example, the condition: name =~ /^foo|bar$/ will not be rewritten.
Support for this may arrive in the future.
Regexes that can be converted into simpler expression will be able to
take advantage of the tsdb index, making them significantly faster.
Previously, calling derivative with a non-duration second argument was
allowed during parsing but would panic during execution due to a failed
type conversion. This change ensures the second argument is a duration
literal.
The functionality works the same as wildcards, but this time, you can
specify a regular expression.
One limitation is that you can't specify whether you only want to select
fields or tags. Since the regex can be changed to suit the person's
needs, I don't currently think this is an issue.
Strings would always return an empty string and stddev is meaningless
when it comes to strings. This removes that functionality so strings
don't automatically get picked up when using a wildcard.
The `cumulative_sum()` function can be used to sum each new point and
output the current total. For the following points:
cpu value=2 0
cpu value=4 10
cpu value=6 20
This would output the following points:
> SELECT cumulative_sum(value) FROM cpu
time value
---- -----
0 2
10 6
20 12
As can be seen, each new point adds to the sum of the previous point and
outputs the value with the same timestamp.
The function can also be used with an aggregate like `derivative()`.
> SELECT cumulative_sum(mean(value) FROM cpu WHERE time >= now() - 10m GROUP BY time(1m)
First Pass at implementing sample
Add sample iterators for all types
Remove size from sample struct
Fix off by one error when generating random number
Add benchmarks for sample iterator
Add test and associated fixes for off by one error
Add test for sample function
Remove NumericLiteral from sample function call
Make clear that the counter is incr w/ each call
Rename IsRandom to AllSamplesSeen
Add a rng for each reducer that is created
The default rng that comes with math/rand has a global lock. To avoid
having to worry about any contention on the lock, each reducer now has
its own time seeded rng.
Add sample function to changelog
Clean up template for fill average
Change fill(average) to fill(linear)
Update average to linear in infuxql spec
Add Integer Tests and associated fixes
Update CHANGELOG for fill(linear)
Manual use of system queries could result in a user using the query
incorrect. Rather than check to make sure the query was used correctly,
we're just going to prevent users from using those sources so they can't
use them incorrectly.
The vet checks for some files did not pass for go 1.7. As part of a
preliminary start to making go 1.7 work with this software, go vet
should pass.
Also updated the gogo/protobuf dependency which fixed the code generator
to work with go 1.7 too. Ran `go generate` on the entire repository to
ensure every file was up to date.
The derivative() call would panic if it received two points at the same
time because it tried to divide by zero. The derivative call now skips
past these points. To avoid skipping past these points, use `GROUP BY *`
so that each series is kept separated into their own series.
The difference() call has also been modified to skip past these points.
Even though difference doesn't divide by the time, difference is
supposed to perform the same as derivative, but without dividing by the
time.
Return an error when we encounter the same option twice in ALTER
RETENTION POLICY and remove the `maxNumOptions` number from the parsing
loop. The `maxNumOptions` number would need to be modified if another
option was added to the parsing loop and it didn't correctly prevent
duplicate options from being reported as an error anyway.
Normalize all of the SHOW commands so they allow both using ON to
specify the database and using the default database. Some commands would
require one and some would require the other and it was confusing when
using the query language.
Affected commands:
* SHOW RETENTION POLICIES
* SHOW MEASUREMENTS
* SHOW SERIES
* SHOW TAG KEYS
* SHOW TAG VALUES
* SHOW FIELD KEYS
When attempting to reduce the WHERE clause, the time literals had not
been converted from string literals yet. This adds the functionality to
have it handle the same time math when the time literal is still a
string literal.
The dollar sign would sometimes be accepted as whitespace if it was
immediately followed by a reserved keyword or an invalid character. It
now reads these properly as a bound parameter rather than ignoring the
dollar sign.
Instead of having the parser set the defaults, the command will set the
defaults so that the constants for that are actually used. This way we
can also identify which things the user provided and which ones we are
filling with default values.
This allows the meta client to be able to make smarter decisions when
determining if the user requested a conflict or if the requested
capabilities match with what is currently available. If you just say
`CREATE DATABASE WITH NAME myrp`, the user doesn't really care what the
duration of the retention policy is and just wants to use the default.
Now, we can use that information to determine if an existing retention
policy would conflict with what the user requested rather than returning
an error if a default value ever gets changed since the meta client
command can communicate intent more easily.
We added `SHARD DURATION` as an extra option, but forgot to increase the
maximum number of allowable options from 3 to 4. So if 4 options were
used, the last one was ignored. This was commonly `DEFAULT`, but it
could have been any of the options.
Negative timestamps are now supported. We also now refuse two
nanoseconds that are at the edge of the minimum time window. One of the
nanoseconds we do not accept is because we need MinInt64 to be used for
some internal comparisons in the TSM engine and it was causing an
underflow when we subtracted one from the minimum time. The second is so
we can have one minimum time that signifies the default minimum that
nobody can write to (so we can implicitly rewrite the timestamp on
aggregate queries) but still use the explicit timestamp if it is given
to us by the user. We aren't able to tell the difference between if the
user provided it or if it was implicit without those values being
different.
If the default minimum time is used with an aggregate query, we rewrite
the time to be the epoch for backwards compatibility since we believe
that's more important than supporting that extra nanosecond.
This allows a long series of uninterruptible statements to still be
interrupted for a long running query that might do something like create
or drop many databases.
It is now possible to use a mixed duration unit like `1h30m`. The
duration units can be in whatever order as long as they are connected to
each other.
There is a change to the scanner. A token such as `10x` will be scanned
as a duration literal, but will then fail to parse as an invalid
duration. This should not be a breaking change as there is no situation
where `10m10` was a valid order of tokens for the parser.
Fixes#3634.
The query executor would only store the number of active queries and the
query duration so it was impossible to determine how many queries were
actually executed during that timeframe because quick queries would be
gone before the call to gather statistics was made.
This adds two new statistics so track when queries start and when
queries finish and doesn't decrement the counter so the number of
executed queries can be obtained using `derivative()` and
`difference()`.
Instead of having the parser set the defaults, the command will set the
defaults so that the constants for that are actually used. This way we
can also identify which things the user provided and which ones we are
filling with default values.
This allows the meta client to be able to make smarter decisions when
determining if the user requested a conflict or if the requested
capabilities match with what is currently available. If you just say
`CREATE DATABASE WITH NAME myrp`, the user doesn't really care what the
duration of the retention policy is and just wants to use the default.
Now, we can use that information to determine if an existing retention
policy would conflict with what the user requested rather than returning
an error if a default value ever gets changed since the meta client
command can communicate intent more easily.
This commit fixes the `MaxSelectSeriesN` limit which was broken by
the implementation of lazy iterators. The setting previously limited
the total number of series but the new implementation limits the
concurrent number of series being processed.
This commit limits queries to only process one shard at a time.
However, within a shard, multiple series can still be processed in
parallel. Shard iterators are lazily instantiated during query
execution to limit the amount of memory a given query uses.
The previous parseFill would try to parse an expression and only unscan
one token when it failed. This caused it to not put back the correct
number of tokens with some expression.
Now it has been modified to check for the fill ident ahead of time and
then use ParseExpr() to parse the call. If the expression fails to parse
into a call, it will send an error instead of trying to continue with an
invalid parser state.
Fixes#6543.