Commit Graph

263 Commits (815f740f4c4f4253dffd0a18471eb9ba02f39c0a)

Author SHA1 Message Date
Jonathan A. Sternberg 7d043dbc61 Add nanosecond duration literal support 2017-05-19 10:44:11 -05:00
Jonathan A. Sternberg 57a2abbc87 Restrict top() and bottom() selectors to be used with no other functions 2017-04-14 10:23:07 -05:00
Jonathan A. Sternberg a550d323c4 Restrict fill(none) and fill(linear) to be usable only with aggregate queries 2017-04-10 15:58:05 -05:00
zhexuany 232fdae6dd introduce a new function non_negative_difference 2017-03-31 23:08:36 +08:00
Jonathan A. Sternberg 3e52ec7ca2 Merge pull request #7762 from influxdata/js-6541-timezone-support
Support timezone offsets for queries
2017-03-28 10:39:07 -05:00
Jonathan A. Sternberg ccf0cb8371 Fix query parser when using addition and subtraction without spaces
Additionally, support unary addition and subtraction for variables,
calls, and parenthesis expressions. Doing `-value` will be the
equivalent of doing `-1 * value` now.
2017-03-24 12:52:19 -05:00
Jonathan A. Sternberg 347b01814e Support timezone offsets for queries
The timezone for a query can now be added to the end with something like
`TZ("America/Los_Angeles")` and it will localize the results of the
query to be in that timezone. The offset will automatically be set to
the offset for that timezone and offsets will automatically adjust for
daylight savings time so grouping by a day will result in a 25 hour day
once a year and a 23 hour day another day of the year.

The automatic adjustment of intervals for timezone offsets changing will
only happen if the group by period is greater than the timezone offset
would be. That means grouping by an hour or less will not be affected by
daylight savings time, but a 2 hour or 1 day interval will be.

The default timezone is UTC and existing queries are unaffected by this
change.

When times are returned as strings (when `epoch=1` is not used), the
results will be returned using the requested timezone format in RFC3339
format.
2017-03-22 15:09:41 -05:00
Jonathan A. Sternberg a6c09e58a0 Return an error when an invalid duration literal is parsed 2017-03-21 12:10:41 -05:00
Jonathan A. Sternberg 208d8507f1 Implement both single and multiline comments in influxql
A single line comment will read until the end of a line and is started
with `--` (just like SQL). A multiline comment is with `/* */`. You
cannot nest multiline comments.
2017-03-15 14:24:09 -05:00
Jonathan A. Sternberg 55e64e1edd Fixed String() output for MOD operator and added MOD op precedence 2017-02-10 16:48:05 -06:00
Jonathan A. Sternberg f628b4a198 Update subqueries so groupings are propagated to inner queries
Previously, only time expressions got propagated inwards. The reason for
this was simple. If the outer query was going to filter to a specific
time range, then it would be unnecessary for the inner query to output
points within that time frame. It started as an optimization, but became
a feature because there was no reason to have the user repeat the same
time clause for the inner query as the outer query. So we allowed an
aggregate query with an interval to pass validation in the subquery if
the outer query had a time range. But `GROUP BY` clauses were not
propagated because that same logic didn't apply to them. It's not an
optimization there. So while grouping by a tag in the outer query
without grouping by it in the inner query was useless, there wasn't any
particular reason to care.

Then a bug was found where wildcards would propagate the dimensions
correctly, but the outer query containing a group by with the inner
query omitting it wouldn't correctly filter out the outer group by. We
could fix that filtering, but on further review, I had been seeing
people make that same mistake a lot. People seem to just believe that
the grouping should be propagated inwards. Instead of trying to fight
what the user wanted and explicitly erase groupings that weren't
propagated manually, we might as well just propagate them for the user
to make their lives easier. There is no useful situation where you would
want to group into buckets that can't physically exist so we might as
well do _something_ useful.

This will also now propagate time intervals to inner queries since the
same applies there. But, while the interval propagates, the following
query will not pass validation since it is still not possible to use a
grouping interval with a raw query (even if the inner query is an
aggregate):

    SELECT * FROM (SELECT mean(value) FROM cpu) WHERE time > now() - 5m GROUP BY time(1m)

This also means wildcards will behave a bit differently. They will
retrieve dimensions from the sources in the inner query rather than just
using the dimensions in the group by.

Fixing top() and bottom() to return the correct auxiliary fields.
Unfortunately, we were not copying the buffer with the auxiliary fields
so those values would be overwritten by a later point.
2017-01-23 12:38:10 -06:00
Jonathan A. Sternberg d7c8c7ca4f Support subquery execution in the query language
This adds query syntax support for subqueries and adds support to the
query engine to execute queries on subqueries.

Subqueries act as a source for another query. It is the equivalent of
writing the results of a query to a temporary database, executing
a query on that temporary database, and then deleting the database
(except this is all performed in-memory).

The syntax is like this:

    SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *)

This will execute derivative and then sum the result of those derivatives.
Another example:

    SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host)

This would let you find the maximum minimum value of each host.

There is complete freedom to mix subqueries with auxiliary fields. The only
caveat is that the following two queries:

    SELECT mean(value) FROM cpu
    SELECT mean(value) FROM (SELECT value FROM cpu)

Have different performance characteristics. The first will calculate
`mean(value)` at the shard level and will be faster, especially when it comes to
clustered setups. The second will process the mean at the top level and will not
include that optimization.
2017-01-07 13:00:48 -06:00
Cory LaNou 572da8985c enforce minimum shard duration when creating retention policies 2016-12-20 09:11:43 -06:00
Jonathan A. Sternberg c957bf7f99 Quote the empty string as an ident
Without this quoting, the function `max("")` turns into `max()` and will
not be reparsed correctly.
2016-11-18 16:25:39 -06:00
Jason Wilder af72d9b0e4 Merge pull request #7515 from influxdata/jw-7053
Return parse error from delete/drop when db or rp is specified
2016-10-25 12:05:56 -06:00
Jason Wilder c68b7a192f Return parse error from delete/drop when db or rp is specified
The delete and drop statements apply to the measurement within a db.
The parser allowed a db or rp to be specified and these values were
silently ignored.  This could cause data loss as someone would think
they are only deleting the series within a rp, but they are actually
deleting all their data.

Instead, we return a parse error if a db or rp is specified in the
delete or drop statements.  Ideally, we'd be able to respect the
db and rp, but that requires significant work in the query engine
and tsdb store to make that work.

Fixes #7053
2016-10-25 11:43:15 -06:00
Edd Robinson 06d1226b9a Rewrite exact match regexes to use tsdb index
This commit adds support for replacing regexes with non-regex conditions
when possible. Currently the following regexes are supported:

 - host =~ /^foo$/ will be converted into host = 'foo'
 - host !~ /^foo$/ will be converted into host != 'foo'

Note: if the regex expression contains character classes, grouping,
repetition or similar, it may not be rewritten.

For example, the condition: name =~ /^foo|bar$/ will not be rewritten.
Support for this may arrive in the future.

Regexes that can be converted into simpler expression will be able to
take advantage of the tsdb index, making them significantly faster.
2016-10-25 11:10:03 +01:00
Jonathan A. Sternberg 6afc2a77a5 Implement cumulative_sum() function
The `cumulative_sum()` function can be used to sum each new point and
output the current total. For the following points:

    cpu value=2 0
    cpu value=4 10
    cpu value=6 20

This would output the following points:

    > SELECT cumulative_sum(value) FROM cpu
    time    value
    ----    -----
    0       2
    10      6
    20      12

As can be seen, each new point adds to the sum of the previous point and
outputs the value with the same timestamp.

The function can also be used with an aggregate like `derivative()`.

    > SELECT cumulative_sum(mean(value) FROM cpu WHERE time >= now() - 10m GROUP BY time(1m)
2016-10-07 10:11:53 -05:00
Michael Desa f9b8129770 Add sample function to query language
First Pass at implementing sample

Add sample iterators for all types

Remove size from sample struct

Fix off by one error when generating random number

Add benchmarks for sample iterator

Add test and associated fixes for off by one error

Add test for sample function

Remove NumericLiteral from sample function call

Make clear that the counter is incr w/ each call

Rename IsRandom to AllSamplesSeen

Add a rng for each reducer that is created

The default rng that comes with math/rand has a global lock. To avoid
having to worry about any contention on the lock, each reducer now has
its own time seeded rng.

Add sample function to changelog
2016-10-06 09:41:42 -07:00
Michael Desa 966e5503bf Add fill(linear) to query language
Clean up template for fill average

Change fill(average) to fill(linear)

Update average to linear in infuxql spec

Add Integer Tests and associated fixes

Update CHANGELOG for fill(linear)
2016-10-04 14:27:04 -07:00
Cory LaNou acbf193640
add test to prevent future parsing regressions for time durations 2016-09-16 11:44:05 -05:00
Jason Wilder a6d3e46893 Fix panic when parsing ms durations 2016-09-16 08:47:18 -06:00
Cory LaNou 71f0c7e1e9 return appropriate error if overflowing duration when parsing 2016-09-14 09:27:38 -05:00
Jonathan A. Sternberg dbb8c5570c Duplicate parsing bug in ALTER RETENTION POLICY
Return an error when we encounter the same option twice in ALTER
RETENTION POLICY and remove the `maxNumOptions` number from the parsing
loop. The `maxNumOptions` number would need to be modified if another
option was added to the parsing loop and it didn't correctly prevent
duplicate options from being reported as an error anyway.
2016-09-13 15:56:13 -05:00
Jonathan A. Sternberg aae88fc3c3 Support ON and use default database for SHOW commands
Normalize all of the SHOW commands so they allow both using ON to
specify the database and using the default database. Some commands would
require one and some would require the other and it was confusing when
using the query language.

Affected commands:

* SHOW RETENTION POLICIES
* SHOW MEASUREMENTS
* SHOW SERIES
* SHOW TAG KEYS
* SHOW TAG VALUES
* SHOW FIELD KEYS
2016-09-13 15:36:59 -05:00
Jonathan A. Sternberg 04c59b8941 Fix the dollar sign so it properly handles reserved keywords
The dollar sign would sometimes be accepted as whitespace if it was
immediately followed by a reserved keyword or an invalid character. It
now reads these properly as a bound parameter rather than ignoring the
dollar sign.
2016-09-02 15:32:46 -05:00
Jonathan A. Sternberg 4ff0b10210 Merge pull request #7139 from influxdata/js-7137-show-tag-values-string-method
Properly output the SHOW TAG VALUES command so it can be reparsed
2016-09-01 10:19:19 -05:00
Jonathan A. Sternberg dc2527ce86 Merge branch '1.0' 2016-08-31 14:45:57 -05:00
Jonathan A. Sternberg 23f2d50ecb Use defaults from `meta` package for `CREATE DATABASE`
Instead of having the parser set the defaults, the command will set the
defaults so that the constants for that are actually used. This way we
can also identify which things the user provided and which ones we are
filling with default values.

This allows the meta client to be able to make smarter decisions when
determining if the user requested a conflict or if the requested
capabilities match with what is currently available. If you just say
`CREATE DATABASE WITH NAME myrp`, the user doesn't really care what the
duration of the retention policy is and just wants to use the default.
Now, we can use that information to determine if an existing retention
policy would conflict with what the user requested rather than returning
an error if a default value ever gets changed since the meta client
command can communicate intent more easily.
2016-08-30 13:23:49 -05:00
Jonathan A. Sternberg b8a70105aa Fix alter retention policy when all options are used
We added `SHARD DURATION` as an extra option, but forgot to increase the
maximum number of allowable options from 3 to 4. So if 4 options were
used, the last one was ignored. This was commonly `DEFAULT`, but it
could have been any of the options.
2016-08-26 11:25:18 -05:00
Jonathan A. Sternberg 8b234546a8 Merge pull request #7204 from influxdata/1.0
Merge 1.0 branch to master
2016-08-25 15:20:30 -05:00
Ashish Gaurav 4e17f9bb13 add mode() function & tests 2016-08-23 19:31:41 -05:00
Jonathan A. Sternberg f0f7d91d6c Properly output all commands so they can be reparsed
The commands fixed:
* SHOW TAG VALUES
* SHOW STATS
* SHOW DIAGNOSTICS
2016-08-15 15:04:51 -05:00
Jonathan A. Sternberg 87f7c66b8a Merge pull request #7119 from influxdata/js-create-database-use-defaults
Use defaults from `meta` package for `CREATE DATABASE`
2016-08-11 10:34:22 -05:00
Jonathan A. Sternberg ab049d7f0a Support mixed duration units
It is now possible to use a mixed duration unit like `1h30m`. The
duration units can be in whatever order as long as they are connected to
each other.

There is a change to the scanner. A token such as `10x` will be scanned
as a duration literal, but will then fail to parse as an invalid
duration. This should not be a breaking change as there is no situation
where `10m10` was a valid order of tokens for the parser.

Fixes #3634.
2016-08-10 13:34:19 -05:00
Jonathan A. Sternberg 530b00bd76 Use defaults from `meta` package for `CREATE DATABASE`
Instead of having the parser set the defaults, the command will set the
defaults so that the constants for that are actually used. This way we
can also identify which things the user provided and which ones we are
filling with default values.

This allows the meta client to be able to make smarter decisions when
determining if the user requested a conflict or if the requested
capabilities match with what is currently available. If you just say
`CREATE DATABASE WITH NAME myrp`, the user doesn't really care what the
duration of the retention policy is and just wants to use the default.
Now, we can use that information to determine if an existing retention
policy would conflict with what the user requested rather than returning
an error if a default value ever gets changed since the meta client
command can communicate intent more easily.
2016-08-09 12:00:06 -05:00
Jonathan A. Sternberg 2c739c0532 Fix parseFill to check for fill ident before attempting to parse an expression
The previous parseFill would try to parse an expression and only unscan
one token when it failed. This caused it to not put back the correct
number of tokens with some expression.

Now it has been modified to check for the fill ident ahead of time and
then use ParseExpr() to parse the call. If the expression fails to parse
into a call, it will send an error instead of trying to continue with an
invalid parser state.

Fixes #6543.
2016-08-01 11:38:44 -05:00
Cory LaNou 1117526873 remove IF EXISTS/IF NOT EXISTS from influxql language 2016-07-29 12:58:05 -05:00
Jonathan A. Sternberg 9837de793c Support regex and other operations for selecting the key in SHOW TAG VALUES
This adds support for using regex expressions in SHOW TAG VALUES when
selecting the key. Also supporting the `!=` operation for the
comparison. Now you can do any of the following:

    SHOW TAG VALUES WITH KEY != "region"
    SHOW TAG VALUES WITH KEY =~ /region/
    SHOW TAG VALUES WITH KEY !~ /region/

It also adds a new SetLiteral AST node that will potentially be used in
the future to allow set operations for other comparisons in the future.

Fixes #4532.
2016-06-13 10:03:14 -05:00
Jonathan A. Sternberg 2fa6d306c2 Add option to KILL QUERY to kill on a specific host
Option only applies to clustering.
2016-06-07 16:48:07 -05:00
Jonathan A. Sternberg b8e22d9d79 Merge pull request #6586 from influxdata/js-3733-rename-default-retention-policy
Modify the default retention policy name and make it configurable
2016-06-06 15:05:29 -05:00
Joe LeGasse f2fd988ab9 Delay parsing of date/time strings until needed
The current code would compare every string literal it crossed and tried
to coerce them to time literals if the _looked_ like date/time strings.

The only time the TimeLiteral was used is when comparing to the the
'time' value in a where clause. This change moves the string parsing
code until we attempt to compare 'time' to a string, at which point we
know we need/want a TimeLiteral, and not just an ordinary string.

Fixes #6727
2016-05-27 09:43:45 -04:00
Mark Rushakoff fed67ffdf0 Fix typo in parse error 2016-05-24 10:47:51 -07:00
Jonathan A. Sternberg baaa782c95 Modify the default retention policy name and make it configurable
The default retention policy name is changed to "autogen" instead of
"default" since it ends up being ambiguous when we tell a user to check
the default retention policy, it is uncertain if we are referring to the
default retention policy (which can be changed) or the retention policy
with the name "default".

Now the automatically generated retention policy name is "autogen".

The default retention policy is now also configurable through the
configuration file so an administrator can customize what they think
should be the default.

Fixes #3733.
2016-05-24 09:51:23 -04:00
Nathaniel Cook 6ed0d94343 Add Holt-Winters forecasting method. 2016-05-19 09:24:56 -06:00
Jonathan A. Sternberg 451a5205ef Support bound parameters in the parser
The parser can be passed a map of keys to literal values to be replaced
into the query. Parameters are preceded by a dollar sign (`$`). If a
parameter key is missing, an error is thrown by the parser.

Fixes #2926.
2016-05-18 20:10:15 -04:00
Jonathan A. Sternberg 23f6a706bb Support cast syntax for selecting a specific type
Casting syntax is done with the PostgreSQL syntax `field1::float` to
specify which type should be used when selecting a field. You can also
do `field1::field` or `tag1::tag` to specify that a field or tag should
be selected.

This makes it possible to select a tag when a field key and a tag key
conflict with each other in a measurement. It also means it's possible
to choose a field with a specific type if multiple shards disagree. If
no types are given, the same ordering for how a type is chosen is used
to determine which type to return.

The FieldDimensions method has been updated to return the data type for
the fields that get returned. The SeriesKeys function has also been
removed since it is no longer needed. SeriesKeys was originally used for
the fill iterator, but then expanded to be used by auxiliary iterators
for determining the channel iterator types. The fill iterator doesn't
need it anymore and the auxiliary types are better served by
FieldDimensions implementing that functionality, so SeriesKeys is no
longer needed.

Fixes #6519.
2016-05-16 12:08:29 -04:00
Jonathan A. Sternberg 64556e4f8e Support offset argument in the GROUP BY time(...) call
An offset of `time(1m, now())` will anchor the offset to the current
time of the query. The default offset is `0s` which is the current
default anyway.

This fixes #2074 by making time zone offset support unnecessary. Time
comparisons can use timezones inside of the time clause and the offset
needed for non-hour timezone differences can be used as part of the
offset argument.
2016-05-02 14:02:35 -04:00
Jonathan A. Sternberg ff3ee909de Fix validation to catch a string used in `count(distinct())`
Also removes the functions `HasSimpleCount()` and `HasCountDistinct()`
as they are no longer useful. They had a small role in validation that
has now been moved into `validateAggregates()`.

Fixes #6472.
2016-04-29 13:46:18 -04:00
Ben Johnson f7af787aef
add DELETE query support
This commit adds query language support for deleting series with a
`DELETE` query.
2016-04-27 15:16:23 -06:00