influxdb

Commit Graph

Author	SHA1	Message	Date
Jonathan A. Sternberg	905e7fe05e	Refactor validation code and move it to the compiler This refactors the validation code so it is more flexible and performs a small bit of work to make preparing and executing the query easier. The general idea is that compilation will eventually do more heavy lifting in creating the initial plan and prepare will construct an actual plan rather than just doing some basic field rewriting. This change at least sets us up for that change in the future and moves the validation code to the query execution instead of in the parser. This also frees up the parser to parse the complete AST without worrying if the query itself is valid. That could be useful for client code that wants to compile a partial query to an AST and then perform modifications on the AST for some reason.	2017-08-26 17:36:32 -05:00
Ben Johnson	60ab1282ea	Refactor system iterators. Previously pseudo iterators could be created for meta data such as series, measurement, and tag data. These iterators were created at a higher level and lacked a lot of the power of the query engine. This commit moves system iterators down to the series level and supports the following: - _name - _seriesKey - _tagKey - _tagValue - _fieldKey These can be used as normal fields such as: SELECT _seriesKey FROM cpu This will return all the series keys for `cpu`.	2017-08-16 09:27:29 -06:00
Ben Johnson	c9b5d60753	Parse SHOW CARDINALITY.	2017-08-16 09:27:15 -06:00
Jonathan A. Sternberg	3e2501cbd1	Introduce a new dynamic language mechanism The language is now defined in a way similar to many HTTP routers with the left prefix being placed into a parse tree and then eventually invoking a function to parse the arguments. This allows dynamically adding additional components to the parse tree for either query language extensions or enterprise.	2017-07-02 20:05:46 -05:00
Jonathan A. Sternberg	7d043dbc61	Add nanosecond duration literal support	2017-05-19 10:44:11 -05:00
Jonathan A. Sternberg	57a2abbc87	Restrict top() and bottom() selectors to be used with no other functions	2017-04-14 10:23:07 -05:00
Jonathan A. Sternberg	a550d323c4	Restrict fill(none) and fill(linear) to be usable only with aggregate queries	2017-04-10 15:58:05 -05:00
zhexuany	232fdae6dd	introduce a new function non_negative_difference	2017-03-31 23:08:36 +08:00
Jonathan A. Sternberg	3e52ec7ca2	Merge pull request #7762 from influxdata/js-6541-timezone-support Support timezone offsets for queries	2017-03-28 10:39:07 -05:00
Jonathan A. Sternberg	ccf0cb8371	Fix query parser when using addition and subtraction without spaces Additionally, support unary addition and subtraction for variables, calls, and parenthesis expressions. Doing `-value` will be the equivalent of doing `-1 * value` now.	2017-03-24 12:52:19 -05:00
Jonathan A. Sternberg	347b01814e	Support timezone offsets for queries The timezone for a query can now be added to the end with something like `TZ("America/Los_Angeles")` and it will localize the results of the query to be in that timezone. The offset will automatically be set to the offset for that timezone and offsets will automatically adjust for daylight savings time so grouping by a day will result in a 25 hour day once a year and a 23 hour day another day of the year. The automatic adjustment of intervals for timezone offsets changing will only happen if the group by period is greater than the timezone offset would be. That means grouping by an hour or less will not be affected by daylight savings time, but a 2 hour or 1 day interval will be. The default timezone is UTC and existing queries are unaffected by this change. When times are returned as strings (when `epoch=1` is not used), the results will be returned using the requested timezone format in RFC3339 format.	2017-03-22 15:09:41 -05:00
Jonathan A. Sternberg	a6c09e58a0	Return an error when an invalid duration literal is parsed	2017-03-21 12:10:41 -05:00
Jonathan A. Sternberg	208d8507f1	Implement both single and multiline comments in influxql A single line comment will read until the end of a line and is started with `--` (just like SQL). A multiline comment is with `/* */`. You cannot nest multiline comments.	2017-03-15 14:24:09 -05:00
Jonathan A. Sternberg	55e64e1edd	Fixed String() output for MOD operator and added MOD op precedence	2017-02-10 16:48:05 -06:00
Jonathan A. Sternberg	f628b4a198	Update subqueries so groupings are propagated to inner queries Previously, only time expressions got propagated inwards. The reason for this was simple. If the outer query was going to filter to a specific time range, then it would be unnecessary for the inner query to output points within that time frame. It started as an optimization, but became a feature because there was no reason to have the user repeat the same time clause for the inner query as the outer query. So we allowed an aggregate query with an interval to pass validation in the subquery if the outer query had a time range. But `GROUP BY` clauses were not propagated because that same logic didn't apply to them. It's not an optimization there. So while grouping by a tag in the outer query without grouping by it in the inner query was useless, there wasn't any particular reason to care. Then a bug was found where wildcards would propagate the dimensions correctly, but the outer query containing a group by with the inner query omitting it wouldn't correctly filter out the outer group by. We could fix that filtering, but on further review, I had been seeing people make that same mistake a lot. People seem to just believe that the grouping should be propagated inwards. Instead of trying to fight what the user wanted and explicitly erase groupings that weren't propagated manually, we might as well just propagate them for the user to make their lives easier. There is no useful situation where you would want to group into buckets that can't physically exist so we might as well do _something_ useful. This will also now propagate time intervals to inner queries since the same applies there. But, while the interval propagates, the following query will not pass validation since it is still not possible to use a grouping interval with a raw query (even if the inner query is an aggregate): SELECT * FROM (SELECT mean(value) FROM cpu) WHERE time > now() - 5m GROUP BY time(1m) This also means wildcards will behave a bit differently. They will retrieve dimensions from the sources in the inner query rather than just using the dimensions in the group by. Fixing top() and bottom() to return the correct auxiliary fields. Unfortunately, we were not copying the buffer with the auxiliary fields so those values would be overwritten by a later point.	2017-01-23 12:38:10 -06:00
Jonathan A. Sternberg	d7c8c7ca4f	Support subquery execution in the query language This adds query syntax support for subqueries and adds support to the query engine to execute queries on subqueries. Subqueries act as a source for another query. It is the equivalent of writing the results of a query to a temporary database, executing a query on that temporary database, and then deleting the database (except this is all performed in-memory). The syntax is like this: SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *) This will execute derivative and then sum the result of those derivatives. Another example: SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host) This would let you find the maximum minimum value of each host. There is complete freedom to mix subqueries with auxiliary fields. The only caveat is that the following two queries: SELECT mean(value) FROM cpu SELECT mean(value) FROM (SELECT value FROM cpu) Have different performance characteristics. The first will calculate `mean(value)` at the shard level and will be faster, especially when it comes to clustered setups. The second will process the mean at the top level and will not include that optimization.	2017-01-07 13:00:48 -06:00
Cory LaNou	572da8985c	enforce minimum shard duration when creating retention policies	2016-12-20 09:11:43 -06:00
Jonathan A. Sternberg	c957bf7f99	Quote the empty string as an ident Without this quoting, the function `max("")` turns into `max()` and will not be reparsed correctly.	2016-11-18 16:25:39 -06:00
Jason Wilder	af72d9b0e4	Merge pull request #7515 from influxdata/jw-7053 Return parse error from delete/drop when db or rp is specified	2016-10-25 12:05:56 -06:00
Jason Wilder	c68b7a192f	Return parse error from delete/drop when db or rp is specified The delete and drop statements apply to the measurement within a db. The parser allowed a db or rp to be specified and these values were silently ignored. This could cause data loss as someone would think they are only deleting the series within a rp, but they are actually deleting all their data. Instead, we return a parse error if a db or rp is specified in the delete or drop statements. Ideally, we'd be able to respect the db and rp, but that requires significant work in the query engine and tsdb store to make that work. Fixes #7053	2016-10-25 11:43:15 -06:00
Edd Robinson	06d1226b9a	Rewrite exact match regexes to use tsdb index This commit adds support for replacing regexes with non-regex conditions when possible. Currently the following regexes are supported: - host =~ /^foo$/ will be converted into host = 'foo' - host !~ /^foo$/ will be converted into host != 'foo' Note: if the regex expression contains character classes, grouping, repetition or similar, it may not be rewritten. For example, the condition: name =~ /^foo\|bar$/ will not be rewritten. Support for this may arrive in the future. Regexes that can be converted into simpler expression will be able to take advantage of the tsdb index, making them significantly faster.	2016-10-25 11:10:03 +01:00
Jonathan A. Sternberg	6afc2a77a5	Implement cumulative_sum() function The `cumulative_sum()` function can be used to sum each new point and output the current total. For the following points: cpu value=2 0 cpu value=4 10 cpu value=6 20 This would output the following points: > SELECT cumulative_sum(value) FROM cpu time value ---- ----- 0 2 10 6 20 12 As can be seen, each new point adds to the sum of the previous point and outputs the value with the same timestamp. The function can also be used with an aggregate like `derivative()`. > SELECT cumulative_sum(mean(value) FROM cpu WHERE time >= now() - 10m GROUP BY time(1m)	2016-10-07 10:11:53 -05:00
Michael Desa	f9b8129770	Add sample function to query language First Pass at implementing sample Add sample iterators for all types Remove size from sample struct Fix off by one error when generating random number Add benchmarks for sample iterator Add test and associated fixes for off by one error Add test for sample function Remove NumericLiteral from sample function call Make clear that the counter is incr w/ each call Rename IsRandom to AllSamplesSeen Add a rng for each reducer that is created The default rng that comes with math/rand has a global lock. To avoid having to worry about any contention on the lock, each reducer now has its own time seeded rng. Add sample function to changelog	2016-10-06 09:41:42 -07:00
Michael Desa	966e5503bf	Add fill(linear) to query language Clean up template for fill average Change fill(average) to fill(linear) Update average to linear in infuxql spec Add Integer Tests and associated fixes Update CHANGELOG for fill(linear)	2016-10-04 14:27:04 -07:00
Cory LaNou	acbf193640	add test to prevent future parsing regressions for time durations	2016-09-16 11:44:05 -05:00
Jason Wilder	a6d3e46893	Fix panic when parsing ms durations	2016-09-16 08:47:18 -06:00
Cory LaNou	71f0c7e1e9	return appropriate error if overflowing duration when parsing	2016-09-14 09:27:38 -05:00
Jonathan A. Sternberg	dbb8c5570c	Duplicate parsing bug in ALTER RETENTION POLICY Return an error when we encounter the same option twice in ALTER RETENTION POLICY and remove the `maxNumOptions` number from the parsing loop. The `maxNumOptions` number would need to be modified if another option was added to the parsing loop and it didn't correctly prevent duplicate options from being reported as an error anyway.	2016-09-13 15:56:13 -05:00
Jonathan A. Sternberg	aae88fc3c3	Support ON and use default database for SHOW commands Normalize all of the SHOW commands so they allow both using ON to specify the database and using the default database. Some commands would require one and some would require the other and it was confusing when using the query language. Affected commands: * SHOW RETENTION POLICIES * SHOW MEASUREMENTS * SHOW SERIES * SHOW TAG KEYS * SHOW TAG VALUES * SHOW FIELD KEYS	2016-09-13 15:36:59 -05:00
Jonathan A. Sternberg	04c59b8941	Fix the dollar sign so it properly handles reserved keywords The dollar sign would sometimes be accepted as whitespace if it was immediately followed by a reserved keyword or an invalid character. It now reads these properly as a bound parameter rather than ignoring the dollar sign.	2016-09-02 15:32:46 -05:00
Jonathan A. Sternberg	4ff0b10210	Merge pull request #7139 from influxdata/js-7137-show-tag-values-string-method Properly output the SHOW TAG VALUES command so it can be reparsed	2016-09-01 10:19:19 -05:00
Jonathan A. Sternberg	dc2527ce86	Merge branch '1.0'	2016-08-31 14:45:57 -05:00
Jonathan A. Sternberg	23f2d50ecb	Use defaults from `meta` package for `CREATE DATABASE` Instead of having the parser set the defaults, the command will set the defaults so that the constants for that are actually used. This way we can also identify which things the user provided and which ones we are filling with default values. This allows the meta client to be able to make smarter decisions when determining if the user requested a conflict or if the requested capabilities match with what is currently available. If you just say `CREATE DATABASE WITH NAME myrp`, the user doesn't really care what the duration of the retention policy is and just wants to use the default. Now, we can use that information to determine if an existing retention policy would conflict with what the user requested rather than returning an error if a default value ever gets changed since the meta client command can communicate intent more easily.	2016-08-30 13:23:49 -05:00
Jonathan A. Sternberg	b8a70105aa	Fix alter retention policy when all options are used We added `SHARD DURATION` as an extra option, but forgot to increase the maximum number of allowable options from 3 to 4. So if 4 options were used, the last one was ignored. This was commonly `DEFAULT`, but it could have been any of the options.	2016-08-26 11:25:18 -05:00
Jonathan A. Sternberg	8b234546a8	Merge pull request #7204 from influxdata/1.0 Merge 1.0 branch to master	2016-08-25 15:20:30 -05:00
Ashish Gaurav	4e17f9bb13	add mode() function & tests	2016-08-23 19:31:41 -05:00
Jonathan A. Sternberg	f0f7d91d6c	Properly output all commands so they can be reparsed The commands fixed: * SHOW TAG VALUES * SHOW STATS * SHOW DIAGNOSTICS	2016-08-15 15:04:51 -05:00
Jonathan A. Sternberg	87f7c66b8a	Merge pull request #7119 from influxdata/js-create-database-use-defaults Use defaults from `meta` package for `CREATE DATABASE`	2016-08-11 10:34:22 -05:00
Jonathan A. Sternberg	ab049d7f0a	Support mixed duration units It is now possible to use a mixed duration unit like `1h30m`. The duration units can be in whatever order as long as they are connected to each other. There is a change to the scanner. A token such as `10x` will be scanned as a duration literal, but will then fail to parse as an invalid duration. This should not be a breaking change as there is no situation where `10m10` was a valid order of tokens for the parser. Fixes #3634.	2016-08-10 13:34:19 -05:00
Jonathan A. Sternberg	530b00bd76	Use defaults from `meta` package for `CREATE DATABASE` Instead of having the parser set the defaults, the command will set the defaults so that the constants for that are actually used. This way we can also identify which things the user provided and which ones we are filling with default values. This allows the meta client to be able to make smarter decisions when determining if the user requested a conflict or if the requested capabilities match with what is currently available. If you just say `CREATE DATABASE WITH NAME myrp`, the user doesn't really care what the duration of the retention policy is and just wants to use the default. Now, we can use that information to determine if an existing retention policy would conflict with what the user requested rather than returning an error if a default value ever gets changed since the meta client command can communicate intent more easily.	2016-08-09 12:00:06 -05:00
Jonathan A. Sternberg	2c739c0532	Fix parseFill to check for fill ident before attempting to parse an expression The previous parseFill would try to parse an expression and only unscan one token when it failed. This caused it to not put back the correct number of tokens with some expression. Now it has been modified to check for the fill ident ahead of time and then use ParseExpr() to parse the call. If the expression fails to parse into a call, it will send an error instead of trying to continue with an invalid parser state. Fixes #6543.	2016-08-01 11:38:44 -05:00
Cory LaNou	1117526873	remove IF EXISTS/IF NOT EXISTS from influxql language	2016-07-29 12:58:05 -05:00
Jonathan A. Sternberg	9837de793c	Support regex and other operations for selecting the key in SHOW TAG VALUES This adds support for using regex expressions in SHOW TAG VALUES when selecting the key. Also supporting the `!=` operation for the comparison. Now you can do any of the following: SHOW TAG VALUES WITH KEY != "region" SHOW TAG VALUES WITH KEY =~ /region/ SHOW TAG VALUES WITH KEY !~ /region/ It also adds a new SetLiteral AST node that will potentially be used in the future to allow set operations for other comparisons in the future. Fixes #4532.	2016-06-13 10:03:14 -05:00
Jonathan A. Sternberg	2fa6d306c2	Add option to KILL QUERY to kill on a specific host Option only applies to clustering.	2016-06-07 16:48:07 -05:00
Jonathan A. Sternberg	b8e22d9d79	Merge pull request #6586 from influxdata/js-3733-rename-default-retention-policy Modify the default retention policy name and make it configurable	2016-06-06 15:05:29 -05:00
Joe LeGasse	f2fd988ab9	Delay parsing of date/time strings until needed The current code would compare every string literal it crossed and tried to coerce them to time literals if the _looked_ like date/time strings. The only time the TimeLiteral was used is when comparing to the the 'time' value in a where clause. This change moves the string parsing code until we attempt to compare 'time' to a string, at which point we know we need/want a TimeLiteral, and not just an ordinary string. Fixes #6727	2016-05-27 09:43:45 -04:00
Mark Rushakoff	fed67ffdf0	Fix typo in parse error	2016-05-24 10:47:51 -07:00
Jonathan A. Sternberg	baaa782c95	Modify the default retention policy name and make it configurable The default retention policy name is changed to "autogen" instead of "default" since it ends up being ambiguous when we tell a user to check the default retention policy, it is uncertain if we are referring to the default retention policy (which can be changed) or the retention policy with the name "default". Now the automatically generated retention policy name is "autogen". The default retention policy is now also configurable through the configuration file so an administrator can customize what they think should be the default. Fixes #3733.	2016-05-24 09:51:23 -04:00
Nathaniel Cook	6ed0d94343	Add Holt-Winters forecasting method.	2016-05-19 09:24:56 -06:00
Jonathan A. Sternberg	451a5205ef	Support bound parameters in the parser The parser can be passed a map of keys to literal values to be replaced into the query. Parameters are preceded by a dollar sign (`$`). If a parameter key is missing, an error is thrown by the parser. Fixes #2926.	2016-05-18 20:10:15 -04:00

1 2 3 4 5 ...

267 Commits (ef0ad3292d90b1c6be433ffde4e56834b15de323)