influxdb

Commit Graph

Author	SHA1	Message	Date
Jonathan A. Sternberg	78a32cba0e	Queries for `bottom()` with no tags got messed up while changing the implementation It didn't properly pass the variable reference when creating the variable iterator so a null iterator got passed back instead. Duplicate the `top()` tests in TopInt to also test `bottom()` with the same queries so `bottom()` stops getting neglected so often.	2017-05-30 11:28:41 -05:00
Stuart Carnie	47f97ea134	use parsed measurement and models.Tags	2017-05-26 13:21:59 -07:00
Joe LeGasse	815f740f4c	initial fga work wip wip fix tests / build	2017-05-26 13:16:27 -07:00
Jonathan A. Sternberg	9edf236cc8	Maintain the tags of points selected by top() or bottom() when writing the results When a `SELECT ... INTO ...` is used with `top()` or `bottom()` used with tags, the points will be written with the tags still intact instead of converted to fields.	2017-05-23 15:00:21 -05:00
Jonathan A. Sternberg	7b9b55bfc0	Optimize top() and bottom() using an incremental aggregator The previous version of `top()` and `bottom()` would gather all of the points to use in a slice, filter them (if necessary), then use a slightly modified heap sort to retrieve the top or bottom values. This performed horrendously from the standpoint of memory. Since it consumed so much memory and spent so much time in allocations (along with sorting a potentially very large slice), this affected speed too. These calls have now been modified so they keep the top or bottom points in a min or max heap. For `top()`, a new point will read the minimum value from the heap. If the new point is greater than the minimum point, it will replace the minimum point and fix the heap with the new value. If the new point is smaller, it discards that point. For `bottom()`, the process is the opposite. It will then sort the final result to ensure the correct ordering of the selected points. When `top()` or `bottom()` contain a tag to select, they have now been modified so this query: SELECT top(value, host, 2) FROM cpu Essentially becomes this query: SELECT top(value, 2), host FROM ( SELECT max(value) FROM cpu GROUP BY host ) This should drastically increase the performance of all `top()` and `bottom()` queries.	2017-05-19 11:56:46 -05:00
Jonathan A. Sternberg	7d043dbc61	Add nanosecond duration literal support	2017-05-19 10:44:11 -05:00
rw-influxdata	67279ccc64	Fix AST rewriting panic due to a nil Condition.	2017-05-04 14:51:53 -07:00
Jonathan A. Sternberg	df30a4d9c9	Refactor the subquery code and fix outer condition queries This change refactors the subquery code into a separate builder class to help allow for more reuse and make the functions smaller and easier to read. The previous function that handled most of the code was too big and impossible to reason through. This also goes and replaces the complicated logic of aggregates that had a subquery source with the simpler IteratorMapper. I think the overhead from the IteratorMapper will be more, but I also believe that the actual code is simpler and more robust to produce more accurate answers. It might be a future project to optimize that section of code, but I don't have any actual numbers for the efficiency of one method and I believe accuracy and code clarity may be more important at the moment since I am otherwise incapable of reading my own code.	2017-04-28 17:12:32 -05:00
Jonathan A. Sternberg	addc12561f	Fix LIMIT and OFFSET for certain aggregate queries When LIMIT and OFFSET were used with any functions that were not handled directly by the query engine (anything other than count, max, min, mean, first, or last), the input to the function would be limited instead of receiving the full stream of values it was supposed to receive. This also fixes a bug that caused the server to hang when LIMIT and OFFSET were used with a selector. When using a selector, the limit and offset should be handled before the points go to the auxiliary iterator to be split into different iterators. Limiting happened afterwards which caused the auxiliary iterator to hang forever.	2017-04-28 15:55:06 -05:00
Ben Johnson	3a46e5dd9e	Remove default upper time bound for DELETE queries.	2017-04-28 12:26:26 -06:00
Jonathan A. Sternberg	be3bce5212	top() and bottom() now returns the time for every point `top()` and `bottom()` will now organize the points by time and also keep the points original time even when a time grouping is used. At the same time, `top()` and `bottom()` will no longer honor any fill options that are present since they don't really make sense for these specific functions. This also fixes the aggregate and selectors to honor the ordered iterator option so iterator remain ordered and to also respect the buckets that are created by the final dimensions of the query so that two buckets don't overlap each other within the same reducer. A test has been added for this situation. This should clarify and encourage the use of the ordered attribute within the query engine.	2017-04-26 15:07:10 -05:00
Jonathan A. Sternberg	57a2abbc87	Restrict top() and bottom() selectors to be used with no other functions	2017-04-14 10:23:07 -05:00
Cory LaNou	9060a2a5ff	Add HasDefaultDatabase interface to several statements	2017-04-12 13:41:28 -05:00
Jonathan A. Sternberg	d653b76b5b	Merge pull request #8282 from influxdata/js-8281-influxql-select-tests Fix influxql select tests	2017-04-11 15:05:08 -05:00
Jonathan A. Sternberg	c6e1b83906	Fix influxql select tests The inputs are now sent to the tested iterator in the correct order so we can more accurately test each individual select statement.	2017-04-10 20:51:14 -05:00
Jonathan A. Sternberg	a550d323c4	Restrict fill(none) and fill(linear) to be usable only with aggregate queries	2017-04-10 15:58:05 -05:00
Jonathan A. Sternberg	0a5e4bd92b	Implicitly cast null to false in binary expressions with a boolean Also more consistently treat a binary expression with strings so it produces the same value no matter the direction of the expression.	2017-04-06 12:26:04 -05:00
Jason Wilder	8da84e6144	Merge branch 'master' into tsi	2017-04-03 11:21:02 -06:00
Tom Young	d2fd3f50aa	Add bitwise AND, OR and XOR operators to InfluxQL.	2017-03-31 21:02:02 +01:00
Jonathan A. Sternberg	211e7ea65d	Merge pull request #8234 from influxdata/js-8230-fix-window-computation-overflow Prevent overflowing or underflowing during window computation	2017-03-31 11:09:30 -05:00
zhexuany	232fdae6dd	introduce a new function non_negative_difference	2017-03-31 23:08:36 +08:00
Jonathan A. Sternberg	64fb1db5f5	Prevent overflowing or underflowing during window computation The Window function will now check before it adjusts the offset whether it is going to overflow or underflow. If it is going to do either, it sets the start or end time to MinTime or MaxTime.	2017-03-30 16:35:22 -05:00
Jonathan A. Sternberg	2ea805c928	Interpolate between different intervals to find the whole area under the curve	2017-03-30 12:51:52 -05:00
Tom Young	cac94a1fc7	Add "integral" function to InfluxQL	2017-03-30 12:07:26 -05:00
Edd Robinson	fddaff2cc8	Merge master in	2017-03-29 18:00:28 +01:00
Jonathan A. Sternberg	7e0ed1f5e5	Ensure the input for certain functions in the query engine are ordered The following functions require ordered input but were not guaranteed to received ordered input: * `distinct()` * `sample()` * `holt_winters()` * `holt_winters_with_fit()` * `derivative()` * `non_negative_derivative()` * `difference()` * `moving_average()` * `elapsed()` * `cumulative_sum()` * `top()` * `bottom()` These function calls have now been modified to request that their input be ordered by the query engine. This will prevent the improper output that could have been caused by multiple series being merged together or multiple shards being merged together potentially incorrectly when no time grouping was specified. Two additional functions were already correct to begin with (so there are no bugs with these two, but I'm including their names for completeness). * `median()` * `percentile()`	2017-03-28 13:55:37 -05:00
Jonathan A. Sternberg	24109468c3	Merge pull request #8168 from influxdata/js-8167-math-with-multiple-selectors Fix a regression when math was used with selectors	2017-03-28 13:31:57 -05:00
Jonathan A. Sternberg	3e52ec7ca2	Merge pull request #7762 from influxdata/js-6541-timezone-support Support timezone offsets for queries	2017-03-28 10:39:07 -05:00
Jonathan A. Sternberg	b14c292cba	Fix a regression when math was used with selectors If there were multiple selectors and math, the query engine would mistakenly think it was the only selector in the query and would not match their timestamps. Fixed the query engine to pass whether the selector should be treated as a selector so queries like `max(value) * 1, min(value) * 1` will match the timestamps of the result.	2017-03-27 14:12:15 -05:00
Jonathan A. Sternberg	ccf0cb8371	Fix query parser when using addition and subtraction without spaces Additionally, support unary addition and subtraction for variables, calls, and parenthesis expressions. Doing `-value` will be the equivalent of doing `-1 * value` now.	2017-03-24 12:52:19 -05:00
Jonathan A. Sternberg	347b01814e	Support timezone offsets for queries The timezone for a query can now be added to the end with something like `TZ("America/Los_Angeles")` and it will localize the results of the query to be in that timezone. The offset will automatically be set to the offset for that timezone and offsets will automatically adjust for daylight savings time so grouping by a day will result in a 25 hour day once a year and a 23 hour day another day of the year. The automatic adjustment of intervals for timezone offsets changing will only happen if the group by period is greater than the timezone offset would be. That means grouping by an hour or less will not be affected by daylight savings time, but a 2 hour or 1 day interval will be. The default timezone is UTC and existing queries are unaffected by this change. When times are returned as strings (when `epoch=1` is not used), the results will be returned using the requested timezone format in RFC3339 format.	2017-03-22 15:09:41 -05:00
Jonathan A. Sternberg	33981277bc	Fix the time range when an exact timestamp is selected There is a lot of confusion in the code if the range is [start, end) or [start, end]. This is not made easier because it is acts one way in some areas and in another way in some other areas, but it is usually [start, end]. The `time = ?` syntax assumed that it was [start, end) and added an extra nanosecond to the end time to accomodate for that, but the range was actually [start, end] and that caused it to include one extra nanosecond when it shouldn't have. This change fixes it so exactly one timestamp is selected when `time = ?` is used.	2017-03-21 14:55:31 -05:00
Jonathan A. Sternberg	a6c09e58a0	Return an error when an invalid duration literal is parsed	2017-03-21 12:10:41 -05:00
Jason Wilder	8f7b251afd	Merge branch 'master' into jw-tsi	2017-03-20 17:17:26 -06:00
Jason Wilder	86ad0a45b6	Ensure iterators are closed when query is killed The underlying iterators were not closed when a query was kill so although the client would receive an error, the query would continue on until completion.	2017-03-17 16:00:39 -06:00
Jonathan A. Sternberg	41c8370bbc	Fix fill(linear) when multiple series exist and there are null values When there were multiple series and anything other than the last series had any null values, the series would start using the first point from the next series to interpolate points. Interpolation should not cross between series. Now, the linear fill checks to make sure the next point is within the same series before using it to perform interpolation.	2017-03-16 15:54:20 -05:00
Jonathan A. Sternberg	5072db40c2	Forbid wildcards in binary expressions When rewriting fields, wildcards within binary expressions were skipped. This now throws an error whenever it finds a wildcard within a binary expression in order to prevent the panic that occurs.	2017-03-16 14:26:10 -05:00
Jonathan A. Sternberg	208d8507f1	Implement both single and multiline comments in influxql A single line comment will read until the end of a line and is started with `--` (just like SQL). A multiline comment is with `/* */`. You cannot nest multiline comments.	2017-03-15 14:24:09 -05:00
Ben Johnson	358b1e0b05	Merge remote-tracking branch 'upstream/master' into tsi	2017-03-15 10:13:32 -06:00
Jason Wilder	b9e5375043	Merge branch '1.2' into jw-merge-12	2017-03-08 13:16:50 -07:00
Jonathan A. Sternberg	83cf8893e1	Include IsRawQuery in the rewritten statement for meta queries	2017-03-06 14:46:33 -06:00
Jason Wilder	675d7c9d65	Merge branch '1.2' into jw-merge12	2017-03-06 11:09:05 -07:00
Jonathan A. Sternberg	c5970b59b4	Map types correctly when selecting a field with multiple measurements where one of the measurements is empty	2017-03-01 11:47:26 -06:00
Jonathan A. Sternberg	1fb34e3eef	Dividing aggregate functions with different outputs doesn't panic	2017-02-23 18:38:29 -06:00
Jonathan A. Sternberg	72e4dd01b9	Properly select a tag within a subquery Previously, subqueries would only be able to select tag fields within a subquery if the subquery used a selector. But, it didn't filter out aggregates like `mean()` so it would panic instead. Now it is possible to select the tag directly instead of rewriting the query in an invalid way. Some queries in this form will still not work though. For example, the following still does not function (but also doesn't panic): SELECT count(host) FROM (SELECT mean, host FROM (SELECT mean(value) FROM cpu GROUP BY host))	2017-02-23 11:16:22 -06:00
Jonathan A. Sternberg	5a2b458180	Reduce the expression in a subquery to avoid a panic The builder used for subqueries does not handle parenthesis, but a set of parenthesis wrapping a field would cause it to panic. This code now reduces the expression so the parenthesis are removed before being processed.	2017-02-23 10:14:05 -06:00
Mark Rushakoff	601cbcd084	Merge branch '1.2' into mr-merge-12	2017-02-17 16:14:22 -08:00
Jonathan A. Sternberg	2fe48d6781	Rename zap import back to github.com/uber-go/zap They rebased a revision we were previously relying upon that allowed us to use the vanity name so we are reverting back to an older version with the old import path.	2017-02-17 17:17:22 -06:00
Jonathan A. Sternberg	71f62d33e6	Map types correctly when using a regex and one of the measurements is empty	2017-02-13 18:14:29 -06:00
Mark Rushakoff	c762ab49ee	Merge pull request #7974 from influxdata/mr-4785-show-databases Allow non-admin users to execute SHOW DATABASES	2017-02-13 15:04:00 -08:00
Jason Wilder	f45a58937c	Merge pull request #7998 from influxdata/jw-merge-12 Merge 1.2.1-rc3 to master	2017-02-13 14:24:17 -07:00
Jason Wilder	c3de210ded	Merge branch '1.2' into jw-merge-12	2017-02-13 11:45:27 -07:00
Mark Rushakoff	53699aa24f	Allow non-admin users to execute SHOW DATABASES This commit introduces a new interface type, influxql.Authorizer, that is passed as part of a statement's execution context and determines whether the context is permitted to access a given database. In the future, the Authorizer interface may be expanded to other resources besides databases. In this commit, the Authorizer interface is specifically used to determine which databases are returned when executing SHOW DATABASES. When HTTP authentication is enabled, the existing meta.UserInfo struct implements Authorizer, meaning admin users can SHOW every database, and non-admin users can SHOW only databases for which they have read and/or write permission. When HTTP authentication is disabled, all databases are visible through SHOW DATABASES. This addresses a long-standing issue where Chronograf or Grafana would be unable to list databases if the logged-in user did not have admin privileges. Fixes #4785.	2017-02-13 08:59:16 -08:00
Jonathan A. Sternberg	55e64e1edd	Fixed String() output for MOD operator and added MOD op precedence	2017-02-10 16:48:05 -06:00
Jason Wilder	8d0f2c3ca9	Merge pull request #7983 from influxdata/jw-parallel-buffers Increase parallel iterator buffers to improve group by query speed	2017-02-10 11:22:59 -07:00
Jason Wilder	5e42ac411a	Increase buffer to improve group by query speed	2017-02-10 11:07:49 -07:00
Jonathan A. Sternberg	a0d8c1ca9f	Add modulo operator to the query language	2017-02-10 10:16:37 -06:00
Jonathan A. Sternberg	2ad1668c2a	Prevent a panic when aggregates are used in an inner query with a raw query The following types of queries will panic: SELECT mean, host FROM (SELECT mean(value) FROM cpu GROUP BY host) SELECT top(sum, host, 3) FROM (SELECT sum(value) FROM cpu GROUP BY host) These queries _should_ work, but due to a current limitation with aggregate functions, the aggregate functions won't return any auxiliary fields. So even if a tag is not an auxiliary field, it is treated that way by the query engine and this query will fail. Fixing this properly will take a longer period of time. This fix just prevents the panic from killing the server while we fix this for real.	2017-02-08 11:44:56 -06:00
Jason Wilder	1bc0f68490	Merge branch '1.2' into jw-merge-12	2017-02-07 12:48:36 -07:00
Jonathan A. Sternberg	e1fa48d0dd	Fix ORDER BY time DESC with ordering series keys The order of series keys is in ascending alphabetical order, not descending alphabetical order, when it is ordered by descending time. This fixes the ordering so points are returned in descending order. The emitter also had the conditions for choosing which iterator to use in the wrong direction (which only affects aggregates with `FILL(none)`).	2017-02-06 15:49:12 -06:00
Jonathan A. Sternberg	caaad60dcf	Fix authentication when subqueries are present The code that checked if a query was authorized did not account for sources that were subqueries. Now, the check for the required privileges will descend into the subquery and add the subqueries required privileges to the list of required privileges for the entire query.	2017-02-06 09:43:14 -06:00
Jason Wilder	2e95b4043c	Merge branch '1.2' into jw-merge-12	2017-02-02 16:40:36 -07:00
Jonathan A. Sternberg	e49ba016fa	Fix incorrect math when aggregates that emit different times are used When using `non_negative_derivative()` and `last()` in a math aggregate with each other, the math would not be matched with each other because one of those aggregates would emit one fewer point than the others. The math iterators have been modified so they now track the name and tags of a point and match based on those. This isn't necessarily ideal and may come to bite us in the future. We don't necessarily have a defined structure for all iterators so it can be difficult to know which of two points is supposed to come first in the ordering. This uses the common ordering that usually makes sense, but the query engine is getting complicated enough where I am not 100% certain that this is correct in all circumstances.	2017-02-02 14:40:41 -06:00
Joe LeGasse	dd9278a098	regex: don't use exact match for case insensitive expression Fixes #7906 In an attempt to reduce the overhead of using regex for exact matches, the query parser will replace `=~ /^thing$/` with `== 'thing'`, but the conditions being checked would ignore if any flags were set on the expression, so `=~ /(?i)^THING$/` was replaced with `== 'THING'`, which will fail unless the case was already exact. This change ensures that no flags have been changed from those defaulted by the parser.	2017-02-02 10:49:12 -05:00
Joe LeGasse	93d18d42a6	regex: don't use exact match for case insensitive expression Fixes #7906 In an attempt to reduce the overhead of using regex for exact matches, the query parser will replace `=~ /^thing$/` with `== 'thing'`, but the conditions being checked would ignore if any flags were set on the expression, so `=~ /(?i)^THING$/` was replaced with `== 'THING'`, which will fail unless the case was already exact. This change ensures that no flags have been changed from those defaulted by the parser.	2017-02-02 10:25:08 -05:00
Cory LaNou	8e14f0173b	Merge pull request #7924 from influxdata/cjl-7880-elapsed-panic fix panic in query execution	2017-02-01 12:03:39 -06:00
Cory LaNou	e3e319a176	fix panic in query execution	2017-02-01 10:51:26 -06:00
Jonathan A. Sternberg	e060fd0aa3	Fix EvalType when a parenthesis expression is used It did not descend into the expression within the parenthesis correctly and would just recurse infinitely on itself instead.	2017-01-31 10:35:21 -06:00
Jonathan A. Sternberg	e8719c90ab	Fix EvalType when a parenthesis expression is used It did not descend into the expression within the parenthesis correctly and would just recurse infinitely on itself instead.	2017-01-31 10:19:43 -06:00
Paul Dix	a801c9dea6	Merge pull request #7889 from influxdata/js-subquery-fixes Cherry-pick 1.2 fixes for subqueries into master	2017-01-26 10:49:37 -05:00
Edd Robinson	91ee34b111	Merge pull request #7837 from influxdata/er-tidy General tidy up and subtle bug fixes	2017-01-26 13:43:07 +00:00
Jonathan A. Sternberg	ce54856e3d	Expand query dimensions from the subquery During development, I, at some point, decided that the dimensions should be expanded based on what was available rather than what was present in the subquery. I don't really know the rationale for this because I forgot, but it doesn't make sense or seem to be particularly useful. Expanding dimensions now just uses the values specified in the subquery rather than expanding to all available dimensions of the measurement in the subquery.	2017-01-25 16:33:03 -06:00
Jonathan A. Sternberg	92c5d336b4	Expand query dimensions from the subquery During development, I, at some point, decided that the dimensions should be expanded based on what was available rather than what was present in the subquery. I don't really know the rationale for this because I forgot, but it doesn't make sense or seem to be particularly useful. Expanding dimensions now just uses the values specified in the subquery rather than expanding to all available dimensions of the measurement in the subquery.	2017-01-25 16:02:37 -06:00
Ben Johnson	047c21f4d9	Merge remote-tracking branch 'upstream/master' into tsi	2017-01-24 09:28:58 -07:00
Jonathan A. Sternberg	83c6d53294	Support the WHERE clause in outer queries with subqueries	2017-01-23 15:01:32 -06:00
Jonathan A. Sternberg	3d4d9062a0	Update subqueries so groupings are propagated to inner queries Previously, only time expressions got propagated inwards. The reason for this was simple. If the outer query was going to filter to a specific time range, then it would be unnecessary for the inner query to output points within that time frame. It started as an optimization, but became a feature because there was no reason to have the user repeat the same time clause for the inner query as the outer query. So we allowed an aggregate query with an interval to pass validation in the subquery if the outer query had a time range. But `GROUP BY` clauses were not propagated because that same logic didn't apply to them. It's not an optimization there. So while grouping by a tag in the outer query without grouping by it in the inner query was useless, there wasn't any particular reason to care. Then a bug was found where wildcards would propagate the dimensions correctly, but the outer query containing a group by with the inner query omitting it wouldn't correctly filter out the outer group by. We could fix that filtering, but on further review, I had been seeing people make that same mistake a lot. People seem to just believe that the grouping should be propagated inwards. Instead of trying to fight what the user wanted and explicitly erase groupings that weren't propagated manually, we might as well just propagate them for the user to make their lives easier. There is no useful situation where you would want to group into buckets that can't physically exist so we might as well do _something_ useful. This will also now propagate time intervals to inner queries since the same applies there. But, while the interval propagates, the following query will not pass validation since it is still not possible to use a grouping interval with a raw query (even if the inner query is an aggregate): SELECT * FROM (SELECT mean(value) FROM cpu) WHERE time > now() - 5m GROUP BY time(1m) This also means wildcards will behave a bit differently. They will retrieve dimensions from the sources in the inner query rather than just using the dimensions in the group by. Fixing top() and bottom() to return the correct auxiliary fields. Unfortunately, we were not copying the buffer with the auxiliary fields so those values would be overwritten by a later point.	2017-01-23 15:01:19 -06:00
Jonathan A. Sternberg	6cd5b690d1	Support the WHERE clause in outer queries with subqueries	2017-01-23 14:49:04 -06:00
Jonathan A. Sternberg	f628b4a198	Update subqueries so groupings are propagated to inner queries Previously, only time expressions got propagated inwards. The reason for this was simple. If the outer query was going to filter to a specific time range, then it would be unnecessary for the inner query to output points within that time frame. It started as an optimization, but became a feature because there was no reason to have the user repeat the same time clause for the inner query as the outer query. So we allowed an aggregate query with an interval to pass validation in the subquery if the outer query had a time range. But `GROUP BY` clauses were not propagated because that same logic didn't apply to them. It's not an optimization there. So while grouping by a tag in the outer query without grouping by it in the inner query was useless, there wasn't any particular reason to care. Then a bug was found where wildcards would propagate the dimensions correctly, but the outer query containing a group by with the inner query omitting it wouldn't correctly filter out the outer group by. We could fix that filtering, but on further review, I had been seeing people make that same mistake a lot. People seem to just believe that the grouping should be propagated inwards. Instead of trying to fight what the user wanted and explicitly erase groupings that weren't propagated manually, we might as well just propagate them for the user to make their lives easier. There is no useful situation where you would want to group into buckets that can't physically exist so we might as well do _something_ useful. This will also now propagate time intervals to inner queries since the same applies there. But, while the interval propagates, the following query will not pass validation since it is still not possible to use a grouping interval with a raw query (even if the inner query is an aggregate): SELECT * FROM (SELECT mean(value) FROM cpu) WHERE time > now() - 5m GROUP BY time(1m) This also means wildcards will behave a bit differently. They will retrieve dimensions from the sources in the inner query rather than just using the dimensions in the group by. Fixing top() and bottom() to return the correct auxiliary fields. Unfortunately, we were not copying the buffer with the auxiliary fields so those values would be overwritten by a later point.	2017-01-23 12:38:10 -06:00
Edd Robinson	7374e48999	Remove dead code from influxql	2017-01-17 09:47:34 -08:00
Jonathan A. Sternberg	3ba950b029	Fix for subqueries to use the parallel iterator correctly Also, fix the `Iterators.Merge(IteratorOptions)` function so it consults the `Ordered` attribute to determine which iterator it should use to merge the input iterators.	2017-01-11 10:47:18 -06:00
Jonathan A. Sternberg	d7c8c7ca4f	Support subquery execution in the query language This adds query syntax support for subqueries and adds support to the query engine to execute queries on subqueries. Subqueries act as a source for another query. It is the equivalent of writing the results of a query to a temporary database, executing a query on that temporary database, and then deleting the database (except this is all performed in-memory). The syntax is like this: SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *) This will execute derivative and then sum the result of those derivatives. Another example: SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host) This would let you find the maximum minimum value of each host. There is complete freedom to mix subqueries with auxiliary fields. The only caveat is that the following two queries: SELECT mean(value) FROM cpu SELECT mean(value) FROM (SELECT value FROM cpu) Have different performance characteristics. The first will calculate `mean(value)` at the shard level and will be faster, especially when it comes to clustered setups. The second will process the mean at the top level and will not include that optimization.	2017-01-07 13:00:48 -06:00
Ben Johnson	409b0165f5	shared in-memory index	2017-01-05 10:09:57 -07:00
Mark Rushakoff	6a94d200c8	Merge remote-tracking branch 'influx/master' into mr-godoc	2017-01-04 13:27:36 -08:00
Mark Rushakoff	61b5a15227	Prefer built-in string -> []rune conversion	2017-01-03 14:45:33 -08:00
Mark Rushakoff	88b8bd2465	Update godoc for package influxql I did not look at any of the .gen.go files.	2016-12-30 18:02:52 -08:00
Gustav Westling	26b33307ae	Resolved PR comments on test files	2016-12-30 11:42:38 +01:00
Gustav Westling	56d98325da	Removed ineffective assignments, and added checks for errors that previsouly was not checked	2016-12-29 20:26:15 +01:00
Cory LaNou	572da8985c	enforce minimum shard duration when creating retention policies	2016-12-20 09:11:43 -06:00
Mark Rushakoff	a29781286b	Use local RNG in SampleReducer The reducers already had a local RNG but mistakenly did not use it when sampling points. Because the local RNG is not protected by a mutex, there is a slight speedup as a result of this change: benchmark old ns/op new ns/op delta BenchmarkSampleIterator_1k-4 418 418 +0.00% BenchmarkSampleIterator_100k-4 434 422 -2.76% BenchmarkSampleIterator_1M-4 449 439 -2.23% benchmark old allocs new allocs delta BenchmarkSampleIterator_1k-4 3 3 +0.00% BenchmarkSampleIterator_100k-4 3 3 +0.00% BenchmarkSampleIterator_1M-4 3 3 +0.00% benchmark old bytes new bytes delta BenchmarkSampleIterator_1k-4 304 304 +0.00% BenchmarkSampleIterator_100k-4 304 304 +0.00% BenchmarkSampleIterator_1M-4 304 304 +0.00% The speedup would presumably increase when multiple sample iterators are used concurrently.	2016-12-15 12:33:19 -08:00
Jonathan A. Sternberg	ec57108520	Use proper uber-go/zap import path It looks like the real import path to the project is go.uber.org/zap instead of github.com/uber-go/zap since the example in the project references that path.	2016-12-15 08:54:14 -06:00
Jonathan A. Sternberg	21502a39e8	Switch logging to use structured logging everywhere The logging library has been switched to use uber-go/zap. While the logging has been changed to use structured logging, this commit does not change any of the logging statements to take advantage of the new structured log or new log levels. Those changes will come in future commits.	2016-12-14 10:45:15 -06:00
Jonathan A. Sternberg	bffc759cf9	Return the time from a percentile call on an integer `percentile()` is supposed to be a selector and return the time of the point, but that only got changed when the input was a float. Updating the integer processor to also return the time of the point rather than the beginning of the interval.	2016-12-01 12:34:48 -06:00
Jonathan A. Sternberg	e0c1908683	Merge pull request #7644 from influxdata/js-fix-empty-variable-serialization Quote the empty string as an ident	2016-11-29 12:16:35 -06:00
Jonathan A. Sternberg	b4db76cee2	Introduce syntax for marking a partial response with chunking The `partial` tag has been added to the JSON response of a series and the result so that a client knows when more of the series or result will be sent in a future JSON chunk. This helps interactive clients who don't want to wait for all of the data to know if it is done processing the current series or the current result. Previously, the client had to guess if the next chunk would refer to the same result or a new result and it had to match the name and tags of the two series to know if they were the same series. Now, the client just needs to check the `partial` field included with the response to know if it should expect more. Fixed `max-row-limit` so it counts rows instead of results and it truncates the response when the `max-row-limit` is reached.	2016-11-22 11:16:22 -06:00
Jonathan A. Sternberg	c957bf7f99	Quote the empty string as an ident Without this quoting, the function `max("")` turns into `max()` and will not be reparsed correctly.	2016-11-18 16:25:39 -06:00
Jonathan A. Sternberg	e885fe5117	Expand string and boolean fields when using a wildcard with sample()	2016-11-15 15:56:47 -06:00
Jonathan A. Sternberg	64c2d704da	Avoid deadlock when max-row-limit is hit When the `max-row-limit` was hit, the goroutine reading from the results channel would stop reading from the channel, but it didn't signal to the sender that it was no longer reading from the results. This caused the sender to continue trying to send results even though nobody would ever read it and this created a deadlock. Include an `AbortCh` on the `ExecutionContext` that will signal when results are no longer desired so the sender can abort instead of deadlocking.	2016-11-08 13:12:28 -06:00
Tom Young	24fa1ac1c0	Remove old function which is no longer used.	2016-11-06 13:38:59 +00:00
Jonathan A. Sternberg	1b2fa645ee	Fix incorrect grouping when multiple aggregates are used with sparse data When a query would use a grouping with two different aggregates, it was possible for one of the aggregates to return a value from a different series key than the second aggregate. When these series keys didn't match, the returned grouping would be screwed up because it sorted by time before checking for name and tags. This did not happen when the aggregates returned values for the same series keys because then the iterators were aligned with each other.	2016-11-02 13:35:22 -05:00
Jonathan A. Sternberg	83e998fbed	Support the ON syntax in SHOW TAG VALUES The parser was updated previously in #7295 and the functionality was supposed to be there, but the wiring in the query engine for that to happen was never written.	2016-11-01 15:54:45 -05:00
Jonathan A. Sternberg	ce1831160d	Fix output duration units for SHOW QUERIES The previous version was showing the microseconds unit when it was outputting nanoseconds. Now we correctly identify which sub-second unit to use (milliseconds, microseconds, or nanoseconds) and use the correct unit while dividing the duration unit correctly to produce the correct output. Also updated to use the default duration string instead of our own custom formatters. It turns out that the string method for `time.Duration` does the correct thing as long as we truncate the value first.	2016-10-31 12:48:01 -05:00
Jason Wilder	0b6f5441b9	Add config option to messages when limits exceeded When a limit is exceeded, we return errors and sometimes log (if appropriate) that a limit was exceeded. The messages don't always provide an indication as to where or how they are configured. Instead, return the config option (easily searchable for) as well as the limit currently set and the value that exceeded it when possible.	2016-10-28 14:54:45 -06:00
Jason Wilder	af72d9b0e4	Merge pull request #7515 from influxdata/jw-7053 Return parse error from delete/drop when db or rp is specified	2016-10-25 12:05:56 -06:00
Jason Wilder	c68b7a192f	Return parse error from delete/drop when db or rp is specified The delete and drop statements apply to the measurement within a db. The parser allowed a db or rp to be specified and these values were silently ignored. This could cause data loss as someone would think they are only deleting the series within a rp, but they are actually deleting all their data. Instead, we return a parse error if a db or rp is specified in the delete or drop statements. Ideally, we'd be able to respect the db and rp, but that requires significant work in the query engine and tsdb store to make that work. Fixes #7053	2016-10-25 11:43:15 -06:00
Edd Robinson	b12b0d12fb	Add regex benchmarks and fix existing approach	2016-10-25 11:10:03 +01:00
Edd Robinson	06d1226b9a	Rewrite exact match regexes to use tsdb index This commit adds support for replacing regexes with non-regex conditions when possible. Currently the following regexes are supported: - host =~ /^foo$/ will be converted into host = 'foo' - host !~ /^foo$/ will be converted into host != 'foo' Note: if the regex expression contains character classes, grouping, repetition or similar, it may not be rewritten. For example, the condition: name =~ /^foo\|bar$/ will not be rewritten. Support for this may arrive in the future. Regexes that can be converted into simpler expression will be able to take advantage of the tsdb index, making them significantly faster.	2016-10-25 11:10:03 +01:00
Jonathan A. Sternberg	19a61dbb44	Align binary math expression streams by time Also fills in missing values using the fill expression for any binary aggregation.	2016-10-18 13:31:13 -05:00
Mark Rushakoff	0ddb7ad842	Disallow derivative call with non-duration 2nd arg Previously, calling derivative with a non-duration second argument was allowed during parsing but would panic during execution due to a failed type conversion. This change ensures the second argument is a duration literal.	2016-10-17 16:20:53 -07:00
Jonathan A. Sternberg	3496c5b85f	Merge pull request #7442 from influxdata/js-5955-make-regex-work-on-field-keys-in-select Support using regexes to select fields and dimensions	2016-10-17 11:37:47 -05:00
Jonathan A. Sternberg	b60b4b371e	Support using regexes to select fields and dimensions The functionality works the same as wildcards, but this time, you can specify a regular expression. One limitation is that you can't specify whether you only want to select fields or tags. Since the regex can be changed to suit the person's needs, I don't currently think this is an issue.	2016-10-13 22:17:14 -05:00
Jonathan A. Sternberg	95859b8ab4	Remove accidentally added string support for the stddev call Strings would always return an empty string and stddev is meaningless when it comes to strings. This removes that functionality so strings don't automatically get picked up when using a wildcard.	2016-10-10 14:58:28 -05:00
Jonathan A. Sternberg	6afc2a77a5	Implement cumulative_sum() function The `cumulative_sum()` function can be used to sum each new point and output the current total. For the following points: cpu value=2 0 cpu value=4 10 cpu value=6 20 This would output the following points: > SELECT cumulative_sum(value) FROM cpu time value ---- ----- 0 2 10 6 20 12 As can be seen, each new point adds to the sum of the previous point and outputs the value with the same timestamp. The function can also be used with an aggregate like `derivative()`. > SELECT cumulative_sum(mean(value) FROM cpu WHERE time >= now() - 10m GROUP BY time(1m)	2016-10-07 10:11:53 -05:00
Michael Desa	f9b8129770	Add sample function to query language First Pass at implementing sample Add sample iterators for all types Remove size from sample struct Fix off by one error when generating random number Add benchmarks for sample iterator Add test and associated fixes for off by one error Add test for sample function Remove NumericLiteral from sample function call Make clear that the counter is incr w/ each call Rename IsRandom to AllSamplesSeen Add a rng for each reducer that is created The default rng that comes with math/rand has a global lock. To avoid having to worry about any contention on the lock, each reducer now has its own time seeded rng. Add sample function to changelog	2016-10-06 09:41:42 -07:00
Michael Desa	966e5503bf	Add fill(linear) to query language Clean up template for fill average Change fill(average) to fill(linear) Update average to linear in infuxql spec Add Integer Tests and associated fixes Update CHANGELOG for fill(linear)	2016-10-04 14:27:04 -07:00
Jason Wilder	a3fd12198e	Avoid extra allocations when evalating binary expressions	2016-09-29 13:18:38 -06:00
Jonathan A. Sternberg	3afdf3cd94	Merge tag 'v1.0.1'	2016-09-27 17:53:33 -05:00
Jonathan A. Sternberg	dbc4a9150f	Prevent manual use of system queries Manual use of system queries could result in a user using the query incorrect. Rather than check to make sure the query was used correctly, we're just going to prevent users from using those sources so they can't use them incorrectly.	2016-09-23 10:00:18 -05:00
Cory LaNou	acbf193640	add test to prevent future parsing regressions for time durations	2016-09-16 11:44:05 -05:00
Jason Wilder	a6d3e46893	Fix panic when parsing ms durations	2016-09-16 08:47:18 -06:00
Jonathan A. Sternberg	635ce337f0	Merge pull request #7304 from influxdata/js-remove-substatement-method Remove defunct `Substatement()` call	2016-09-15 08:32:40 -05:00
Jonathan A. Sternberg	c11cbc5f05	Merge pull request #7309 from influxdata/js-go-vet-for-1.7 Update source files to pass vet checks for go 1.7	2016-09-15 08:32:30 -05:00
Jonathan A. Sternberg	477d6231db	Update source files to pass vet checks for go 1.7 The vet checks for some files did not pass for go 1.7. As part of a preliminary start to making go 1.7 work with this software, go vet should pass. Also updated the gogo/protobuf dependency which fixed the code generator to work with go 1.7 too. Ran `go generate` on the entire repository to ensure every file was up to date.	2016-09-14 15:01:22 -05:00
Cory LaNou	71f0c7e1e9	return appropriate error if overflowing duration when parsing	2016-09-14 09:27:38 -05:00
Jonathan A. Sternberg	0b94f5dc1a	Skip past points at the same time in derivative call within a merged series The derivative() call would panic if it received two points at the same time because it tried to divide by zero. The derivative call now skips past these points. To avoid skipping past these points, use `GROUP BY *` so that each series is kept separated into their own series. The difference() call has also been modified to skip past these points. Even though difference doesn't divide by the time, difference is supposed to perform the same as derivative, but without dividing by the time.	2016-09-13 16:57:36 -05:00
Jonathan A. Sternberg	dbb8c5570c	Duplicate parsing bug in ALTER RETENTION POLICY Return an error when we encounter the same option twice in ALTER RETENTION POLICY and remove the `maxNumOptions` number from the parsing loop. The `maxNumOptions` number would need to be modified if another option was added to the parsing loop and it didn't correctly prevent duplicate options from being reported as an error anyway.	2016-09-13 15:56:13 -05:00
Jonathan A. Sternberg	aae88fc3c3	Support ON and use default database for SHOW commands Normalize all of the SHOW commands so they allow both using ON to specify the database and using the default database. Some commands would require one and some would require the other and it was confusing when using the query language. Affected commands: * SHOW RETENTION POLICIES * SHOW MEASUREMENTS * SHOW SERIES * SHOW TAG KEYS * SHOW TAG VALUES * SHOW FIELD KEYS	2016-09-13 15:36:59 -05:00
Jonathan A. Sternberg	394c13870b	Remove defunct `Substatement()` call	2016-09-13 14:17:31 -05:00
Jonathan A. Sternberg	4326da0820	Implement time math for lazy time literals When attempting to reduce the WHERE clause, the time literals had not been converted from string literals yet. This adds the functionality to have it handle the same time math when the time literal is still a string literal.	2016-09-09 13:34:56 -05:00
Jonathan A. Sternberg	04c59b8941	Fix the dollar sign so it properly handles reserved keywords The dollar sign would sometimes be accepted as whitespace if it was immediately followed by a reserved keyword or an invalid character. It now reads these properly as a bound parameter rather than ignoring the dollar sign.	2016-09-02 15:32:46 -05:00
Jonathan A. Sternberg	4ff0b10210	Merge pull request #7139 from influxdata/js-7137-show-tag-values-string-method Properly output the SHOW TAG VALUES command so it can be reparsed	2016-09-01 10:19:19 -05:00
Jonathan A. Sternberg	dc2527ce86	Merge branch '1.0'	2016-08-31 14:45:57 -05:00
Jonathan A. Sternberg	23f2d50ecb	Use defaults from `meta` package for `CREATE DATABASE` Instead of having the parser set the defaults, the command will set the defaults so that the constants for that are actually used. This way we can also identify which things the user provided and which ones we are filling with default values. This allows the meta client to be able to make smarter decisions when determining if the user requested a conflict or if the requested capabilities match with what is currently available. If you just say `CREATE DATABASE WITH NAME myrp`, the user doesn't really care what the duration of the retention policy is and just wants to use the default. Now, we can use that information to determine if an existing retention policy would conflict with what the user requested rather than returning an error if a default value ever gets changed since the meta client command can communicate intent more easily.	2016-08-30 13:23:49 -05:00
Nathaniel Cook	888dc8cbd2	Merge pull request #7234 from influxdata/nc-influxql-readme Update Influxql Readme	2016-08-29 13:09:34 -06:00
Jonathan A. Sternberg	f67558c2a7	Merge pull request #7236 from influxdata/js-7220-revert-limit-shard-concurrency Revert "limit shard concurrency"	2016-08-29 13:41:46 -05:00
Nathaniel Cook	3ab4e9fa1d	update InfluxQL readme to reflect current code	2016-08-29 12:33:55 -06:00
Jonathan A. Sternberg	c05c7f6360	Revert "limit shard concurrency" This reverts commit `6c7d56d4bc`.	2016-08-29 12:39:52 -05:00
Jonathan A. Sternberg	b8a70105aa	Fix alter retention policy when all options are used We added `SHARD DURATION` as an extra option, but forgot to increase the maximum number of allowable options from 3 to 4. So if 4 options were used, the last one was ignored. This was commonly `DEFAULT`, but it could have been any of the options.	2016-08-26 11:25:18 -05:00
Jonathan A. Sternberg	8b234546a8	Merge pull request #7204 from influxdata/1.0 Merge 1.0 branch to master	2016-08-25 15:20:30 -05:00
Jonathan A. Sternberg	10029caf2f	Support negative timestamps in the query engine Negative timestamps are now supported. We also now refuse two nanoseconds that are at the edge of the minimum time window. One of the nanoseconds we do not accept is because we need MinInt64 to be used for some internal comparisons in the TSM engine and it was causing an underflow when we subtracted one from the minimum time. The second is so we can have one minimum time that signifies the default minimum that nobody can write to (so we can implicitly rewrite the timestamp on aggregate queries) but still use the explicit timestamp if it is given to us by the user. We aren't able to tell the difference between if the user provided it or if it was implicit without those values being different. If the default minimum time is used with an aggregate query, we rewrite the time to be the epoch for backwards compatibility since we believe that's more important than supporting that extra nanosecond.	2016-08-25 12:52:41 -05:00
Jonathan A. Sternberg	993ac1ca2e	Remove confusing comment and unnecessary continue	2016-08-23 19:43:18 -05:00
Ashish Gaurav	4e17f9bb13	add mode() function & tests	2016-08-23 19:31:41 -05:00
Edd Robinson	90ff713f21	Fix base64 encoding issue in stats Fixes #7177.	2016-08-22 15:21:31 +01:00
Ben Johnson	8aa224b22d	reduce memory allocations in index This commit changes the index to point to index data in the shards instead of keeping it in-memory on the heap.	2016-08-16 14:09:00 -06:00
Jonathan A. Sternberg	f0f7d91d6c	Properly output all commands so they can be reparsed The commands fixed: * SHOW TAG VALUES * SHOW STATS * SHOW DIAGNOSTICS	2016-08-15 15:04:51 -05:00
Jonathan A. Sternberg	87f7c66b8a	Merge pull request #7119 from influxdata/js-create-database-use-defaults Use defaults from `meta` package for `CREATE DATABASE`	2016-08-11 10:34:22 -05:00
Jonathan A. Sternberg	32d10de94f	Check in between query statements to see if the query was interrupted This allows a long series of uninterruptible statements to still be interrupted for a long running query that might do something like create or drop many databases.	2016-08-10 15:36:02 -05:00
Jonathan A. Sternberg	ab049d7f0a	Support mixed duration units It is now possible to use a mixed duration unit like `1h30m`. The duration units can be in whatever order as long as they are connected to each other. There is a change to the scanner. A token such as `10x` will be scanned as a duration literal, but will then fail to parse as an invalid duration. This should not be a breaking change as there is no situation where `10m10` was a valid order of tokens for the parser. Fixes #3634.	2016-08-10 13:34:19 -05:00
Jonathan A. Sternberg	3959656968	Add additional statistics to query executor The query executor would only store the number of active queries and the query duration so it was impossible to determine how many queries were actually executed during that timeframe because quick queries would be gone before the call to gather statistics was made. This adds two new statistics so track when queries start and when queries finish and doesn't decrement the counter so the number of executed queries can be obtained using `derivative()` and `difference()`.	2016-08-10 11:35:06 -05:00
Jonathan A. Sternberg	530b00bd76	Use defaults from `meta` package for `CREATE DATABASE` Instead of having the parser set the defaults, the command will set the defaults so that the constants for that are actually used. This way we can also identify which things the user provided and which ones we are filling with default values. This allows the meta client to be able to make smarter decisions when determining if the user requested a conflict or if the requested capabilities match with what is currently available. If you just say `CREATE DATABASE WITH NAME myrp`, the user doesn't really care what the duration of the retention policy is and just wants to use the default. Now, we can use that information to determine if an existing retention policy would conflict with what the user requested rather than returning an error if a default value ever gets changed since the meta client command can communicate intent more easily.	2016-08-09 12:00:06 -05:00
Ben Johnson	55b3e63ced	concurrent series limit This commit fixes the `MaxSelectSeriesN` limit which was broken by the implementation of lazy iterators. The setting previously limited the total number of series but the new implementation limits the concurrent number of series being processed.	2016-08-09 08:58:01 -06:00

1 2 3 4 5 ...

1100 Commits (master)