influxdb

Commit Graph

Author	SHA1	Message	Date
Sam Arnold	98361e2073	fix: error instead of panic for statement rewrite failure (#21792 )	2021-07-06 11:05:21 -04:00
Sam Arnold	903b8cd0ea	feat(query): Hyper log log operators in influxql (#20603 ) * feat(query): hyper log log counting in query engine In addition to helping with normal queries, this can improve the 'SHOW CARDINALITY' meta-queries: time influx -database mydb -execute 'select count_hll(sum_hll(_seriesKey)) from big' name: big time count_hll ---- --------- 0 200767781 influx -database mydb -execute 0.06s user 0.12s system 0% cpu 8:49.99 total	2021-02-08 08:38:14 -05:00
Ayan George	2529799abb	refactor: Change ToLower comparisons to EqualFold (#18147 ) When comparing strings in a case-insensitive way, strings.EqualFold() is (almost?) always faster than comparing the results of strings.ToLower(). In addition, strings.EqualFold() never causes an allocation. This patch replaces case-insensitive string comparisons that use strings.ToLower() with a strings.EqualFold() call.	2020-05-18 19:46:59 -04:00
Adam	ec32b3a4bb	fix(query/compile.go): time range was exceeding min/max bounds under certain conditions added test for missing lower bound	2019-08-14 09:59:17 -04:00
Jonathan A. Sternberg	728bdd6562	Fix the derivative and others time ranges for aggregate data The derivative function and others similar to it would preload themselves with data so that the first interval would be the start of the time range. That meant reading data outside of the time range. One change to the shard mapper back in v1.4.0 caused the shard mapper to constrict queries to the intervals given to the shard mapper. This was correct because the shard mapper can only deal with times it has mapped, but this broke the functionality of looking back into the past for the derivative and other functions that used that functionality. The compiler has been updated with an additional attribute that records how many intervals in the past will need to be read so that the shard mapper can include extra times that it may not necessarily read from, but may be queried because of the above described functionality.	2018-09-10 09:59:33 -05:00
Jonathan A. Sternberg	c6811f4725	Fix the inherited interval for derivative and others The compiler was too strict. The inherited interval from an outer query should not have caused the inner query to fail because inherited intervals are only implicitly passed to inner queries that support group by time functionality. Since the inner query with a derivative doesn't support grouping by time and the inner query itself doesn't specify a time, the outer query shouldn't have invalidated the inner query.	2018-08-30 13:01:18 -05:00
Patrick Hemmer	7dc7efd501	rename "triple_exponential_average" -> "triple_exponential_derivative"	2018-05-16 19:40:12 -04:00
Jonathan A. Sternberg	d42062def2	Add technical analysis algorithms This adds numerous technical analysis algorithms: * exponential_moving_average * double_exponential_moving_average * triple_exponential_moving_average * relative_strength_index * triple_exponential_average * kaufmans_efficiency_ratio (commonly referred to as just "Efficiency Ratio") * kaufmans_adaptive_moving_average * chande_momentum_oscillator (both the common 'smoothed' version, and the ta-lib version)	2018-04-23 22:27:21 -04:00
Jonathan A. Sternberg	58bcc6fdc9	Fix the validation for multiple nested distinct calls	2018-04-23 14:38:02 -05:00
Tom Young	42581c7432	Add new math functions: - abs - asin, acos, atan, atan2 - exp, ln, log, log2, log10 - pow, sqrt	2018-04-17 12:56:36 -05:00
Jonathan A. Sternberg	92dd6ea978	Fix regression to allow now() to be used as the group by offset again	2018-03-26 10:52:32 -05:00
Jonathan A. Sternberg	6e627cfdbf	Implement basic trigonometry functions This adds support for math functions into the query language. Math functions are special because they are transformations and do not access the filesystem in the same way aggregate functions do. A transformation takes one point and always outputs one point making it more similar to binary expressions so these math functions follow the same rules as binary expressions. This also supports using math literals (so you can do `sin(1)`) and the math functions can be used anywhere such as in a field or an expression. Both of the following are supported: SELECT sin(value) FROM cpu SELECT value FROM cpu WHERE sin(value) > 0.5 Arguments are in radians. Degrees is not supported.	2018-03-20 14:13:52 -05:00
Jonathan A. Sternberg	f8d60a881d	Refactor the math engine to compile the query and use eval This change makes it so that we simplify the math engine so it doesn't use a complicated set of nested iterators. That way, we have to change math in one fewer place. It also greatly simplifies the query engine as now we can create the necessary iterators, join them by time, name, and tags, and then use the cursor interface to read them and use eval to compute the result. It makes it so the auxiliary iterators and all of their complexity can be removed. This also makes use of the new eval functionality that was recently added to the influxql package. No math functions have been added, but the scaffolding has been included so things like trigonometry functions are just a single commit away. This also introduces a small breaking change. Because of the call optimization, it is now possible to use the same selector multiple times as a selector. So if you do this: SELECT max(value) * 2, max(value) / 2 FROM cpu This will now return the timestamp of the max value rather than zero since this query is considered to have only a single selector rather than multiple separate selectors. If any aspect of the selector is different, such as different selector functions or different arguments, it will consider the selectors to be aggregates like the old behavior.	2018-03-19 15:01:15 -05:00
Jonathan A. Sternberg	c8b0c6e166	Update influxql to include the function type evaluators in the query package	2018-03-14 15:42:28 -05:00
Jonathan A. Sternberg	733d842812	Turn the ExecutionContext into a context.Context Along with modifying ExecutionContext to be a context and have the TaskManager return the context itself, this also creates a Monitor interface and exposes the Monitor through the Context. This way, we can access the monitor from within the query.Select method and keep all of the limits inside of the query package instead of leaking them into the statement executor. An eventual goal is to remove the InterruptCh from the IteratorOptions and use the Context instead, but for now, we'll just assign the done channel from the Context to the IteratorOptions so at least they refer to the same channel.	2018-03-08 14:03:20 -06:00
Jonathan A. Sternberg	9e122eb1a4	Fix the implicit time range in a subquery The implicit time range for an interval is supposed to be now when no end is specified. In a subquery though, the interval doesn't exist and so it doesn't set the end time to now, but to the max time. Since the subquery qualifies as something that should have the implicit end time apply, this results in a query that runs slowly because it is filling in a bunch of unasked for intervals if a fill is specified. This hack adds the implicit end time if it sees the parent query's end time is set to the maximum available time. This is a temporary fix for this problem. The query compilation should perform these time range calculations in the compilation stage and the subqueries should use the compilation stage during execution instead of ignoring it. That work takes a lot more effort though and is more prone to running into unforeseen bugs. This fix introduces a subtle, but likely rare to run into bug. If the top level query specifies the maximum time as the end time and the subquery has an interval, the subquery should use the end time rather than now as the time range. With this hack, it will interpret it as an implicit time rather than an explicit one. This is unlikely to matter though.	2018-02-27 17:10:10 -06:00
Jonathan A. Sternberg	ca471f7d0f	Fix regression when math between literals is used in a field	2018-02-14 14:34:34 -05:00
Jonathan A. Sternberg	db60a83d5a	Fix query compilation so multiple nested distinct calls is allowable When refactoring the query engine, I thought calling `count(distinct(value))` multiple times was disallowed and so the refactor made it so that wasn't possible. It turns out that this pattern is allowed because since the distinct is nested, it is aggregated anyway and can be combined with other aggregates. This removes the erroneously placed restriction.	2017-11-28 11:09:32 -06:00
Edd Robinson	fbcb299b8a	Support WHERE time clause in SHOW TAG VALUES This commit adds time support to SHOW TAG VALUES. Time can be used as both a lower and upper boundary. However, there are some caveats. For the `inmem` index, filtering by time will still return all results because the index data is shared across shards. For the `tsi1` index, filtering by time will only work down to the shard lever. Specifically, when querying by time all shards within that time range will be used to generate the results.	2017-11-06 19:15:01 +00:00
Edd Robinson	98d584b63f	Use index for SHOW X meta queries When a meta query does not include a time component then it can be answered exclusively by the index. This should result in a much faster query execution that if the TSM engine was engaged. This commit rewrites the following queries such that they make use of the index where no time component is present: - SHOW MEASUREMENTS - SHOW SERIES - SHOW TAG KEYS - SHOW FIELD KEYS	2017-11-06 19:15:00 +00:00
Stuart Carnie	f3d45ba301	influxdata/influxdb/influxql -> influxdata/influxql	2017-10-30 14:40:26 -07:00
Jonathan A. Sternberg	f20cab6e99	Implicitly decide on the lower limit for fill queries when none is present This allows the query: SELECT mean(value) FROM cpu GROUP BY time(1d) To function in some way that makes sense. The upper limit is implicitly the `now()` starting time and the lower limit will be whichever interval the lowest point falls into. When no lower bound is specified and `max-select-buckets` is specified, the query will only consider points that would satisfy `max-select-buckets`. So if you have one point written in 1970, have another point within the last minute, and then do the above query with `max-select-buckets` being equal to 10, the older point from 1970 will not be considered.	2017-10-05 15:56:44 -05:00
Jonathan A. Sternberg	9cbd604603	Fix time constraints in subqueries from the refactor	2017-09-08 11:55:53 -05:00
Jonathan A. Sternberg	1c7bafcd3e	Force subqueries to match the parent queries ordering Previously, subqueries would honor their own ordering. We never really supported that and I have no idea if it would work since most parts in the query engine assume that points are being delivered in only one ordering. Subqueries have now been modified so if a person tries to do different ordering, they get an error when running the query. If they specify an ordering in the top most query, that ordering gets propagated to all subqueries. Fixes #8699.	2017-08-28 15:57:40 -05:00
Jonathan A. Sternberg	5dbcd1b06f	Merge pull request #8745 from influxdata/js-refactor-validation Refactor validation code and move it to the compiler	2017-08-28 11:35:24 -05:00
Jonathan A. Sternberg	d2fcb893e1	Close the query shard group after the iterators are created Now, the prepared statement keeps the open resource and closing the open resource created from `Prepare` is the responsibility of the prepared statement. This also nils out the local shard mapping after it is closed to prevent it from being used after it is closed.	2017-08-28 09:46:11 -05:00
Jonathan A. Sternberg	905e7fe05e	Refactor validation code and move it to the compiler This refactors the validation code so it is more flexible and performs a small bit of work to make preparing and executing the query easier. The general idea is that compilation will eventually do more heavy lifting in creating the initial plan and prepare will construct an actual plan rather than just doing some basic field rewriting. This change at least sets us up for that change in the future and moves the validation code to the query execution instead of in the parser. This also frees up the parser to parse the complete AST without worrying if the query itself is valid. That could be useful for client code that wants to compile a partial query to an AST and then perform modifications on the AST for some reason.	2017-08-26 17:36:32 -05:00
Jonathan A. Sternberg	8738e72cf1	Refactor the select call into three separate phases The first call is to compile the query. This performs some initial processing that can be done before having any access to the shards. At the moment, it does very little, but it's intended to be changed to eventually perform initial validations of the query and create an internal graph structure for the execution of the query. The second call is to prepare the query. This step has access to the shard mapper. Right now, it just maps the shards and rewrites the fields of the query for any wildcards. In the future, it is intended to do the above, but also to prepare the final directed acyclical graph that will execute the query. The third call is to select the query. This step is intended to create all of the iterators for processing the query. At the moment, much of the work intended for the second step is performed in the third step.	2017-08-25 07:50:13 -05:00

28 Commits (933a14e16f0232c5ec883644b7363a59da7bcd90)