influxdb

Commit Graph

Author	SHA1	Message	Date
davidby-influx	df39b1e71c	fix(query): Group By queries with offset that crosses a DST boundary can fail (#20230 ) * fix(query): Group By queries with offset that crosses a DST boundary can fail Customer reported that a GROUP BY query with an offset that caused an interval to cross a daylight savings change inserted an extra output row off by one hour. This fix ensured that the start time for the interval of a GROUP BY operator is correctly set before calculating the time zone offset for that date and time. Add TestGroupByIterator_DST() in query/iterator_test.go for regression testing of this bug. Fixes https://github.com/influxdata/influxdb/issues/20238	2020-12-04 09:40:43 -08:00
Jonathan A. Sternberg	65b6b1dbb0	Use the timezone when evaluating time literals in subqueries	2019-04-22 10:45:40 -05:00
Ben Wells	e9bada090f	Fix misspelling identified by misspell	2019-02-03 20:27:43 +00:00
Edd Robinson	e7a40402ea	Merge pull request #10276 from brookst/abs-fix Fix bug with incorrect ABS results for negative integer values	2019-02-01 07:54:46 -08:00
Jonathan A. Sternberg	9f1ac41b85	Pass the query authorizer to subqueries The query authorizer was not being properly passed to subqueries so rejections did not happen when a subquery was the one reading the value. Similarly, the max series limit was not being propagated downwards either.	2018-12-07 15:09:35 -06:00
Jonathan A. Sternberg	22fc9f6a19	Strip tags from a subquery when the outer query does not group by that tag The following would, erroneously, not strip the tag from the inner query: SELECT value FROM (SELECT value FROM cpu GROUP BY host) The inner query was supposed to group by the host tag, but the outer query should strip it away since it is not being grouped by anymore. This fixes things so that the result will have the tags stripped away when they are not requested in the grouping.	2018-10-04 10:05:46 -05:00
Tim Brooks	343ce42238	Fix #10261 ABS(int64) Added inline bit-shift absolute value for int64 type as per: http://cavaliercoder.com/blog/optimized-abs-for-int64-in-go.html Updated implementations in Iterator and the continuous query service	2018-09-11 21:53:44 +01:00
Jonathan A. Sternberg	0f304690c5	Enable casting values from a subquery This also fixes the cursor system to abandon iterators that will not produce meaningful results since the variables are all unknown types. This creates a weird behavior that existed in previous releases and we are keeping here for backwards compatibility. If a subquery referenced a field that didn't exist in the subquery, it will return nothing. But, if there are two subqueries and one of them has the field exist and the other doesn't, the second will return all null values.	2018-03-30 16:58:37 -05:00
Stuart Carnie	aa61359cc7	Storage RPC API improvements. See PR for details * reduce # allocations (115M -> 22M) * reduce size allocations (53GB -> 1.3GB) * reduce RPC query time (45s -> 12.9s)	2018-03-21 13:46:09 -07:00
Jonathan A. Sternberg	f8d60a881d	Refactor the math engine to compile the query and use eval This change makes it so that we simplify the math engine so it doesn't use a complicated set of nested iterators. That way, we have to change math in one fewer place. It also greatly simplifies the query engine as now we can create the necessary iterators, join them by time, name, and tags, and then use the cursor interface to read them and use eval to compute the result. It makes it so the auxiliary iterators and all of their complexity can be removed. This also makes use of the new eval functionality that was recently added to the influxql package. No math functions have been added, but the scaffolding has been included so things like trigonometry functions are just a single commit away. This also introduces a small breaking change. Because of the call optimization, it is now possible to use the same selector multiple times as a selector. So if you do this: SELECT max(value) * 2, max(value) / 2 FROM cpu This will now return the timestamp of the max value rather than zero since this query is considered to have only a single selector rather than multiple separate selectors. If any aspect of the selector is different, such as different selector functions or different arguments, it will consider the selectors to be aggregates like the old behavior.	2018-03-19 15:01:15 -05:00
Jonathan A. Sternberg	733d842812	Turn the ExecutionContext into a context.Context Along with modifying ExecutionContext to be a context and have the TaskManager return the context itself, this also creates a Monitor interface and exposes the Monitor through the Context. This way, we can access the monitor from within the query.Select method and keep all of the limits inside of the query package instead of leaking them into the statement executor. An eventual goal is to remove the InterruptCh from the IteratorOptions and use the Context instead, but for now, we'll just assign the done channel from the Context to the IteratorOptions so at least they refer to the same channel.	2018-03-08 14:03:20 -06:00
Jonathan A. Sternberg	9e122eb1a4	Fix the implicit time range in a subquery The implicit time range for an interval is supposed to be now when no end is specified. In a subquery though, the interval doesn't exist and so it doesn't set the end time to now, but to the max time. Since the subquery qualifies as something that should have the implicit end time apply, this results in a query that runs slowly because it is filling in a bunch of unasked for intervals if a fill is specified. This hack adds the implicit end time if it sees the parent query's end time is set to the maximum available time. This is a temporary fix for this problem. The query compilation should perform these time range calculations in the compilation stage and the subqueries should use the compilation stage during execution instead of ignoring it. That work takes a lot more effort though and is more prone to running into unforeseen bugs. This fix introduces a subtle, but likely rare to run into bug. If the top level query specifies the maximum time as the end time and the subquery has an interval, the subquery should use the end time rather than now as the time range. With this hack, it will interpret it as an implicit time rather than an explicit one. This is unlikely to matter though.	2018-02-27 17:10:10 -06:00
Edd Robinson	21f0c6415b	Cleanup query package	2018-01-21 12:08:23 -08:00
Edd Robinson	6443355467	Pass through SystemIterator in PB	2017-11-08 19:57:16 +00:00
Edd Robinson	fbcb299b8a	Support WHERE time clause in SHOW TAG VALUES This commit adds time support to SHOW TAG VALUES. Time can be used as both a lower and upper boundary. However, there are some caveats. For the `inmem` index, filtering by time will still return all results because the index data is shared across shards. For the `tsi1` index, filtering by time will only work down to the shard lever. Specifically, when querying by time all shards within that time range will be used to generate the results.	2017-11-06 19:15:01 +00:00
Stuart Carnie	f3d45ba301	influxdata/influxdb/influxql -> influxdata/influxql	2017-10-30 14:40:26 -07:00
Stuart Carnie	e9313876ab	EXPLAIN ANALYZE * Introduces EXPLAIN ANALYZE command, which produces a detailed tree of operations used to execute the query. introduce context.Context to APIs metrics package * create groups of named measurements * safe for concurrent access tracing package EXPLAIN ANALYZE implementation for OSS Serialize EXPLAIN ANALYZE traces from remote nodes use context.Background for tests group with other stdlib packages additional documentation and remove unused API use influxdb/pkg/testing/assert remove testify reference	2017-10-20 08:01:37 -07:00
Jonathan A. Sternberg	79092610c8	Support unsigned binary math in fields Field math works similar to condition evaluation, but not the exact same because we have more information to work with in field expressions than we do in conditional math because fields retain the information about their source while conditions do not. The main difference is that you cannot add an unsigned literal to the output of an integer iterator while you can inside of a condition. You can perform math on a positive integer literal to an unsigned iterator. Inside of the condition, we aren't sure if an integer is because of a literal or because of an iterator so we can't make that distinction.	2017-10-02 17:06:49 -05:00
Jonathan A. Sternberg	bcf2e8fca5	Prevent deadlock when doing math on the result of a subquery The `fill(none)` attribute got set on subqueries, but that can cause an issue with certain subqueries just like it caused a deadlock on outer queries.	2017-09-22 14:45:53 -05:00
Jonathan A. Sternberg	0ef94e0cf0	Add unsigned iterators for all types This allows unsigned data to be queried from the storage engine. Binary math is not yet implemented for unsigned types.	2017-09-18 15:09:10 -05:00
Jonathan A. Sternberg	5a9553b2c4	Remove unused casting code from the query engine Originally, casting was performed inside of the query engine especially for call iterators. Currently, the engine takes care of all casting so we just need to normalize the iterators types for type safety reasons rather than actual functional reasons. Removing this code. Cover coverage showed that it was not hit when run against the actual server. I ran the tests package and got code coverage of the query package while running the tests in that package.	2017-09-18 12:33:34 -05:00
Jonathan A. Sternberg	aa7ae36880	Merge pull request #8826 from influxdata/js-iterator-options-marshaling Fix group by marshaling in the IteratorOptions	2017-09-14 16:46:11 -05:00
Jonathan A. Sternberg	5b3b7a5102	Fix group by marshaling in the IteratorOptions Ensure that the marshaling and unmarshaling an IteratorOptions returns the same thing.	2017-09-14 15:40:54 -05:00
Jonathan A. Sternberg	9cbd604603	Fix time constraints in subqueries from the refactor	2017-09-08 11:55:53 -05:00
Jonathan A. Sternberg	590be193e5	Include the number of scanned cached values in the iterator cost	2017-09-06 15:41:07 -05:00
Jonathan A. Sternberg	50d404e690	Initial implementation of explain plan It prints the statistics of each iterator that will access the storage engine. For each access of the storage engine, it will print the number of shards that will potentially be accessed, the number of files that may be accessed, the number of series that will be created, the number of blocks, and the size of those blocks.	2017-09-01 09:01:10 -05:00
Ben Johnson	1dbe0662d8	Use system cursors for measurement, series, and tag key meta queries.	2017-08-30 08:35:20 -06:00
Jonathan A. Sternberg	1c7bafcd3e	Force subqueries to match the parent queries ordering Previously, subqueries would honor their own ordering. We never really supported that and I have no idea if it would work since most parts in the query engine assume that points are being delivered in only one ordering. Subqueries have now been modified so if a person tries to do different ordering, they get an error when running the query. If they specify an ordering in the top most query, that ordering gets propagated to all subqueries. Fixes #8699.	2017-08-28 15:57:40 -05:00
Jonathan A. Sternberg	5cdd1b1489	Fix the panic message for the new interval iterator	2017-08-26 17:36:32 -05:00
Jonathan A. Sternberg	8738e72cf1	Refactor the select call into three separate phases The first call is to compile the query. This performs some initial processing that can be done before having any access to the shards. At the moment, it does very little, but it's intended to be changed to eventually perform initial validations of the query and create an internal graph structure for the execution of the query. The second call is to prepare the query. This step has access to the shard mapper. Right now, it just maps the shards and rewrites the fields of the query for any wildcards. In the future, it is intended to do the above, but also to prepare the final directed acyclical graph that will execute the query. The third call is to select the query. This step is intended to create all of the iterators for processing the query. At the moment, much of the work intended for the second step is performed in the third step.	2017-08-25 07:50:13 -05:00
Jonathan A. Sternberg	96689e661e	Move query engine code from the statement executor to the query engine The statement rewriting logic should be in the query engine as part of preparing a query. This creates a shard mapper interface that the query engine expects and then passes it to the query engine instead of requiring the query to be preprocessed before being input into the query engine. This interface is (mostly) the same as the old interface, just moved to a different package.	2017-08-23 10:07:30 -05:00
Jonathan A. Sternberg	8bd04ebe39	Remove TimeRange function and replace with a more accurate ConditionExpr function The ConditionExpr function is more accurate because it parses the condition and ensures that time conditions are actually used correctly. That means that attempting to combine conditions with OR will not result in the query silently pretending it's an AND and nested conditions work correctly so there is only one way to read the query. It also extracts the non-time conditions into a separate condition so we can stop attempting to parse around the time conditions in lower layers of the storage engine. This change does not remove those hacks, but a following commit should be able to sanitize the condition and remove them.	2017-08-16 16:45:35 -05:00
Jonathan A. Sternberg	9a2357c2c0	Separate the query engine into a separate package This change provides a clear separation between the query engine mechanics and the query language so that the language can be parsed and dealt with separate from the query engine itself.	2017-08-16 13:38:43 -05:00

33 Commits (415361e1ebc0d2d3000f7f98ae67c4f5ce17735e)