influxdb

Commit Graph

Author	SHA1	Message	Date
Jonathan A. Sternberg	d7c8c7ca4f	Support subquery execution in the query language This adds query syntax support for subqueries and adds support to the query engine to execute queries on subqueries. Subqueries act as a source for another query. It is the equivalent of writing the results of a query to a temporary database, executing a query on that temporary database, and then deleting the database (except this is all performed in-memory). The syntax is like this: SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *) This will execute derivative and then sum the result of those derivatives. Another example: SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host) This would let you find the maximum minimum value of each host. There is complete freedom to mix subqueries with auxiliary fields. The only caveat is that the following two queries: SELECT mean(value) FROM cpu SELECT mean(value) FROM (SELECT value FROM cpu) Have different performance characteristics. The first will calculate `mean(value)` at the shard level and will be faster, especially when it comes to clustered setups. The second will process the mean at the top level and will not include that optimization.	2017-01-07 13:00:48 -06:00
Jonathan A. Sternberg	bb060a60c6	Fix regex binary encoding for a measurement Previously, it encoded the text representation of the regex literal which included the surrounding slashes used in the query language. The binary encoding should only include the exact string used to create the regular expression.	2016-07-05 11:39:41 -05:00
Jonathan A. Sternberg	71c8e9e567	Refactor ExecuteQuery to take options as a struct This allows us to add additional options to ExecuteQuery without creating parameter bloat. Removing the unused Series structs. Their necessity was removed by a previous commit, but the structs were not removed yet. Add another type of interrupt iterator that monitors the interrupt channel and calls `Close()` on the iterator when the interrupt happens. It will primarily be used for asynchronously closing the ReaderIterator, but it will only close the read side of the connection properly. More work needs to be done to allow closing the write side efficiently.	2016-06-01 12:30:52 -05:00
Jonathan A. Sternberg	23f6a706bb	Support cast syntax for selecting a specific type Casting syntax is done with the PostgreSQL syntax `field1::float` to specify which type should be used when selecting a field. You can also do `field1::field` or `tag1::tag` to specify that a field or tag should be selected. This makes it possible to select a tag when a field key and a tag key conflict with each other in a measurement. It also means it's possible to choose a field with a specific type if multiple shards disagree. If no types are given, the same ordering for how a type is chosen is used to determine which type to return. The FieldDimensions method has been updated to return the data type for the fields that get returned. The SeriesKeys function has also been removed since it is no longer needed. SeriesKeys was originally used for the fill iterator, but then expanded to be used by auxiliary iterators for determining the channel iterator types. The fill iterator doesn't need it anymore and the auxiliary types are better served by FieldDimensions implementing that functionality, so SeriesKeys is no longer needed. Fixes #6519.	2016-05-16 12:08:29 -04:00
Jonathan A. Sternberg	d6d0addcec	Fix aggregate returns when data is missing from some shards If a shard is empty for a specific field and the field type is something other than a float, a nil iterator would get returned from one of the empty shards and cause the combined iterators to be cast to the float type and all other iterator types to be discarded (or for integers, to be cast). This is rare since most aggregates don't accept strings or booleans, but for queries like: SELECT distinct(string) FROM mydata It would result in nothing getting returned if one of the shards didn't have a value for `string`. This change modifies the query engine to return nil for the shards instead of a fake iterator and then to only use the fake iterator if the final aggregate iterator is nil (meaning that no iterators could be constructed for the field from any shard). Fixes #6495.	2016-05-03 10:41:22 -04:00
Jonathan A. Sternberg	c8c38e15cd	Merge pull request #6386 from influxdata/js-iterator-next-error Modify all of the iterators to allow returning an error on Next()	2016-04-20 10:39:53 -04:00
Nathaniel Cook	465f5a375f	add elapsed function	2016-04-19 12:54:54 -06:00
Jonathan A. Sternberg	7ec2a991d5	Modify all of the iterators to allow returning an error on Next() This also switches the remaining iterators to be lazy so they can return errors properly. They needed to be converted to lazy initialization anyway, which has the side effect of making it much easier for us to propagate the underlying error during initialization. Updated the Emitter to return errors when it cannot read properly from the iterators.	2016-04-18 11:17:55 -04:00
Ben Johnson	4f381d03d7	add double buffer on chan iterator This commit changes the channel iterators to use a double buffer to reduce allocations. The caller of `Iterator.Next()` must copy out the point before calling `Next()` again.	2016-04-14 13:52:13 -06:00
Ben Johnson	525e22c92b	tsm1 query engine alloc reduction This commit makes a number of performance improvements to reduce allocations during query execution. Several objects and buffers are now reused across the components to avoid allocations. Previously a simple `count(value)` query across 1M points would require 26,000+ allocations. After the changes in this commit that number has been reduced to 88.	2016-04-11 14:50:59 -06:00
Jonathan A. Sternberg	fa5a38dcd4	Fixing aggregate queries with no GROUP BY to include the end time Queries with a time constraint but no group by would not include the final point from the underlying iterator. Fixes #6229.	2016-04-07 14:11:28 -04:00
Jonathan A. Sternberg	43e3330480	Fix the reader iterator so it doesn't read the first point when creating the iterator	2016-04-01 17:31:28 -04:00
Ben Johnson	7156c1f9bd	add IteratorStats This commit adds an `IteratorStats` that holds aggregate iterator processing information. A method is also added to `Iterator` to return the stats: Stats() influxql.IteratorStats The remote iterators will also emit their stats in the point stream upon first connection, on a given interval, and then finally once the last point has been sent.	2016-03-21 16:25:19 -06:00
Cory LaNou	ba6a95e9bc	Merge pull request #5994 from influxdata/single-server-lite Single Server	2016-03-14 16:11:37 -05:00
Jonathan A. Sternberg	74d51e3842	Support nil values in binary math expressions with two iterators Related to #5959 and #5973.	2016-03-11 15:57:35 -05:00
Ben Johnson	beda072426	add support for remote expansion of regex This commit moves the `tsdb.Store.ExpandSources()` function onto the `influxql.IteratorCreator` and provides support for issuing source expansion across a cluster.	2016-03-11 12:40:07 -07:00
Jonathan A. Sternberg	9113839e4c	Fix sorting of `first()` and `last()` calls across shards Previously the call iterator would normalize the time to the interval for all calls. This meant that when `first()` or `last()` was called with no group by interval the value would be found for each shard, the time was normalized, then it tried to find the value between the shards (but no longer with any time data as that had already been eliminated). This removes part of the time logic from the call iterators and makes a new iterator `IntervalIterator` to normalize the times as they come out of the underlying iterator. Fixes #5890.	2016-03-03 21:15:43 -05:00
Jonathan A. Sternberg	87fc143732	Fix limit iterator with multiple sources The limit iterator would short circuit if there were no dimensions and all points had been read. It also needs to consider that multiple sources will require reading the entire iterator too, so the short circuit requires only a single source. Fixes #5871.	2016-03-01 21:44:45 -05:00
Ben Johnson	16eea8eecc	add SeriesList marshaling	2016-02-25 15:38:16 -07:00
Ben Johnson	0dda9f6608	add remote execution This commit adds remote execution to the query engine.	2016-02-25 08:41:20 -07:00
Ben Johnson	5a0d1ab7c1	rename influxdb/influxdb to influxdata/influxdb This commit changes all the import and URL references from: github.com/influxdb/influxdb to: github.com/influxdata/influxdb	2016-02-10 10:26:18 -07:00
Jonathan A. Sternberg	98810a363a	Correct the AuxIterator test and adding some additional locks The additional locks shouldn't be necessary due to how the code is used, but should prevent any potential data races in case we accidentally do something bad.	2016-02-10 09:40:31 -07:00
Jonathan A. Sternberg	ed151598ff	Modify the AuxIterator to include a Start method The AuxIterator streams points to the underlying iterators. When it started automatically, race conditions occurred between the stream closing the iterators and creating iterators from the AuxIterator.	2016-02-10 09:40:30 -07:00
Ben Johnson	b3d5aa82b7	improve performance of LIMIT with no dimensions	2016-02-10 09:40:30 -07:00
Jonathan A. Sternberg	2e7cf5328c	Fix go vet issues on 1.4 go 1.5 was being used to develop the query engine branch, but we aren't using 1.5 for master at the moment. This fixes issues that go vet brings up in 1.4 that don't exist in 1.5.	2016-02-10 09:40:30 -07:00
Jonathan A. Sternberg	5b756e0fbe	Improve test coverage in influxql package	2016-02-10 09:40:29 -07:00
Jonathan A. Sternberg	d1f7c445e7	Modify iterators to work across shards Aux iterators now ask the iterator creator what series will be returned and determine which aux fields to create based on the results. The `tsdb.Shards` struct also creates a call iterator around the iterators returned from each shard.	2016-02-10 09:40:29 -07:00
Jonathan A. Sternberg	c2d1206177	Implement the fill iterator Fill requires an additional function for IteratorCreator to retrieve the series that will be returned from the iterator. When fill is required for an aggregate, the IteratorCreator will be asked what series will be returned by the created iterator.	2016-02-10 09:40:29 -07:00
Jonathan A. Sternberg	21d2a4c3de	Sort MergeIterator by tags after name and before the window	2016-02-10 09:40:28 -07:00
Jonathan A. Sternberg	d8337acf90	Emit all points of a certain name for MergeIterators When multiple sources are used, emit all points for a certain source (like cpu) before another source (like mem) regardless of which window they are in. If the sources are the same, then sort by window. Continue to ignore tags since we don't need to sort nicely by tags with a MergeIterator, only SortedMergeIterator.	2016-02-10 09:40:28 -07:00
Jonathan A. Sternberg	5605bbb22e	Implement casting support for different iterator types Out of a list of iterators, an overarching iterator type is chosen and only iterators of that type are returned for the merge iterator. If a type can be cast to another type, an extra cast iterator is created to handle that casting. The only supported cast is from integers to floats.	2016-02-10 09:40:28 -07:00
Jonathan A. Sternberg	f7a3918e40	Basic testing for binary expressions Use Iterators().ReadAll() in select unit tests.	2016-02-10 09:40:27 -07:00
Jonathan A. Sternberg	67c1042435	More work	2016-02-10 09:40:26 -07:00
Jonathan A. Sternberg	0e1910cb92	More work on improving the iterator unit tests	2016-02-10 09:40:26 -07:00
Jonathan A. Sternberg	3dd6aa17f3	Test the merge iterator for every type instead of just floats	2016-02-10 09:40:26 -07:00
Jonathan A. Sternberg	fa79aae584	Expanding test coverage for the influxql/iterator.go SortedMergeIterator is now tested and all of the IteratorOptions methods are now tested explicitly for functionality.	2016-02-10 09:40:26 -07:00
Ben Johnson	b8918a780c	integer support	2016-02-10 09:40:25 -07:00
Ben Johnson	00806de9b8	refactor query engine	2016-02-10 09:40:25 -07:00
Ben Johnson	60b051ee88	LIMIT/OFFSET	2016-02-10 09:40:24 -07:00
Ben Johnson	cde973f409	refactor query engine	2016-02-10 09:40:24 -07:00

40 Commits (3d4d9062a0925314e4e9445de49ae1b132940d05)