Commit Graph

40 Commits (3d4d9062a0925314e4e9445de49ae1b132940d05)

Author SHA1 Message Date
Jonathan A. Sternberg d7c8c7ca4f Support subquery execution in the query language
This adds query syntax support for subqueries and adds support to the
query engine to execute queries on subqueries.

Subqueries act as a source for another query. It is the equivalent of
writing the results of a query to a temporary database, executing
a query on that temporary database, and then deleting the database
(except this is all performed in-memory).

The syntax is like this:

    SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *)

This will execute derivative and then sum the result of those derivatives.
Another example:

    SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host)

This would let you find the maximum minimum value of each host.

There is complete freedom to mix subqueries with auxiliary fields. The only
caveat is that the following two queries:

    SELECT mean(value) FROM cpu
    SELECT mean(value) FROM (SELECT value FROM cpu)

Have different performance characteristics. The first will calculate
`mean(value)` at the shard level and will be faster, especially when it comes to
clustered setups. The second will process the mean at the top level and will not
include that optimization.
2017-01-07 13:00:48 -06:00
Jonathan A. Sternberg bb060a60c6 Fix regex binary encoding for a measurement
Previously, it encoded the text representation of the regex literal
which included the surrounding slashes used in the query language. The
binary encoding should only include the exact string used to create the
regular expression.
2016-07-05 11:39:41 -05:00
Jonathan A. Sternberg 71c8e9e567 Refactor ExecuteQuery to take options as a struct
This allows us to add additional options to ExecuteQuery without
creating parameter bloat.

Removing the unused Series structs. Their necessity was removed by a
previous commit, but the structs were not removed yet.

Add another type of interrupt iterator that monitors the interrupt
channel and calls `Close()` on the iterator when the interrupt happens.
It will primarily be used for asynchronously closing the ReaderIterator,
but it will only close the read side of the connection properly. More
work needs to be done to allow closing the write side efficiently.
2016-06-01 12:30:52 -05:00
Jonathan A. Sternberg 23f6a706bb Support cast syntax for selecting a specific type
Casting syntax is done with the PostgreSQL syntax `field1::float` to
specify which type should be used when selecting a field. You can also
do `field1::field` or `tag1::tag` to specify that a field or tag should
be selected.

This makes it possible to select a tag when a field key and a tag key
conflict with each other in a measurement. It also means it's possible
to choose a field with a specific type if multiple shards disagree. If
no types are given, the same ordering for how a type is chosen is used
to determine which type to return.

The FieldDimensions method has been updated to return the data type for
the fields that get returned. The SeriesKeys function has also been
removed since it is no longer needed. SeriesKeys was originally used for
the fill iterator, but then expanded to be used by auxiliary iterators
for determining the channel iterator types. The fill iterator doesn't
need it anymore and the auxiliary types are better served by
FieldDimensions implementing that functionality, so SeriesKeys is no
longer needed.

Fixes #6519.
2016-05-16 12:08:29 -04:00
Jonathan A. Sternberg d6d0addcec Fix aggregate returns when data is missing from some shards
If a shard is empty for a specific field and the field type is something
other than a float, a nil iterator would get returned from one of the
empty shards and cause the combined iterators to be cast to the float
type and all other iterator types to be discarded (or for integers, to
be cast).

This is rare since most aggregates don't accept strings or booleans, but
for queries like:

    SELECT distinct(string) FROM mydata

It would result in nothing getting returned if one of the shards didn't
have a value for `string`.

This change modifies the query engine to return nil for the shards
instead of a fake iterator and then to only use the fake iterator if the
final aggregate iterator is nil (meaning that no iterators could be
constructed for the field from any shard).

Fixes #6495.
2016-05-03 10:41:22 -04:00
Jonathan A. Sternberg c8c38e15cd Merge pull request #6386 from influxdata/js-iterator-next-error
Modify all of the iterators to allow returning an error on Next()
2016-04-20 10:39:53 -04:00
Nathaniel Cook 465f5a375f add elapsed function 2016-04-19 12:54:54 -06:00
Jonathan A. Sternberg 7ec2a991d5 Modify all of the iterators to allow returning an error on Next()
This also switches the remaining iterators to be lazy so they can return
errors properly. They needed to be converted to lazy initialization
anyway, which has the side effect of making it much easier for us to
propagate the underlying error during initialization.

Updated the Emitter to return errors when it cannot read properly from
the iterators.
2016-04-18 11:17:55 -04:00
Ben Johnson 4f381d03d7
add double buffer on chan iterator
This commit changes the channel iterators to use a double buffer
to reduce allocations. The caller of `Iterator.Next()` must copy
out the point before calling `Next()` again.
2016-04-14 13:52:13 -06:00
Ben Johnson 525e22c92b
tsm1 query engine alloc reduction
This commit makes a number of performance improvements to
reduce allocations during query execution. Several objects
and buffers are now reused across the components to avoid
allocations.

Previously a simple `count(value)` query across 1M points
would require 26,000+ allocations. After the changes in
this commit that number has been reduced to 88.
2016-04-11 14:50:59 -06:00
Jonathan A. Sternberg fa5a38dcd4 Fixing aggregate queries with no GROUP BY to include the end time
Queries with a time constraint but no group by would not include the
final point from the underlying iterator.

Fixes #6229.
2016-04-07 14:11:28 -04:00
Jonathan A. Sternberg 43e3330480 Fix the reader iterator so it doesn't read the first point when creating the iterator 2016-04-01 17:31:28 -04:00
Ben Johnson 7156c1f9bd add IteratorStats
This commit adds an `IteratorStats` that holds aggregate
iterator processing information. A method is also added to
`Iterator` to return the stats:

	Stats() influxql.IteratorStats

The remote iterators will also emit their stats in the point
stream upon first connection, on a given interval, and then
finally once the last point has been sent.
2016-03-21 16:25:19 -06:00
Cory LaNou ba6a95e9bc Merge pull request #5994 from influxdata/single-server-lite
Single Server
2016-03-14 16:11:37 -05:00
Jonathan A. Sternberg 74d51e3842 Support nil values in binary math expressions with two iterators
Related to #5959 and #5973.
2016-03-11 15:57:35 -05:00
Ben Johnson beda072426 add support for remote expansion of regex
This commit moves the `tsdb.Store.ExpandSources()` function onto
the `influxql.IteratorCreator` and provides support for issuing
source expansion across a cluster.
2016-03-11 12:40:07 -07:00
Jonathan A. Sternberg 9113839e4c Fix sorting of `first()` and `last()` calls across shards
Previously the call iterator would normalize the time to the interval
for all calls. This meant that when `first()` or `last()` was called
with no group by interval the value would be found for each shard, the
time was normalized, then it tried to find the value between the shards
(but no longer with any time data as that had already been eliminated).

This removes part of the time logic from the call iterators and makes a
new iterator `IntervalIterator` to normalize the times as they come out
of the underlying iterator.

Fixes #5890.
2016-03-03 21:15:43 -05:00
Jonathan A. Sternberg 87fc143732 Fix limit iterator with multiple sources
The limit iterator would short circuit if there were no dimensions and
all points had been read. It also needs to consider that multiple
sources will require reading the entire iterator too, so the short
circuit requires only a single source.

Fixes #5871.
2016-03-01 21:44:45 -05:00
Ben Johnson 16eea8eecc add SeriesList marshaling 2016-02-25 15:38:16 -07:00
Ben Johnson 0dda9f6608 add remote execution
This commit adds remote execution to the query engine.
2016-02-25 08:41:20 -07:00
Ben Johnson 5a0d1ab7c1 rename influxdb/influxdb to influxdata/influxdb
This commit changes all the import and URL references from:

    github.com/influxdb/influxdb

to:

    github.com/influxdata/influxdb
2016-02-10 10:26:18 -07:00
Jonathan A. Sternberg 98810a363a Correct the AuxIterator test and adding some additional locks
The additional locks shouldn't be necessary due to how the code is used,
but should prevent any potential data races in case we accidentally do
something bad.
2016-02-10 09:40:31 -07:00
Jonathan A. Sternberg ed151598ff Modify the AuxIterator to include a Start method
The AuxIterator streams points to the underlying iterators. When it
started automatically, race conditions occurred between the stream
closing the iterators and creating iterators from the AuxIterator.
2016-02-10 09:40:30 -07:00
Ben Johnson b3d5aa82b7 improve performance of LIMIT with no dimensions 2016-02-10 09:40:30 -07:00
Jonathan A. Sternberg 2e7cf5328c Fix go vet issues on 1.4
go 1.5 was being used to develop the query engine branch, but we aren't
using 1.5 for master at the moment. This fixes issues that go vet brings
up in 1.4 that don't exist in 1.5.
2016-02-10 09:40:30 -07:00
Jonathan A. Sternberg 5b756e0fbe Improve test coverage in influxql package 2016-02-10 09:40:29 -07:00
Jonathan A. Sternberg d1f7c445e7 Modify iterators to work across shards
Aux iterators now ask the iterator creator what series will be returned
and determine which aux fields to create based on the results.

The `tsdb.Shards` struct also creates a call iterator around the
iterators returned from each shard.
2016-02-10 09:40:29 -07:00
Jonathan A. Sternberg c2d1206177 Implement the fill iterator
Fill requires an additional function for IteratorCreator to retrieve the
series that will be returned from the iterator. When fill is required
for an aggregate, the IteratorCreator will be asked what series will be
returned by the created iterator.
2016-02-10 09:40:29 -07:00
Jonathan A. Sternberg 21d2a4c3de Sort MergeIterator by tags after name and before the window 2016-02-10 09:40:28 -07:00
Jonathan A. Sternberg d8337acf90 Emit all points of a certain name for MergeIterators
When multiple sources are used, emit all points for a certain source
(like cpu) before another source (like mem) regardless of which window
they are in. If the sources are the same, then sort by window.

Continue to ignore tags since we don't need to sort nicely by tags with
a MergeIterator, only SortedMergeIterator.
2016-02-10 09:40:28 -07:00
Jonathan A. Sternberg 5605bbb22e Implement casting support for different iterator types
Out of a list of iterators, an overarching iterator type is chosen and
only iterators of that type are returned for the merge iterator. If a
type can be cast to another type, an extra cast iterator is created to
handle that casting.

The only supported cast is from integers to floats.
2016-02-10 09:40:28 -07:00
Jonathan A. Sternberg f7a3918e40 Basic testing for binary expressions
Use Iterators().ReadAll() in select unit tests.
2016-02-10 09:40:27 -07:00
Jonathan A. Sternberg 67c1042435 More work 2016-02-10 09:40:26 -07:00
Jonathan A. Sternberg 0e1910cb92 More work on improving the iterator unit tests 2016-02-10 09:40:26 -07:00
Jonathan A. Sternberg 3dd6aa17f3 Test the merge iterator for every type instead of just floats 2016-02-10 09:40:26 -07:00
Jonathan A. Sternberg fa79aae584 Expanding test coverage for the influxql/iterator.go
SortedMergeIterator is now tested and all of the IteratorOptions methods
are now tested explicitly for functionality.
2016-02-10 09:40:26 -07:00
Ben Johnson b8918a780c integer support 2016-02-10 09:40:25 -07:00
Ben Johnson 00806de9b8 refactor query engine 2016-02-10 09:40:25 -07:00
Ben Johnson 60b051ee88 LIMIT/OFFSET 2016-02-10 09:40:24 -07:00
Ben Johnson cde973f409 refactor query engine 2016-02-10 09:40:24 -07:00