influxdb

Commit Graph

Author	SHA1	Message	Date
Jonathan A. Sternberg	19a61dbb44	Align binary math expression streams by time Also fills in missing values using the fill expression for any binary aggregation.	2016-10-18 13:31:13 -05:00
Michael Desa	966e5503bf	Add fill(linear) to query language Clean up template for fill average Change fill(average) to fill(linear) Update average to linear in infuxql spec Add Integer Tests and associated fixes Update CHANGELOG for fill(linear)	2016-10-04 14:27:04 -07:00
Jonathan A. Sternberg	c05c7f6360	Revert "limit shard concurrency" This reverts commit `6c7d56d4bc`.	2016-08-29 12:39:52 -05:00
Jonathan A. Sternberg	10029caf2f	Support negative timestamps in the query engine Negative timestamps are now supported. We also now refuse two nanoseconds that are at the edge of the minimum time window. One of the nanoseconds we do not accept is because we need MinInt64 to be used for some internal comparisons in the TSM engine and it was causing an underflow when we subtracted one from the minimum time. The second is so we can have one minimum time that signifies the default minimum that nobody can write to (so we can implicitly rewrite the timestamp on aggregate queries) but still use the explicit timestamp if it is given to us by the user. We aren't able to tell the difference between if the user provided it or if it was implicit without those values being different. If the default minimum time is used with an aggregate query, we rewrite the time to be the epoch for backwards compatibility since we believe that's more important than supporting that extra nanosecond.	2016-08-25 12:52:41 -05:00
Ben Johnson	6c7d56d4bc	limit shard concurrency This commit limits queries to only process one shard at a time. However, within a shard, multiple series can still be processed in parallel. Shard iterators are lazily instantiated during query execution to limit the amount of memory a given query uses.	2016-08-05 09:45:57 -06:00
Jason Wilder	19546faab3	Release cursor/iterator resources aggressively	2016-08-03 00:21:39 -06:00
Jonathan A. Sternberg	3bd51d3537	Fix fill(previous) when used with math operators	2016-06-29 09:54:12 -05:00
Jonathan A. Sternberg	497db2a6d3	Removing dead code from every package except influxql The tsdb package had a substantial amount of dead code related to the old query engine still in there. It is no longer used, so it was removed since it was left unmaintained. There is likely still more code that is the same, but wasn't found as part of this code cleanup. influxql has dead code show up because of the code generation so it is not included in this pruning.	2016-06-20 22:41:07 -05:00
Jonathan A. Sternberg	252cde1e81	Fix golint errors for the influxql package	2016-06-20 08:51:02 -05:00
Jonathan A. Sternberg	71c8e9e567	Refactor ExecuteQuery to take options as a struct This allows us to add additional options to ExecuteQuery without creating parameter bloat. Removing the unused Series structs. Their necessity was removed by a previous commit, but the structs were not removed yet. Add another type of interrupt iterator that monitors the interrupt channel and calls `Close()` on the iterator when the interrupt happens. It will primarily be used for asynchronously closing the ReaderIterator, but it will only close the read side of the connection properly. More work needs to be done to allow closing the write side efficiently.	2016-06-01 12:30:52 -05:00
Jonathan A. Sternberg	23f6a706bb	Support cast syntax for selecting a specific type Casting syntax is done with the PostgreSQL syntax `field1::float` to specify which type should be used when selecting a field. You can also do `field1::field` or `tag1::tag` to specify that a field or tag should be selected. This makes it possible to select a tag when a field key and a tag key conflict with each other in a measurement. It also means it's possible to choose a field with a specific type if multiple shards disagree. If no types are given, the same ordering for how a type is chosen is used to determine which type to return. The FieldDimensions method has been updated to return the data type for the fields that get returned. The SeriesKeys function has also been removed since it is no longer needed. SeriesKeys was originally used for the fill iterator, but then expanded to be used by auxiliary iterators for determining the channel iterator types. The fill iterator doesn't need it anymore and the auxiliary types are better served by FieldDimensions implementing that functionality, so SeriesKeys is no longer needed. Fixes #6519.	2016-05-16 12:08:29 -04:00
Ben Johnson	078e561820	parallelize iterators	2016-05-09 10:25:30 -06:00
Ben Johnson	49eb3b8d04	optimize show series iterator This commit changes the `SeriesIterator` to process one measurement at a time and uses a `floatFastDedupeIterator` to avoid point encoding during deduplication.	2016-05-03 08:52:44 -06:00
Ben Johnson	1b6524a7bf	reduce interrupt iterator checks The interrupt iterator currently introduces a non-trivial amount of overhead to queries by checking for interrupts every 256 points. This commit adjusts that check to every 5000 points. There are also several places where nested field access has been adjusted to minimize field lookups.	2016-04-26 12:16:07 -06:00
Jonathan A. Sternberg	7ec2a991d5	Modify all of the iterators to allow returning an error on Next() This also switches the remaining iterators to be lazy so they can return errors properly. They needed to be converted to lazy initialization anyway, which has the side effect of making it much easier for us to propagate the underlying error during initialization. Updated the Emitter to return errors when it cannot read properly from the iterators.	2016-04-18 11:17:55 -04:00
Jonathan A. Sternberg	86046bb2d0	Implement derivatives across intervals for aggregate queries For aggregate queries, derivatives will now alter the start time to one interval behind and will use that interval to find the derivative of the first point instead of giving no value for that interval. Null values will still be discarded so if the interval before the one you are querying is null, then it will be discarded like if it were in the middle of the query. You can use `fill(0)` to fill in these values. This does not apply to raw queries yet. Also modified the derivative and difference aggregates to use the stream iterator instead of the reduce slice iterator for space efficiency. Fixes #3247. Contributes to #5943.	2016-04-15 18:16:08 -04:00
Ben Johnson	4f381d03d7	add double buffer on chan iterator This commit changes the channel iterators to use a double buffer to reduce allocations. The caller of `Iterator.Next()` must copy out the point before calling `Next()` again.	2016-04-14 13:52:13 -06:00
Ben Johnson	525e22c92b	tsm1 query engine alloc reduction This commit makes a number of performance improvements to reduce allocations during query execution. Several objects and buffers are now reused across the components to avoid allocations. Previously a simple `count(value)` query across 1M points would require 26,000+ allocations. After the changes in this commit that number has been reduced to 88.	2016-04-11 14:50:59 -06:00
Jonathan A. Sternberg	fa5a38dcd4	Fixing aggregate queries with no GROUP BY to include the end time Queries with a time constraint but no group by would not include the final point from the underlying iterator. Fixes #6229.	2016-04-07 14:11:28 -04:00
Edd Robinson	dfee15bd19	Scopes influxql Protobuf package to prevent clashes Fixes #6211. In Go-land packages with the same name, e.g., internal, do not clash with each other when they're in different parts of the project. However with protobufs definitions will clash if they share the same package name. This commit renames the influxql protobuf package to `influxql` to avoid a clash with a message definition in another protobuf package called internal. Go package aliases allow us to continue to refer to the internal package as `internal` rather than `influxql`.	2016-04-05 13:36:47 +01:00
Jonathan A. Sternberg	43e3330480	Fix the reader iterator so it doesn't read the first point when creating the iterator	2016-04-01 17:31:28 -04:00
Ben Johnson	b28c4db3d0	mark merge iterator as initialized This commit sets the `MergeIterator.init` flag after initialization. Previously this would generate a new heap on every call to `Next()` which caused some aggregate queries to slow by ~10,000%.	2016-03-31 09:56:23 -06:00
Jonathan A. Sternberg	178a6e2f0a	Merge pull request #6113 from influxdata/js-6112-simple-moving-average Implement simple moving average	2016-03-30 20:57:55 -04:00
Jonathan A. Sternberg	278b0950a7	Perform lazy initialization of the heap for the MergeIterator The MergeIterator creation function would call `peek()` on the iterator to initialize the heap. Since this function can sometimes take a long time (such as a huge aggregate query on a shard), the `influxql.Select()` wouldn't return until the query had already been completed. The `influxql.Select()` call should be just the creation of the iterators and shouldn't calculate anything. This is important for future features like the point limiter that have to be initialized after the `influxql.Select()` call.	2016-03-30 16:08:55 -04:00
Jonathan A. Sternberg	6453dbc249	Implement simple moving average The simple moving average will gradually emit points instead of waiting until the end. This should apply to derivative and difference in the future too. Fixes #6112.	2016-03-29 14:36:43 -04:00
Jonathan A. Sternberg	114e734ee5	Fix a bad merge that removed ExpandSources from AuxIterators Regenerated the protobuf file for influxql to use a newer protobuf.	2016-03-22 16:36:22 -04:00
Ben Johnson	6e1c1da25b	reduce allocations in query execution This commit removes some heap objects by converting them from pointer references to non-pointers or by reusing buffers.	2016-03-22 09:51:39 -06:00
Ben Johnson	d58c6608fe	add InterruptIterator.Stats()	2016-03-21 16:38:18 -06:00
Ben Johnson	7156c1f9bd	add IteratorStats This commit adds an `IteratorStats` that holds aggregate iterator processing information. A method is also added to `Iterator` to return the stats: Stats() influxql.IteratorStats The remote iterators will also emit their stats in the point stream upon first connection, on a given interval, and then finally once the last point has been sent.	2016-03-21 16:25:19 -06:00
Jonathan A. Sternberg	6655ca7769	Create a new interrupt iterator that will stop emitting points after an interrupt Use of the iterator is spread out into both `IteratorCreators` and inside of the iterators themselves. Part of the interrupt must be handled inside of the engine so it stops trying to emit points when an interrupt is found and another part of the interrupt has to happen when combining the iterators so it doesn't just start reading the next shard.	2016-03-21 12:07:07 -04:00
Cory LaNou	ba6a95e9bc	Merge pull request #5994 from influxdata/single-server-lite Single Server	2016-03-14 16:11:37 -05:00
Jonathan A. Sternberg	94916082c9	Make binary expressions with either point being nil return a nil point This also fixes integer to float and float/integer to boolean binary expressions to correctly work with nil points at all. Related to #5973.	2016-03-14 13:27:59 -04:00
Ben Johnson	e96185f993	add support for remote expansion of regex This commit moves the `tsdb.Store.ExpandSources()` function onto the `influxql.IteratorCreator` and provides support for issuing source expansion across a cluster.	2016-03-14 16:55:53 +00:00
Jonathan A. Sternberg	3f68bd12ee	Merge pull request #5979 from influxdata/js-5974-aux-iterator-close-panic Fix aux iterators to respect early closing	2016-03-14 12:03:50 -04:00
Jonathan A. Sternberg	0042866002	Teach the AuxIterator how to background Now the AuxIterator will know when it is backgrounded so that it can stop reading from the primary iterator when all of the child iterators have been closed.	2016-03-14 11:12:02 -04:00
Jonathan A. Sternberg	74d51e3842	Support nil values in binary math expressions with two iterators Related to #5959 and #5973.	2016-03-11 15:57:35 -05:00
Ben Johnson	beda072426	add support for remote expansion of regex This commit moves the `tsdb.Store.ExpandSources()` function onto the `influxql.IteratorCreator` and provides support for issuing source expansion across a cluster.	2016-03-11 12:40:07 -07:00
Jonathan A. Sternberg	09a9b3c53e	Fix aux iterators to respect early closing The primary input iterator for an aux iterator would continue trying to send points to a closed channel even after an aux iterator had already been closed. This changes the aux iterators to use sync.Cond instead of channels and lower level syncing primitives for handling buffered input/output. Fixes #5974.	2016-03-11 12:07:32 -05:00
Jonathan A. Sternberg	9c5bc8ab2b	Refactor reduce slice func to use the aggregator and emitter	2016-03-07 13:25:45 -05:00
Jonathan A. Sternberg	9113839e4c	Fix sorting of `first()` and `last()` calls across shards Previously the call iterator would normalize the time to the interval for all calls. This meant that when `first()` or `last()` was called with no group by interval the value would be found for each shard, the time was normalized, then it tried to find the value between the shards (but no longer with any time data as that had already been eliminated). This removes part of the time logic from the call iterators and makes a new iterator `IntervalIterator` to normalize the times as they come out of the underlying iterator. Fixes #5890.	2016-03-03 21:15:43 -05:00
Jonathan A. Sternberg	e3660fae93	Support all iterator types for count(), first(), and last() All three of these iterators are supposed to support all four types of iterators, but the implementation was never done for string or boolean. Fixes #5886.	2016-03-02 23:49:55 -05:00
Jonathan A. Sternberg	2440568b27	Merge pull request #5875 from influxdata/js-5852-mean-function-accuracy Improve mean accuracy while retaining the speedup with a custom iterator	2016-03-02 17:09:58 -05:00
Jonathan A. Sternberg	1c543b28a9	Refactored call iterators to make them public and more usable as a library This refactor is primarily to support Kapacitor. Kapacitor doesn't care about the iterators and mostly keeps the points it handles in memory. The iterator interface is more than Kapacitor cares about. This commit refactors and opens up the internals of aggregating and reducing incoming points so it can be used by an outside library with the same code. It also makes the iterators used by the call iterators publically usable with new functionality. Reducers are split into two methods which are separate interfaces that can be combined for dealing with casting between different types. The Aggregator interfaces accept points into the aggregator and retain any internal state they need. The Emitter interface will then create a point from that aggregated state which can be fed to the iterator. The Emitters do not fill in the name or tag of the point as that is expected to be done by the person aggregating the point. While the Emitters do sometimes fill in the time, that value will also be overwritten by the iterator. Filling in the time is to allow a future version that will allow returning the point time instead of just the interval time.	2016-03-02 16:10:49 -05:00
Jonathan A. Sternberg	d11bc6182c	Improve mean accuracy while retaining the speedup with a custom iterator Fixes #5852.	2016-03-02 14:48:11 -05:00
Jonathan A. Sternberg	87fc143732	Fix limit iterator with multiple sources The limit iterator would short circuit if there were no dimensions and all points had been read. It also needs to consider that multiple sources will require reading the entire iterator too, so the short circuit requires only a single source. Fixes #5871.	2016-03-01 21:44:45 -05:00
Ben Johnson	0dda9f6608	add remote execution This commit adds remote execution to the query engine.	2016-02-25 08:41:20 -07:00
Jonathan A. Sternberg	18c7c554ba	Optimize the mean() call by moving the calculation into the shard iterator A new attribute has been added to points to track how many points were used to calculate that point. This is particularly useful for finding the mean as we can then split mean calculation into two phases: one at the shard level and a second at the shards level. This optimization is now used so we don't have to hold so many points in memory while calculating the mean.	2016-02-16 10:32:34 -05:00
Jonathan A. Sternberg	73ee204fbc	Cast number fill values for the fill iterator Querying an integer field with a fill value will cause a cast error because the underlying type is a float64 rather than an int64. Add a function that will coerce the value to the correct type. It may be more appropriate in the future to have the fill iterator read the underlying iterator and cast to the appropriate type rather than coerce the fill value to the correct type, but this solution works for our current scenario well.	2016-02-11 15:21:50 -05:00
Jonathan A. Sternberg	98810a363a	Correct the AuxIterator test and adding some additional locks The additional locks shouldn't be necessary due to how the code is used, but should prevent any potential data races in case we accidentally do something bad.	2016-02-10 09:40:31 -07:00
Jonathan A. Sternberg	ed151598ff	Modify the AuxIterator to include a Start method The AuxIterator streams points to the underlying iterators. When it started automatically, race conditions occurred between the stream closing the iterators and creating iterators from the AuxIterator.	2016-02-10 09:40:30 -07:00

1 2

69 Commits (74c6a0c1c590a102705442fcb49f372c85f1beb1)