influxdb

Commit Graph

Author	SHA1	Message	Date
Jonathan A. Sternberg	9edf236cc8	Maintain the tags of points selected by top() or bottom() when writing the results When a `SELECT ... INTO ...` is used with `top()` or `bottom()` used with tags, the points will be written with the tags still intact instead of converted to fields.	2017-05-23 15:00:21 -05:00
Jonathan A. Sternberg	7b9b55bfc0	Optimize top() and bottom() using an incremental aggregator The previous version of `top()` and `bottom()` would gather all of the points to use in a slice, filter them (if necessary), then use a slightly modified heap sort to retrieve the top or bottom values. This performed horrendously from the standpoint of memory. Since it consumed so much memory and spent so much time in allocations (along with sorting a potentially very large slice), this affected speed too. These calls have now been modified so they keep the top or bottom points in a min or max heap. For `top()`, a new point will read the minimum value from the heap. If the new point is greater than the minimum point, it will replace the minimum point and fix the heap with the new value. If the new point is smaller, it discards that point. For `bottom()`, the process is the opposite. It will then sort the final result to ensure the correct ordering of the selected points. When `top()` or `bottom()` contain a tag to select, they have now been modified so this query: SELECT top(value, host, 2) FROM cpu Essentially becomes this query: SELECT top(value, 2), host FROM ( SELECT max(value) FROM cpu GROUP BY host ) This should drastically increase the performance of all `top()` and `bottom()` queries.	2017-05-19 11:56:46 -05:00
Jonathan A. Sternberg	be3bce5212	top() and bottom() now returns the time for every point `top()` and `bottom()` will now organize the points by time and also keep the points original time even when a time grouping is used. At the same time, `top()` and `bottom()` will no longer honor any fill options that are present since they don't really make sense for these specific functions. This also fixes the aggregate and selectors to honor the ordered iterator option so iterator remain ordered and to also respect the buckets that are created by the final dimensions of the query so that two buckets don't overlap each other within the same reducer. A test has been added for this situation. This should clarify and encourage the use of the ordered attribute within the query engine.	2017-04-26 15:07:10 -05:00
zhexuany	232fdae6dd	introduce a new function non_negative_difference	2017-03-31 23:08:36 +08:00
Jonathan A. Sternberg	2ea805c928	Interpolate between different intervals to find the whole area under the curve	2017-03-30 12:51:52 -05:00
Tom Young	cac94a1fc7	Add "integral" function to InfluxQL	2017-03-30 12:07:26 -05:00
Jonathan A. Sternberg	d7c8c7ca4f	Support subquery execution in the query language This adds query syntax support for subqueries and adds support to the query engine to execute queries on subqueries. Subqueries act as a source for another query. It is the equivalent of writing the results of a query to a temporary database, executing a query on that temporary database, and then deleting the database (except this is all performed in-memory). The syntax is like this: SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *) This will execute derivative and then sum the result of those derivatives. Another example: SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host) This would let you find the maximum minimum value of each host. There is complete freedom to mix subqueries with auxiliary fields. The only caveat is that the following two queries: SELECT mean(value) FROM cpu SELECT mean(value) FROM (SELECT value FROM cpu) Have different performance characteristics. The first will calculate `mean(value)` at the shard level and will be faster, especially when it comes to clustered setups. The second will process the mean at the top level and will not include that optimization.	2017-01-07 13:00:48 -06:00
Mark Rushakoff	88b8bd2465	Update godoc for package influxql I did not look at any of the .gen.go files.	2016-12-30 18:02:52 -08:00
Jonathan A. Sternberg	bffc759cf9	Return the time from a percentile call on an integer `percentile()` is supposed to be a selector and return the time of the point, but that only got changed when the input was a float. Updating the integer processor to also return the time of the point rather than the beginning of the interval.	2016-12-01 12:34:48 -06:00
Jonathan A. Sternberg	e885fe5117	Expand string and boolean fields when using a wildcard with sample()	2016-11-15 15:56:47 -06:00
Jonathan A. Sternberg	95859b8ab4	Remove accidentally added string support for the stddev call Strings would always return an empty string and stddev is meaningless when it comes to strings. This removes that functionality so strings don't automatically get picked up when using a wildcard.	2016-10-10 14:58:28 -05:00
Jonathan A. Sternberg	6afc2a77a5	Implement cumulative_sum() function The `cumulative_sum()` function can be used to sum each new point and output the current total. For the following points: cpu value=2 0 cpu value=4 10 cpu value=6 20 This would output the following points: > SELECT cumulative_sum(value) FROM cpu time value ---- ----- 0 2 10 6 20 12 As can be seen, each new point adds to the sum of the previous point and outputs the value with the same timestamp. The function can also be used with an aggregate like `derivative()`. > SELECT cumulative_sum(mean(value) FROM cpu WHERE time >= now() - 10m GROUP BY time(1m)	2016-10-07 10:11:53 -05:00
Michael Desa	f9b8129770	Add sample function to query language First Pass at implementing sample Add sample iterators for all types Remove size from sample struct Fix off by one error when generating random number Add benchmarks for sample iterator Add test and associated fixes for off by one error Add test for sample function Remove NumericLiteral from sample function call Make clear that the counter is incr w/ each call Rename IsRandom to AllSamplesSeen Add a rng for each reducer that is created The default rng that comes with math/rand has a global lock. To avoid having to worry about any contention on the lock, each reducer now has its own time seeded rng. Add sample function to changelog	2016-10-06 09:41:42 -07:00
Jonathan A. Sternberg	993ac1ca2e	Remove confusing comment and unnecessary continue	2016-08-23 19:43:18 -05:00
Ashish Gaurav	4e17f9bb13	add mode() function & tests	2016-08-23 19:31:41 -05:00
Ashish Gaurav	70c8c021ac	added benchmark tests for median aggrergator (Package: influxql,influxql_test)	2016-08-04 08:02:19 +05:30
Nathaniel Cook	ce74fe0b06	count and sum return 0 for empty intervals	2016-06-01 15:53:23 -06:00
Nathaniel Cook	6ed0d94343	Add Holt-Winters forecasting method.	2016-05-19 09:24:56 -06:00
Jonathan A. Sternberg	a05e2b164e	Support booleans for min() and max() Fixes #6494.	2016-04-29 14:56:22 -04:00
Nathaniel Cook	465f5a375f	add elapsed function	2016-04-19 12:54:54 -06:00
Jonathan A. Sternberg	86046bb2d0	Implement derivatives across intervals for aggregate queries For aggregate queries, derivatives will now alter the start time to one interval behind and will use that interval to find the derivative of the first point instead of giving no value for that interval. Null values will still be discarded so if the interval before the one you are querying is null, then it will be discarded like if it were in the middle of the query. You can use `fill(0)` to fill in these values. This does not apply to raw queries yet. Also modified the derivative and difference aggregates to use the stream iterator instead of the reduce slice iterator for space efficiency. Fixes #3247. Contributes to #5943.	2016-04-15 18:16:08 -04:00
Jonathan A. Sternberg	66a599825b	Allow percentile to be used as a selector Fixes #6292.	2016-04-13 13:29:14 -04:00
Jonathan A. Sternberg	50bd78433c	Merge pull request #6291 from influxdata/js-6261-optimize-distinct Optimize the distinct call	2016-04-12 17:09:10 -04:00
Nathaniel Cook	6ae62e9644	update Percentile to preserve Aux fields since its a selector	2016-04-12 13:34:50 -06:00
Jonathan A. Sternberg	6708d0c439	Optimize the distinct call Change distinct so it uses a custom reducer that keeps internal state instead of requiring all of the points to be kept as a slice in memory. Fixes #6261.	2016-04-11 18:29:50 -04:00
Jonathan A. Sternberg	6453dbc249	Implement simple moving average The simple moving average will gradually emit points instead of waiting until the end. This should apply to derivative and difference in the future too. Fixes #6112.	2016-03-29 14:36:43 -04:00
Jonathan A. Sternberg	a9720f926e	Implement the difference function The difference function is implemented very similar to how derivative is implemented. It is an aggregate function that acts over the entire aggregate. This function will also have the same problems that derivative has with getting values from the previous interval or point. This will be fixed separately as part of #5943. Fixes #1825.	2016-03-29 09:27:12 -04:00
Jonathan A. Sternberg	43a5e84aaf	Merge pull request #6047 from influxdata/js-6040-boolean-distinct Support the distinct() call for booleans	2016-03-17 17:17:21 -04:00
Jonathan A. Sternberg	e47426ff6e	Support integer literals in the query language Numbers in the query without any decimal will now be emitted as integers instead and be parsed as an IntegerLiteral. This ensures we keep the original context that a query was issued with and allows us to act more similar to how programming languages are typically structured when it comes to floats and ints. This adds functionality for dealing with integers promoting to floats in the various different places where math are used. Fixes #5744 and #5629.	2016-03-17 10:37:34 -04:00
Jonathan A. Sternberg	2e7816ebd9	Support the distinct() call for booleans Normalize the time for the distinct() call to either be at the beginning of the group by interval or the start time similar to every other call. The timestamp previously just showed the first time found and didn't make a lot of sense in the context of what the function was supposed to do. Fixes #6040.	2016-03-17 09:32:54 -04:00
Nathaniel Cook	4961a4435b	Fix nil comparison for top/bottom	2016-03-07 15:21:22 -07:00
Nathaniel Cook	46fc6e5516	Expose Reduce Functions for Kapacitor	2016-03-07 14:03:14 -07:00
Jonathan A. Sternberg	9c5bc8ab2b	Refactor reduce slice func to use the aggregator and emitter	2016-03-07 13:25:45 -05:00
Jonathan A. Sternberg	8d89a203a2	Fix sorting for distinct by sorting by value when the point time is the same	2016-03-03 19:09:38 -05:00
Jonathan A. Sternberg	e3660fae93	Support all iterator types for count(), first(), and last() All three of these iterators are supposed to support all four types of iterators, but the implementation was never done for string or boolean. Fixes #5886.	2016-03-02 23:49:55 -05:00
Jonathan A. Sternberg	1c543b28a9	Refactored call iterators to make them public and more usable as a library This refactor is primarily to support Kapacitor. Kapacitor doesn't care about the iterators and mostly keeps the points it handles in memory. The iterator interface is more than Kapacitor cares about. This commit refactors and opens up the internals of aggregating and reducing incoming points so it can be used by an outside library with the same code. It also makes the iterators used by the call iterators publically usable with new functionality. Reducers are split into two methods which are separate interfaces that can be combined for dealing with casting between different types. The Aggregator interfaces accept points into the aggregator and retain any internal state they need. The Emitter interface will then create a point from that aggregated state which can be fed to the iterator. The Emitters do not fill in the name or tag of the point as that is expected to be done by the person aggregating the point. While the Emitters do sometimes fill in the time, that value will also be overwritten by the iterator. Filling in the time is to allow a future version that will allow returning the point time instead of just the interval time.	2016-03-02 16:10:49 -05:00
Jonathan A. Sternberg	d11bc6182c	Improve mean accuracy while retaining the speedup with a custom iterator Fixes #5852.	2016-03-02 14:48:11 -05:00
Jonathan A. Sternberg	7a03df2af1	Remove the non-unreachable panics in the new query engine The only panics left are ones that should be unreachable unless there is a bug. Fixes #5777.	2016-02-22 12:52:43 -05:00
Jonathan A. Sternberg	18c7c554ba	Optimize the mean() call by moving the calculation into the shard iterator A new attribute has been added to points to track how many points were used to calculate that point. This is particularly useful for finding the mean as we can then split mean calculation into two phases: one at the shard level and a second at the shards level. This optimization is now used so we don't have to hold so many points in memory while calculating the mean.	2016-02-16 10:32:34 -05:00
Jonathan A. Sternberg	42b9166000	Support derivative() call for integer fields in the new query engine Fixes #5640.	2016-02-12 11:36:59 -05:00
Sergei Egorov	eef0e41a7e	Optimize ReducePercentile method: do not call len() twice + move sorting after index check	2016-02-11 20:05:34 +02:00
Ben Johnson	607750ab1b	add SHOW MEASUREMENTS iterator	2016-02-10 09:40:28 -07:00
Jonathan A. Sternberg	dbb9b36d84	Support integers with top() and bottom() and fix point ordering top() and bottom() point ordering was incorrect and using an inefficient method of sorting. It has now been updated to use a heap and ordering is being done by value first and time second (with earlier times always taking priority). Removed unit tests that test using `time` inside of the query to get the real time instead of the interval time and only allowing the default behavior. We will have another mechanism to get the real time during an interval, but the current method is deprecated. The top() and bottom() methods now have integer support.	2016-02-10 09:40:27 -07:00
Ben Johnson	a0fe0ca437	fix new query engine test regressions	2016-02-10 09:40:27 -07:00
Jonathan A. Sternberg	76b49b3ab3	Fixed a bug in first() and last() where the time was lost last() would always return the last output of the iterator (which isn't necessarily the last time value due to how the merge iterator works) and first() would always return the first output of the iterator (wrong for the same reason). Now the time is kept by the reduce function and the times are wiped as part of the reduce iterator after the value has been found.	2016-02-10 09:40:26 -07:00
Jonathan A. Sternberg	03ad7a4e40	Move the integerReduceSliceFloatIterator to call_iterator.go It matches more in functionality to the functions in call_iterator.go than iterator.go. iterator.go mostly has base iterators and call_iterator.go has iterators related to functional calls, which is the only time integerReduceSliceFloatIterator is used.	2016-02-10 09:40:26 -07:00
Ben Johnson	b8918a780c	integer support	2016-02-10 09:40:25 -07:00
Jonathan A. Sternberg	97752df03d	Repair regressions in derivatives with group by tests Also fixes the `first()` and `last()` calls to do the same thing as `min()` and `max()` by returning the time corresponding to the start of the interval rather than the point's real time.	2016-02-10 09:40:25 -07:00
Jonathan A. Sternberg	e0ac29dd2d	Implement most of top() and bottom() This does not implement the time selector, but everything else is implemented. Unfortunately, there are no tests for bottom() in the old query engine, so only top() is properly tested.	2016-02-10 09:40:25 -07:00
Ben Johnson	00806de9b8	refactor query engine	2016-02-10 09:40:25 -07:00

1 2

52 Commits (2259ada8c39fa594c7f26a92e4a4a430f0687c61)