Commit Graph

104 Commits (7708e12bd75456e93a2d2985b7cb7a3b09700472)

Author SHA1 Message Date
Jonathan A. Sternberg 0a7f379768
Fixing the stream iterator to not ignore the error
The stream iterator would ignore an error that happened when reading
points. This may have caused it to potentially return an error that got
ignored and then to try invoking `Next()` on an iterator in an invalid
state and that iterator would then actually return a point which it
wasn't supposed to.

Also added some defensive coding to that same call to prevent a nil map
from being assigned to in the event of an invalid iterator returning
junk data.
2018-10-10 16:26:55 -05:00
Jonathan A. Sternberg 22fc9f6a19
Strip tags from a subquery when the outer query does not group by that tag
The following would, erroneously, not strip the tag from the inner
query:

    SELECT value FROM (SELECT value FROM cpu GROUP BY host)

The inner query was supposed to group by the host tag, but the outer
query should strip it away since it is not being grouped by anymore.
This fixes things so that the result will have the tags stripped away
when they are not requested in the grouping.
2018-10-04 10:05:46 -05:00
Jonathan A. Sternberg 634471f12e
Fix subquery functionality when a function references a tag from the subquery
It has previously been allowed for a subquery to use a tag within a
function (such as `count()`) when the tag is from a subquery and the
subquery itself references a field at some point to perform the join.

This functionality regressed in 1.6 because of a change in how
subqueries were executed that forgot to treat a tag the same as a string
field.

This fixes that regression and adds a test case to avoid hitting that
regression again.
2018-10-04 10:05:20 -05:00
Jonathan A. Sternberg 728bdd6562 Fix the derivative and others time ranges for aggregate data
The derivative function and others similar to it would preload
themselves with data so that the first interval would be the start of
the time range. That meant reading data outside of the time range.

One change to the shard mapper back in v1.4.0 caused the shard mapper to
constrict queries to the intervals given to the shard mapper. This was
correct because the shard mapper can only deal with times it has mapped,
but this broke the functionality of looking back into the past for
the derivative and other functions that used that functionality.

The compiler has been updated with an additional attribute that records
how many intervals in the past will need to be read so that the shard
mapper can include extra times that it may not necessarily read from,
but may be queried because of the above described functionality.
2018-09-10 09:59:33 -05:00
Jonathan A. Sternberg c6811f4725 Fix the inherited interval for derivative and others
The compiler was too strict. The inherited interval from an outer query
should not have caused the inner query to fail because inherited
intervals are only implicitly passed to inner queries that support group
by time functionality. Since the inner query with a derivative doesn't
support grouping by time and the inner query itself doesn't specify a
time, the outer query shouldn't have invalidated the inner query.
2018-08-30 13:01:18 -05:00
Jonathan A. Sternberg 531dc49717
Merge pull request #9853 from influxdata/js-fix-log2-function
Fix the log2 function to return values
2018-05-19 11:35:31 -05:00
Jonathan A. Sternberg de87298e38
Merge pull request #9870 from influxdata/js-fix-top-unit-test
Fix the new top/bottom unit tests
2018-05-17 18:00:30 -05:00
Jonathan A. Sternberg a1670613a1 Fix the log2 function to return values
The passed in argument wasn't correct so it always returned null instead
of the appropriate value.

Also includes unit tests for all of the math functions and restricts the
`asin()` and `acos()` functions to floats only since those functions
don't give any meaningful results when using integers or unsigned.
2018-05-17 15:44:58 -05:00
Jonathan A. Sternberg 1cf1a23361
Merge pull request #9860 from phemmer/ta-rename
rename "triple_exponential_average" -> "triple_exponential_derivative"
2018-05-17 14:42:26 -05:00
Jonathan A. Sternberg 04d5c83dcd Fix the new top/bottom unit tests
The new tests accidentally labeled the `p3` variable as the data type
rather than a string.
2018-05-17 13:46:24 -05:00
Jonathan A. Sternberg 8a2bc63d3c Return the correct auxiliary values for top/bottom
When `top()` or `bottom()` were used and selected auxiliary values, they
would return the wrong values that would be equal to the last point
selected. This is because the aggregators saved the memory address of
the auxiliary fields instead of copying them over. Since the same
auxiliary fields memory location is used for every value returned by the
storage engine, this resulted in the values being incorrect because they
were overwritten with incorrect values.

This fixes that so the auxiliary fields are copied out when they are
saved rather than only the memory location.
2018-05-17 10:25:40 -05:00
Patrick Hemmer 7dc7efd501 rename "triple_exponential_average" -> "triple_exponential_derivative" 2018-05-16 19:40:12 -04:00
Jonathan A. Sternberg 9d049c4b62 Optimize the spread function to process points iteratively instead of in batch 2018-04-30 11:25:29 -05:00
Jonathan A. Sternberg d9a528ecd4 Include the query task status in the QueryInfo struct
Previously, the task manager was modified to keep the query status so it
could track which queries were running and which ones were killed.
In those previous versions, we removed a task from the process table
as soon as it was killed and did not remove it after it had finished
executing. This meant there could be zombie goroutines running in the
background that were impossible to see.

When the task manager was updated to track the task status, we forgot to
expose the status in the public interface so consumers could see the
task status.
2018-04-26 08:17:41 -05:00
Jonathan A. Sternberg b326db531c
Merge pull request #9646 from influxdata/js-math-type-mapper-tests
Introduce unit tests for the math type mapper
2018-04-25 12:48:33 -05:00
Jonathan A. Sternberg d42062def2 Add technical analysis algorithms
This adds numerous technical analysis algorithms:

* exponential_moving_average
* double_exponential_moving_average
* triple_exponential_moving_average
* relative_strength_index
* triple_exponential_average
* kaufmans_efficiency_ratio (commonly referred to as just "Efficiency Ratio")
* kaufmans_adaptive_moving_average
* chande_momentum_oscillator (both the common 'smoothed' version, and the ta-lib version)
2018-04-23 22:27:21 -04:00
Jonathan A. Sternberg 58bcc6fdc9 Fix the validation for multiple nested distinct calls 2018-04-23 14:38:02 -05:00
Jacob Marble 232be14aef respect rp parameter in /query 2018-04-19 08:31:43 -07:00
Tom Young 42581c7432 Add new math functions:
- abs
- asin, acos, atan, atan2
- exp, ln, log, log2, log10
- pow, sqrt
2018-04-17 12:56:36 -05:00
ahmah2009 7968e21881 Implement floor, ceil, and round functions 2018-04-04 23:53:55 +03:00
Jonathan A. Sternberg 8aeb0fa0c6 Update explain analyze to output data related to the iterator scanners 2018-04-02 14:49:22 -05:00
Jonathan A. Sternberg dc71a8d82b
Merge pull request #9666 from influxdata/js-simplify-call-valuer
Update the interface for the simplified call valuer
2018-04-02 11:47:41 -05:00
Jonathan A. Sternberg f7bfae4044 Update the interface for the simplified call valuer 2018-03-31 00:21:36 -05:00
Jonathan A. Sternberg 0f304690c5 Enable casting values from a subquery
This also fixes the cursor system to abandon iterators that will not
produce meaningful results since the variables are all unknown types.

This creates a weird behavior that existed in previous releases and we
are keeping here for backwards compatibility. If a subquery referenced a
field that didn't exist in the subquery, it will return nothing. But, if
there are two subqueries and one of them has the field exist and the
other doesn't, the second will return all null values.
2018-03-30 16:58:37 -05:00
Jonathan A. Sternberg dd79f06efa
Merge pull request #9641 from influxdata/js-subquery-tests
Add some unit tests to subqueries
2018-03-29 14:15:26 -05:00
Jonathan A. Sternberg a49a8dce6b Remove unused query code
This code was previously used to implement binary expressions and other
transfomation iterators. It is no longer needed.
2018-03-28 13:24:45 -05:00
Jonathan A. Sternberg 2fb67dd4be Fix subquery conditions with the cursor refactor 2018-03-28 13:13:46 -05:00
Jonathan A. Sternberg 2292b44ed7 Introduce unit tests for the math type mapper 2018-03-28 09:40:06 -05:00
Jonathan A. Sternberg d4db76508f Add some unit tests to subqueries
This is not complete, but it is a starting point for more thorough tests
of subqueries.

This also reorders the use of `cmp.Diff` so the `want` is first and
`got` is second. This way, the `want` shows up as a minus sign in the
diff rather than, confusingly, as a plus sign.
2018-03-27 14:56:27 -05:00
Jonathan A. Sternberg 4044d41e10
Merge pull request #9637 from influxdata/js-use-null-for-float-nan
Use a null placeholder for NaN results
2018-03-27 12:23:49 -05:00
Jonathan A. Sternberg 5d9f6519ad
Merge pull request #9633 from influxdata/js-9142-group-by-offset-now
Fix regression to allow now() to be used as the group by offset again
2018-03-27 11:10:27 -05:00
Jonathan A. Sternberg 41bc1ab241 Use a null placeholder for NaN results
This ensures that NaN gets serialized as a null value and that it does
not get replaced with the fill value.
2018-03-27 08:44:44 -05:00
Jonathan A. Sternberg 92dd6ea978 Fix regression to allow now() to be used as the group by offset again 2018-03-26 10:52:32 -05:00
Stuart Carnie aa61359cc7 Storage RPC API improvements. See PR for details
* reduce # allocations (115M -> 22M)
* reduce size allocations (53GB -> 1.3GB)
* reduce RPC query time (45s -> 12.9s)
2018-03-21 13:46:09 -07:00
Jonathan A. Sternberg 6e627cfdbf Implement basic trigonometry functions
This adds support for math functions into the query language. Math
functions are special because they are transformations and do not access
the filesystem in the same way aggregate functions do. A transformation
takes one point and always outputs one point making it more similar to
binary expressions so these math functions follow the same rules as
binary expressions.

This also supports using math literals (so you can do `sin(1)`) and the
math functions can be used anywhere such as in a field or an expression.
Both of the following are supported:

    SELECT sin(value) FROM cpu
    SELECT value FROM cpu WHERE sin(value) > 0.5

Arguments are in radians. Degrees is not supported.
2018-03-20 14:13:52 -05:00
Jonathan A. Sternberg f8d60a881d Refactor the math engine to compile the query and use eval
This change makes it so that we simplify the math engine so it doesn't
use a complicated set of nested iterators. That way, we have to change
math in one fewer place.

It also greatly simplifies the query engine as now we can create the
necessary iterators, join them by time, name, and tags, and then use the
cursor interface to read them and use eval to compute the result. It
makes it so the auxiliary iterators and all of their complexity can be
removed.

This also makes use of the new eval functionality that was recently
added to the influxql package.

No math functions have been added, but the scaffolding has been included
so things like trigonometry functions are just a single commit away.

This also introduces a small breaking change. Because of the call
optimization, it is now possible to use the same selector multiple times
as a selector. So if you do this:

    SELECT max(value) * 2, max(value) / 2 FROM cpu

This will now return the timestamp of the max value rather than zero
since this query is considered to have only a single selector rather
than multiple separate selectors. If any aspect of the selector is
different, such as different selector functions or different arguments,
it will consider the selectors to be aggregates like the old behavior.
2018-03-19 15:01:15 -05:00
Jonathan A. Sternberg c8b0c6e166 Update influxql to include the function type evaluators in the query package 2018-03-14 15:42:28 -05:00
Jonathan A. Sternberg df7a660fb3 Modify the Select call to return a Cursor
The Cursor returned will be capable of scanning rows into a structure.
It replaces part of the function for why the Emitter existed. The
Emitter would both join the resulting rows and then transform the values
into a models.Row so it could be returned to the results.

In the future, we will be able to use the Cursor directly to write out
values which should be more memory efficient.
2018-03-09 12:47:41 -06:00
Jonathan A. Sternberg 21164e1d8c Use the error in the point limit monitor test 2018-03-09 11:09:13 -06:00
Jonathan A. Sternberg 733d842812 Turn the ExecutionContext into a context.Context
Along with modifying ExecutionContext to be a context and have the
TaskManager return the context itself, this also creates a Monitor
interface and exposes the Monitor through the Context. This way, we can
access the monitor from within the query.Select method and keep all of
the limits inside of the query package instead of leaking them into the
statement executor.

An eventual goal is to remove the InterruptCh from the IteratorOptions
and use the Context instead, but for now, we'll just assign the done
channel from the Context to the IteratorOptions so at least they refer
to the same channel.
2018-03-08 14:03:20 -06:00
Jonathan A. Sternberg de4390ae83 Rename some of the structs and interfaces in the query package
Remove the `Query` prefix from some structs and interfaces. They were
there so when the query engine was in the same package as influxql,
these would be differentiated. Now that the package name is query, the
extra prefix seems redundant.
2018-03-02 09:44:12 -06:00
Jonathan A. Sternberg 9e122eb1a4 Fix the implicit time range in a subquery
The implicit time range for an interval is supposed to be now when no
end is specified. In a subquery though, the interval doesn't exist and
so it doesn't set the end time to now, but to the max time. Since the
subquery qualifies as something that should have the implicit end time
apply, this results in a query that runs slowly because it is filling in
a bunch of unasked for intervals if a fill is specified.

This hack adds the implicit end time if it sees the parent query's end
time is set to the maximum available time.

This is a temporary fix for this problem. The query compilation should
perform these time range calculations in the compilation stage and the
subqueries should use the compilation stage during execution instead of
ignoring it. That work takes a lot more effort though and is more prone
to running into unforeseen bugs.

This fix introduces a subtle, but likely rare to run into bug. If the
top level query specifies the maximum time as the end time and the
subquery has an interval, the subquery should use the end time rather
than now as the time range. With this hack, it will interpret it as an
implicit time rather than an explicit one. This is unlikely to matter
though.
2018-02-27 17:10:10 -06:00
Jonathan A. Sternberg 426b6ee151 Update sample config with information on logging levels
Update the query executor to log a message using the style guide. The
actual message is static and the query is now in a context field.
2018-02-21 11:25:38 -06:00
Jonathan A. Sternberg ca471f7d0f Fix regression when math between literals is used in a field 2018-02-14 14:34:34 -05:00
Edd Robinson 21f0c6415b Cleanup query package 2018-01-21 12:08:23 -08:00
Edd Robinson de0e9b1a4b Unify approach to short-circuit auth 2018-01-17 14:00:24 +00:00
Jonathan A. Sternberg ecba19eb27 Prevent a panic when a query simultaneously finishes and is killed at the same time
There is a strange race condition where a query can be killed and finish
at approximately the same time. If this happens, the query gets
retrieved by the killing task, the query gets closed by the normal
processing thread, and then the killing task attempts to kill the query
afterwards. Since the close doesn't mark the query as already killed
(since it's not killed, just merely unused), the killing thread attempts
to close the channel again.

Mark the query as killed whenever it is closed to prevent a double close
from happening. This should never cause the status to be erroneously
reported since the query status is removed from the query table within
the same lock scope.
2018-01-02 11:04:01 -06:00
Edd Robinson f6835632e7 Merge master into branch 2017-12-08 17:11:07 +00:00
Ben Johnson e0df47d54f
Fixing up tests. 2017-12-02 16:52:34 -07:00
Jonathan A. Sternberg db60a83d5a Fix query compilation so multiple nested distinct calls is allowable
When refactoring the query engine, I thought calling
`count(distinct(value))` multiple times was disallowed and so the
refactor made it so that wasn't possible.

It turns out that this pattern is allowed because since the distinct is
nested, it is aggregated anyway and can be combined with other
aggregates.

This removes the erroneously placed restriction.
2017-11-28 11:09:32 -06:00