When multiple sources are used, emit all points for a certain source
(like cpu) before another source (like mem) regardless of which window
they are in. If the sources are the same, then sort by window.
Continue to ignore tags since we don't need to sort nicely by tags with
a MergeIterator, only SortedMergeIterator.
Previously reduce iterators just separated points by tags. If you had
identical tags but different names, it would group those together so you
could have these two points:
cpu value=1
mem value=2
When you performed a `mean(value)` call and included both cpu and mem as
sources, it would return one mem series with a value of 1.5 instead of
two serieses.
Out of a list of iterators, an overarching iterator type is chosen and
only iterators of that type are returned for the merge iterator. If a
type can be cast to another type, an extra cast iterator is created to
handle that casting.
The only supported cast is from integers to floats.
top() and bottom() point ordering was incorrect and using an inefficient
method of sorting. It has now been updated to use a heap and ordering is
being done by value first and time second (with earlier times always
taking priority).
Removed unit tests that test using `time` inside of the query to get the
real time instead of the interval time and only allowing the default
behavior. We will have another mechanism to get the real time during an
interval, but the current method is deprecated.
The top() and bottom() methods now have integer support.
No boolean operator tests because the parser doesn't currently allow it.
I am currently keeping the code that performs this for now until we
decide whether to allow boolean operators inside of a select clause. It
would be easy to have the new query engine do something like this:
SELECT mean(value) < 10 FROM cpu GROUP BY time(1m)
And then you would have it return true or false for each interval, but
the same can be done through a where clause too.
last() would always return the last output of the iterator (which isn't
necessarily the last time value due to how the merge iterator works) and
first() would always return the first output of the iterator (wrong for
the same reason).
Now the time is kept by the reduce function and the times are wiped as
part of the reduce iterator after the value has been found.
It matches more in functionality to the functions in call_iterator.go
than iterator.go. iterator.go mostly has base iterators and
call_iterator.go has iterators related to functional calls, which is
the only time integerReduceSliceFloatIterator is used.
Also fixes the `first()` and `last()` calls to do the same thing as
`min()` and `max()` by returning the time corresponding to the start of
the interval rather than the point's real time.
This does not implement the time selector, but everything else is
implemented. Unfortunately, there are no tests for bottom() in the old
query engine, so only top() is properly tested.
Previously if you issued a CQ with a resample interval higher than the
query interval, such as the following:
CREATE CONTINUOUS QUERY cq ON db
RESAMPLE EVERY 4m
BEGIN
SELECT mean(value) INTO cpu_mean FROM cpu GROUP BY time(2m)
END
This would result in strange behavior because the FOR value defaulted to
the GROUP BY interval and the minimum time passing before a CQ ran was
also the resample interval, so it wouldn't run the appropriate intervals
even if you set the resample duration to a higher value.
This tweaks the CQ runner to set the minimum interval before a bucket
becomes capable of running to the lower of the query interval or the
resample interval instead of always using the resample interval.
It also sets the default resample duration to be the higher value of the
query interval or the resample interval so the above query gets a
default of 4m instead of 2m and will execute 2 queries every 4 minutes.
If you manually set the resample duration to a lower value than the
resample interval, the old behavior will still happen and should be
considered an error.
This also makes trying to create a continuous query with a resample
duration of below the resample interval or query interval (whichever is
higher) as an error returned by the parser.
Fixes#5286.
* Add dir, hostname, and bind address to top level config since it applies to services other than meta
* Add enabled flags to example toml for data and meta services
* Wire up add/remove raft peers and meta servers to meta service
* Update DROP SERVER to be either DROP META SERVER or DROP DATA SERVER
* Bring over statement executor from old meta package
* Start meta service client implementation
* Update meta service test to use the client
* Wire up node ID/meta server storage information
This makes the following syntax possible:
CREATE CONTINUOUS QUERY mycq ON mydb
RESAMPLE EVERY 1m FOR 1h
BEGIN
SELECT mean(value) INTO cpu_mean FROM cpu GROUP BY time(5m)
END
The RESAMPLE option customizes how often an interval will be sampled and
the duration. The interval is customized with EVERY. Any intervals
within the resampling duration on a multiple of the resample interval
will be updated with the new results from the query.
The duration is customized with FOR. This determines how long an
interval will participate in resampling.
Both options are optional. If RESAMPLE is in the syntax, at least one of
the two needs to be given. The default for both is the interval of the
continuous query.
The service also improves tracking of the last run time and the logic of
when a query for an interval should be run. When determining the oldest
interval to run for a query, the continuous query service determines
what would have been the optimal time to perform the next query based on
the last run time. It then uses this time to determine the oldest
interval that should be run using the resample duration and will
resample all intervals between this time and the current time as opposed
to potentially forgetting about the last run in an interval if the
continuous query service gets delayed for some reason.
This removes the previous config options for customizing continuous
queries since they are no longer relevant and adds a new option of
customizing the run interval. The run interval determines how often the
continuous query service polls for when it should execute a query. This
option defaults to 1s, but can be set to 1m if the least common factor
of all continuous queries' intervals is a higher value (like 1m).