Scanner objects and iterators often need a ValuerEval. This
object is created, often with a function call, and has at
least one interface in it, so it allocates storage. Then it's
dropped again right away. The only part of it that might be
subject to change is usually a map. While the map's contents
change over time, the actual map doesn't change for the
lifetime of the object.
So, in both iterators and scanners, stash the ValuerEval
and continue reusing it. On a query returning a fair number
of data points, this produces a small (<5% in practice)
improvement in observed performance, visible as a significant
reduction in time spent in runtime (mallocgc, newobject,
etcetera).
The performance improvement isn't big, but it's reasonably
easy to evaluate it and establish that it's a safe change
to make.
Signed-off-by: seebs <seebs@seebs.net>
This also fixes the cursor system to abandon iterators that will not
produce meaningful results since the variables are all unknown types.
This creates a weird behavior that existed in previous releases and we
are keeping here for backwards compatibility. If a subquery referenced a
field that didn't exist in the subquery, it will return nothing. But, if
there are two subqueries and one of them has the field exist and the
other doesn't, the second will return all null values.
This change makes it so that we simplify the math engine so it doesn't
use a complicated set of nested iterators. That way, we have to change
math in one fewer place.
It also greatly simplifies the query engine as now we can create the
necessary iterators, join them by time, name, and tags, and then use the
cursor interface to read them and use eval to compute the result. It
makes it so the auxiliary iterators and all of their complexity can be
removed.
This also makes use of the new eval functionality that was recently
added to the influxql package.
No math functions have been added, but the scaffolding has been included
so things like trigonometry functions are just a single commit away.
This also introduces a small breaking change. Because of the call
optimization, it is now possible to use the same selector multiple times
as a selector. So if you do this:
SELECT max(value) * 2, max(value) / 2 FROM cpu
This will now return the timestamp of the max value rather than zero
since this query is considered to have only a single selector rather
than multiple separate selectors. If any aspect of the selector is
different, such as different selector functions or different arguments,
it will consider the selectors to be aggregates like the old behavior.
The Cursor returned will be capable of scanning rows into a structure.
It replaces part of the function for why the Emitter existed. The
Emitter would both join the resulting rows and then transform the values
into a models.Row so it could be returned to the results.
In the future, we will be able to use the Cursor directly to write out
values which should be more memory efficient.