A static initialization is not desirable in the main binaries, as it forces all
paths of code to init, but it is still useful in tests. It allows static
intialization to be performed once for all tests and eliminates the need to
always add the FluxInit call. Added a fluxinit/static package that calls
fluxinit.FluxInit() to replace the builtin package. This hides the nature of
the initialization and makes it clear that it is mandatory initialization code
getting called.
It appears that the double write caused by using to() inside a separate
execution environment (experimental.chain) causes flux e2e tests to behave
unpredictably, when coupled with the 1.x storage engine. Removing the second
write by using two passes, one to write to the db, then another to run the
test, eliminates the flakiness. Verified by running e2e tests in parallel times
8 for 12 hours without any flakiness observed. Before the fix, the flakiness
would take approx 30 minutes on avgerage to exhibit.
This commit also removes universe/to_time from the skipped tests because it was
added when this flakiness was discovered.
This is a backport of #14262 to the 1.x storage engine. The 1.x storage
engine is now the primary engine for open source so when we switched we
regressed to the old behavior.
This also fixes `go generate` for the tsm1 package by running `tmpl`
with `go run` instead of assuming the correct one is installed in the
path.
This is required to keep the system resources low when running
the Flux end-to-end tests, which create a bucket for each test. A
bucket creates at least 17 files after the first write:
* 8 for the `_series` segment files
* 8 for the `index` log files
* 1 for the `wal`
Can specify that a key must be present in the query response metadata before
LoggingProxyQueryService logs the query. Will use this in gateway to only log
the query when the connection to queryd fails.
Enables the mix and max aggregates for the ReadGroupAggregte pushdown behind a feature flag.
Co-authored-by: Jonathan A. Sternberg <jonathan@influxdata.com>
The `buckets()` command would use a bucket lookup that wrapped the
`FindBuckets` API. It did not use the pagination aspect of this API
correctly. When the underlying implementation was changed to a version
that correctly implemented pagination, this broke the query `buckets()`
command. Since it was query that used the API incorrectly rather than a
regression in the `FindBuckets` implementation, this fixes the usage to
correctly use pagination.
Force the writing of data and running of the test to happen sequentially. As
the results come out, collect them and report an error only if the diff results
are not empty.
Additional changes:
* fix(query/stdlib): update rewrite rules for schema mutation
The schema mutator was wrapped in a dual implementation spec so the
rewrite rules were type asserting on the wrong type.
This rule reorders group and window so it will switch from using
`ReadGroup` to using `ReadWindowAggregate` when the intent is to
aggregate a grouped window. It will then add a group node that groups by
the given columns and the start and stop columns and then reperform the
aggregate. This is more performant than performing the group first.
Annotate the context with feature flags when handling flux queries in influxdb.
Taking advantage of this in flux end-to-end tests. Using a custom flagger that
can set overrides based on the test case that is about to be run, allowing us
to enable features in the end-to-end tests.
This enables a new rule that will push down the full `aggregateWindow`
query including the `duplicate` and `window(every: inf)` that recombines
the tables. When the full rule is used, the table is not split into
tables for each window and instead retains itself as a single table. The
start or stop column is renamed to `_time` and `_start` and `_stop` will
be the boundaries of the query.
* feat: flags for pushing down new aggregates
* refactor: grouped aggregate rewrite rules
The storage operation ReadGroup aggregates per series on the storage
side. The planner will rewrite grouped aggregate queries to call
ReadGroup, which will perform a partial aggregation, followed by
another operation that will perform the rest of the aggregation on
the compute side.
* feat: storage capabilities for grouped aggregates
* fix: changes from review
* feat: group read operation name should include aggregate
This implements create empty for the window table reader and allows this
table read function to be used when it is specified. It will pass down
the create empty flag from the original window call into the storage
read function.
This also fixes the window table reader so it properly creates
individual tables for each window. Previously, it was constructing one
table for an entire series instead of one table per window.
Tests have been added to verify three edge case behaviors. The first is
the normal read operation where all values are present. The second is
when create empty is specified so null values may be created. The third
is with truncated boundaries to ensure that storage is read from and the
start and stop timestamps get correctly truncated.
Added a (disabled) planner rule that matches:
ReadGroupPhys -> { count }
It uses the same physical spec node for group to implement the aggregate. The
rule requires:
* the pushDownGroupAggregateCount feature flag enabled
* no existing aggregate present in the ReadGroup
* use of the "_value" column only
The column reader passed to `flux.Table.Do` is automatically released.
The function passed to the column reader should never release it
manually. This causes a double release which causes the table to be
erroneously freed when it might be referenced by another transformation.
In particular, this affected the following:
tables
|> yield()
|> to()
This is because this would produce a buffered table with two references
and pass it to both `yield()` and `to()` because `yield()` is a
pseudo-node that doesn't really exist. The real graph looks more like:
tables |> yield()
tables |> to()
The `yield()` would double release which would release the `to()`
transformation's copy of the column readers. The `to()` method would
then be invoked with an invalid column reader.
The e2e test driver in influxdb runs the tests twice to get past the fact that there
is no way to force order between the write to storage and the read back. When
the json.Marshal call became mandatory it was added to the first run, but not
the second.