Commit Graph

23 Commits (933a14e16f0232c5ec883644b7363a59da7bcd90)

Author SHA1 Message Date
Jeffrey Smith II fce0d1c863
chore: update to go 1.19 (#24119)
* chore: update to go 1.19.6

* chore: gofmt

* test: fix tests for sort order change

* chore: generate pb

* feat: upgrade flux to v0.188.0 (#23911)

* feat: upgrade flux to 0.171.0

Tests failing, safety commit

First step in https://github.com/influxdata/influxdb/issues/23815

* fix: remove "org" parameter" from writeOptSource

I attempted to implement the "orgOpt" argument in a similar fashion
to f6669f7512. However, it looks like Flux doesn't accept "org" as
a parameter to "load". It responds with:

Error calling function \"load\" @113:16-113:30: error calling function \"to\" @6:19-6:47: unused arguments [org]

This brings us from 194 passing to 570 passing.

* fix: temporarily disable broken flux tests

These tests expect rows to be stored in a certain order. However,
nothing is specifying the sort order. This has been fixed in a
later update to flux: (see 3d6f47ded).

Temporarily disable these tests until we include a fixed
version of the flux tests.

* chore: add tests from a492993012

This fixes "test-flux.sh" so it runs tests within the "flux/"
directory. This uncovered some other issues with the tests
located within "flux/". These also needed to be updated
to match the newer flux API.

* feat: upgrade flux to 0.172.0

This includes changes made in "cbbf4b27da". Since "test.go" in 2.x
diverged from 1.x, some modifications were required to make this
compatible.

* feat: upgrade flux to 0.173.0

* feat: upgrade flux to v0.174.0

* fix: Update the condition when reseting cursor (#23522)

Filters that contain `or` may change between cursor resets so we must remember to update the condition in the read cursor.

```flux
|> filter(fn: (r) => ((r["_field"] == "field1" and r["_value"]==true) or (r["_field"] == "field2" and r["_value"] == false)))
```

Closes https://github.com/influxdata/flux/issues/4804

* feat: upgrade flux to 0.174.1

* feat: upgrade flux to 0.175.0

* chore: remove end-to-end tests

These were removed in a492993 for 2.x. These tests prevent "go test ./..."
from completing. As stated in the original commit, these tests should now be
handled by the "fluxtest" harness.

* feat: upgrade flux to 0.176.0

Some tests needed to be disabled within the flux harness. This is a
result of enabling "Optimize Aggregate Window" in flux@05a1065f.
These tests are not present in 2.x. Therefore, I am unsure if
the breakage is resolved in a later commit.

* feat: upgrade flux to 0.177.0

* feat: upgrade flux to 0.178.0

* feat: upgrade flux to v0.179.0

This removes all invocations of "flux.RegisterOpSpec". According
to flux@e39096d5, "flux.RegisterOpSpec" does nothing in the
current version of flux and was removed.

* chore: update fluxtest skip list (#23633)

* chore: manually backport 785a465e9a

This removes the reference to "flux.Spec".

* build(flux): update flux to v0.181.0 (#23682)

* build(flux): update flux to v0.184.2

* chore: skip more Flux acceptance tests

There are issues for each skip detailed in test-flux.sh.

* feat: upgrade flux to v0.185.0

This adds "FluxTesting" to the "HTTPD" configuration. This option is
hidden and disabled by default. When "FluxTesting" is set, it
enables the default testing flags for "Flux".

These flags allow the "vectorized float tests" and tests requiring
the "removeRedundantSortNodes" and "labelPolymorphism" flag
enabled to work. These changes are based off of d8553c002e.

flux@3d6f47ded is included within this version of Flux. Therefore
we can now include the "group_*" tests.

* feat: upgrade flux to 0.186.0

* feat: upgrade flux to 0.187.0

* feat: upgrade flux to 0.188.0

* fix: re-run ./generate.sh with updated protoc

* fix: restrict cores to match CircleCI documentation

Co-authored-by: davidby-influx <dbyrne@influxdata.com>
Co-authored-by: Markus Westerlind <marwes91@gmail.com>
Co-authored-by: Sean Brickley <sean@wabr.io>
Co-authored-by: Jonathan A. Sternberg <jonathan@influxdata.com>
Co-authored-by: Christopher M. Wolff <chris.wolff@influxdata.com>

---------

Co-authored-by: Brandon Pfeifer <bpfeifer@influxdata.com>
Co-authored-by: davidby-influx <dbyrne@influxdata.com>
Co-authored-by: Markus Westerlind <marwes91@gmail.com>
Co-authored-by: Sean Brickley <sean@wabr.io>
Co-authored-by: Jonathan A. Sternberg <jonathan@influxdata.com>
Co-authored-by: Christopher M. Wolff <chris.wolff@influxdata.com>
2023-03-03 10:05:05 -05:00
Dane Strandboge 8b38d0e2bf
build: upgrade protobuf library (#22606) 2021-10-15 11:42:47 -05:00
Sam Arnold 894f54e6ac
fix: group by returns multiple results per group in some circumstances (#21631)
* fix: Revert performance improvement for sorted merge iterator

This reverts commit af8e66cd25.

* test: add end to end regression test for broken group-by

* chore: update changelog
2021-06-08 10:41:58 -04:00
Tristan Su af8e66cd25 improvement(query): performance improvement for sorted merge iterator
Sorted merge iterator has cpu-intensive operations to sort the points
from multiple inputs. Typical queries like `SELECT * FROM m GROUP BY *`
do not behave well due to the comparison of points though in many cases
it doesn't necessarily have to use the slow path.

This patch adds a shortcut.  If each input has a single and unique
series we can just return the points input by input.
The detection of the shortcut introduces slight overhead but the gains
are significant in many slow queries.
2020-04-20 21:06:45 +08:00
Jonathan A. Sternberg c381389f35
Subquery ordering with aggregates in descending mode was wrong
The sort order of points when performing aggregates never took into
account if they were ascending or descending so when multiple series
were aggregated, it would ensure they were sorted in the correct order.
But it wouldn't reverse this order when descending was used.

Additionally, it seems that the iterator template and the iterator file
itself became out of sync. It seems the template was not reverted
correctly from a previously incorrect change and only the float type was
changed to the correct version and the tests used the float version.
2019-07-17 09:55:38 -05:00
Jonathan A. Sternberg e153f3fa10
Fix the sort order for aggregates so that they are sorted by tag and then time
The reduce iterators would read in the points for a window, which
matched the grouping of the outermost query, and then it would sort them
by the time before emitting the points.

When there were multiple series, this would sometimes cause a conflict
because it would change the sorting of the inner query output when
selectors were used within a subquery. Then, these emitted points would
be output in the wrong order and they wouldn't join correctly when
multiple cursors were used.

This fixes it so the sorting happens per series grouping rather than on
all of the points together so they retain their tag order which is the
correct sorting method.
2019-04-16 12:25:32 -05:00
Jonathan A. Sternberg 0a7f379768
Fixing the stream iterator to not ignore the error
The stream iterator would ignore an error that happened when reading
points. This may have caused it to potentially return an error that got
ignored and then to try invoking `Next()` on an iterator in an invalid
state and that iterator would then actually return a point which it
wasn't supposed to.

Also added some defensive coding to that same call to prevent a nil map
from being assigned to in the event of an invalid iterator returning
junk data.
2018-10-10 16:26:55 -05:00
Jonathan A. Sternberg 22fc9f6a19
Strip tags from a subquery when the outer query does not group by that tag
The following would, erroneously, not strip the tag from the inner
query:

    SELECT value FROM (SELECT value FROM cpu GROUP BY host)

The inner query was supposed to group by the host tag, but the outer
query should strip it away since it is not being grouped by anymore.
This fixes things so that the result will have the tags stripped away
when they are not requested in the grouping.
2018-10-04 10:05:46 -05:00
Jonathan A. Sternberg 0f304690c5 Enable casting values from a subquery
This also fixes the cursor system to abandon iterators that will not
produce meaningful results since the variables are all unknown types.

This creates a weird behavior that existed in previous releases and we
are keeping here for backwards compatibility. If a subquery referenced a
field that didn't exist in the subquery, it will return nothing. But, if
there are two subqueries and one of them has the field exist and the
other doesn't, the second will return all null values.
2018-03-30 16:58:37 -05:00
Jonathan A. Sternberg a49a8dce6b Remove unused query code
This code was previously used to implement binary expressions and other
transfomation iterators. It is no longer needed.
2018-03-28 13:24:45 -05:00
Jonathan A. Sternberg f8d60a881d Refactor the math engine to compile the query and use eval
This change makes it so that we simplify the math engine so it doesn't
use a complicated set of nested iterators. That way, we have to change
math in one fewer place.

It also greatly simplifies the query engine as now we can create the
necessary iterators, join them by time, name, and tags, and then use the
cursor interface to read them and use eval to compute the result. It
makes it so the auxiliary iterators and all of their complexity can be
removed.

This also makes use of the new eval functionality that was recently
added to the influxql package.

No math functions have been added, but the scaffolding has been included
so things like trigonometry functions are just a single commit away.

This also introduces a small breaking change. Because of the call
optimization, it is now possible to use the same selector multiple times
as a selector. So if you do this:

    SELECT max(value) * 2, max(value) / 2 FROM cpu

This will now return the timestamp of the max value rather than zero
since this query is considered to have only a single selector rather
than multiple separate selectors. If any aspect of the selector is
different, such as different selector functions or different arguments,
it will consider the selectors to be aggregates like the old behavior.
2018-03-19 15:01:15 -05:00
Jonathan A. Sternberg df7a660fb3 Modify the Select call to return a Cursor
The Cursor returned will be capable of scanning rows into a structure.
It replaces part of the function for why the Emitter existed. The
Emitter would both join the resulting rows and then transform the values
into a models.Row so it could be returned to the results.

In the future, we will be able to use the Cursor directly to write out
values which should be more memory efficient.
2018-03-09 12:47:41 -06:00
Edd Robinson 21f0c6415b Cleanup query package 2018-01-21 12:08:23 -08:00
Jonathan A. Sternberg a73c3a1965 Fix race condition in the merge iterator close method
If the close happens when next is being called, it can result in a race
condition where the current iterator gets set to nil after the initial
check.

This also fixes the finalizer so it runs the close method in a goroutine
instead of running it by itself. This is because all finalizers run on
the same goroutine so a close that takes a long time can cause a backup
for all finalizers. This also removes the redundant call to
`runtime.SetFinalizer` from the finalizer itself because a finalizer,
when called, has already cleared itself.
2017-11-27 16:55:41 -06:00
Edd Robinson 98d584b63f Use index for SHOW X meta queries
When a meta query does not include a time component then it can be
answered exclusively by the index. This should result in a much faster
query execution that if the TSM engine was engaged.

This commit rewrites the following queries such that they make use
of the index where no time component is present:

  - SHOW MEASUREMENTS
  - SHOW SERIES
  - SHOW TAG KEYS
  - SHOW FIELD KEYS
2017-11-06 19:15:00 +00:00
Stuart Carnie f3d45ba301 influxdata/influxdb/influxql -> influxdata/influxql 2017-10-30 14:40:26 -07:00
Stuart Carnie c51ba16287 fixes #9007 2017-10-25 13:08:55 -07:00
Stuart Carnie e9313876ab EXPLAIN ANALYZE
* Introduces EXPLAIN ANALYZE command, which
  produces a detailed tree of operations used to
  execute the query.

introduce context.Context to APIs

metrics package

* create groups of named measurements
* safe for concurrent access

tracing package

EXPLAIN ANALYZE implementation for OSS

Serialize EXPLAIN ANALYZE traces from remote nodes

use context.Background for tests

group with other stdlib packages

additional documentation and remove unused API

use influxdb/pkg/testing/assert

remove testify reference
2017-10-20 08:01:37 -07:00
Jonathan A. Sternberg f20cab6e99 Implicitly decide on the lower limit for fill queries when none is present
This allows the query:

    SELECT mean(value) FROM cpu GROUP BY time(1d)

To function in some way that makes sense. The upper limit is implicitly
the `now()` starting time and the lower limit will be whichever interval
the lowest point falls into.

When no lower bound is specified and `max-select-buckets` is specified,
the query will only consider points that would satisfy
`max-select-buckets`. So if you have one point written in 1970, have
another point within the last minute, and then do the above query with
`max-select-buckets` being equal to 10, the older point from 1970 will
not be considered.
2017-10-05 15:56:44 -05:00
Jonathan A. Sternberg 0ef94e0cf0 Add unsigned iterators for all types
This allows unsigned data to be queried from the storage engine.

Binary math is not yet implemented for unsigned types.
2017-09-18 15:09:10 -05:00
Jonathan A. Sternberg 5a9553b2c4 Remove unused casting code from the query engine
Originally, casting was performed inside of the query engine especially
for call iterators. Currently, the engine takes care of all casting so
we just need to normalize the iterators types for type safety reasons
rather than actual functional reasons.

Removing this code. Cover coverage showed that it was not hit when run
against the actual server. I ran the tests package and got code coverage
of the query package while running the tests in that package.
2017-09-18 12:33:34 -05:00
Edd Robinson 8c4686fb1b Ensure that sorted heaps are merged correctly
When merging streams of system iterators we don't use tags or time.
Instead we add series keys (in the case of, for example, `SHOW SERIES`)
to the `Aux` field of the iterators' elements. This is because we only
emit merged and sorted sets of series key to the client.

We currently use `SortedMergeHeap`s to merge together multiple
iterators, and the comparitor function did not consider `Aux` fields
when determining which heap to pop the next item off during a merge. As
such, `SHOW SERIES` and `SHOW TAG KEYS` (any meta query that gets
converted into a special type of `SELECT`) were returning results in
arbitrary order.

This issue was never noticed on the `inmem` index because the streams
are always duplicates of each other, and of course it doesn't matter if
you arbitrarily merge together two idential, sorted streams...

The issue first manifested itself on the `tsi1` index, but this fix will
apply to both indexes.
2017-08-23 17:21:24 +01:00
Jonathan A. Sternberg 9a2357c2c0 Separate the query engine into a separate package
This change provides a clear separation between the query engine
mechanics and the query language so that the language can be parsed and
dealt with separate from the query engine itself.
2017-08-16 13:38:43 -05:00