Commit Graph

635 Commits (2a658ec0ab2b0cdb5bbb93d1c54d4dccbf9f2e6a)

Author SHA1 Message Date
Andrew Lamb 3eb48ef210
chore: Update datafusion again (#8247)
* chore: Update datafusion to get new grouping

* chore: Update for new API

* chore: update tests

* fix: new API

* fix: state type

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-21 11:20:36 +00:00
Andrew Lamb ac9d1946e9
fix: add retry loop to avoid CI flake in build-catalog test (#8271)
* fix: add retry loop to avoid CI flake in build-catalog test

* fix: Update influxdb_iox/tests/end_to_end_cases/debug.rs

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-21 10:14:06 +00:00
Joe-Blount 1bed99567c
chore: add DF metrics to compaction spans (#8270)
* chore: add DF metrics to compaction spans

* chore: update string for test verification

* chore: update comment

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-20 15:00:22 +00:00
Christopher M. Wolff 668a1c3d8e
fix: aggregate fns called on tags should return null (#8274)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-20 14:55:16 +00:00
Marco Neumann 6ae9143742
fix: `end_to_end_cases::cli::query_ingester` flakyness (#8281)
While I cannot reproduce the CI flakyness locally (probably because the
local system is fast enough), looking at the test convinced me that the
ingester should not persist.

Closes #8245.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-20 11:48:59 +00:00
Martin Hilton d1640bb926
feat(influxql): CUMULATIVE_SUM window function (#8248)
* feat(influxql): CUMULATIVE_SUM window function

Implement the InfluxQL CUMULATIVE_SUM window function. This is
implemented as described in
https://docs.influxdata.com/influxdb/v1.8/query_language/functions/#cumulative_sum.

* chore: Add a test demonstrating NULL handling of CUMULATIVE_SUM

---------

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-18 06:13:58 +00:00
Christopher M. Wolff 33e41fc5cb
fix: improve error for malformed gap fill query (#8252)
* fix: improve error for malformed gap fill query

* fix: code review feedback
2023-07-17 21:20:34 +00:00
Christopher M. Wolff b916a89159
fix: recurse through SubqueryAlias when finding gap fill time range (#8249) 2023-07-17 19:39:30 +00:00
Joe-Blount 85a9e13262
Merge branch 'main' into jrb_63_compactor_spans 2023-07-17 09:52:27 -05:00
Christopher M. Wolff 85f03acbdf
fix: correctly catch field/tag discrepancy (#8234)
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-14 18:21:56 +00:00
Joe-Blount 803122e3b4 Merge remote-tracking branch 'origin/main' into jrb_63_compactor_spans
# Conflicts:
#	compactor/src/driver.rs
2023-07-13 08:54:22 -05:00
Andrew Lamb 9bfec2f77c
fix: ignore flaky test while it is debugged (#8227)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-13 10:27:13 +00:00
Andrew Lamb 48a5c3e966
chore: Add longer sleep in `end_to_end_cases::debug::build_catalog` and extra logging (#8224)
* fix: Add longer sleep in end_to_end_cases::debug::build_catalog

* chore: add debug logging when test fails

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-13 10:14:59 +00:00
Joe-Blount c5a4912399 chore: add compactor tracing test case 2023-07-11 10:43:09 -05:00
Martin Hilton 9111cd517f
feat(influxql): PERCENTILE function (#8187)
* feat(influxql): support TOP and BOTTOM  functions

Add support for the TOP and BOTTOM functions which return the first
n rows in some ordered data set.

* fix: clippy

* refactor(influxql): use window aggregates for selectors

Change the implentation of ProjectionType::Selector to use a window
aggregate, rather than an aggregate with a custom selector function.
This is in preparation for implementing PERCENTILE.

* feat(influxql): PERCENTILE selector

Add a selector for the row containing the nth percentile of a
partition. This is the behaviour used when a single selector function
is used in an influxql query.

* feat(influxql): PERCENTILE aggregator

Add the PERCENTILE aggregation function for when the PERCENTILE
function is used in an aggregating projection. This implementation
buffers all non-null field values in memory in order to perform the
operation and therefore could be an expensive operation. This is
necessary for compatibility with earlier influxdb versions.

* refactor(influxql): move PERCENTILE implementation out of plan

The plan module is getting rather full of user-defined function
implementations. This breaks the new functions used to implement
percentile into some new top-level modules for aggregate and window
UDFs.

* fix: doc-lint

* chore: refactor `find_enumerated`

* chore: use `s` in format string

* chore: include the unexpected selector function in the error

* chore(influxql): review suggestions

Added some addition comments to help understanding.

Changed the handling os slector functions such that FIRST, LAST,
MAX & MIN behave the same as they did before PERCENTILE was added.

* chore(influxql): make percent_row_number a window UDF

Now that user-defined window functions are available make the
percent_row_number function be one of those. this allows the values
to be calculated for the entire window partition in one go.

For some reason the user-defined window function cannot return NULL
values. This function uses 0 where it would otherwise use NULL, as
row numbering starts at 1.

---------

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-11 05:33:16 +00:00
Fraser Savage dec0244bff
refactor(e2e): Wait 100ms between queries in debug::build_catalog test 2023-07-10 15:27:30 +01:00
Fraser Savage 0978aa0551
fix(e2e): Add small busy-loop to debug::build_catalog test to assert only on non-empty results 2023-07-10 15:13:37 +01:00
Andrew Lamb 3ce11d8d66
chore: Update DataFusion (#8190)
* chore: Update DataFusion

* chore: Run cargo hakari tasks

* fix: Update for API changes

* fix: use display format

* chore: Update explain plan output

* fix: update plans

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-10 09:54:50 +00:00
Andrew Lamb 048fc32bd5
feat: add `influxdb_iox debug build-catalog` command (#8067)
* feat: add `influxdb_iox debug build-catalog` command

* fix: tests

* fix: Use info! logs instead of println for status

* fix: Set partition_hash_id as well

* fix: remove leftover code

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-07 18:32:27 +00:00
Stuart Carnie 1ca547b313
fix: Teach planner to rewrite binary expressions for div operator
Specifically when the operands are integers, to match InfluxQL OG
2023-07-07 11:22:03 +10:00
Martin Hilton dfffdc1d90
feat(influxql): support TOP and BOTTOM functions (#8143)
* refactor(iox_query_influxql): expand select projection

Change the SELECT projection in the planner to make it clearer how
each projection type works.

* feat(influxql): support TOP and BOTTOM  functions

Add support for the TOP and BOTTOM functions which return the first
n rows in some ordered data set.

* fix: clippy

* chore: Use array / slice destructuring

* chore: review suggestion in iox_query_influxql/src/plan/planner.rs

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>

---------

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-06 07:08:45 +00:00
Marco Neumann 70b44f78ee
test: correctly decode ingester reponses in end2end tests 2023-07-03 17:25:01 +02:00
Marco Neumann b1a4e3955e
test: `ingester_partition_pruning` must perform type coercion 2023-07-03 17:25:00 +02:00
Carol (Nichols || Goulding) cd28bf0337
test: Query an ingester with a predicate that should prune partitions 2023-07-03 17:24:58 +02:00
Dom Dwyer e5a9e1534a
test: assert 1 file persisted
There should be a single file persisted during graceful shutdown.
2023-07-03 15:51:02 +02:00
Dom Dwyer 5d0c172e61
test(e2e): query shutdown-persisted files
Ensure buffered ingester data is persisted and remains queryable after a
graceful ingester shutdown.
2023-07-03 15:51:02 +02:00
Marco Neumann 4638b89d93
refactor: migrate retention to proper predicates (#8092)
Do not (ab)use per-chunk delete predicates for the retention policy.
Instead use a per-table predicate.

This makes the code way cleaner, since the scoping is correct (i.e.
delete predicates are a table-wide attribute, not a chunk-based one) and
it is consistent time predicates that the user providers (e.g. via
`WHERE time > x`).

It also allows us to remove delete predicates (in their current,
non-scalable form) from the query path. A potential future version would
likely not use per chunk predicates (and "is processed" markers) but use
the timestamp / chunk order to determine to which data the predicate
should be applied.

Note that the lowering of the retention policy changed slightly from

```text
(time > (now() - retention)) AND (time < MAX)
```

to

```text
time > (now() - retention)
```

Since the `MAX` cut is just an artifact of the lowering and was unnecessary.

Closes #7409.
Closes #7410.
2023-06-29 08:36:37 +00:00
Martin Hilton 511a0bae78
feat(influxql): add derivative and non_negative_derivative (#8103)
Add the DERIVATIVE and NON_NEGATIVE_DERIVATIVE functions to influxql.
These are used to calculate derivatives over arbitrary time units.
The implementation is modeled after the DIFFERENCE and
NON_NEGATIVE_DIFFERENCE functions, with a difference that the unit
parameters is a configuration of the user-defined aggregator function
and therefore there cannot be a single shared definition of the
function.

The NON_NEGATIVE_DIFFERENCE function implementation has been
refactored to be an arbitrary NON_NEGATIVE wrapper for any Accumulator
function.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-29 05:53:18 +00:00
Marco Neumann 178483c1a0
feat: basic non-aggregates w/ InfluxQL selector functions (#8016)
* test: ensure that selectors check arg count

* feat: basic non-aggregates w/ InfluxQL selector functions

See #7533.

* refactor: clean up code

* feat: get more advanced cases to work

* docs: remove stale comments

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-23 08:05:50 +00:00
Stuart Carnie 7b4a1a0660
chore: PR feedback
Add tests for fewer rows than N for `moving_average`

See: https://github.com/influxdata/influxdb_iox/pull/8023#discussion_r1237298376
2023-06-22 12:15:47 +10:00
Stuart Carnie 13726c2a76
Merge branch 'main' into sgc/issue/7600_moving_average 2023-06-22 10:10:22 +10:00
Marco Neumann 83a5037e61
feat: query support for custom partitioning (#8025)
* feat: querier-specific stat creation routine

* feat: prune querier chunks using partition col ranges

* feat: add table client

* test: custom partitioning

* fix: correctly set up stats for chunks with col subsets

* fix: flaky test

* refactor: remove obsolete dead_code markers

* feat: add partition template to `create_namespace`

* test: extend custom partitioning end2end tests

* fix: explain shuffling, make it actual deterministic
2023-06-21 09:03:19 +00:00
Stuart Carnie 2cbaf9cffa
chore: more tests, renamed avg_n → moving_average 2023-06-21 15:05:08 +10:00
Stuart Carnie edaac28498
Merge branch 'main' into sgc/issue/7600_moving_average 2023-06-21 11:39:06 +10:00
wiedld 34b5fadde0
refactor: move scheduler related configs to compactor_scheduler (#8013) 2023-06-20 09:55:35 -07:00
Stuart Carnie a2521bbf35
feat: moving_average, difference and non_negative_difference
There is a `todo` regarding `update_batch` to be discussed with @alamb
2023-06-20 16:37:28 +10:00
Stuart Carnie 8670b28445
Merge branch 'main' into sgc/issue/7600_moving_average 2023-06-18 09:41:19 +10:00
Andrew Lamb 5889c96501
chore: Update `datafusion` and other dependencies (#7981)
* chore: Update DatFaFusion pin

* chore: Update other dependencies

* chore: Update hakari

* fix: Update for API changes

* fix: Update explain plan

* fix: Update influxql plans

* fix: rustdoc links

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-16 09:48:55 +00:00
Stuart Carnie 2407be8062
feat: trialed retractable UDAF
Unfortunately, this is not suitable when the source data has nulls,
as InfluxQL OG ignores these values.
2023-06-16 13:10:47 +10:00
Fraser Savage 73c0c28bd0
feat(cli): Add `influxdb_iox debug wal inspect` command
This commit adds an `inspect` command to read through the sequenced
operations in a WAL file and debug pretty print their contents to
stdout, optionally filtering by a sequence number range.
2023-06-09 18:16:57 +01:00
Marko Mikulicic d26ad8e079
feat: Allow passing service protection limits in create db gRPC call (#7941)
* feat: Allow passing service protection limits in create db gRPC call

* fix: Move the impl into the catalog namespace trait

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-08 14:28:32 +00:00
Andrew Lamb 17c0d837b3
chore: Update DataFusion, arrow, object_store pins (#7942)
* chore: Update DataFusion, arrow, object_store pins

* chore: Update for hakari

* chore: Update for new APIs

* fix: update test

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-07 17:08:31 +00:00
Stuart Carnie c18902b05e
Merge branch 'main' into sgc/issue/7829_time_bounds_3 2023-06-07 08:51:38 +10:00
Nga Tran a2f5f37b2e
test: turn interval 0 test on after upgrading DF with the fix (#7938)
* test: turn interval 0 test on after upgrading DF with the fix

* chore: remove obsolete comments
2023-06-06 15:50:54 +00:00
Stuart Carnie f114842711
feat: Push outer query time-range to subqueries
Added additional end-to-end tests to validate time-range behaviour
2023-06-06 16:33:01 +10:00
Stuart Carnie 9e2550c933
Merge branch 'main' into sgc/issue/7829_time_bounds_3
# Conflicts:
#	iox_query_influxql/src/plan/planner.rs
2023-06-06 12:55:43 +10:00
Andrew Lamb f571aeb445
chore: Update DataFusion pin (#7916)
* chore: Update DataFusion pin

* chore: Update cargo

* fix: update for API changes

* fix: Update plans

* chore: Update for new api

* fix: Update plans

* chore: Update for API changes more

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-05 18:38:59 +00:00
Stuart Carnie d8c2f2c679
refactor: Simplify `TimeRange` to match InfluxQL OG behaviour explicitly 2023-06-05 15:14:13 +10:00
Stuart Carnie 28166006a8
chore: clippy 2023-06-04 06:56:19 +10:00
kodiakhq[bot] 1d6fd83a9a
Merge branch 'main' into savage/wal-regenerate-lp-catalog-support 2023-06-02 14:23:55 +00:00