influxdb

Commit Graph

Author	SHA1	Message	Date
Martin Hilton	9111cd517f	feat(influxql): PERCENTILE function (#8187 ) * feat(influxql): support TOP and BOTTOM functions Add support for the TOP and BOTTOM functions which return the first n rows in some ordered data set. * fix: clippy * refactor(influxql): use window aggregates for selectors Change the implentation of ProjectionType::Selector to use a window aggregate, rather than an aggregate with a custom selector function. This is in preparation for implementing PERCENTILE. * feat(influxql): PERCENTILE selector Add a selector for the row containing the nth percentile of a partition. This is the behaviour used when a single selector function is used in an influxql query. * feat(influxql): PERCENTILE aggregator Add the PERCENTILE aggregation function for when the PERCENTILE function is used in an aggregating projection. This implementation buffers all non-null field values in memory in order to perform the operation and therefore could be an expensive operation. This is necessary for compatibility with earlier influxdb versions. * refactor(influxql): move PERCENTILE implementation out of plan The plan module is getting rather full of user-defined function implementations. This breaks the new functions used to implement percentile into some new top-level modules for aggregate and window UDFs. * fix: doc-lint * chore: refactor `find_enumerated` * chore: use `s` in format string * chore: include the unexpected selector function in the error * chore(influxql): review suggestions Added some addition comments to help understanding. Changed the handling os slector functions such that FIRST, LAST, MAX & MIN behave the same as they did before PERCENTILE was added. * chore(influxql): make percent_row_number a window UDF Now that user-defined window functions are available make the percent_row_number function be one of those. this allows the values to be calculated for the entire window partition in one go. For some reason the user-defined window function cannot return NULL values. This function uses 0 where it would otherwise use NULL, as row numbering starts at 1. --------- Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-07-11 05:33:16 +00:00
Fraser Savage	dec0244bff	refactor(e2e): Wait 100ms between queries in debug::build_catalog test	2023-07-10 15:27:30 +01:00
Fraser Savage	0978aa0551	fix(e2e): Add small busy-loop to debug::build_catalog test to assert only on non-empty results	2023-07-10 15:13:37 +01:00
Andrew Lamb	3ce11d8d66	chore: Update DataFusion (#8190 ) * chore: Update DataFusion * chore: Run cargo hakari tasks * fix: Update for API changes * fix: use display format * chore: Update explain plan output * fix: update plans --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-07-10 09:54:50 +00:00
Andrew Lamb	048fc32bd5	feat: add `influxdb_iox debug build-catalog` command (#8067 ) * feat: add `influxdb_iox debug build-catalog` command * fix: tests * fix: Use info! logs instead of println for status * fix: Set partition_hash_id as well * fix: remove leftover code --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-07-07 18:32:27 +00:00
Stuart Carnie	1ca547b313	fix: Teach planner to rewrite binary expressions for div operator Specifically when the operands are integers, to match InfluxQL OG	2023-07-07 11:22:03 +10:00
Martin Hilton	dfffdc1d90	feat(influxql): support TOP and BOTTOM functions (#8143 ) * refactor(iox_query_influxql): expand select projection Change the SELECT projection in the planner to make it clearer how each projection type works. * feat(influxql): support TOP and BOTTOM functions Add support for the TOP and BOTTOM functions which return the first n rows in some ordered data set. * fix: clippy * chore: Use array / slice destructuring * chore: review suggestion in iox_query_influxql/src/plan/planner.rs Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com> --------- Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-07-06 07:08:45 +00:00
Marco Neumann	70b44f78ee	test: correctly decode ingester reponses in end2end tests	2023-07-03 17:25:01 +02:00
Marco Neumann	b1a4e3955e	test: `ingester_partition_pruning` must perform type coercion	2023-07-03 17:25:00 +02:00
Carol (Nichols \|\| Goulding)	cd28bf0337	test: Query an ingester with a predicate that should prune partitions	2023-07-03 17:24:58 +02:00
Dom Dwyer	e5a9e1534a	test: assert 1 file persisted There should be a single file persisted during graceful shutdown.	2023-07-03 15:51:02 +02:00
Dom Dwyer	5d0c172e61	test(e2e): query shutdown-persisted files Ensure buffered ingester data is persisted and remains queryable after a graceful ingester shutdown.	2023-07-03 15:51:02 +02:00
Marco Neumann	4638b89d93	refactor: migrate retention to proper predicates (#8092 ) Do not (ab)use per-chunk delete predicates for the retention policy. Instead use a per-table predicate. This makes the code way cleaner, since the scoping is correct (i.e. delete predicates are a table-wide attribute, not a chunk-based one) and it is consistent time predicates that the user providers (e.g. via `WHERE time > x`). It also allows us to remove delete predicates (in their current, non-scalable form) from the query path. A potential future version would likely not use per chunk predicates (and "is processed" markers) but use the timestamp / chunk order to determine to which data the predicate should be applied. Note that the lowering of the retention policy changed slightly from ```text (time > (now() - retention)) AND (time < MAX) ``` to ```text time > (now() - retention) ``` Since the `MAX` cut is just an artifact of the lowering and was unnecessary. Closes #7409. Closes #7410.	2023-06-29 08:36:37 +00:00
Martin Hilton	511a0bae78	feat(influxql): add derivative and non_negative_derivative (#8103 ) Add the DERIVATIVE and NON_NEGATIVE_DERIVATIVE functions to influxql. These are used to calculate derivatives over arbitrary time units. The implementation is modeled after the DIFFERENCE and NON_NEGATIVE_DIFFERENCE functions, with a difference that the unit parameters is a configuration of the user-defined aggregator function and therefore there cannot be a single shared definition of the function. The NON_NEGATIVE_DIFFERENCE function implementation has been refactored to be an arbitrary NON_NEGATIVE wrapper for any Accumulator function. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-29 05:53:18 +00:00
Marco Neumann	178483c1a0	feat: basic non-aggregates w/ InfluxQL selector functions (#8016 ) * test: ensure that selectors check arg count * feat: basic non-aggregates w/ InfluxQL selector functions See #7533. * refactor: clean up code * feat: get more advanced cases to work * docs: remove stale comments --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-23 08:05:50 +00:00
Stuart Carnie	7b4a1a0660	chore: PR feedback Add tests for fewer rows than N for `moving_average` See: https://github.com/influxdata/influxdb_iox/pull/8023#discussion_r1237298376	2023-06-22 12:15:47 +10:00
Stuart Carnie	13726c2a76	Merge branch 'main' into sgc/issue/7600_moving_average	2023-06-22 10:10:22 +10:00
Marco Neumann	83a5037e61	feat: query support for custom partitioning (#8025 ) * feat: querier-specific stat creation routine * feat: prune querier chunks using partition col ranges * feat: add table client * test: custom partitioning * fix: correctly set up stats for chunks with col subsets * fix: flaky test * refactor: remove obsolete dead_code markers * feat: add partition template to `create_namespace` * test: extend custom partitioning end2end tests * fix: explain shuffling, make it actual deterministic	2023-06-21 09:03:19 +00:00
Stuart Carnie	2cbaf9cffa	chore: more tests, renamed avg_n → moving_average	2023-06-21 15:05:08 +10:00
Stuart Carnie	edaac28498	Merge branch 'main' into sgc/issue/7600_moving_average	2023-06-21 11:39:06 +10:00
wiedld	34b5fadde0	refactor: move scheduler related configs to compactor_scheduler (#8013 )	2023-06-20 09:55:35 -07:00
Stuart Carnie	a2521bbf35	feat: moving_average, difference and non_negative_difference There is a `todo` regarding `update_batch` to be discussed with @alamb	2023-06-20 16:37:28 +10:00
Stuart Carnie	8670b28445	Merge branch 'main' into sgc/issue/7600_moving_average	2023-06-18 09:41:19 +10:00
Andrew Lamb	5889c96501	chore: Update `datafusion` and other dependencies (#7981 ) * chore: Update DatFaFusion pin * chore: Update other dependencies * chore: Update hakari * fix: Update for API changes * fix: Update explain plan * fix: Update influxql plans * fix: rustdoc links --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-16 09:48:55 +00:00
Stuart Carnie	2407be8062	feat: trialed retractable UDAF Unfortunately, this is not suitable when the source data has nulls, as InfluxQL OG ignores these values.	2023-06-16 13:10:47 +10:00
Fraser Savage	73c0c28bd0	feat(cli): Add `influxdb_iox debug wal inspect` command This commit adds an `inspect` command to read through the sequenced operations in a WAL file and debug pretty print their contents to stdout, optionally filtering by a sequence number range.	2023-06-09 18:16:57 +01:00
Marko Mikulicic	d26ad8e079	feat: Allow passing service protection limits in create db gRPC call (#7941 ) * feat: Allow passing service protection limits in create db gRPC call * fix: Move the impl into the catalog namespace trait --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-08 14:28:32 +00:00
Andrew Lamb	17c0d837b3	chore: Update DataFusion, arrow, object_store pins (#7942 ) * chore: Update DataFusion, arrow, object_store pins * chore: Update for hakari * chore: Update for new APIs * fix: update test --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-07 17:08:31 +00:00
Stuart Carnie	c18902b05e	Merge branch 'main' into sgc/issue/7829_time_bounds_3	2023-06-07 08:51:38 +10:00
Nga Tran	a2f5f37b2e	test: turn interval 0 test on after upgrading DF with the fix (#7938 ) * test: turn interval 0 test on after upgrading DF with the fix * chore: remove obsolete comments	2023-06-06 15:50:54 +00:00
Stuart Carnie	f114842711	feat: Push outer query time-range to subqueries Added additional end-to-end tests to validate time-range behaviour	2023-06-06 16:33:01 +10:00
Stuart Carnie	9e2550c933	Merge branch 'main' into sgc/issue/7829_time_bounds_3 # Conflicts: # iox_query_influxql/src/plan/planner.rs	2023-06-06 12:55:43 +10:00
Andrew Lamb	f571aeb445	chore: Update DataFusion pin (#7916 ) * chore: Update DataFusion pin * chore: Update cargo * fix: update for API changes * fix: Update plans * chore: Update for new api * fix: Update plans * chore: Update for API changes more --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-05 18:38:59 +00:00
Stuart Carnie	d8c2f2c679	refactor: Simplify `TimeRange` to match InfluxQL OG behaviour explicitly	2023-06-05 15:14:13 +10:00
Stuart Carnie	28166006a8	chore: clippy	2023-06-04 06:56:19 +10:00
kodiakhq[bot]	1d6fd83a9a	Merge branch 'main' into savage/wal-regenerate-lp-catalog-support	2023-06-02 14:23:55 +00:00
Fraser Savage	e9b5708c70	refactor(cli): Perform `regenerate-lp` using a sorted output comparison Query the ingester directly through the test cluster to allow for less brittle assertion of results.	2023-06-02 13:43:44 +01:00
Fraser Savage	50797b6967	test(cli): Assert writing `regenerate-lp` output produces same query results This changes the e2e test to delete the WAL segment file, restart the ingester and ensure the results returned by an ingester query after feeding the regenerated line proto in are the same as those before.	2023-06-02 12:45:52 +01:00
Marco Neumann	efbaf455a0	feat: `selector_first` with additional args (#7898 ) * feat: `selector_first` with additional args Foundation for #7533. * test: `selector_first` malformed args * docs: explain type handling	2023-06-02 10:08:21 +00:00
Nga Tran	21752cfb69	test: reproducer for panic bug attempt to calculate the remainder with a divisor of zero (#7903 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-01 15:43:24 +00:00
Stuart Carnie	600ed6652c	refactor: rewrite time-range expressions to a single range Fixes gap filling, which was confused by multiple lower or upper time bounds.	2023-05-30 15:46:45 +10:00
Christopher M. Wolff	2a07b53879	feat: add more tag predicate rewrite logic for InfluxQL (#7869 ) * feat: add more tag predicate rewrite logic for InfluxQL * chore: cargo fmt * chore: fmt * test: add more tests --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-26 21:53:52 +00:00
Fraser Savage	bf031641c5	feat(cli): Add measurement name lookup to `wal regenerate-lp` command This commit adds support for the CLI to query the namespace and schema APIs to retrieve database and table names from the IDs found in WAL entries being regenerated.	2023-05-26 17:31:19 +01:00
wiedld	7bcde3c544	chore(7618): trace ingester response encoding v2 (#7820 ) * test: integration test for tracing of queries to the ingester * chore: add FlightFrameEncodeRecorder to record spans per each polling result * refactor(trace): impl TraceCollector for Arc Allow any Arc-wrapped TraceCollector implementation to be used as a TraceCollector. This avoids needing to as_any() and downcast later. * test: assert FlightFrameEncodeRecorder trace spans This test exercises the FlightDataEncoder wrapped with the trace decorator (FlightFrameEncodeRecorder) when executing against a data source that yields data after varying numbers of Stream polls. This test passing will validate the FlightFrameEncodeRecorder correctly instruments the amount of time a client spends waiting on the FlightDataEncoder to acquire or encode a protocol frame, but also ensures the decorator correctly accounts for varying behaviours allowed through the Stream abstraction. It does this by simulating a data source that is not always immediately ready to provide data, such as a buffer wrapped in a contended async mutex. * refactor: move tracing decorator into separate mod * fix: record spans * refactor(test): update test The frame encoder is not one-to-one - it emits two frames for the first data payload, a schema and a payload. This commit updates the test to account for it! * refactor: remove unneeded mut ref, and use enum state method which panics when in a (should be unreachable) state * chore: add more docs to FlightFrameEncodeRecorder and related --------- Co-authored-by: Dom Dwyer <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-26 09:40:16 +00:00
Carol (Nichols \|\| Goulding)	c3117e7eb8	fix: Return 'already exists' errors from namespace and table gRPC APIs When appropriate, rather than internal errors.	2023-05-25 13:19:33 -04:00
Marco Neumann	bc18c6dc5f	refactor: re-land #7815 . (#7852 ) * refactor: consolidate pruning code Let's have a single chunk pruning implementation in our code, not two. Also removes a bit of crust from `QueryChunk` since it is technically no longer responsible for pruning (this part has been pushed into the querier for early pruning and bits for the `iox_query_influxrpc` for some RPC shenanigans). * test: regression test for incident * fix: chunk pruning * docs: add some test notes	2023-05-24 09:46:49 +00:00
Dom Dwyer	e61fb3a78c	test: remove line numbers from asserts I don't think the tests are that specific that they need to assert the line.	2023-05-23 14:55:43 +02:00
Stuart Carnie	d9feed3374	Merge branch 'main' into sgc/issue/7794_subquery_inconsistency	2023-05-23 09:52:28 +10:00
kodiakhq[bot]	b9bcaf1aa0	Merge branch 'main' into savage/wal-regenerate-lp-cli-command	2023-05-22 16:18:44 +00:00
Marco Neumann	31b8813760	feat: hide `system.queries` table from prod by default (#7810 ) Introduce a new header called `iox-debug` which when set enables certain debug features. The first one will be the `system.queries` table which is a process-local, namespace-scoped query log. In most prod setups this is only useful for debugging and will confuse the user a lot because when multiple queries are deployed then the K8s routing decides which pod/process the users hits. This leads to an inconsistent view. However the log is still useful for debugging. This also wires the "debug header set" flag through the Flight ticket, because JDBC proved (integration tests FTW!) that headers are only passed to `GetFlightInfo` but not to `DoGet` and the ticket must encode all the relevant information. Closes #7119. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-22 12:29:24 +00:00

1 2 3 4 5 ...

621 Commits (729851be580e8a7813c0f3fe3e574f7cb20a6593)