influxdb

Commit Graph

Author	SHA1	Message	Date
Christopher M. Wolff	c9d40d6b80	feat: find time range in WHERE clause for gap-filling (#6805 ) * feat: add analysis to find time predicates * refactor: propagate time range to gap fill logical node * refactor: propagate time range to GapFillExec * refactor: code review feedback --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-02 23:48:36 +00:00
Stuart Carnie	8e931514fe	feat: Regular expression operator support (#6792 ) * feat: Regular expression operator support * chore: Added additional comments	2023-02-01 23:17:40 +00:00
Stuart Carnie	57f55e14c8	feat: IOx InfluxQL planner learns how to process time range expressions (#6772 ) * feat: IOx learns InfluxQL time-range expression → DF logical Expr IOx now understand the how to evaluate an InfluxQL time-range filter expression and transform that to a DataFusion logical expression. * chore: move time range expression to independent functions There is no need for these to be part of the `InfluxQLToLogicalPlan` struct and makes them easier to test. * chore: support scalar now on either side of binary expression * chore: improve error messages * chore: address clippy concerns * chore: add tests for time ranges * chore: add a test where time appears on the right-hand side Ensure time is correctly identified on the right-hand side of a conditional expression. * chore: add tests that specify a timezone * chore: Run cargo hakari tasks * chore: fix linting issues * chore: Remove unnecessary line * chore: Feedback: Add API to parse a conditional expression Based on feedback from @alamb, we don't want to hide the error from parsing a `ConditionalExpression`. To do this, we use the public API, `parse_statements` as a model and provide a new API, `parse_conditional_expression`, which returns a `Result` with the error being a `ParseError`. Additionally, `ConditionalExpression` implements the `FromStr` API using the `parse_conditional_expression` API. * chore: PR feedback reverting this change I believe my intention was to update all instances in the match, but never completed the change. Will leave for another day. * chore: PR feedback add additional comments * chore: rustfmt --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2023-02-01 00:27:17 +00:00
Andrew Lamb	80f0125940	feat: Add number of rows to explain of RecordBatchesExec (#6781 ) * feat: Add number of rows to explain of RecordBatchesExec * fix: Update test output	2023-01-31 14:26:20 +00:00
Andrew Lamb	5b14caa780	chore: Update DataFusion (#6753 ) * chore: Update datafusion * fix: Update for changes * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-30 14:48:52 +00:00
dependabot[bot]	ed7d02a225	chore(deps): Bump tokio from 1.24.2 to 1.25.0 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.2 to 1.25.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/commits/tokio-1.25.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2023-01-30 01:57:27 +00:00
Andrew Lamb	0d32662eea	chore: Update datafusion again (#6722 ) * chore: Update datafusion * fix: Update for API * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-27 18:59:27 +00:00
Christopher M. Wolff	9a942ceff5	refactor: propagate gapfill stride to exec (#6690 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-24 20:49:29 +00:00
Andrew Lamb	c3bc61f10e	refactor: Move `flightsql` code into its own module, add docs and tests (#6640 ) * refactor: Move `flightsql` code into its own module * fix: get schema from LogicalPlan * refactor: use arrow_flight::sql::Any instead of prost_types::any * fix: cleanup docs and avoid as_ref * fix: Use Bytes * fix: use Any::pack * fix: doclink	2023-01-24 18:24:32 +00:00
Marco Neumann	cb02262b9d	refactor: extract "exec DF plan" and "store stream to file" components (#6663 ) * refactor: extract `PartitionInfo` * refactor: extract DF exec component * feat: add some error conversions * refactor: make fn public * refactor: extract file sink component * fix: clippy	2023-01-23 14:40:35 +00:00
Christopher M. Wolff	6f39ae342e	feat: create a GapFillExec type (#6641 ) * refactor: make gap fill rule avoid aliasing * feat: create a GapFillExec type * refactor: remove unneeded sort node from GapFill rule * chore: code review feedback Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-20 17:44:00 +00:00
Christopher M. Wolff	413e4e4088	feat: create a logical plan node and rule for gap-filling (#6602 ) * feat: create a GapFill logical plan node * feat: create a GapFill optimizer rule * chore: code review feedback * chore: fix issue found after merging main Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-18 17:01:55 +00:00
Andrew Lamb	8410998408	chore: Update datafusion to Jan 17, 2023 (2 / 2) and arrow/parquet `30.0.1` (#6604 ) * chore: Update datafusion to Jan 9, 2023 (2 / 2) and arrow/parquet `30.0.1` * chore: Update for changes in arrow ipc * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2023-01-18 15:51:24 +00:00
Andrew Lamb	57f08dbccd	chore: Update datafusion to Jan 9, 2023 (1 / 2) (#6603 ) * refactor: Update DataFusion pin to early Jan 2023 * fix: Update tests now that planning is async * fix: Updates for API changes * chore: Run cargo hakari tasks * fix: Update comment * refactor: nicer config setup * fix: gapfill async Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2023-01-18 12:19:32 +00:00
Stuart Carnie	15a9b4f1e5	refactor: Drop Expr::UnaryOp to simplify tree traversal (#6600 ) * refactor: Drop Expr::UnaryOp to simplify tree traversal The UnaryOp doesn't provide and additional value and complicates walking the AST, as literal values wrapped in a UnaryOp(Minus, ...) require extra handling when reducing time range expressions, etc. This change also is true to the InfluxQL Go implementation, which represents whole number literals as signed integers unless they exceed i64::MAX. * chore: Refactor all usages of format!("{}", ?) to ?.to_string() Per https://github.com/influxdata/influxdb_iox/pull/6600#discussion_r1072028895	2023-01-18 02:27:38 +00:00
Marco Neumann	56c38ba8e1	feat: safely stream data from one tokio runtime to another (#6586 ) * refactor: remove unused code * refactor: make fn private * feat: safely stream data from one tokio runtime to another Closes #6577. * refactor: review comments Co-authored-by: Andrew Lamb <alamb@influxdata.com> * docs: improve * test: explain * test: make tests more tricky * refactor: improve error message Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-17 10:32:46 +00:00
Stuart Carnie	3f6bb3e330	feat: Parse IANA timezones in an InfluxQL TZ clause (#6585 ) * feat: Parse IANA timezone strings to chrono_tz::Tz * feat: Visitors can customise the return error type This avoids having to remap errors from `&'static str` to the caller's error type, and will be used in a future PR for time range expressions. * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2023-01-15 22:00:41 +00:00
Marco Neumann	bc030150f5	refactor: improve executor panic/error handling (#6582 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-13 07:10:53 +00:00
Stuart Carnie	81ffb3edb5	chore: move walk and the mutable variant to the parser crate (#6575 ) It is generally a useful API to be core to the InfluxQL parser crate.	2023-01-12 21:06:06 +00:00
Marco Neumann	e2da573dcf	refactor: improve thread naming (#6579 ) - name exec driver thread (instead of using the default that `thread::spawn` gives us) - provide number to every worker thread (both for the dedicatd executor and for the main runtime) - shorten thread names (current naming too long for most debug tools)	2023-01-12 14:22:49 +00:00
Stuart Carnie	66047f4372	feat: InfluxQL learns how to plan some InfluxQL queries (#6520 ) * feat: InfluxQL learns how to plan some queries Also added a means to test the planner and execution * chore: Update module docs * chore: Document the planner functions * chore: Update end_to_end_cases crate * chore: Clarify why `SLIMIT` and `SOFFSET` return `NotImplemented` * chore: Address lint issues * chore: Fix rustdoc link issue * chore: Remove InfluxQL tests from query_tests crate Will follow conventions established by @carols10cents when new query_tests crate is merged. * chore: `now` field `now` is a DataFusion built-in scalar function * chore: remove unused code * chore: Add additional arithmetic expression tests * chore: Establish pattern for identifying and tracking InfluxQL issues * chore: Add tests for case sensitivity issues * chore: group tests into modules and functions This avoids mass rewriting of insta snapshots as new tests are added to each function. When tests are added in the middle, existing snapshots are renamed (-N+1, -N+2, etc) resulting in having to review numerous additional snapshots.	2023-01-11 02:50:49 +00:00
dependabot[bot]	b49cc2e35e	chore(deps): Bump tokio from 1.24.0 to 1.24.1 (#6545 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.0 to 1.24.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.24.0...tokio-1.24.1) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-10 09:48:44 +00:00
Andrew Lamb	29df5d7fcb	fix: create statistics for nulled columns in RecordBatchExec (#6527 )	2023-01-09 07:37:10 +00:00
Raphael Taylor-Davies	e1036a0c63	refactor: cleanup schema boxing (#6511 ) * refactor: cleanup Schema boxing * chore: clippy	2023-01-06 10:57:39 +00:00
Raphael Taylor-Davies	2037db7f7b	refactor: decouple influxql from SchemaProvider (#6507 ) * refactor: decouple influxql from SchemaProvider * refactor: reorder arguments * refactor: use QueryNamespaceMeta Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-06 07:36:29 +00:00
Marco Neumann	25f275f1b0	refactor: improve influxRPC warning logging (#6493 ) The current version is barely readable because the logged schema w/ all it's metadata is soooo long. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-04 16:52:32 +00:00
Stuart Carnie	aacd91db94	feat: Teach InfluxQL rewriter how to name columns (#6481 ) * feat: Add timestamp data type * feat: Add with_quiet API to suppress output to STDOUT * fix: Field name resolution to match InfluxQL * refactor: Allow TestChunks to be directly accessed This will be useful when testing the InfluxQL planner. * fix: Add Timestamp case to var_ref module * feat: Add InfluxQL compatible column naming * chore: Add doc comment. * fix: keywords may be followed by a `!` such as `!=` * fix: field_name improvements * No longer clones expressions * Explicitly handle all Expr enumerated items * more tests * fix: collision with explicitly aliased column Fixes case where column is explicitly aliased to an auto-named variant. Test case added to validate.	2023-01-04 00:55:18 +00:00
Andrew Lamb	dbe52f1ca1	chore: Upgrade datafusion (#6467 ) * chore: Update datafusion * fix: Update for new apis * chore: Update expected plan * fix: Update for new config construction * chore: update clippy * fix: Fix error codes * fix: update another test * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-03 15:29:11 +00:00
Stuart Carnie	4add55d39e	feat: InfluxQL planner learns how to normalise InfluxQL AST (#6236 ) * chore: Move logic to context, in line with DataFusion SQL * chore: Add ordering for InfluxQL data types Ordering is used to determine automatic casting operations. If two field columns are present in an expression, one float and one integer, the integer should be cast to a float, such that the final expression will be a float. * chore: Add DerefMut trait to collection types Will allow these collections to be mutated when traversing the InfluxQL AST. * chore: Add influxql module with initial AST normalisation implementation * chore: Add more unit tests and docs * chore: Run cargo hakari tasks * chore: Fix link * chore: Support regular expression expansion and Call expressions * chore: Add tests for walk_expr functions * chore: Add insta snapshot files * chore: Add docs and make API accessible to the crate * chore: Move to Arc<dyn SchemaProvider> for use in influxql planner * chore: Move code back; it is better encapsulated here * chore: Remove redundant attribute * chore: Improve regex compatibility with InfluxQL / Go * chore: Style improvement. Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2023-01-02 23:48:21 +00:00
Carol (Nichols \|\| Goulding)	46ff8854ec	fix: Use code backticks around invalid HTML tags in doc strings	2022-12-21 16:36:17 -05:00
Carol (Nichols \|\| Goulding)	bfc74db94c	fix: Use into_values function. Thanks clippy!	2022-12-21 14:32:35 -05:00
Andrew Lamb	d0d5906476	chore: Update datafusion pin (#6442 ) * chore: Update datafusion pin * refactor: Update iox_query for new apis * chore: Update some more apis * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-12-19 20:02:42 +00:00
Dom Dwyer	933ab1f8c7	feat(ingester2): optimal persist parallelism This commit changes the behaviour of the persist system to enable optimal parallelism of persist operations, and improve the accuracy of the outstanding job bound / back-pressure. Previously all persist operations for a given partition were consistently hashed to a single worker task. This serialised persistence per partition, ensuring all updates to the partition sort key were serialised. However, this also unnecessarily serialises persist operations that do not need to update the sort key, reducing the potential throughput of the system; in the worst case of a single partition receiving all the writes, only one worker would be persisting, and the other N-1 workers would be idle. After this change, the sort key is inspected when enqueuing the persist operation and if it can be determined that no sort key update is necessary (the typical case), then the persist task is placed into a global work queue from which all workers consume. This allows for maximal parallelisation of these jobs, and the removes the per-worker head-of-line blocking. In the case that the sort key does need updating, these jobs continue to be consistently hashed to a single worker, ensuring serialised sort key updates only where necessary. To support these changes, the back-pressure system has been changed to account for all outstanding persist jobs in the system, regardless of type or assigned worker - a logical, bounded queue is composed together of a semaphore limiting the number of persist tasks overall, and a series of physical, unbounded queues - one to each worker & the global queue. The overall system remains bounded by the INFLUXDB_IOX_PERSIST_QUEUE_DEPTH value, and is now simpler to reason about (it is independent of the number of workers, etc).	2022-12-15 18:30:51 +01:00
Marco Neumann	a5d693eba2	feat: lower Influx regex expressions to DF regex expressions (#6394 ) * feat: lower Influx regex experessions to DF regex expressions For #6388. * refactor: address review comments	2022-12-15 09:33:28 +00:00
Andrew Lamb	8729977851	chore: Upgrade datafusion / arrow to 29.0.0 to get flightsql client (#6396 ) * chore: Update datafusion pin * chore: Update for API change * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-12-13 20:16:09 +00:00
Marco Neumann	65687bf0fa	test: regex baseline test (#6389 ) For #6388. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-13 17:42:31 +00:00
Andrew Lamb	9175f4a0b5	chore: Upgrade datafusion to get correct support for multi-part identifiers (#6349 ) * test: add tests for periods in measurement names * chore: Update Datafusion * chore: Update for changed APIs * chore: Update expected plan output * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-08 11:27:13 +00:00
Marco Neumann	c25afda6cc	fix: `GroupGenerator`/`Converter` panic (#6351 ) Do not poll a ready future.	2022-12-08 11:08:21 +00:00
Marco Neumann	080aff8f71	fix: account for memory allocations in InfluxRPC group outputs (#6345 ) * fix: account for memory allocations in InfluxRPC group outputs This should prevent the querier from OOMing. See https://github.com/influxdata/idpe/issues/16614 . * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * refactor: pull out constant Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-12-08 09:55:31 +00:00
dependabot[bot]	1d38d400f0	chore(deps): Bump object_store from 0.5.1 to 0.5.2 (#6339 ) * chore(deps): Bump object_store from 0.5.1 to 0.5.2 Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.1 to 0.5.2. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md) - [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.1...object_store_0.5.2) --- updated-dependencies: - dependency-name: object_store dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Run cargo hakari tasks Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-06 07:53:54 +00:00
Marco Neumann	f62b270852	fix: gRPC errors regarding group cols (#6314 ) * fix: gRPC errors regarding group cols - missing group col prev. produced an "internal error" but should be "invalid argument" - duplicate group cols produced a panic but should also be "invalid argument" * docs: clarify	2022-12-06 07:36:32 +00:00
Marco Neumann	cd6a8a1a82	refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics (#6313 ) * refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics Closes #6310. * refactor: rename and tune default exec mem limits * fix: ingester2 bits after rebase	2022-12-05 12:38:28 +00:00
Marco Neumann	942a6100b5	fix: check schemas in `pretty_print_batches` (#6309 ) * fix: check schemas in `pretty_print_batches` I think most users of this function (and `assert_batches_eq`) assume that all batches have the same schema. If not, `pretty_print_batches` may either fail producing an actual table (some rows may have more or less columns) or silently produce a table that looks "alright". * fix: equalize schemas where it is required/desired Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-02 12:14:16 +00:00
Marco Neumann	ec2e72d223	test: simplify test executors (#6312 ) Have a single global test executor w/ reasonable defaults. Also don't require tests to join/await executor shutdowns (most tests forget this anyways and will get a runtime warning). Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-02 11:38:18 +00:00
Marco Neumann	ab4f910111	refactor: improve DF error handling (#6311 ) This is required to extract "resource exhausted" errors in more cases.	2022-12-02 11:25:30 +00:00
Marco Neumann	e2168ae859	refactor: stream-based series-set conversion (#6285 ) * refactor: stream-based series-set conversion Closes #6216. * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * refactor: improve algo docs and tests * test: fix after rebase * fix: broken `Series` conversion when slices are present Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-01 17:24:36 +00:00
Andrew Lamb	d0f1f6a4fd	chore: Upgrade datafusion to get memory limits (#6297 ) * chore: Update datafusion * fix: use correctly qualified column names * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-01 16:40:26 +00:00
Marco Neumann	01315bc063	refactor: bring back "stream-based `SeriesSetConvert::convert` interface (#6282 )" (#6301 ) This reverts commit `4a8bb871dc`. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-01 14:27:43 +00:00
Marco Neumann	6cecc439d4	refactor: revert "simplify `SeriesSet` (#6277 )" (#6298 ) This reverts commit `c41200536e`.	2022-12-01 13:30:19 +00:00
Marco Neumann	4a8bb871dc	refactor: revert stream-based `SeriesSetConvert::convert` interface (#6282 ) This reverts commit `dad6dee924`.	2022-12-01 12:51:56 +01:00

1 2 3 4

187 Commits (a9433302dd639b2dc3250a9794f8364cda3f3ef6)