influxdb

Commit Graph

Author	SHA1	Message	Date
Andrew Lamb	4a1f8db254	chore: Update datafusion + arrow/arrow-flight/parquet to patched version `42.0.0` (#8113 ) * Revert "Revert "chore: Update datafusion + arrow/arrow-flight/parquet to version `42.0.0` (#8036)" (#8049)" This reverts commit `fb0674fc01`. * chore: Update Cargo and hakari * chore: Update to patched version --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-30 12:59:31 +00:00
Marco Neumann	b982ee180e	refactor: remove `QueryChunk::column_names` (#8109 ) This interface was once specially implemented by the RUB. The only actual implementation of it is within the querier that just forwards it to a simple schema scan. Lift this semantic to `iox_query_influxrpc` instead so all the chunks can use it. If we ever want to optimize this again, we should use `QueryChunk::data` instead (i.e. instead of implementing it within the chunk it should use the data method and do something smart based on that). First half of #8096.	2023-06-29 13:43:10 +00:00
Marco Neumann	ca31c1eade	feat: hook up tokio metrics (#8050 ) * feat: metrics for main tokio runtime * feat: instrument executor tokio runtime --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-29 11:11:44 +00:00
Marco Neumann	dcb4a9bb5c	refactor: fuse `QueryChunk` and `QueryChunkMeta` (#8107 ) Closes #8095.	2023-06-29 11:02:48 +00:00
Marco Neumann	4638b89d93	refactor: migrate retention to proper predicates (#8092 ) Do not (ab)use per-chunk delete predicates for the retention policy. Instead use a per-table predicate. This makes the code way cleaner, since the scoping is correct (i.e. delete predicates are a table-wide attribute, not a chunk-based one) and it is consistent time predicates that the user providers (e.g. via `WHERE time > x`). It also allows us to remove delete predicates (in their current, non-scalable form) from the query path. A potential future version would likely not use per chunk predicates (and "is processed" markers) but use the timestamp / chunk order to determine to which data the predicate should be applied. Note that the lowering of the retention policy changed slightly from ```text (time > (now() - retention)) AND (time < MAX) ``` to ```text time > (now() - retention) ``` Since the `MAX` cut is just an artifact of the lowering and was unnecessary. Closes #7409. Closes #7410.	2023-06-29 08:36:37 +00:00
dependabot[bot]	b15c6062a9	chore(deps): Bump tokio from 1.28.2 to 1.29.0 (#8100 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.28.2 to 1.29.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.28.2...tokio-1.29.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-06-28 13:18:08 +00:00
dependabot[bot]	990044dcb2	chore(deps): Bump indexmap from 1.9.3 to 2.0.0 (#8073 ) * chore(deps): Bump indexmap from 1.9.3 to 2.0.0 Bumps [indexmap](https://github.com/bluss/indexmap) from 1.9.3 to 2.0.0. - [Changelog](https://github.com/bluss/indexmap/blob/master/RELEASES.md) - [Commits](https://github.com/bluss/indexmap/compare/1.9.3...2.0.0) --- updated-dependencies: - dependency-name: indexmap dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Run cargo hakari tasks --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-26 08:52:51 +00:00
Marco Neumann	7322f238fb	docs: query processing (#8033 ) * docs: query processing Closes https://github.com/influxdata/idpe/issues/17770 . * docs: apply recommendations Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com> * docs: improve description of the flight protocol * docs: link `LogicalPlan` * docs: link `ExecutionPlan` * docs: improve wording * docs: improve query planning docs --------- Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2023-06-23 09:13:14 +00:00
dependabot[bot]	74a48a8f63	chore(deps): Bump itertools from 0.10.5 to 0.11.0 (#8060 ) * chore(deps): Bump itertools from 0.10.5 to 0.11.0 Bumps [itertools](https://github.com/rust-itertools/itertools) from 0.10.5 to 0.11.0. - [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-itertools/itertools/compare/v0.10.5...v0.11.0) --- updated-dependencies: - dependency-name: itertools dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Run cargo hakari tasks --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-23 08:11:56 +00:00
Andrew Lamb	fb0674fc01	Revert "chore: Update datafusion + arrow/arrow-flight/parquet to version `42.0.0` (#8036 )" (#8049 ) This reverts commit `70ffedadc7`. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-22 11:03:25 +00:00
Andrew Lamb	70ffedadc7	chore: Update datafusion + arrow/arrow-flight/parquet to version `42.0.0` (#8036 ) * chore: Update datafusion + arrow/arrow-flight/parquet to version `42.0.0` * chore: Update for new APIs --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-21 16:11:36 +00:00
Stuart Carnie	e10b8c93c8	chore: Update DataFusion and other dependencies (#8014 ) * chore: Update DataFusion pin * chore: Update API changes * chore: Don't use deprecated API * chore: Run cargo hakari tasks * chore: Update tests due to changes in logical plan nodes from DF update * chore: Fix broken links in docs * chore: Adjust changes to expected output --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2023-06-16 10:39:36 +00:00
Andrew Lamb	5889c96501	chore: Update `datafusion` and other dependencies (#7981 ) * chore: Update DatFaFusion pin * chore: Update other dependencies * chore: Update hakari * fix: Update for API changes * fix: Update explain plan * fix: Update influxql plans * fix: rustdoc links --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-16 09:48:55 +00:00
Marco Neumann	8ef1b64f6a	fix: remove de-dup even if we have many partitions (#8004 ) See optimizer pipeline here: `5d0bb68c5b/iox_query/src/physical_optimizer/mod.rs (L33-L35)` After generating the naive initial plan w/ many partitions, we must consider more than 100 partitions to split the key space. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-15 15:25:25 +00:00
Andrew Lamb	17c0d837b3	chore: Update DataFusion, arrow, object_store pins (#7942 ) * chore: Update DataFusion, arrow, object_store pins * chore: Update for hakari * chore: Update for new APIs * fix: update test --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-07 17:08:31 +00:00
Andrew Lamb	f571aeb445	chore: Update DataFusion pin (#7916 ) * chore: Update DataFusion pin * chore: Update cargo * fix: update for API changes * fix: Update plans * chore: Update for new api * fix: Update plans * chore: Update for API changes more --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-05 18:38:59 +00:00
Stuart Carnie	da682d8c53	chore: clippy 🧠	2023-06-04 07:23:11 +10:00
Marco Neumann	fa5011197c	refactor: migrate `iox_query` to use DataFusion statistics (#7908 ) This is the major part of #7470. Additional clean ups (e.g. to remove the actual types from `data_types`) will follow. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-02 09:18:59 +00:00
Marco Neumann	72ff001d33	feat: aggregator for DataFusion statistics (#7904 ) * feat: aggregator for DataFusion statistics Required to implement #7470, esp. to implement the statistics folding done within `RecordBatchesExec`. * docs: improve	2023-06-01 16:11:30 +00:00
Andrew Lamb	a48f681e56	feat(parquet): reduce and limit buffering when writing parquet files (#7880 ) * feat: limit buffering when writing parquet files ("combined solution") * chore: Run cargo hakari tasks --------- Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-31 13:27:32 +00:00
Andrew Lamb	1ff76b7bf2	chore: use workspace dependencies for `object_store`	2023-05-26 07:03:42 -04:00
Marco Neumann	bc18c6dc5f	refactor: re-land #7815 . (#7852 ) * refactor: consolidate pruning code Let's have a single chunk pruning implementation in our code, not two. Also removes a bit of crust from `QueryChunk` since it is technically no longer responsible for pruning (this part has been pushed into the querier for early pruning and bits for the `iox_query_influxrpc` for some RPC shenanigans). * test: regression test for incident * fix: chunk pruning * docs: add some test notes	2023-05-24 09:46:49 +00:00
Dom Dwyer	928a4d163e	build: remove unused dependencies from crates This commit fixes loads of crates (47!) had unused dependencies, or mis-configured dependencies (test deps as normal deps). I added the "unused_crate_dependencies" to all crates to help prevent this mess from growing again! https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html This has the minor downside of false-positives when specifying dev-dependencies for test/bench binaries - these are files in /test or /benches (not normal tests). This commit includes a workaround, importing them in lib.rs (gated by a feature flag). I think the trade-off of better dependency management is worth it!	2023-05-23 14:55:43 +02:00
Marco Neumann	6c0f50a473	revert: refactor: consolidate pruning code (#7815 ) (#7847 ) This reverts commit `db9fe92981`. Likely causing an incident, see https://app.incident.io/incidents/267 .	2023-05-23 08:01:53 +00:00
Marco Neumann	db9fe92981	refactor: consolidate pruning code (#7815 ) Let's have a single chunk pruning implementation in our code, not two. Also removes a bit of crust from `QueryChunk` since it is technically no longer responsible for pruning (this part has been pushed into the querier for early pruning and bits for the `iox_query_influxrpc` for some RPC shenanigans).	2023-05-22 08:42:20 +00:00
Andrew Lamb	6344fe8c3f	chore: Add rationale for `clippy::future_not_send` (#7822 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-18 16:58:56 +00:00
Marco Neumann	7e64264eef	refactor: remove `RedudantSort` optimizer pass (#7809 ) * test: add dedup test for multiple partitions and ranges * refactor: remove `RedudantSort` optimizer pass Similar to #7807 this is now covered by DataFusion, as demonstrated by the fact that all query tests (incl. explain tests) still pass. The good thing is: passes that are no longer required don't require any upstreaming, so this also closes #7411.	2023-05-17 09:30:04 +00:00
Marco Neumann	931b4488bd	refactor: remove `SortPushdown` optimizer pass (#7807 ) DataFusion is now smart enough to do that using the builtin passes. No `EXPLAIN` tests regressed. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-17 06:26:31 +00:00
Nga Tran	ca12f1c03d	fix: correctly recurse in `ParquetSortness` (#7778 ) * test: reproducer for idpe_17556 * fix: `ParquetSortness` and partial opt 1. correctly handle cases where `ParquetSortness` would optimize one child branch but not the other 2. handle cases where `ParquetSortness` recusion should stop a bit clearer (using `TreeNodeRewriter`) 3. rename query tests to be a bit clearer 4. add test case with many (but not too many) duplicate files and an ingester (basically a prod use case where the compactor is slightly behind) --------- Co-authored-by: Marco Neumann <marco@crepererum.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-17 06:09:23 +00:00
Marco Neumann	d3ff945117	refactor: remove output sorting from scan provider (#7798 ) This is somewhat a left-over from the old phys. plan construction where we tried to fold in the sorts at the right place. Now the optimizer takes care of that, so we can just express this as a standard logical node (the same as SQL and InfluxQL). This makes the plan construction a bit cleaner since the actual scan provider only performs the minimal work that is required by DataFusion and the users (SQL, InfluxQL, reorg) request what they actually need. The tests in `iox_query::frontend::reorg::tests` that assert the tests still pass and proof that the actual physical plans are identical w/ this approach. Closes #7785. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-17 05:59:04 +00:00
Chunchun Ye	2bb6445668	chore: update DataFusion and arrow / arrow-flight / parquet to `39.0.0` (#7793 ) * chore: update DataFusion and arrow/parquet/arrow-flight to 39.0.0 * chore: update DataFusion and arrow/parquet/arrow-flight to 39.0.0 in workspace-hack/Cargo.toml * chore: Run cargo hakari tasks * chore: fix CI test and lint * chore: update csv schema * refactor: remove type-annotate for `Arc` --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-16 13:42:26 +00:00
Andrew Lamb	7735e7c95b	chore: Update DataFusion again (#7777 ) * chore: Update datafusion again * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-15 12:38:45 +00:00
Andrew Lamb	2860d87fe1	chore: Update DataFusion (#7756 ) * chore: Update DataFusion pin * chore: Update explain plans * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2023-05-05 18:58:18 +00:00
Christopher M. Wolff	55b35367ac	test: add test for gap fill query missing time bounds (#7747 ) * test: add test for gap fill query missing time bounds * chore: update unit test --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-04 21:01:45 +00:00
Christopher M. Wolff	05688799c4	fix: handle aliases in gapfill aggregate columns (#7725 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-03 15:20:14 +00:00
Andrew Lamb	2b1f8b56e2	chore: Update DataFusion (#7719 ) * chore: Update DataFusion * chore: update for API change * chore: update some tests * fix: Update plans in optimizer * chore: Update plans * chore: Update error messages * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2023-05-02 17:55:04 +00:00
Andrew Lamb	530ee94558	fix: use correct sort key in projection_pushdown (#7718 ) * fix: use correct sort key in projection_pushdown * fix: tabs in docs * refactor: Use Serde to format test results	2023-05-02 16:50:04 +00:00
Christopher M. Wolff	493b26831d	fix: make influx RPC interface break up series into multiple frames (#7691 ) * fix: make influx RPC interface break up series into multiple frames * refactor: code review feedback --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-01 20:18:05 +00:00
Marco Neumann	0556fdae53	refactor: remove `QueryChunk::partition_sort_key` (#7680 ) As of #7250 / #7449 the partition sort key is no longer required for query planning. Instead we use a combination of `QueryChunk::partition_id` and `QueryChunk::sort_key` which is more robust and easier to reason about. Removing it simplifies the querier code a lot since we no longer need to have a sort key for the ingester chunks and also don't need to "sync" the sort key between chunks for consistency.	2023-04-27 10:54:41 +00:00
dependabot[bot]	bdf7f316d7	chore(deps): Bump tokio from 1.27.0 to 1.28.0 (#7667 ) * chore(deps): Bump tokio from 1.27.0 to 1.28.0 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.27.0 to 1.28.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.27.0...tokio-1.28.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Run cargo hakari tasks --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Dom <dom@itsallbroken.com>	2023-04-26 12:53:26 +00:00
Christopher M. Wolff	7a6862ee3a	refactor: let date_bin_gapfill allow omitted origin (#7595 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-19 14:56:52 +00:00
Marco Neumann	d7dc305972	feat: allow overwriting DataFusion's default config (#7586 ) This is helpful to test changes in our defaults but also for testing. Required for https://github.com/influxdata/idpe/issues/17474 . Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-18 11:28:45 +00:00
Andrew Lamb	f46d06d56f	chore: Update DataFusion + arrow ecosystem to 37 (#7544 ) * chore: Update datafusion and arrow/parquet to 37, tonic to 0.9.1 * refactor: Update for FieldRef and other API changes * fix: Update field size calculation * fix: Use `NullBuffer` directly * fix: remove outdated comment * chore: Update test for tonic * chore: Run cargo hakari tasks * chore: cargo update --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-14 12:43:01 +00:00
Andrew Lamb	134ff2ef83	chore: update DataFusion pin (right before arrow 37 update) (#7540 ) * chore: update DataFusion pin * refactor: Update for deprecated API * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-13 17:25:24 +00:00
Andrew Lamb	3ebd07358b	chore: Update DataFusion pin, upgrade `date_bin` and `InfluxQL` to use `Interval(MonthDayNano)` (#7516 ) * chore: Update datafusion * chore: Update for change in PhysicalSortExpr * refactor: Update date_bin_gapfill to take IntervalMonthDayNano, fix FlightSQL * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-13 10:43:32 +00:00
Christopher M. Wolff	cbd747db44	feat: update gap fill planner rule to use `interpolate` (#7494 ) * feat: add INTERPOLATE fn and update planner gap-fill planner rule * test: add an end-to-end test for interpolate()	2023-04-12 21:51:44 +00:00
Christopher M. Wolff	0937615dba	fix: make interpolate() fill null values in input (#7490 ) * fix: make interpolate() fill null values in input * chore: cargo doc	2023-04-12 21:41:11 +00:00
Christopher M. Wolff	3e60369eff	refactor: input buffering for gap filling interpolate null-as-missing (#7478 ) * refactor: move logic for knowing how much to buffer into GapFiller * chore: clippy * chore: add some clarifying comments * refactor: clean up relationships between gap filling types * refactor: remove use of RefCell from BufferedInput --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-12 21:08:51 +00:00
Andrew Lamb	8c42fedf33	chore: Remove dead code (#7475 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-11 10:44:49 +00:00
Andrew Lamb	1a80b8073c	fix: Improve span names for query access (#7476 ) * fix: Improve span names for query access * fix: update test --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-11 10:34:09 +00:00
Marco Neumann	5f43f2a719	refactor: remove old query planning code (#7449 ) Closes #7406. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-06 16:05:08 +00:00
Marco Neumann	30b1878171	test: `ChunkTableProvider::scan` + fix "not dedup" (#7448 ) 1. Add loads of tests for `ChunkTableProvider::scan` (= the naive phys. plan before running any phys. optimizers) 2. Fix interaction of "no de-dup" and predicate pushdown. This might be used by the ingester at some point and I would like to have this correct before someone silently introduces a bug by pushing field predicates into the ingester. This is mostly prep-work for #7406 so I know that test coverage is sufficient.	2023-04-06 08:39:53 +00:00
Andrew Lamb	e8b7d69b0f	chore: Update datafusion again (#7442 ) * chore: Update datafusion * chore: Fix up plans for datafusion API change * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-05 18:21:53 +00:00
Andrew Lamb	94d390f31e	test: Add additional tests for reorg plans (#7444 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-05 11:15:23 +00:00
Christopher M. Wolff	d57a4f8947	refactor: make null-as-missing default behavior for LOCF (#7443 ) * refactor: make null-as-missing default behavior for LOCF * test: update InfluxQL test --------- Co-authored-by: Christopher Wolff <cwolff@athena.tail244ec.ts.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-04 18:03:09 +00:00
Andrew Lamb	badc8865ef	chore: Update datafusion again (#7440 ) * chore: Update DataFusion * chore: Update for new API * chore: Run cargo hakari tasks * fix: cargo doc --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2023-04-04 15:45:46 +00:00
dependabot[bot]	66982f988b	chore(deps): Bump object_store from 0.5.5 to 0.5.6 (#7433 ) Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.5 to 0.5.6. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md) - [Commits](https://github.com/apache/arrow-rs/commits) --- updated-dependencies: - dependency-name: object_store dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Dom <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-04 08:43:34 +00:00
Marco Neumann	e9bdf96457	refactor: remove DF-clean-DF phys. optimizer pass hack (#7428 ) As discussed in https://github.com/influxdata/influxdb_iox/pull/7250#discussion_r1155684471 Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-04-04 08:09:35 +00:00
Marco Neumann	f04962d630	feat: new query planning (#7250 ) Closes #6098.	2023-04-03 10:31:03 +00:00
Marco Neumann	e3b802cd25	feat: "parquet sortness" optimizer pass (#7383 ) * feat: "parquet sortness" optimizer pass Trade wider fan-out for the not having to fully sort parquet files. For #6098. * test: rename Co-authored-by: Andrew Lamb <alamb@influxdata.com> --------- Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-31 08:01:33 +00:00
Marco Neumann	2d7bff91b5	feat: allow gap-fill logical opt. to handle inline filters (#7384 ) With #6098 our `TableProvider` will declare `supports_filter_pushdown` as "exact" since we handle the predicate pushdown ourselves. This has two effects: 1. The phys. plan no longer contains an additional `FilterExec` node even if we already do all the correct filtering. This will improve performance. 2. The logical plan no longer contains a `Filter` node but instead the predicate is part of the `TableScan`. This simplifies the logical plan. For (2) we need to adjust the gap fill logical optimizer to find the time range again. Otherwise the optimizer pass will fail (which is currently somewhat swallowed by DataFusion even though it is logged) and the physical plan will contain our placeholder UDFs that are not executable. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-31 06:09:51 +00:00
Marco Neumann	d2f3f279f3	fix: projection pushdown w/ resorting (#7381 ) We should resort properly when performing projection pushdown. Extended test utils to actually catch this by checking the plan schemas. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-30 10:24:23 +00:00
dependabot[bot]	9cbcdc7672	chore(deps): Bump tokio from 1.26.0 to 1.27.0 (#7373 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.26.0 to 1.27.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.26.0...tokio-1.27.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-03-30 09:36:04 +00:00
Marco Neumann	066c3280eb	fix: phys. optimizers must respect sort partitioning (#7362 ) * fix: sort pushdown must preserve partioning * fix: projection pushdown must preserve sort partitioning --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-30 08:10:21 +00:00
Stuart Carnie	19a0c7fe9c	feat: Teach InfluxQL how to process `FILL(null\|previous\|<value>)` (#7359 ) * chore: Publicise gap-filling APIs Helps #6916 * feat: IOx learns `FILL(null\|previous\|<value>)` Helps #6916 * chore: More test cases * chore: Revert change to TreeNodeVisitor * chore: Update snapshot with expected gap-filling changes	2023-03-29 23:11:20 +00:00
Christopher M. Wolff	f41c1a7945	feat: update gap fill planner rule to use LOCF (#7358 ) * feat: update gap fill planner rule to use LOCF * chore: cargo fmt --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-29 15:45:34 +00:00
Marco Neumann	39856ad432	fix: projection pushdown should project `ParquetExec` ordering (#7356 ) * fix: projection pushdown should project `ParquetExec` ordering Bug found while working on the final steps for #6098. * fix: Update expected output * test: make test even harder --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2023-03-29 09:05:19 +00:00
Marco Neumann	52e54e0f8d	feat: more aggressive `CombineChunks` (#7355 ) Try to combine chunks even when not all Union-arms/inputs are combinable. This will later help to transform ```yaml --- union: - parquet: files: [f1] - parquet: files: [f2] - dedup: parquet: files: [f3] ``` into ```yaml --- union: - parquet: files: [f1, f2] - dedup: parquet: files: [f3] ``` Helps #6098. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-29 06:24:17 +00:00
Andrew Lamb	43e236e040	chore: Update datafusion again (#7353 ) * chore: Update DataFusion * refactor: Update predicate crate for new transform API * refactor: Update iox_query crate for new APIs * refactor: Update influxql for new API * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-28 16:21:49 +00:00
Christopher M. Wolff	dbf6493312	feat: add scalar function LOCF (#7347 ) * feat: add scalar function LOCF * chore: cargo update spin@0.9.6 Apparently this version was yanked	2023-03-28 14:35:27 +00:00
Marco Neumann	71b88b22b9	fix: ensure we don't loose predicates in chunk roundtrips (#7340 ) `extract_chunks` never runs after predicate pushdown. However IF this should ever happen, we would potentially forget the predicates attached to `ParquetExec`. So let's make sure we refuse chunk extraction in this case. This is similar to the existing behavior, i.e. we don't support chunk extraction after filter pushdown (i.e. if there is a filter around an `RecordBatchesExec`). For #6098. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-27 11:18:56 +00:00
Christopher M. Wolff	f73187ff7e	feat: add interpolation fill strategy to GapFillExec (#7317 ) * feat: add interpolation fill strategy to GapFillExec * chore: clippy * chore: code review feedback * chore: fix doc comments --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-24 18:53:14 +00:00
Andrew Lamb	5dd71998a1	chore: Update datafusion (#7318 ) * chore: Update datafusion * chore: Update for API change * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-24 15:07:23 +00:00
Andrew Lamb	184565b552	feat(flightsql): Implement FlightSQL `GetSqlInfo` endpoint (#7198 ) * feat(flightsql): Implement GetSqlInfo endpoint * chore: Add some comments to clarify the tests intent	2023-03-20 19:34:18 +00:00
Christopher M. Wolff	866f9cefa1	feat: add null-as-missing gap filling (#7245 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-17 20:34:45 +00:00
Andrew Lamb	96c2094302	refactor(iox_query): extract influxrpc planner to its own crate (#7241 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-17 18:48:55 +00:00
Marco Neumann	20ec47b00b	feat: virtual chunk order col (#7240 ) * feat: introduce `CHUNK_ORDER_COLUMN_NAME` * feat: impl `ChunkOrder` everywhere * feat: `ChunkOrder::get` * feat: emit chunk order column for `RecordBatchesExec` * feat: `chunk_order_field` * feat: chunk order col for parquet chunks * feat: optional chunk order col handling for dedup --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-17 09:39:21 +00:00
Marco Neumann	3e40de3cd4	feat: recover desired output sort in in `extract_chunks` (#7233 ) This is helpful so that optimizer passes to forget the sort key, esp. when the run after `DedupNullColumns` and `DedupSortOrder`. For #6098. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-17 09:19:10 +00:00
Andrew Lamb	3fb4fad784	refactor(iox_query): Rename `prepare_sql` to `sql_to_physical_plan` (#7226 ) * refactor(iox_query): Rename `prepare_sql` to `sql_to_physical_plan` * fix: logical conflict	2023-03-16 19:12:15 +00:00
Andrew Lamb	7dfaa05e8a	chore: Update datafusion again (#7208 ) * chore: update datafusion again * fix: update test * fix: use table_reference * fix: clean up import * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-16 14:34:40 +00:00
Marco Neumann	45d23f7652	refactor: `extract_chunks` return arrow schema (#7231 ) Similar to #7217 there is no need to convert the arrow schema to an IOx schema. This also makes it easier to handle the chunk order column in #6098. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-16 14:19:52 +00:00
Marco Neumann	f128539f98	feat: more projection pushdown (#7218 ) * feat: proj->proj pushdown For #6098. * feat: proj->SortPreservingMergeExec pushdown For #6098. --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-16 08:59:48 +00:00
Marco Neumann	3a31f41c2c	refactor: use arrow schema in `chunks_to_physical_nodes` (#7217 ) We don't need a validated IOx schema in this method. This will simplify some work on #6098. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-16 08:45:14 +00:00
Andrew Lamb	6d6fd8f663	feat(flightsql): implement basic `CommandGetCatalogs` support (#7212 ) * refactor: reduce redundancy in test * chore: implement basic get_catalog support * fix: clippy	2023-03-15 21:52:59 +00:00
Marco Neumann	393de6980e	feat: debug-log errors during chunk extraction (#7223 ) Helps debugging while working on #6098 .	2023-03-15 18:55:33 +00:00
Christopher M. Wolff	afb571a502	feat: implement gap fill with previous value (#7182 ) * feat: implement gap fill with previous value * test: update fill prev test to include null value --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-15 15:54:59 +00:00
Christopher M. Wolff	570c61f9a7	refactor: formalize abstraction for building gap filled columns (#7179 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-14 15:14:02 +00:00
Andrew Lamb	0eb858c70d	chore: Update datafusion (#7167 ) * chore: Update datafusion * chore: Update datafusion * refactor: use UserDefinedLogicalNodeCore * fix: remove stray comment * fix: clippy * chore: Run cargo hakari tasks --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-13 16:41:32 +00:00
Christopher M. Wolff	ffab683ead	refactor: move trailing_gaps bit into cursor (#7178 ) * refactor: push trailing_gap bit into cursor * chore: clippy --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-13 15:40:55 +00:00
Marco Neumann	737ea15d07	feat: projection pushdown phys. optimizer (#7161 ) * feat: projection pushdown phys. optimizer The is by far the largest pass (at least test-wise), because projections are added last in the naive plan and you have to push them through everything else. The actual code however isn't that complicated mostly because we can reuse some DataFusion functionality and the different variants for the different "child nodes" are very similar. For #6098. * feat: projection pushdown for `RecordBatchesExec` * test: `test_ignore_when_partial_impure_projection_rename` * test: more dedup projection tests * test: integration	2023-03-13 12:59:45 +00:00
Marco Neumann	41802b7b5b	feat: `SchemaAdapterStream` may create virtual columns (#7173 ) * feat: `SchemaAdapterStream` may create virtual columns For chunk order handling in #6098. * fix: improve `SchemaAdapterStream` docs and error handling	2023-03-13 10:02:13 +00:00
Carol (Nichols \|\| Goulding)	cc7c44f76a	chore: Upgrade to Rust 1.68 (#7175 ) * chore: Upgrade to Rust 1.68 * fix: Remove unnecessary into_iter, thanks Clippy! * fix: Use the size of the type, not a reference to the type... oops. Thanks clippy! * fix: Return block directly instead of creating a variable Thanks clippy! --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-12 13:22:20 +00:00
Stuart Carnie	fe48a685ec	refactor: Move InfluxQL behaviour from iox_query to new crate (#7156 ) * refactor: Break unnecessary dependencies from `iox_query` crate In the process, the test code has been simplified. * refactor: Move InfluxQL plan module to iox_query_influxql crate * refactor: Move remaining behaviour from iox_query to iox_query_influxql * chore: rustfmt 🙄 I was under the impression `clippy` would catch formatting	2023-03-08 22:29:20 +00:00
Marco Neumann	309177b750	feat: phys. pred. pushdown to parquet (#7159 ) For #6098.	2023-03-08 16:36:27 +00:00
Marco Neumann	3828d2a50e	chore: update DataFusion to `deeaa5632ed99a58b91767261570756db736d158` (#7158 ) * chore: update DataFusion to `deeaa5632ed99a58b91767261570756db736d158` I want to get pull: - https://github.com/apache/arrow-datafusion/pull/5495 Changes in the IOx code base due to: - https://github.com/apache/arrow-datafusion/pull/5423 - https://github.com/apache/arrow-datafusion/pull/5421 - https://github.com/apache/arrow-datafusion/pull/5450 * refactor: simplify expression simplifcation Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: remove upstreamed code * test: update snapshots --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2023-03-08 13:05:31 +00:00
Marco Neumann	58dad4cb01	feat: remove all-NULL columns from dedup (#7146 ) For #6098. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-08 10:04:48 +00:00
Marco Neumann	81388e7ff2	feat: determine cheap de-dup sort order (#7147 ) * feat: determine cheap de-dup sort order For #6098. * test: `test_three_chunks_different_subsets` * fix: ensure that columns can be drawn early * docs: improve algo explaination * refactor: make code clearer	2023-03-08 09:50:07 +00:00
Christopher M. Wolff	ff11fe465d	refactor: convert gap fill exec tests to use insta snapshots (#7154 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-08 01:53:15 +00:00
Stuart Carnie	2b74f07fe5	feat: Support `GROUP BY` with tags in raw `SELECT` queries (#7109 ) * chore: Normalise name of Call expression to lowercase Simplifies matching functions in planner, as they are guaranteed to be lowercase. This also ensures compatibility with InfluxQL when generating column alias names, which are reflected in updated tests. * chore: Ensure aggregate functions fail gracefully. * feat: GROUP BY tag support * feat: Ensure schema-level metadata is propagated Requires: https://github.com/apache/arrow-rs/issues/3779 * chore: Add some tests to validate GROUP BY output * chore: Add clarifying comment * chore: Declare message in flight.proto The metadata is public API, so best practice is to encode this in a way that is most compatible for clients in other languages, and will also document the history of schema changes. Added tests to validate the metadata is encoded correctly. * chore: Placate linters * chore: Use correct column in test cases * chore: Add `is_projected` to the TagKeyColumn message `is_projected` is necessary to inform a client whether it should include the tag key is used exclusively for the group key (false) or also projected in the `SELECT` column list. * refactor: Move constants to `schema` crate per PR feedback * chore: rustfmt 🙄 * chore: Update docs for InfluxQlMetadata Co-authored-by: Andrew Lamb <alamb@influxdata.com> --------- Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2023-03-07 22:40:23 +00:00
Christopher M. Wolff	3f3a47eae9	feat: add a type to characterize fill strategy (#7150 ) * feat: add a type to characterize fill strategy * chore: clippy and fix comment	2023-03-07 17:11:31 +00:00

1 2 3 4 5 ...

384 Commits (b555ddf18b19cb57ddf2f71596cd3409354caf2e)