Commit Graph

47 Commits (f78195f7c7327dde3ad07a40a2926044572d38a3)

Author SHA1 Message Date
Andrew Lamb 58838e214e
feat: enable parquet predicate pushdown in IOx (#5930) 2022-11-02 18:00:47 +00:00
Andrew Lamb 9c1f0a3644
refactor: move SessionConfig creation into datafusion_utils (#6011)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 20:04:49 +00:00
Carol (Nichols || Goulding) 44936f661a
feat: Use workspace dep inheritance for datafusion instead of shim crate 2022-10-26 10:33:56 -04:00
Carol (Nichols || Goulding) 2e83e04eab
feat: Use workspace package metadata to reduce differences and repetition 2022-10-24 13:04:09 -04:00
Andrew Lamb 82d6fc3bda
feat: support queries via influxrpc with periods in field names (#5919)
* feat: support queries via influxrpc with periods in field names

* fix: update comments

* fix: more tests

* fix: more tests
2022-10-19 20:09:55 +00:00
Andrew Lamb ece0ce440d
chore: update datafusion (#5869)
* chore: update datafusion

* chore: Update split conjunction api

* chore: remove code that was upstreamed

* chore: simplification is better upstream

* fix: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-17 13:22:16 +00:00
Andrew Lamb 9134ccd6c3
chore: Update datafusion again (#5855)
* chore: Update datafusion

* chore: Updates for changes in datafusion

* chore: more updates

* fix: update doc example

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-13 19:18:57 +00:00
Andrew Lamb 1a77c509c0
docs: Add some doc comments to the datafusion_util module (#5842)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-13 09:05:01 +00:00
Andrew Lamb d57c99638c
chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0 (#5792)
* chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0

* fix: Update for coercion, fix explain plans for change in column name display

* chore: Update datafusion lock

* fix: Update for other API changes

* chore: Update to latest datafusion pin

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-12 16:19:14 +00:00
Nga Tran 95ed41f140
feat: Projection pushdown for querier -> ingester for rpc queries (#5782)
* feat: initial step to identify where the projection should be provided

* feat: start getting columns of all expressions

* chore: format

* test: test for the table_chunk_stream

* fix: fix a compile error. Thanks @alamb

* test: full tests for table_chunk_stream

* chore: cleanup

* fix: do not cut any columns in case all fields are needed

* test: add one more test case of reading all columns

* refactor: move code that identify columbs ot push down to a function. Add the use of  field_columns

* chore: cleanup

* refactor: make sream_from_batch support empty batches

* chore: cleanup

* chore: fix clippy after auto merge

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-06 17:21:23 +00:00
Marco Neumann adeacf416c
ci: fix (#5569)
* ci: use same feature set in `build_dev` and `build_release`

* ci: also enable unstable tokio for `build_dev`

* chore: update tokio to 1.21 (to fix console-subscriber 0.1.8

* fix: "must use"
2022-09-06 14:13:28 +00:00
Andrew Lamb e0ea335b70
fix: Support RegExMatch and RegExNotMatch predicates on `_field` (#5301)
* test: add tests for regex_match_on_field

* feat: more general `_field` predicate handling

* fix: remove old comment

* fix: update tests

* fix: improve test a little more

* fix: fmt

* fix: Update predicate/src/rpc_predicate/field_rewrite.rs

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

* fix: Handle predicates that can not be evaluated

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-04 19:42:16 +00:00
Andrew Lamb 76171c5cea
refactor: simplify creating AND expressions (#5298)
* refactor: simplify creating AND expressions

* fix: fmt + clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-04 11:25:43 +00:00
Andrew Lamb e2d871b00b
chore: Update datafusion and arrow/parquet/arrow-flight to `18.0.0` (#5079)
* chore: Update datafusion to 10.0.0, arrow/parquet/arrow-flight to 18

* chore: Run cargo hakari tasks

* fix: update cargo pin

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-18 15:01:03 +00:00
dependabot[bot] 9b67de2f43
chore(deps): Bump tokio from 1.19.2 to 1.20.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.19.2 to 1.20.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.19.2...tokio-1.20.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-07-14 01:21:43 +00:00
Andrew Lamb c46e1c6347
chore: Update datafusion + arrow/parquet/arrow-flight to `17.0.0` (#5021)
* fix: correct nullability declaration of system tables

* chore: Update datafusion and arrow/parquet/arrow-flight

* chore: Run cargo hakari tasks

* fix: Update tests

* fix: Update tests

* fix: predicate pruning

* fix: add some tests

* fix: query_functions

* fix: fix read_buffer test

* fix: fix clippy

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-07 19:22:15 +00:00
Marco Neumann 05189fdb00
fix: ensure panic is propagated throw `AdapterStream` (#4980)
Prior to this change background tasks that we feed into `AdapterStream`
can panic but that would just end the stream without any user-visible
error (except for the panic message on stdout/stderr).

This was found while developing #4964. I have proposed another fix in #4966
but found that I actually developed an existing solution a 2nd time:
`watch_task`. But I also see a major issue with the existing API: one
can create `AdapterStream` with ordinary tokio tasks that are not
watched at all, leaving the burden to the implementor to check for that
(and actually we forgot that in `parquet_file`).

So this change takes a slightly different approach:
The `AdapterStream` does NOT accept ordinary join handles any longer but
requires that you pass a "watched task". The newly introduced
`WatchedTask` does the same as we did manually before: wrapping a future
into a tokio task, watch it and wrap the watcher into a task.

It is now way more difficult to do anything stupid (sure you can still
mix up the tasks and the channels, but we need at least some flexibility
here to allow for "split" and potential future fan-in/out constructs).

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-30 07:57:58 +00:00
Andrew Lamb 8e96a2721d
chore: Update datafusion (again) (#4788)
* chore: Update datafusion

* chore: Update imports

* refactor: update API usage

* refactor: clean up some uses of binary_expr

* fix: remove unused export

* fix: update explain output

* chore: update more explain tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-07 08:17:56 +00:00
dependabot[bot] e03bf94420
chore(deps): Bump tokio from 1.18.2 to 1.19.1 (#4783)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.18.2 to 1.19.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.18.2...tokio-1.19.1)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-06 14:15:12 +00:00
Marco Neumann addc45327e
fix: ensure that query tokio background tasks are canceled (#4643)
* fix: ensure that query tokio background tasks are canceled

While I am not entirely sure if this explains some of the memory leaks I
am seeing in prod, not canceling the tasks correctly certainly makes
debugging way harder and also renders certain form of throttling (e.g.
max. concurrent queries) somewhat ineffective.

Note that parquet file downloads are currently NOT canceled because
tokios `spawn_blocking` cannot be canceled.

* refactor: `Vec` -> `Option`

* refactor: `spawn_blocking` creates a join handle, even though it is useless

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-20 07:18:52 +00:00
Andrew Lamb 37c7ce793c
chore: Update datafusion (again) (#4518)
* chore: Update datafusion (again)

* refactor: Update ExecutionPlan:execute to not be async
2022-05-05 15:43:41 +00:00
Andrew Lamb bc5725b1fc
fix: Move schema pivot work out of `execute` (#4520)
* refactor: Extract watch_task into datafusion_util

* refactor: do work in a separate task

* fix: cleanup

* fix: update test
2022-05-05 15:11:14 +00:00
dependabot[bot] 420c306caa
chore(deps): Bump tokio from 1.17.0 to 1.18.0 (#4453)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.17.0 to 1.18.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.17.0...tokio-1.18.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-28 08:21:17 +00:00
Andrew Lamb 6d2a8256ba
feat: Add window_bounds to IOx Function Registry (2.5/3) (#4432)
* feat: Add window_bounds to IOx Function Registry

* refactor: Prepare for lib tests

* test: Add explicit tests for plumbing

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-27 15:08:38 +00:00
Marco Neumann f444e63960
test: include materialized delete predicates in NG query tests (#4371)
* refactor: move `batch_filter` to `datafusion_util`

* fix: outdated docstring

* feat: allow passing record batches to `iox_tests` parquet files

* test: include materialized delete predicates in NG query tests

* docs: improve wording

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-21 13:00:13 +00:00
Dom Dwyer 5ac4785e19 feat: unbounded channel support for AdaptorStream
Allows the AdaptorStream to work with either a bounded, or unbounded
channel internally.
2022-04-14 15:30:57 +01:00
Andrew Lamb b83b000590
chore: Update datafusion (#4071)
* chore: update to datafusion 5936edc2a94d5fb20702a41eab2b80695961b9dc

* chore: Update apis to match datafusion changes
2022-03-22 13:17:41 +00:00
dependabot[bot] ad3868ed7c
chore(deps): Bump tokio from 1.16.1 to 1.17.0 (#3814)
* chore(deps): Bump tokio from 1.16.1 to 1.17.0

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.16.1 to 1.17.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.16.1...tokio-1.17.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build: update workspace-hack

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom Dwyer <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-22 16:27:43 +00:00
Andrew Lamb 7b96a37165
chore: Update datafusion (#3586)
* chore: update DataFusion to f849968057ddddccc9aa19915ef3ea56bf14d80d

* fix: reduce overhead of creating physical expressions

* chore: use MemTrackingMetrics

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-31 18:15:28 +00:00
Andrew Lamb 5488c257d1
chore: Update datafusion, upgrade to arrow/parqet/arrow-flight 8.0.0 (#3517)
* chore: Update datafusion

* chore: update to arrow 8

* fix: update to use new DataFusion APIs

* fix: update case for sortedness

* fix: cargo hakari
2022-01-27 13:33:27 +00:00
Andrew Lamb 1843476651
chore: Update datafusion deps (#3471)
* chore: Update datafusion

* refactor: Update to use new Exec plan APIs

* fix: error message

* fix: fixup last bit

* fix: clippy

* fix: doclink

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-17 15:26:19 +00:00
Andrew Lamb 527885f7f8
chore: Update datafusion (#3413)
* chore: Update datafusion and update code to handle timezone aware timestamps

* fix: cargo hakari

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-12-23 14:52:12 +00:00
Carol (Nichols || Goulding) 0b7c7764c6
fix: Use a consistent version range for tokio-stream across all crates 2021-12-06 09:37:16 -05:00
Carol (Nichols || Goulding) 16d8ae5e04
fix: Match tokio features to what's actually in use in each crate
Some crates listed features they don't use; other crates ware relying on
feature flags enabled by something else. I tested these changes by
disabling the workspace hack crate and testing each crate.
2021-12-06 09:37:16 -05:00
Carol (Nichols || Goulding) 02c297e850
fix: Always specify the parking_lot feature of tokio to get potential perf boost 2021-12-06 09:37:15 -05:00
Carol (Nichols || Goulding) 9fd4a560f5
feat: Results of running cargo hakari manage-deps 2021-11-19 09:21:57 -05:00
dependabot[bot] 1b377f0377
chore(deps): bump tokio-stream from 0.1.7 to 0.1.8
Bumps [tokio-stream](https://github.com/tokio-rs/tokio) from 0.1.7 to 0.1.8.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-stream-0.1.7...tokio-stream-0.1.8)

---
updated-dependencies:
- dependency-name: tokio-stream
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-01 11:38:59 +00:00
dependabot[bot] c540b40f05
chore(deps): bump tokio from 1.12.0 to 1.13.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.12.0 to 1.13.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.12.0...tokio-1.13.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-01 11:21:59 +00:00
Marco Neumann bc7244c48e chore: use Rust edition 2021 2021-10-25 10:58:20 +02:00
Andrew Lamb a55a21c644
chore: Update datafusion (#2635)
* chore: Update datafusion and sqlparser

* fix: remove STACK_SIZE workaround

* chore: update datafusion_util

* chore: update predicate

* chore: update query_tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-27 14:13:19 +00:00
Marco Neumann 509c07330d refactor: decouple `parquet_file` from `query` 2021-09-14 18:26:16 +02:00
Dom 3de6b44e23
build: use new rustdoc lint name (#2261)
* fix: nocache feature code rot

The MBChunk::snapshot code when using the "nocache" option no longer
compiles - this commit updates it to match the not(nocache) code.

* build: use updated broken_intra_doc_links name

The broken_intra_doc_links lint was renamed
rustdoc::broken_intra_doc_links

https://doc.rust-lang.org/rustdoc/lints.html
2021-08-11 19:48:51 +00:00
Andrew Lamb d35b74c226
fix: Fix doc build warnings (#1945)
* fix: Fix doc build warnings

* refactor: add deny bare_urls to crates

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-13 08:03:42 +00:00
Andrew Lamb 7602bde850
chore: Update datafusion deps (#1799)
* chore: Update datafusion deps + rework code

* refactor: remove workaround as it has been contributed upstream

* fix: Update query/src/exec/split.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 10:58:32 +00:00
Andrew Lamb bed6ec8c31
feat: Handle merging chunks that have different schemas (#1761)
* feat: Handle merging chunks that have different schemas

* test: print out original (non deduplicated) data in tests
2021-06-21 15:52:13 +00:00
Raphael Taylor-Davies 44de42906f
refactor: use Arc<str> instead of Arc<String> (#1442) 2021-05-06 17:05:08 +00:00
Raphael Taylor-Davies 411cf134e9
refactor: explode arrow_deps (#1425)
* refactor: explode arrow_deps

* chore: workaround doctest bug
2021-05-05 16:59:12 +00:00