dependabot[bot]
8771dcb645
chore(deps): Bump thiserror from 1.0.38 to 1.0.39 ( #7131 )
...
Bumps [thiserror](https://github.com/dtolnay/thiserror ) from 1.0.38 to 1.0.39.
- [Release notes](https://github.com/dtolnay/thiserror/releases )
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.38...1.0.39 )
---
updated-dependencies:
- dependency-name: thiserror
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-06 10:55:26 +00:00
dependabot[bot]
3256fcc72e
chore(deps): Bump object_store from 0.5.4 to 0.5.5
...
Bumps [object_store](https://github.com/apache/arrow-rs ) from 0.5.4 to 0.5.5.
- [Release notes](https://github.com/apache/arrow-rs/releases )
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md )
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.4...object_store_0.5.5 )
---
updated-dependencies:
- dependency-name: object_store
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-03-03 02:00:51 +00:00
dependabot[bot]
c538cac4ef
chore(deps): Bump tokio from 1.25.0 to 1.26.0 ( #7107 )
...
* chore(deps): Bump tokio from 1.25.0 to 1.26.0
Bumps [tokio](https://github.com/tokio-rs/tokio ) from 1.25.0 to 1.26.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.25.0...tokio-1.26.0 )
---
updated-dependencies:
- dependency-name: tokio
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore: Run cargo hakari tasks
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-02 09:50:39 +00:00
Carol (Nichols || Goulding)
faae5eb438
chore: Rerun cargo hakari manage-deps
2023-02-27 11:56:15 +01:00
Andrew Lamb
7e31b2638d
fix: Understandable compactor2 config report ( #7028 )
...
* fix: Understandable compactor2 config report
* fix: do not log postgres dsn
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-22 23:43:31 +00:00
Marco Neumann
a8feed120c
test: `chunks_to_physical_nodes` ( #7013 )
...
No new actual code but sets up some test infra that I need for #6098 .
2023-02-17 09:37:43 +00:00
dependabot[bot]
0ecde75af5
chore(deps): Bump object_store from 0.5.3 to 0.5.4 ( #6900 )
...
Bumps [object_store](https://github.com/apache/arrow-rs ) from 0.5.3 to 0.5.4.
- [Release notes](https://github.com/apache/arrow-rs/releases )
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md )
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.3...object_store_0.5.4 )
---
updated-dependencies:
- dependency-name: object_store
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-08 09:40:11 +00:00
Raphael Taylor-Davies
d3601a59f8
chore: update DataFusion, upgrade `arrow` `arrow-flight` and `parquet` to `32.0.0` ( #6756 )
...
* chore: update DataFusion
* fix: test
* chore: format
* chore: clippy
* chore: update arrow
* chore: arrow upgrade fallout
* chore: Run cargo hakari tasks
* chore: remove failing warm compaction test
* fix: flight error propagation
* chore: update parquet size
* fix: Update error message
* chore: Update parquet metadata test
---------
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-06 11:35:39 +00:00
Carol (Nichols || Goulding)
30fea67701
fix: Move variables within format strings. Thanks clippy!
...
Changes made automatically using `cargo clippy --fix`.
2023-02-03 13:06:17 -05:00
Stuart Carnie
63d0a77daf
feat: Updating to new services for all-in-one ( #6811 )
...
* feat: Updating to new services for all-in-one
* fix: Use correct shard id for ingester2
* fix: clippy
* fix: use wal directory
* fix: end to end tests
* fix: Update tracing cases for new ingest reality
* fix: update metrics test
* fix: Use rpc mode
---------
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-02-02 20:42:29 +00:00
dependabot[bot]
d0e6b16450
chore(deps): Bump bytes from 1.3.0 to 1.4.0
...
Bumps [bytes](https://github.com/tokio-rs/bytes ) from 1.3.0 to 1.4.0.
- [Release notes](https://github.com/tokio-rs/bytes/releases )
- [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md )
- [Commits](https://github.com/tokio-rs/bytes/compare/v1.3.0...v1.4.0 )
---
updated-dependencies:
- dependency-name: bytes
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-02-01 00:30:56 +00:00
dependabot[bot]
ed7d02a225
chore(deps): Bump tokio from 1.24.2 to 1.25.0
...
Bumps [tokio](https://github.com/tokio-rs/tokio ) from 1.24.2 to 1.25.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/commits/tokio-1.25.0 )
---
updated-dependencies:
- dependency-name: tokio
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-01-30 01:57:27 +00:00
Marko Mikulicic
db7e6335ca
feat(ingester2): New objecstore paths will have no shard id ( #6735 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-27 17:16:54 +00:00
Nga Tran
b8a80869d4
feat: introduce a new way of max_sequence_number for ingester, compactor and querier ( #6692 )
...
* feat: introduce a new way of max_sequence_number for ingester, compactor and querier
* chore: cleanup
* feat: new column max_l0_created_at to order files for deduplication
* chore: cleanup
* chore: debug info for chnaging cpu.parquet
* fix: update test parquet file
Co-authored-by: Marco Neumann <marco@crepererum.net>
2023-01-26 10:52:47 +00:00
Marco Neumann
ed694d3be4
feat: introduce scratchpad store for compactor ( #6706 )
...
* feat: introduce scratchpad store for compactor
Use an intermediate in-memory store (can be a disk later if we want) to
stage all inputs and outputs of the compaction. The reasons are:
- **fewer IO ops:** DataFusion's streaming IO requires slightly more
IO requests (at least 2 per file) due to the way it is optimized to
read as little as possible. It first reads the metadata and then
decides which content to fetch. In the compaction case this is (esp.
w/o delete predicates) EVERYTHING. So in contrast to the querier,
there is no advantage of this approach. In contrary this easily adds
100ms latency to every single input file.
- **less traffic:** For divide&conquer partitions (i.e. when we need to
run multiple compaction steps to deal with them) it is kinda pointless
to upload an intermediate result just to download it again. The
scratchpad avoids that.
- **higher throughput:** We want to limit the number of concurrent
DataFusion jobs because we don't wanna blow up the whole process by
having too much in-flight arrow data at the same time. However while
we perform the actual computation, we were waiting for object store
IO. This was limiting our throughput substantially.
- **shadow mode:** De-coupling the stores in this way makes it easier to
implement #6645 .
Note that we assume here that the input parquet files are WAY SMALLER
than the uncompressed Arrow data during compaction itself.
Closes #6650 .
* fix: panic on shutdown
* refactor: remove shadow scratchpad (for now)
* refactor: make scratchpad safe to use
2023-01-26 10:03:08 +00:00
Marco Neumann
cb02262b9d
refactor: extract "exec DF plan" and "store stream to file" components ( #6663 )
...
* refactor: extract `PartitionInfo`
* refactor: extract DF exec component
* feat: add some error conversions
* refactor: make fn public
* refactor: extract file sink component
* fix: clippy
2023-01-23 14:40:35 +00:00
Andrew Lamb
57f08dbccd
chore: Update datafusion to Jan 9, 2023 (1 / 2) ( #6603 )
...
* refactor: Update DataFusion pin to early Jan 2023
* fix: Update tests now that planning is async
* fix: Updates for API changes
* chore: Run cargo hakari tasks
* fix: Update comment
* refactor: nicer config setup
* fix: gapfill async
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-01-18 12:19:32 +00:00
dependabot[bot]
b49cc2e35e
chore(deps): Bump tokio from 1.24.0 to 1.24.1 ( #6545 )
...
Bumps [tokio](https://github.com/tokio-rs/tokio ) from 1.24.0 to 1.24.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.24.0...tokio-1.24.1 )
---
updated-dependencies:
- dependency-name: tokio
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 09:48:44 +00:00
dependabot[bot]
3fac475b63
chore(deps): Bump base64 from 0.20.0 to 0.21.0 ( #6530 )
...
* chore(deps): Bump base64 from 0.20.0 to 0.21.0
Bumps [base64](https://github.com/marshallpierce/rust-base64 ) from 0.20.0 to 0.21.0.
- [Release notes](https://github.com/marshallpierce/rust-base64/releases )
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md )
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.20.0...v0.21.0 )
---
updated-dependencies:
- dependency-name: base64
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore: Run cargo hakari tasks
* fix: deprecations
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-09 09:39:40 +00:00
Raphael Taylor-Davies
e1036a0c63
refactor: cleanup schema boxing ( #6511 )
...
* refactor: cleanup Schema boxing
* chore: clippy
2023-01-06 10:57:39 +00:00
Andrew Lamb
dbe52f1ca1
chore: Upgrade datafusion ( #6467 )
...
* chore: Update datafusion
* fix: Update for new apis
* chore: Update expected plan
* fix: Update for new config construction
* chore: update clippy
* fix: Fix error codes
* fix: update another test
* chore: Run cargo hakari tasks
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-03 15:29:11 +00:00
Carol (Nichols || Goulding)
72aab99951
fix: Remove needless borrow. Thanks clippy!
2022-12-21 14:32:34 -05:00
dependabot[bot]
299f0e99f9
chore(deps): Bump thiserror from 1.0.37 to 1.0.38
...
Bumps [thiserror](https://github.com/dtolnay/thiserror ) from 1.0.37 to 1.0.38.
- [Release notes](https://github.com/dtolnay/thiserror/releases )
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.37...1.0.38 )
---
updated-dependencies:
- dependency-name: thiserror
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-12-19 10:33:32 +00:00
dependabot[bot]
95969ad24f
chore(deps): Bump base64 from 0.13.1 to 0.20.0 ( #6371 )
...
* chore(deps): Bump base64 from 0.13.1 to 0.20.0
Bumps [base64](https://github.com/marshallpierce/rust-base64 ) from 0.13.1 to 0.20.0.
- [Release notes](https://github.com/marshallpierce/rust-base64/releases )
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md )
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.13.1...v0.20.0 )
---
updated-dependencies:
- dependency-name: base64
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore: Run cargo hakari tasks
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-12 07:07:19 +00:00
dependabot[bot]
1d38d400f0
chore(deps): Bump object_store from 0.5.1 to 0.5.2 ( #6339 )
...
* chore(deps): Bump object_store from 0.5.1 to 0.5.2
Bumps [object_store](https://github.com/apache/arrow-rs ) from 0.5.1 to 0.5.2.
- [Release notes](https://github.com/apache/arrow-rs/releases )
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md )
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.1...object_store_0.5.2 )
---
updated-dependencies:
- dependency-name: object_store
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore: Run cargo hakari tasks
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-06 07:53:54 +00:00
Andrew Lamb
14a9bc92e9
Revert "Revert "chore: Update Datafusion and arrow/arrow-flight/parquet to `28.0.0` ( #6279 )" ( #6294 )" ( #6296 )
...
This reverts commit b7e52c0d8d
.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-01 14:20:43 +00:00
Andrew Lamb
b7e52c0d8d
Revert "chore: Update Datafusion and arrow/arrow-flight/parquet to `28.0.0` ( #6279 )" ( #6294 )
...
This reverts commit 039a45ddd1
.
2022-12-01 11:38:42 +00:00
Andrew Lamb
039a45ddd1
chore: Update Datafusion and arrow/arrow-flight/parquet to `28.0.0` ( #6279 )
...
* chore: Update Datafusion and arrow/arrow-flight/parquet to `28.0.0`
* chore: Update thrift to 0.17
* fix: use workspace arrow-flight in ingester2
* chore: Update for API changes
* fix: test
* chore: Update hakari
* chore: Update hakari again
* chore: Update trace_exporters to latest thrift
* fix: update test
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-30 14:12:30 +00:00
dependabot[bot]
74cf99eac7
chore(deps): Bump zstd from 0.11.2+zstd.1.5.2 to 0.12.0+zstd.1.5.2 ( #6225 )
...
* chore(deps): Bump zstd from 0.11.2+zstd.1.5.2 to 0.12.0+zstd.1.5.2
Bumps [zstd](https://github.com/gyscos/zstd-rs ) from 0.11.2+zstd.1.5.2 to 0.12.0+zstd.1.5.2.
- [Release notes](https://github.com/gyscos/zstd-rs/releases )
- [Commits](https://github.com/gyscos/zstd-rs/commits )
---
updated-dependencies:
- dependency-name: zstd
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore: Run cargo hakari tasks
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-25 10:00:28 +00:00
Andrew Lamb
1a1ea74cb7
chore: Upgrade datafusion again ( #6160 )
...
* Revert "Revert "chore: Update datafusion again (#6108 )""
This reverts commit 766b3bbeb440618cfe332f6ee7d4f8a8217acc48.
* fix: Respect the partition sort key
* chore: update plans
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-22 19:28:26 +00:00
dependabot[bot]
04c00bbb62
chore(deps): Bump bytes from 1.2.1 to 1.3.0 ( #6199 )
...
Bumps [bytes](https://github.com/tokio-rs/bytes ) from 1.2.1 to 1.3.0.
- [Release notes](https://github.com/tokio-rs/bytes/releases )
- [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md )
- [Commits](https://github.com/tokio-rs/bytes/commits )
---
updated-dependencies:
- dependency-name: bytes
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-22 08:23:24 +00:00
dependabot[bot]
a9db7581cd
chore(deps): Bump tokio from 1.21.2 to 1.22.0 ( #6183 )
...
Bumps [tokio](https://github.com/tokio-rs/tokio ) from 1.21.2 to 1.22.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.21.2...tokio-1.22.0 )
---
updated-dependencies:
- dependency-name: tokio
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-21 10:21:24 +00:00
Andrew Lamb
67712b595c
Revert "chore: Update datafusion again ( #6108 )" ( #6159 )
...
This reverts commit fbe9f27f10
.
2022-11-16 21:14:55 +00:00
Andrew Lamb
fbe9f27f10
chore: Update datafusion again ( #6108 )
...
* chore: Update datafusion pin + api code
* chore: Run cargo hakari tasks
* refactor: combine_sort_key is more idomatic and add rationale comments
* refactor: satisfy borrow checker and updated comments
* fix: Add test case for combine_sort_key
* fix: Apply suggestions from code review
Co-authored-by: Marco Neumann <marco@crepererum.net>
* fix: Add back test for deeply nested expression
* fix: Update output ordering
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-16 14:41:52 +00:00
dependabot[bot]
a969754819
chore(deps): Bump chrono from 0.4.22 to 0.4.23 ( #6129 )
...
* chore(deps): Bump chrono from 0.4.22 to 0.4.23
Bumps [chrono](https://github.com/chronotope/chrono ) from 0.4.22 to 0.4.23.
- [Release notes](https://github.com/chronotope/chrono/releases )
- [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md )
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.22...v0.4.23 )
---
updated-dependencies:
- dependency-name: chrono
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
* refactor: chrono future compat
Integer->timstamp conversions should not silently panic.
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-14 13:34:09 +00:00
Carol (Nichols || Goulding)
bdff4e8848
fix: Consistently use 'namespace' instead of 'database' in comments and other internal text
2022-11-11 15:46:04 -05:00
Andrew Lamb
6c17ee29a5
feat: make logging clearer when parquet files upload is retried ( #6056 )
...
* feat: log success when parquet files are retried
* fix: Update parquet_file/src/storage.rs
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: fmt
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-10 11:59:54 +00:00
Carol (Nichols || Goulding)
43687a86d2
fix: Remove lots of needless borrows that Clippy can now identify
...
Except for in generated code that we don't control.
2022-11-09 10:54:18 -05:00
Andrew Lamb
4fb2843d05
refactor: Rename `schema::selection::Selection` to `schema::projection::Projection` ( #6037 )
...
* chore: Rename `schema::selection::Selection` to `schema::projection::Projection`
* fix: docs
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-02 18:15:04 +00:00
Marco Neumann
45b3984aa3
refactor: simplify `QueryChunk` data access ( #6015 )
...
* refactor: simplify `QueryChunk` data access
We have only two types for chunks (now that the RUB is gone):
1. In-memory RecordBatches
2. Parquet files
Loads of logic is duplicated in the different `read_filter`
implementations. Also `read_filter` hides a solid amount of logic from
DataFusion, which will prevent certain (future) optimizations. To enable #5897
and to simplify the interface, let the chunks return the data (batches
or metadata for parquet files) directly and let `iox_query` perform the
actual heavy-lifting.
* docs: improve
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* docs: improve
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-02 08:18:33 +00:00
Andrew Lamb
9c1f0a3644
refactor: move SessionConfig creation into datafusion_utils ( #6011 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 20:04:49 +00:00
Carol (Nichols || Goulding)
b27e2bd7d1
feat: Use workspace dep inheritance for the parquet crate
2022-10-26 10:37:51 -04:00
Carol (Nichols || Goulding)
3145e2c05b
feat: Use workspace dep inheritance for the arrow crate
2022-10-26 10:34:29 -04:00
Carol (Nichols || Goulding)
44936f661a
feat: Use workspace dep inheritance for datafusion instead of shim crate
2022-10-26 10:33:56 -04:00
Marco Neumann
9b48437711
refactor: make influx column type mandatory ( #5978 )
...
We basically assume everywhere that a column falls into one of the three
known categories (time, tag, field), so lets encode this in our type
system instead of defining "unknown" as "undefined behavior, may or may
not crash".
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-26 11:20:29 +00:00
Carol (Nichols || Goulding)
2e83e04eab
feat: Use workspace package metadata to reduce differences and repetition
2022-10-24 13:04:09 -04:00
Marco Neumann
284f253846
refactor: remove unused constant ( #5956 )
...
Now that we read throw `ParquetExec`, `ROW_GROUP_READ_SIZE` is no longer
used.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-24 11:08:44 +00:00
Marco Neumann
e0062f2d40
refactor: do NOT use fake DF context for parquet reading ( #5942 )
...
Use the proper top-level DataFusion context and register the object
store there.
Note that we still hide the `ParquetExec` behind an opaque record batch
stream. Fixing that is next on my list.
Helps with #5897 .
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-24 08:20:26 +00:00
Andrew Lamb
7781ed0455
chore: Update datafusion ( #5928 )
...
* chore: Upgrade datafusion
* chore: Update for new API
* chore: Update expected output
* fix: Update comment
* fix: compilation
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-20 14:37:49 +00:00
Marco Neumann
42b89ade03
refactor: use `SendableRecordBatchStream` to write parquets ( #5911 )
...
Use a proper typed stream instead of peeking the first element. This is
more in line with our remaining stack and shall also improve error
handling.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-19 12:59:53 +00:00