Commit Graph

48 Commits (5d66cd0a81d2533ee174e1bfb56bab86f81c1fed)

Author SHA1 Message Date
Marco Neumann 2b76c31157
refactor: make statistics null counts optional (#4160)
Min/max values and distinct counts are already optional, so let's make
the null counts optional as well. This will be helpful for NG to deal w/
partial statistics (e.g. we only populate stats for the time column).

Note that the total count is still mandatory, but we normally have the
chunk/file-level row count at hand.
2022-03-29 17:47:57 +00:00
Andrew Lamb 5c69a3f43b
chore: Update deps: datafusion, arrow/arrow-flight/parquet to 11, zstd to 0.11 (#4119)
* chore: update datafusion

* chore(deps): Bump arrow from 10.0.0 to 11.0.0

Bumps [arrow](https://github.com/apache/arrow-rs) from 10.0.0 to 11.0.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/10.0.0...11.0.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore(deps): Bump arrow-flight from 10.0.0 to 11.0.0

Bumps [arrow-flight](https://github.com/apache/arrow-rs) from 10.0.0 to 11.0.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/10.0.0...11.0.0)

---
updated-dependencies:
- dependency-name: arrow-flight
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: update parquet to 11.0.0

* fix: error on create schema, test for same

* fix: upgrade zstd

* chore: Run cargo hakari tasks

* fix: fix logical merge conflict

* fix: hakari

* fix: hakari

* fix: update newly introduced dep

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-24 15:27:36 +00:00
Dom Dwyer 5585dd3c21 refactor: switch to using DynObjectStore
Changes all consumers of the object store to use the dynamically
dispatched DynObjectStore type, instead of using a hardcoded concrete
implementation type.
2022-03-15 16:32:52 +00:00
Dom Dwyer 1d5066c421 refactor: rename ObjectStore -> ObjectStoreImpl
Frees up the name for so we can use `dyn ObjectStore` throughout the
code instead of `ObjectStoreApi`.
2022-03-15 16:29:43 +00:00
Andrew Lamb 2c3d30ca32
chore: Update datafusion, arrow, flight and parquet (#4000)
* chore: Update datafusion, arrow, flight and parquet

* fix: api change

* fix: fmt

* fix: update test metadata size

* fix: Update sizes in parquet test

* fix: more metadata size update
2022-03-10 12:24:47 +00:00
Andrew Lamb 677a272095
refactor: Clean up some future clippy warnings from nightly (#3892)
* refactor: clean up new clippy lints

* refactor: complete other cleanups

* fix: ignore overzealous clippy

* fix: re-remove old code
2022-03-03 19:14:27 +00:00
Marco Neumann 33851be3a5
chore: upgrade Rust to 1.59 (#3875)
Mostly a few new clippy crates around `flat_map`, `and_then`, and
"underscore locks" (!!!):
https://rust-lang.github.io/rust-clippy/master/index.html#let_underscore_lock

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-28 15:14:19 +00:00
Raphael Taylor-Davies 2a842fbb1a
feat: correctly sort data and store in catalog metadata (#3864)
* feat: respect sort order in ChunkTableProvider (#3214)

feat: persist sort order in catalog (#3845)

refactor: owned SortKey (#3845)

* fix: size tests

* refactor: immutable SortKey

* test: test sort order restart (#3845)

* chore: explicit None for sort key

* chore: test cleanup

* fix: handling of sort keys containing fields

* chore: remove unused selected_sort_key

* chore: more docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-25 17:56:27 +00:00
Marco Neumann f966f4c7a4
feat: create `ParquetChunk` in querier (#3857)
Adds a small adapter that is able to produce `ParquetChunk`s for NG.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-25 08:54:16 +00:00
Marco Neumann 49d1be30e7
feat: wire up `ParquetFilePath` for NG (#3853)
It's a bit of a duck-type hack, but if we wanna just `ParquetFileChunk`
in the new architecture, we somehow need it to accept new-gen paths.
Also path handling should be somewhat centralized since
ingester/compactor/querier all need to construct them. So having a
`ParquetFilePath` that supports both path styles seems to be a
not-to-bad solution. This should obviously be cleaned up in some
not-to-distant future.
2022-02-24 16:05:38 +00:00
Marco Neumann d62a052394
feat: extend catalog so we can recover `ParquetChunk`s from it (#3852)
* refactor: less parquet data copying

* feat: `PartitionRepo::get_by_id`

* feat: `TableRepo::get_by_id`

* feat: `ParquetFile::file_size_bytes`

* feat: `ParquetFile::parquet_metadata`

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-24 13:16:15 +00:00
dependabot[bot] b63f920d4c
chore(deps): Bump parquet from 9.0.2 to 9.1.0 (#3828)
* chore(deps): Bump parquet from 9.0.2 to 9.1.0

Bumps [parquet](https://github.com/apache/arrow-rs) from 9.0.2 to 9.1.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/9.0.2...9.1.0)

---
updated-dependencies:
- dependency-name: parquet
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: update chunk size test

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-23 11:25:15 +00:00
dependabot[bot] 3b7d31c88a
chore(deps): Bump arrow from 9.0.2 to 9.1.0 (#3826)
Bumps [arrow](https://github.com/apache/arrow-rs) from 9.0.2 to 9.1.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/9.0.2...9.1.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-23 09:25:46 +00:00
dependabot[bot] ad3868ed7c
chore(deps): Bump tokio from 1.16.1 to 1.17.0 (#3814)
* chore(deps): Bump tokio from 1.16.1 to 1.17.0

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.16.1 to 1.17.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.16.1...tokio-1.17.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build: update workspace-hack

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom Dwyer <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-22 16:27:43 +00:00
Andrew Lamb a30803e692
chore: Update datafusion, update `arrow`/`parquet`/`arrow-flight` to 9.0 (#3733)
* chore: Update datafusion

* chore: Update arrow

* fix: missing updates

* chore: Update cargo.lock

* fix: update for smaller parquet size

* fix: update test for smaller parquet files

* test: ensure parquet_file tests write multiple row groups

* fix: update callsite

* fix: Update for tests

* fix: harkari

* fix: use IoxObjectStore::existing

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-15 12:10:24 +00:00
Marco Neumann 22778a3a80
chore: upgrade rskafka and parking_lot (#3592) 2022-02-01 11:50:42 +00:00
Carol (Nichols || Goulding) 0f72a881ef
refactor: Rename Rust struct parquet_file::IoxMetadata to be IoxMetadataOld 2022-01-31 10:36:33 -05:00
Raphael Taylor-Davies 4101d16f71
chore: feature flag consistency (#3574)
* chore: feature flag consistency

* chore: add aarch64-apple-darwin to hakari

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-28 16:38:59 +00:00
Andrew Lamb 5488c257d1
chore: Update datafusion, upgrade to arrow/parqet/arrow-flight 8.0.0 (#3517)
* chore: Update datafusion

* chore: update to arrow 8

* fix: update to use new DataFusion APIs

* fix: update case for sortedness

* fix: cargo hakari
2022-01-27 13:33:27 +00:00
Andrew Lamb 9c19cd6cc4
fix: clamp start/end of TimestampRange to min/max valid timestamp values (#3487)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-20 16:08:00 +00:00
Andrew Lamb dd23056efd
chore: update datafusion, arrow, prost, tonic, pbjson, etc (#3455)
* chore: update datafusion, arrow, prost, tonic, etc

* fix: update pprof as well

* chore: update hakari

* fix: update pbjson

* chore: update heappy

* fix: hakari

* fix: workaround https://github.com/influxdata/influxdb_iox/issues/3458

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-13 17:07:15 +00:00
Marco Neumann f3f6f335a9
chore: upgrade to snafu 0.7 (#3440) 2022-01-11 19:22:36 +00:00
Marco Neumann 37bb7f2120 chore: `cargo update`
dependabot currently doesn't work due to
https://github.com/dependabot/dependabot-core/issues/4574

Excluded `quote` due to
https://github.com/dtolnay/quote/issues/204
2022-01-11 14:57:51 +01:00
Andrew Lamb ccba68fe3e
chore: Update datafusion and sqlparser, and arrow (#3385)
* chore: Update datafusion and sqlparser, and arrow

* fix: cargo hakari

* test: fix metadata size

* fix: update some metadata sizes

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-12-16 12:13:40 +00:00
Nga Tran c992c82582 chore: Merge branch 'main' into ntran/compact_os_tests 2021-12-07 11:08:12 -05:00
Carol (Nichols || Goulding) 7499eac067
fix: Disable uuid serde feature; we're not actually serializing any UUIDs
Connects to #3117.
2021-12-06 09:37:31 -05:00
Carol (Nichols || Goulding) 02c297e850
fix: Always specify the parking_lot feature of tokio to get potential perf boost 2021-12-06 09:37:15 -05:00
Carol (Nichols || Goulding) 0b24b3c227
fix: Use a consistent version specifier when depending on the futures crate 2021-12-06 09:37:12 -05:00
Raphael Taylor-Davies bca561366b
feat: don't copy parquet files out of disk object store (#3282) (#3293)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-12-05 16:31:40 +00:00
Nga Tran 86f9fe0bcb refactor: no longer need to create and test no-row-groups parquet files 2021-12-03 15:14:04 -05:00
Nga Tran 152281e428 fix: Capture the right 'no data' while parquet has no data 2021-12-03 12:19:48 -05:00
Edd Robinson b4ea9887ba refactor: error name 2021-12-02 20:14:02 +00:00
Edd Robinson 88aedc556e feat: add FromStr implementation 2021-12-02 12:59:52 +00:00
Nga Tran f085af034e
refactor: not persist empty chunk resulting from deleting & deduplicating (#3274)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-12-01 20:57:30 +00:00
Andrew Lamb 20da62eb87
chore: Update `arrow`, `arrow-flight`, and `parquet` to 6.3.0 (#3254)
* chore: Update `arrow`,  `arrow-flight`, and `parquet` to 6.3.0

* fix: update metadata size
2021-11-30 12:30:17 +00:00
Raphael Taylor-Davies 88acf3788e
feat: rebuild catalog (#3225) (#3253)
* feat: rebuild catalog (#3225)

* chore: fix doc

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-30 12:17:27 +00:00
kodiakhq[bot] d16a7759ca
Merge branch 'main' into cn/workspace-hack 2021-11-22 17:05:31 +00:00
Raphael Taylor-Davies 73d60539ad
refactor: use ChunkGenerator in parquet_catalog (#2209) (#3167)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-22 10:29:33 +00:00
Carol (Nichols || Goulding) 9fd4a560f5
feat: Results of running cargo hakari manage-deps 2021-11-19 09:21:57 -05:00
Carol (Nichols || Goulding) a2454b542d
fix: Small cleanups in Cargo.tomls (#3160)
* fix: Add tokio rt-multi-thread feature so cargo test -p client_util compiles

* fix: Alphabetize dependencies

* fix: Add the data_types_conversions feature to get tests passing

* fix: Remove dev dependencies already listed under normal dependencies

* fix: Make sure the workspace is using the new resolver
2021-11-18 22:26:33 +00:00
Raphael Taylor-Davies 58f3e2e559
refactor: move delete predicate proto serialization to generated_types (#3108)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-15 12:02:14 +00:00
Raphael Taylor-Davies 3d091208af
refactor: move delete predicate into data_types (#2731) (#3094)
* refactor: move delete predicate into dml (#2731)

* refactor: move DeletePredicate to data_types

* chore: fix doc

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-15 10:28:58 +00:00
Marco Neumann 4c9570b519 refactor: move `catalog` protobuf to `preserved_catalog`
This makes it clearer what's going since the contained messages are
only for the preserved part, not the in-mem catalog and its management.
2021-11-01 18:07:25 +01:00
dependabot[bot] c540b40f05
chore(deps): bump tokio from 1.12.0 to 1.13.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.12.0 to 1.13.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.12.0...tokio-1.13.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-01 11:21:59 +00:00
Marco Neumann bc7244c48e chore: use Rust edition 2021 2021-10-25 10:58:20 +02:00
Andrew Lamb a82dc6f5f0
chore: Update datafusion + arrow (#2903)
* chore: Update datafusion to latest, arrow to 6.0.0

* fix: Update tests

* fix: bubble internal error

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-19 17:14:08 +00:00
Marco Neumann 586d6b6f28 chore: remove unused `parquet_catalog` => `chrono` dep 2021-10-19 14:45:56 +02:00
Marco Neumann 28195b9c0c chore: new `parquet_catalog` crate 2021-10-14 14:34:59 +02:00