Commit Graph

56 Commits (1ca9d28feeb9fb6bb816a298115feab33eec7108)

Author SHA1 Message Date
Andrew Lamb 2c3d30ca32
chore: Update datafusion, arrow, flight and parquet (#4000)
* chore: Update datafusion, arrow, flight and parquet

* fix: api change

* fix: fmt

* fix: update test metadata size

* fix: Update sizes in parquet test

* fix: more metadata size update
2022-03-10 12:24:47 +00:00
Carol (Nichols || Goulding) 8f3e44bf76
refactor: Extract a crate for shared data types in the new design 2022-03-02 12:16:15 -05:00
dependabot[bot] b63f920d4c
chore(deps): Bump parquet from 9.0.2 to 9.1.0 (#3828)
* chore(deps): Bump parquet from 9.0.2 to 9.1.0

Bumps [parquet](https://github.com/apache/arrow-rs) from 9.0.2 to 9.1.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/9.0.2...9.1.0)

---
updated-dependencies:
- dependency-name: parquet
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: update chunk size test

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-23 11:25:15 +00:00
dependabot[bot] 3b7d31c88a
chore(deps): Bump arrow from 9.0.2 to 9.1.0 (#3826)
Bumps [arrow](https://github.com/apache/arrow-rs) from 9.0.2 to 9.1.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/9.0.2...9.1.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-23 09:25:46 +00:00
dependabot[bot] ad3868ed7c
chore(deps): Bump tokio from 1.16.1 to 1.17.0 (#3814)
* chore(deps): Bump tokio from 1.16.1 to 1.17.0

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.16.1 to 1.17.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.16.1...tokio-1.17.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build: update workspace-hack

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom Dwyer <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-22 16:27:43 +00:00
Andrew Lamb a30803e692
chore: Update datafusion, update `arrow`/`parquet`/`arrow-flight` to 9.0 (#3733)
* chore: Update datafusion

* chore: Update arrow

* fix: missing updates

* chore: Update cargo.lock

* fix: update for smaller parquet size

* fix: update test for smaller parquet files

* test: ensure parquet_file tests write multiple row groups

* fix: update callsite

* fix: Update for tests

* fix: harkari

* fix: use IoxObjectStore::existing

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-15 12:10:24 +00:00
Marco Neumann 22778a3a80
chore: upgrade rskafka and parking_lot (#3592) 2022-02-01 11:50:42 +00:00
Carol (Nichols || Goulding) bf89162fa5
refactor: Move IoxMetadata to parquet_file 2022-01-31 10:36:33 -05:00
Andrew Lamb 5488c257d1
chore: Update datafusion, upgrade to arrow/parqet/arrow-flight 8.0.0 (#3517)
* chore: Update datafusion

* chore: update to arrow 8

* fix: update to use new DataFusion APIs

* fix: update case for sortedness

* fix: cargo hakari
2022-01-27 13:33:27 +00:00
Andrew Lamb dd23056efd
chore: update datafusion, arrow, prost, tonic, pbjson, etc (#3455)
* chore: update datafusion, arrow, prost, tonic, etc

* fix: update pprof as well

* chore: update hakari

* fix: update pbjson

* chore: update heappy

* fix: hakari

* fix: workaround https://github.com/influxdata/influxdb_iox/issues/3458

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-13 17:07:15 +00:00
Marco Neumann f3f6f335a9
chore: upgrade to snafu 0.7 (#3440) 2022-01-11 19:22:36 +00:00
Carol (Nichols || Goulding) 7499eac067
fix: Disable uuid serde feature; we're not actually serializing any UUIDs
Connects to #3117.
2021-12-06 09:37:31 -05:00
Carol (Nichols || Goulding) 02c297e850
fix: Always specify the parking_lot feature of tokio to get potential perf boost 2021-12-06 09:37:15 -05:00
Carol (Nichols || Goulding) 0b24b3c227
fix: Use a consistent version specifier when depending on the futures crate 2021-12-06 09:37:12 -05:00
Carol (Nichols || Goulding) 9fd4a560f5
feat: Results of running cargo hakari manage-deps 2021-11-19 09:21:57 -05:00
dependabot[bot] c540b40f05
chore(deps): bump tokio from 1.12.0 to 1.13.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.12.0 to 1.13.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.12.0...tokio-1.13.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-01 11:21:59 +00:00
Marco Neumann bc7244c48e chore: use Rust edition 2021 2021-10-25 10:58:20 +02:00
Andrew Lamb a82dc6f5f0
chore: Update datafusion + arrow (#2903)
* chore: Update datafusion to latest, arrow to 6.0.0

* fix: Update tests

* fix: bubble internal error

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-19 17:14:08 +00:00
Marco Neumann d8f35d8ee9 chore: remove unused `parquet_file` => `chrono` dep 2021-10-19 14:45:56 +02:00
Raphael Taylor-Davies b39e01f7ba
feat: migrate PersistenceWindows to TimeProvider (#2722) (#2798) 2021-10-11 20:40:00 +00:00
Raphael Taylor-Davies afe34751e7
refactor: split out schema crate (#2781)
* refactor: split out schema crate

* chore: fix doc
2021-10-11 09:45:08 +00:00
Raphael Taylor-Davies 86cee568d5
feat: use upstream pbjson (#2650)
* feat: use upstream pbjson

* chore: fmt
2021-09-28 16:29:26 +00:00
Marco Neumann d7b697dfe9 chore: remove unused `object_store` => `tracker` dep 2021-09-22 11:13:40 +02:00
Marco Neumann 9c80d32af5 refactor: use normal google timestamps in parquet metadata again
We changed from Google timestamp (which use variable-sized integers) to
our own fixed-sized integer timestamps so that the size of the parquet
metadata does not depend on the timestamp. However with the introduction
of compression this is the case anyways (since slightly different
timestamps lead to different compression results) and we need now
derministic timestamps for tests. So there is now point in using our own
timestamp type. Switching back to the variable-sized type also shrinks
the post-compression results a bit.
2021-09-20 09:34:03 +02:00
Marco Neumann afc507ae14 feat: compress encoded parquet metadata
Depending on the number of columns, this should safe between 60% and
75%.
2021-09-20 09:33:18 +02:00
Marco Neumann 509c07330d refactor: decouple `parquet_file` from `query` 2021-09-14 18:26:16 +02:00
Marco Neumann bfaba78dc3 refactor: move `predicate` into its own crate
Two reasons:

1. I wanna decouple `parquet_file` from `query` (nearly done, needs a
   small follow-up PR).
2. `predicate` will have more and more features (like serialization)
   which justifies a new home
2021-09-14 17:13:02 +02:00
Raphael Taylor-Davies 44918e4afc
feat: migrate chunk metrics (#2491)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-09 16:02:16 +00:00
Carol (Nichols || Goulding) ae6b0e669b refactor: Extract a database persister type that wraps object store
Connects to #2193.
2021-08-12 15:05:32 -04:00
Andrew Lamb d41b44d312
feat: use zstd compression when writing parquet files (#2218)
* feat: use ZSTD when writing parquet files

* fix: test
2021-08-06 18:45:55 +00:00
Andrew Lamb e92e94caad
chore: Update deps (including arrow 5.1.0, tonic -> 0.5, and prost 0.5) (#2172)
* chore: Update deps (including arrow 5.0.0 --> arrow 5.1.0)

* chore: update all the things

* refactor: Update serving readiness check due to change in Tonic API

* chore: update more deps

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-05 15:57:38 +00:00
Andrew Lamb 4da8a16c18
chore: update to arrow 5.0 and master datafusion (#2049)
* chore: update to arrow 5.0 and master datafusion

* fix: Update test for change in object size
2021-07-19 12:49:51 +00:00
Marco Neumann 4ca2d3e148 chore: move persistence windows related code into own crate
The entire persistence windows data structures (including the
checkpoints) have nothing to do with the mutable buffer per se. So lets
move them into their own crate. This also makes `parquet_file` not
longer depend on `mutable_buffer`.
2021-07-05 10:23:58 +02:00
Marco Neumann cdab1bed05 feat: persist part+db checkpoint in parquets and catalog
This will be required for replay on server startup.
2021-07-05 09:42:46 +02:00
Marco Neumann 4204127b05 refactor: use protobuf for in-parquet metadata 2021-06-30 16:51:37 +02:00
Marco Neumann 0a625b50e6 feat: store transaction timestamp in preserved catalog 2021-06-02 09:41:19 +02:00
Andrew Lamb d3711a5591
refactor: Use ParquetExec from DataFusion to read parquet files (#1580)
* refactor: use ParquetExec to read parquet files

* fix: test

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-01 14:44:07 +00:00
Andrew Lamb 00e735ef0d
chore: remove unused dependencies (#1583) 2021-05-29 10:31:57 +00:00
Raphael Taylor-Davies 4fcc04e6c9
chore: enable arrow prettyprint feature (#1566) 2021-05-27 10:28:14 +00:00
Marco Neumann 19a2733d30 feat: preserve transaction metadata in parquets 2021-05-25 09:56:12 +02:00
Andrew Lamb 14ba25f86d
chore: Update datafusion and use released version of arrow crates (#1546)
* chore: Update datafusion and use released version of arrow crate

* fix: Update for change in API
2021-05-24 15:37:22 +00:00
Raphael Taylor-Davies f9178dbb5f
feat: push metrics into catalog (#1488)
* feat: push metrics into catalog

* chore: minor cleanup

* fix: include db labels in chunk metric domains

* chore: fmt

* fix: don't allow dropping moving chunks

* chore: further tweaks

* chore: review feedback

* feat: use new_unregistered() for metric instruments instead of default

* chore: use &[KeyValue] instead of &Vec<KeyValue>

* refactor: make GauageValue non default constructible
2021-05-14 17:37:39 +00:00
Andrew Lamb 86771ea629
chore: update arrow/datafusion deps (#1433)
* chore: update datafusion deps

* chore: update arrow deps

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-05 22:37:31 +00:00
Raphael Taylor-Davies 411cf134e9
refactor: explode arrow_deps (#1425)
* refactor: explode arrow_deps

* chore: workaround doctest bug
2021-05-05 16:59:12 +00:00
Marco Neumann 1f42eb89cd feat: implement parquet metadata handling
Closes #1379 and contributes to #1380.
2021-05-05 13:29:16 +02:00
Marco Neumann 136c35cb88 feat: implement transaction handling for catalog
Closes #1253.
2021-05-03 10:04:35 +02:00
Nga Tran 4c23ca8888 feat: full implementation of parquet's read_filter for review 2021-04-16 16:03:24 -04:00
Andrew Lamb e226b5a820
feat: Use TimestampNanosecondArray for timestamps in IOx (#1230)
* refactor: Create Arrow arrays using iterators

* feat: use Timestamp64(TimeUnit::Nanosecond) for timestamps

* feat: add support for timestamp array

* fix: update more tests

* fix: remove unecessary code

Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-16 15:55:33 +00:00
Nga Tran 4a6d6bd7ad feat: initial work for querying data from parquet file in object store 2021-04-13 13:57:46 -04:00
Raphael Taylor-Davies 1997324344
feat: mutable buffer snapshotting (#1179)
* feat: mutable buffer snapshotting

* chore: review feedback
2021-04-13 12:14:54 +00:00