Commit Graph

464 Commits (e92e94caad6843866c7b4046fa502f820b2bd102)

Author SHA1 Message Date
Andrew Lamb e92e94caad
chore: Update deps (including arrow 5.1.0, tonic -> 0.5, and prost 0.5) (#2172)
* chore: Update deps (including arrow 5.0.0 --> arrow 5.1.0)

* chore: update all the things

* refactor: Update serving readiness check due to change in Tonic API

* chore: update more deps

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-05 15:57:38 +00:00
Andrew Lamb bb8021d9fd
fix: "Can not convert index to usize in dictionary of type creating group by value Int32" (#2151)
* test: add reproducer for index error

* chore: update datafusion
2021-08-02 12:20:41 +00:00
Andrew Lamb 7ca31703a4
feat: Attempt to dump a stacktrace to stderr prior to process abort on SIGSEGV/SIGILL/SIGBUS (#2155)
* feat: Attempt to dump a stacktrace to stdout prior to process abort

* refactor: rewrite signal handling in terms of libc, remove sig dep

* refactor: print to stderr

* fix: Update src/main.rs

* docs: note provenance
2021-07-30 11:58:28 +00:00
Andrew Lamb 21270b7072
chore: update dependencies (#2152)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-29 18:17:39 +00:00
kodiakhq[bot] cd0910c825
Merge branch 'main' into cn/update-croaring 2021-07-28 13:58:24 +00:00
Carol (Nichols || Goulding) 537e031987 fix: Upgrade croaring to get a performance fix
This release includes https://github.com/saulius/croaring-rs/pull/73 which
includes 6403d44cbe
which should hopefully fix some of the dynamic CPUID issues we're
seeing.
2021-07-28 09:25:01 -04:00
Marco Neumann e736bc6953 feat: add ingest timestamp to `Sequence`
This allows us to track wall-clock ingest time for entries that we
receive via write buffer (e.g. Kafka).
2021-07-28 14:41:18 +02:00
Dom 7d3b6bf80d feat: DatabaseRules (de)serialisation support
Derives serde::Serialise & Deserialise for the DatabaseRules struct and
children.
2021-07-27 14:29:26 +01:00
Andrew Lamb 9d408296d8
chore: Update datafusion deps (#2108)
* chore: Update datafusion deps

* chore: update other deps

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-23 21:24:01 +00:00
Raphael Taylor-Davies 446af5eb15
fix: consistent write timestamps (#2104)
* fix: consistent write timestamps

* chore: fix benchmarks
2021-07-23 18:04:15 +00:00
Raphael Taylor-Davies 844a025c7c
feat: drop based on LRU (#2075) (#2092)
* feat: drop based on LRU (#2075)

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-23 08:31:28 +00:00
kodiakhq[bot] 71f3f1aba2
Merge branch 'main' into cn/refactorings 2021-07-22 19:44:18 +00:00
Andrew Lamb 01c79f1a1a
fix: Print all timestamps using RFC3339 format (#2098)
* fix: Use IOx pretty printer rather than arrow pretty printer

* chore: update tests in the query crate

* chore: update influxdb_iox tests

* chore: Update end to end tests

* chore: update query_tests

* chore: update mutable_buffer tests

* refactor: update parquet_file tests

* refactor: update db tests

* chore: update kafka integration test output

* fix: merge conflict
2021-07-22 19:04:52 +00:00
Andrew Lamb b3d6e3ed7b
fix: Implement pretty printer wity RFC3339 formatting for timestamps (#2096)
* fix: Implement pretty printer wity RFC3339 formatting for timestamps

* fix: doc strings + fmt
2021-07-22 17:08:27 +00:00
Carol (Nichols || Goulding) 8d1d877196 feat: Record first/last write times for RUB chunks 2021-07-22 11:35:22 -04:00
Raphael Taylor-Davies 8c974beba0
feat: add access timestamps to CatalogChunk (#2075) (#2081)
* feat: add access timestamps to CatalogChunk (#2075)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 12:19:30 +00:00
Raphael Taylor-Davies ffe6e62aee
feat: add instant to datetime conversion (#2078)
* feat: add instant to datetime conversion

* chore: review feedback
2021-07-21 11:43:27 +00:00
Andrew Lamb 387667330a
chore: Update datafusion deps (#2073)
* chore: Update datafusion deps

* fix: update tests
2021-07-21 08:27:03 +00:00
Andrew Lamb 1c16988a51
chore: Update datafusion references (#2056) 2021-07-19 18:09:06 +00:00
Carol (Nichols || Goulding) ff6b0ac030
fix: Cargo git deps use rev rather than ref (#2050)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-19 13:31:24 +00:00
Edd Robinson a852dce450 refactor: default to num_cpu 2021-07-19 14:00:10 +01:00
Andrew Lamb 4da8a16c18
chore: update to arrow 5.0 and master datafusion (#2049)
* chore: update to arrow 5.0 and master datafusion

* fix: Update test for change in object size
2021-07-19 12:49:51 +00:00
Marko Mikulicic 06399e88e0
chore: Add some debug logs to write buffer 2021-07-15 22:18:03 +02:00
Andrew Lamb 74b8bb76e6
chore: Update to correct pre-release version of DF (#2023)
Co-authored-by: Edd Robinson <me@edd.io>
2021-07-15 18:13:42 +00:00
Marco Neumann b5428e53a5 refactor: write buffer testing + better mocking
This refactors the write buffer a bit for:

- **Testing:** Add generic tests for the Kafka and the mocking
  implementation. The same interface can be used easily add new
  implementations (e.g. via Redis, filesystem, ...).
- **Partition on Write:** The caller of the writer operation must now
  specify the partition/sequencer ID. The implicit partitioning of the
  Kafka writer would have lead to broken data since we must never spill
  entries w/ the same primary key over multiple partitions. At the
  moment we will only use partition 0 but we can easily implement
  better logic in the future.
- **Improved Mocking:** The mocked implementation now simulates a system
  that feels more real. Especially the handling around multiple streams
  and "write while read" has been improved. This will be helpful for
  testing and for new features like seeking (during replay). A solid
  realistic mock also helps us to ensure that the tests using the mock
  do not rely on unrealistic behavior too much.
2021-07-15 17:20:45 +02:00
Raphael Taylor-Davies d71f38f27c
feat: compute PartitionCheckpoint from PersistenceWindows (#2011)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 12:17:23 +00:00
Raphael Taylor-Davies cbeeb97cff
feat: flush open window on persist (#1985)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-14 16:58:20 +00:00
Edd Robinson 0e5276ed20
Merge branch 'main' into alamb/go_go_go_go 2021-07-14 13:56:35 +01:00
Marco Neumann 9cb9ae0874 chore: move write buffer into its own crate 2021-07-14 14:09:18 +02:00
Andrew Lamb 4800b36949 chore: Update IOx to a pre-release version of arrow and datafusion to test out performance improvement 2021-07-13 15:44:57 -04:00
Andrew Lamb ef3269ee1d
chore: Update datafusion deps (#1984)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-13 17:29:29 +00:00
Marco Neumann 157a0cc98c chore: update flatbuffers to 2.0 2021-07-13 15:44:45 +02:00
Marco Neumann f5c63b2ae6 chore: update nom to version 6
Triggered some changes from `Fn` to `FnMut`.
2021-07-13 15:28:42 +02:00
Marco Neumann f11be17523 chore: update mockito to 0.30 2021-07-13 15:17:28 +02:00
Marco Neumann 2e391deb34 chore: update croaring to 0.5.0
Upstreame changelog:

- CRoaring updated to 0.3.1
- `-march=native` is not a default for croaring-sys anymore
- Impl Default for `Bitmap` and `Treemap`
2021-07-13 15:15:41 +02:00
Marco Neumann 5c16bd2085 chore: cargo update 2021-07-13 15:13:37 +02:00
Andrew Lamb 8b9b369189
chore: Update DataFusion again (#1930)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 15:14:45 +00:00
Andrew Lamb 7602bde850
chore: Update datafusion deps (#1799)
* chore: Update datafusion deps + rework code

* refactor: remove workaround as it has been contributed upstream

* fix: Update query/src/exec/split.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 10:58:32 +00:00
Marco Neumann d6cff911b6 test: ensure that query tests don't rebuild all the time
Beforehand:

```text
❯ env CARGO_LOG=cargo::core::compiler::fingerprint=info cargo test -p query_tests
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint] stale: changed "/home/mneumann/src/influxdb_iox/query_tests/cases"
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint]           (vs) "/home/mneumann/src/influxdb_iox/target/debug/build/query_tests-0e8f741dfb84437f/output"
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint]                FileTime { seconds: 1625474716, nanos: 436081357 } != FileTime { seconds: 1625474752, nanos: 52625167 }
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/Test/TargetInner { ..: lib_target("query_tests", ["lib"], "/home/mneumann/src/influxdb_iox/query_tests/src/lib.rs", Edition2018) }
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint]     err: current filesystem status shows we're outdated
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/RunCustomBuild/TargetInner { ..: custom_build_target("build-script-build", "/home/mneumann/src/influxdb_iox/query_tests/build.rs", Edition2018) }
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint]     err: current filesystem status shows we're outdated
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/Build/TargetInner { ..: lib_target("query_tests", ["lib"], "/home/mneumann/src/influxdb_iox/query_tests/src/lib.rs", Edition2018) }
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint]     err: current filesystem status shows we're outdated
   Compiling query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)
```

The issue is that both the input and the test output files are located
under `cases/`. `build.rs` used `cargo:rerun-if-changed=cases` which per
Cargo doc will scan ALL files in that directory. Note that the normal
`exclude` directive in `Cargo.toml` does NOT work, see
https://github.com/rust-lang/cargo/issues/4587 .

So we need to split input and output files into separate directories
(`cases/{in,out}`).
2021-07-05 15:30:10 +02:00
Marco Neumann 4ca2d3e148 chore: move persistence windows related code into own crate
The entire persistence windows data structures (including the
checkpoints) have nothing to do with the mutable buffer per se. So lets
move them into their own crate. This also makes `parquet_file` not
longer depend on `mutable_buffer`.
2021-07-05 10:23:58 +02:00
Marco Neumann cdab1bed05 feat: persist part+db checkpoint in parquets and catalog
This will be required for replay on server startup.
2021-07-05 09:42:46 +02:00
Jacob Marble 0779b0d9bd
feat: add gRPC listener for new write protocol (#1842)
* feat: add gRPC listener for new write protocol

* chore: clippy happy

* chore: lint

* chore: cargo fmt --all

* chore: cargo clippy

* chore: protobuf-lint

* chore: more formatting

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:15:12 +00:00
Marco Neumann 4204127b05 refactor: use protobuf for in-parquet metadata 2021-06-30 16:51:37 +02:00
Andrew Lamb fef160e24f
feat: Implement data driven query_tests and port explain tests (#1814)
* feat: Implment data driven query testing and port explain tests

* fix: do not fmt the auto generated cases

* refactor: split setup and parser into separate modules

* refactor: Add log to runner, add end to end tests

* docs: fixu cpmments
2021-06-29 16:09:51 +00:00
Andrew Lamb 5cc773ad80
chore: update deps to get new arrow (#1831) 2021-06-28 20:31:31 +00:00
kodiakhq[bot] 1bde983d66
Merge branch 'main' into cn/kafka-write 2021-06-24 12:42:44 +00:00
Marko Mikulicic 69b0bb1510
feat: Implement grpc-router crate
This PR implements the main building block for implementing the gRPC StorageService router.
2021-06-23 17:21:46 +02:00
Carol (Nichols || Goulding) c66f9e5aeb feat: Write entries to Kafka when configured as the write buffer 2021-06-23 10:48:18 -04:00
kodiakhq[bot] 3fd45d987e
Merge branch 'main' into crepererum/fix_server_startup 2021-06-23 07:17:19 +00:00
Andrew Lamb 4c5007f961
fix: Select the correct timestamp for min/max selectors (#1771)
* test: Reproducer showing that the min/max selectors are order dependent

* fix: pick correct timestamp for first/last selectors

* refactor: remove println

* docs: Fixup comments and add to link to arrow-datafusion/issues/600

* fix: Add debug if timestamp is null

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-22 17:53:54 +00:00