Commit Graph

1076 Commits (a55a21c644c2865b16c87cdb538523b1da762410)

Author SHA1 Message Date
Raphael Taylor-Davies 1534ae9edf
refactor: store chunks in iteration order (#2634)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-27 11:59:27 +00:00
Carol (Nichols || Goulding) cf83a325f2
fix: Await on freeze handles instead of try_freeze/returning Transition errors (#2570)
* fix: Await on a freeze handle instead of returning TransitionInProgress

* fix: Await on freeze handle in skip_replay

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-26 10:44:59 +00:00
Edd Robinson 621b26166c feat: validate predicates on read_filter 2021-09-24 14:52:16 +01:00
Marco Neumann 6d85700e3e
docs: mention return type.
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-09-24 07:23:09 +00:00
Marco Neumann 4a0cda188e refactor: make `Partition::force_drop_chunk` similar to `Partition::drop_chunk`
- Bubbles up "not found" error, the caller should reason about it
- Returns deleted chunk
2021-09-23 18:37:54 +02:00
Marco Neumann e842733c5b refactor: `CatalogChunk::add_delete_predicate` cannot fail 2021-09-23 09:55:31 +02:00
kodiakhq[bot] b16e7ea91a
Merge branch 'main' into crepererum/issue2518c 2021-09-22 16:09:04 +00:00
Andrew Lamb d38648952c
chore: Update datafusion (#2602)
* chore: Update datafusion + other deps

* refactor: update query crate for new async interfaces

* refactor: update server crate for new async interface

* refactor: update query_tests crate for new async interfaces

* refactor: update ioxd and server to use new async interface

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-22 10:33:25 +00:00
Marco Neumann 0d7bb69dd3 feat: teach `Db` to preserve delete predicates 2021-09-22 09:43:37 +02:00
Marco Neumann 981ee0c6df refactor: accept unknown chunks in persisted delete predicates
Due to the timing of the "persist" lifecycle action and that delete
predicates might arrive at any time + the fact that we don't wanna hold
transaction locks for too long, we should accept delete predicates for
chunks that are currently "persisting" even though that lifecycle action
might fail.
2021-09-22 09:29:50 +02:00
Nga Tran b4b33c378e test: turn all delete tests on 2021-09-21 15:23:41 -04:00
Marco Neumann fb7299a169
fix: bubble up write errors (#2598)
Fixes #2538.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-21 11:08:37 -04:00
Marco Neumann 015dfb3b16 test: do not (ab)use the panic hook for replay tests
The old construct uses a single assert-statement for both:

- "bubble-up" scenario, were a panic should fire
- a check, were a panic should not fire

That makes it easy to add new tests. However we need two rather
questionable things to make that work:

- catch panic: to convert an assertion to a check
- a custom panic hook: to make tests not overly verbose (aka caught
  panics should not show up on stdout)

Esp. the custom panic hook doesn't work too well w/ multi-threaded tests
since it might swallow error messages from unrelated tests and makes
debugging of CI failures hard.

So instead of using assertions for checks, we now implement a proper
assertion and a check for each test. That's a bit more code per check
but easier probably more stable.
2021-09-21 12:00:37 +02:00
kodiakhq[bot] 77d84ca5ab
Merge branch 'main' into crepererum/chunk_id 2021-09-20 13:39:05 +00:00
kodiakhq[bot] c7e6fffaaa
Merge branch 'main' into ntran/delete_scan 2021-09-20 13:29:47 +00:00
Marco Neumann cef5aeee52 refactor: introduce `ChunkId` type 2021-09-20 13:10:41 +02:00
kodiakhq[bot] 140c71eaf0
Merge branch 'main' into crepererum/issue2518a 2021-09-20 09:16:39 +00:00
Raphael Taylor-Davies f62d0eab3c
feat: disable bytes serde (#2580)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-20 09:07:12 +00:00
Marco Neumann acf698c366 fix: delete predicate sorting 2021-09-20 10:48:32 +02:00
Marco Neumann 492d991f49 feat: delete catalog pres. catalog <=> in-mem catalog API
First step towards #2518. Creates the Rust API to communicate delete
predicates between the preserved catalog and the in-memory catalog and
adds tests ensuring that the in-mem catalog produces the wanted errors
as well as correct checkpoints (similar to how this is done for the
parquet file tracking already).

**This does NOT contain the actual preservation!**
2021-09-20 10:48:32 +02:00
dependabot[bot] 876bb10cf8
chore(deps): bump rand_distr from 0.4.1 to 0.4.2
Bumps [rand_distr](https://github.com/rust-random/rand) from 0.4.1 to 0.4.2.
- [Release notes](https://github.com/rust-random/rand/releases)
- [Changelog](https://github.com/rust-random/rand/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-random/rand/compare/rand_distr-0.4.1...rand_distr-0.4.2)

---
updated-dependencies:
- dependency-name: rand_distr
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-20 08:39:39 +00:00
Marco Neumann 0f5198c88d test: fix tests dealing w/ parquet metadata sizes
Sizes now depend on the actual content and therefore we need
deterministic timestamps.
2021-09-20 09:42:53 +02:00
Marco Neumann e15631002e test: allow test code to specify exact parquet creation timestamp
This is required for deterministic sizes since different timestamp lead
to different compression ratios.
2021-09-20 09:42:52 +02:00
Nga Tran 364d245eae feat: apply negated delete predicates during scan 2021-09-17 16:20:42 -04:00
Carol (Nichols || Goulding) 51a40b31bf feat: Add a --detailed option to the database list CLI
That will list both active and deleted databases with their generations.

Closes #2462.
2021-09-17 15:27:23 -04:00
Carol (Nichols || Goulding) 44a89cdf75 refactor: Change DeletedDatabase to DetailedDatabase
So this info can be reused for active databases in detailed database
lists.
2021-09-17 15:27:22 -04:00
kodiakhq[bot] 23cc980d9e
Merge branch 'main' into cn/restore 2021-09-17 17:52:56 +00:00
Nga Tran 60a866ddcb refactor: merge delete predicates into select predicate 2021-09-17 07:52:33 -04:00
Nga Tran 0444d1b4fd chore: merge main to branch 2021-09-16 17:28:37 -04:00
Nga Tran 6cfeeb352b refactor: address review comments 2021-09-16 17:21:06 -04:00
Nga Tran cf4fd500b9 refactor: remove tests moved to query_tests 2021-09-16 15:05:48 -04:00
Marco Neumann ec943081c7 refactor: `Arc<Vec<...>>` => `Vec<Arc<...>>` for del predicates
The motivations are:

1. The API uses a SINGLE predicate and adds that to many chunks. With
   `Arc<Vec<...>>` you gain nothing, with `Vec<Arc<...>>` the predicate
   is only stored once (in many vectors)
2. While we currently add predicates blindly to all chunks, we can be way
   smarter in the future and prune out tables, partitions or even single
   chunks (based on statistics). With that, it will be rare that many
   chunks share the exact same set of predicates.
3. It would be nice if we could de-duplicate predicates when writing them
   to the preserved catalog without needing to repeat the pruning
   discussed in point 2. This is way easier to implement whan chunks
   exists in `Arc`s.
4. As a side-note: the `Arc<Vec<...>>` wasn't really cloned around but
   instead was created many time. So the new version should be more
   memory efficient out of the box.
2021-09-16 17:16:09 +02:00
Andrew Lamb ce224bd37f
fix: Capture query execution traces for storage gRPC queries as well (#2553)
* fix: Capture query execution traces for storage gRPC queries as well

* refactor: remove debugging droppings

* refactor: do not Box::pin within TracedStream

* refactor: Use Futures::TryStreamExt rather than custom collect function

* fix: remove wild println

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-16 14:45:20 +00:00
kodiakhq[bot] 33cd1cffad
Merge branch 'main' into ntran/delete_read 2021-09-16 13:22:50 +00:00
Marco Neumann bb17b4e2c2 test: even more time-related lifting
Lift a few `Utc::now()` calls further and narrow down checks in tests.
Also avoid a few `<` comparisons which might not always hold.
2021-09-16 11:05:10 +02:00
Marco Neumann 2820db5583 refactor: split preserved catalog `api` into `core` and `interface`
This makes it clearer which traits and functions users of the preserved
catalog must implement. This also splits the error types into smaller
enums that are easier to understand.

This change should make it easier to implement new functionality (like
capturing delete predicates).
2021-09-16 10:30:11 +02:00
Carol (Nichols || Goulding) 91fd32d506 fix: Reset restored db's state instead of restarting background worker 2021-09-15 19:04:05 -04:00
Nga Tran 61e1eac135 fix: fix the cases of multi[le expressions in delete predicate 2021-09-15 17:00:21 -04:00
Carol (Nichols || Goulding) d70d94100e refactor: Extract a type alias 2021-09-15 17:00:09 -04:00
Carol (Nichols || Goulding) 81feced9d6 fix: Restart database background worker when it's restored 2021-09-15 16:59:49 -04:00
Carol (Nichols || Goulding) 7c81c280cf fix: Shut down a database when it's deleted 2021-09-15 16:59:49 -04:00
Carol (Nichols || Goulding) 7b6d8f9327 feat: Add an API for restoring a database that was marked deleted 2021-09-15 16:59:37 -04:00
Raphael Taylor-Davies c66095cad1
feat: remove metrics crate (#2552) 2021-09-15 19:43:33 +00:00
Raphael Taylor-Davies 6e7fa3e574
feat: migrate http ingest metrics (#2542)
* feat: migrate http ingest metrics

* chore: review feedback

* refactor: RAII entry ingest recorder
2021-09-15 19:01:10 +00:00
Nga Tran 7175488133 chore: add some comments 2021-09-15 14:45:04 -04:00
Nga Tran 3486cc8b38 fix: should not send an empty delete predicate predicate which means delete everything (no time range) 2021-09-15 14:14:26 -04:00
Raphael Taylor-Davies 1ea4335ff3
fix: report correct DatabaseStateCode (#2543)
* fix: report correct DatabaseStateCode

* chore: fix lint

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-15 18:02:54 +00:00
Raphael Taylor-Davies 6f2301e16c
feat: migrate write buffer metrics (#2536)
* feat: migrate write buffer metrics

* feat: update server/src/write_buffer.rs

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-15 17:41:55 +00:00
Raphael Taylor-Davies 27b0c46f79
feat: shutdown WriteBufferConsumer on drop (#2533)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-15 11:48:56 +00:00
Marco Neumann 86031b03dc refactor: make time_of_write a parameter
Instead of depending on the `chrono` clock implicitly via `Utc::now()`
we should make it an explicit parameter. This is essential for testing:

1. We have many tests guessing around this value by taking `Utc::now()`
   directly before or after the write (this commit doesn't fix that, but
   allows us to fix it in a follow-up).
2. We have some tests that ignore the `time_of_write` values in some
   comparisons because they cannot control that value (fix not included
   here but left as a follow-up).
3. The upcoming compression (#2528) needs to control timestamps within
   the compressed payload (and `time_of_write` is embedded in the
   parquet metadata) because the compressed size depends on it (even if
   the uncompressed size is stable).

In general I argue that a "clock" is always data and should be passed
(either as a value or as a "now"-function) from the API layer. Hidden
clock checks just make mocking and tests a nightmare (we've seen this w/
replay tests as well).
2021-09-15 11:39:15 +02:00