Commit Graph

25 Commits (71910538993d9d5d0ceeeaff3e2a3879af9fc11b)

Author SHA1 Message Date
Marco Neumann ef9d8aa736
fix: improve warning wording
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
2021-08-06 10:25:26 +02:00
Marco Neumann 882f89cecf fix: only warn when partition ckpt and DB ckpt mins are out-of-sync
There are currently a few bugs and semi-understood edge cases that can
lead to this case. So instead of bailing out, just issue a warning.
2021-08-06 09:48:26 +02:00
Marco Neumann a5e9bbc507 feat: impl `Display` for `OptionalMinMaxSequence` 2021-08-06 09:48:26 +02:00
Marco Neumann 015d858f88 test: add failing regression test for #2185
We need a partition that is partially persisted for this.
This requires some rework for the time handling in `Db` to make it
mockable.

The remaining bits are test framework extensions.
2021-08-05 11:44:44 +02:00
Marco Neumann 748bbf6b3c fix: check empty replay ranges as well during replay plan construction 2021-08-05 11:39:19 +02:00
Andrew Lamb e2df0eeb00
docs: Add replay diagram and notes to checkpoints and ReplayPlanner (#2184)
* docs: Add replay diagram and notes to persistence_windows

* docs: Apply suggestions from code review

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

* docs: review feedback

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-04 18:27:50 +00:00
Raphael Taylor-Davies ac67caf7a1
refactor: rename ReadLock to Freezable (#2145)
* refactor: rename ReadLock to Freezable and make exclusive

* chore: review feedback
2021-08-02 16:21:22 +00:00
Marco Neumann 04e797c706 refactor: pass sequencer numbers directly to DB checkpoint
First of all using a partition checkpoint as some kind of intermediate
representation was kinda a hack because partition checkpoints should
only created for to-be-persisted partitions, not for the others.
API-wise it should only be possible to construct a partition checkpoint
from a flush handle.

Also we were only able to construct partition checkpoints for partitions
that had unpersisted data, otherwise there was no sane way to fill the
`min_unpersisted_timestamp`. We must however scan all partitions no
matter if there is unpersisted data so that we can determine the maximum
seen sequence numbers. This was caught by a replay test resulting in a
catalog state where the last database checkpoint had lower maximum seen
sequence numbers than some partition checkpoint, bailing out with an
error.

So overall it turns out that passing the sequencer numbers directly
instead of wrapping them into a partition checkpoint is the better
implementation.
2021-07-28 17:28:34 +02:00
Marco Neumann cf21c9ec40 feat: impl `Clone` for `ReplayPlan` 2021-07-23 09:23:06 +02:00
Marco Neumann 47ad397918 fix: address review comments 2021-07-22 12:37:07 +02:00
Marco Neumann 57a9d5ade0 refactor: correctly track "seen" ranges in persistence checkpoints
Now we can handle all these cases:

There are two partitions w/ a single write each:

1. A reads sequence number 1
2. B reads sequence number 2
3. we persist A which only knows the sequences up until 1
=> the DB checkpoint needs the global max, otherwise we forget sequences
   during replay (2 in this case, so B would be gone)

1. B reads sequence number 1
2. A reads sequence number 2
3. we persist A which (w/o this commit) would not track the sequencer at
   all in this checkpoint (since there is nothing to replay)
=> we MUST also remember that we already read up until 2, otherwise we'll
   re-read 2 after replay
=> the partition checkpoint needs the local seen max (no matter if there's
   something to to persist)
2021-07-21 19:19:49 +02:00
Marco Neumann a5fc1c7d38 fix: collect min AND max in database checkpoints
This is required to correctly handle the following case:

1. There are two partitions A and B w/ a single write each (from the same
   sequencer).
2. We persist A:
   - The partition checkpoint for A will be empty because after persistence
     there will be nothing to replay (the single write is persisted and
     we're ready).
   - The database checkpoint that contains the global minimum of all ranges
     recognizes that for the sequencer there is indeed something left (the
     minimum sequence number from B).
3. DB restart happens, replay starts
4. We scan all persisted files, figure out that we have a DB checkpoint
   with a sequence minimum but (w/o the change in this commit) there is no
   maximum. Only partition checkpoints contain maxima, and the only partition
   checkpoint that was persisted was the one for partition A and that one was
   empty (see above).
5. So now how do we recover partition B?
2021-07-21 14:48:29 +02:00
Raphael Taylor-Davies ffe6e62aee
feat: add instant to datetime conversion (#2078)
* feat: add instant to datetime conversion

* chore: review feedback
2021-07-21 11:43:27 +00:00
Raphael Taylor-Davies 61da0fe4df
fix: update last_instant when rotating into persistable window (#2067) 2021-07-20 16:38:28 +00:00
Raphael Taylor-Davies e4d2c51e8b
fix: update PersistenceWindows on rules update (#2018) (#2060)
* fix: update PersistenceWindows on rules update (#2018)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 12:44:47 +00:00
Raphael Taylor-Davies 8e5d5928cf
feat: compute WriteSummary from PersistenceWindows (#2030) (#2054)
* feat: compute WriteSummary from PersistenceWindows (#2030)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 08:46:52 +00:00
Marco Neumann 81c90868be docs: note that `MinMaxSequence` are inclusive start/end 2021-07-16 11:45:34 +02:00
Raphael Taylor-Davies d71f38f27c
feat: compute PartitionCheckpoint from PersistenceWindows (#2011)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 12:17:23 +00:00
Raphael Taylor-Davies cbeeb97cff
feat: flush open window on persist (#1985)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-14 16:58:20 +00:00
Raphael Taylor-Davies f1c1620c84
feat: make persistence windows interface harder to use incorrectly (#1977)
* feat: make persistence windows interface harder to use incorrectly

* chore: review feedback

* chore: update comment

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-14 13:03:18 +00:00
Marco Neumann 886a87f011 chore: enable linting for `persistence_windows` crate 2021-07-14 14:11:51 +02:00
Raphael Taylor-Davies 04e77f6b3c
feat: assert PersistenceWindows pre-conditions (#1976)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-13 14:04:04 +00:00
Marco Neumann 18893e76e0 refactor: convert some table name and part. key String to Arcs
This has the (somewhat nice) side effect that it shrinks the in-mem
catalog a bit as well because nw `ParquetChunk` is a bit smaller making
the chunk stage enum smaller as well.
2021-07-08 14:34:28 +02:00
Andrew Lamb 33bc85ad18
feat: Infrastructure for persistence (#1925)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 11:14:38 +00:00
Marco Neumann 4ca2d3e148 chore: move persistence windows related code into own crate
The entire persistence windows data structures (including the
checkpoints) have nothing to do with the mutable buffer per se. So lets
move them into their own crate. This also makes `parquet_file` not
longer depend on `mutable_buffer`.
2021-07-05 10:23:58 +02:00