This implements a way to add checkpoints to the preserved catalog and
speed up replay.
Note: This leaves the "hook it up into the actual DB" for a future PR.
Issue: #1381.
There are going to be more cases here when the Kafka write buffer is
introduced that affect how the SequencedEntry is created and whether a
database being immutable is an error or not.
* refactor: Separate query_tests into its own crate
* fix: references
* refactor: break out server benchmarks
* fix: Update query_tests/src/lib.rs
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Before this change we loaded databases eagerly when a serverID was
passed on startup BEFORE starting up the gRPC server. Since loading
(esp. at its current state without checkpoints and with too many small
parquet files) can take very long, K8s thinks IOx is unhealthy. With
this change we are now loading databases in the server background worker
once a serverID is available. Until then we block all DB-related
interactions including adding new databases (since without inspecting
the object store there is now way we can check if the DB already
exists).
Furthermore we now load database no matter if the serverID was passed on
startup (via CLI or environment variable) or was set later via gRPC
call. Before this change the latter case was somewhat forgotten.
Since the number of parquet files can potentially be unbound (aka very
very large) and we do not want to hold the transaction lock for too
long and also want to limit memory consumption of the cleanup routine,
let's limit the number of files that we collect for cleanup.
* refactor: Make it clear only partition_key and table name pruning is happening in catalog
* fix: clippy
* fix: Update server/src/db/catalog.rs
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* refactor: use TableNameFilter enum rather than Option
* docs: Add docstring to the `From` implementation
* fix: Update server/src/db/catalog/partition.rs
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Edd Robinson <me@edd.io>
There are not functional changes here (except that errors look slightly
different) but it should allow for an easier move of the DB loading into
a delayed task.