Commit Graph

152 Commits (466416016b54d603ba2aec0a6dda873088363968)

Author SHA1 Message Date
Marco Neumann 3d008f4d27 feat: add API+CLI to unload chunks
Closes #1919.
2021-07-12 14:06:01 +02:00
Paul Dix 6f2d20cb19 chore: reword comment on late_arrive_window_seconds for clarity 2021-07-11 08:25:31 -04:00
Paul Dix 2854b54420 refactor: leave deprecated mutable_linger_seconds in proto 2021-07-10 12:48:50 -04:00
Paul Dix 0c8c81a321 refactor: remove mutable_linger_seconds from lifecycle
The interplay between mutable_linger_seconds, late_arrive_window and persist_age_threshold_seconds can be tricky to reason about. I realized that the lifecycle rules can be simplified by removing mutable_linger_seconds and instead using late_arrive_window_seconds for the same purpose. Semantically, they basically mean the same thing. We want to give data around this amount of time to arrive before the system persists it, which gives it more of an opportunity to persist non-overlapping data.

When a partition goes cold for writes, after we've waiting past this window, we should compact and persist that partition. This removes one unnecessary knob from the lifecycle configuration and also removes the potential for conflicting configuration options.
2021-07-10 08:04:33 -04:00
Andrew Lamb 9534220035
feat: Add any lifecycle_action to system.chunks and API (#1947) 2021-07-09 17:38:29 +00:00
Paul Dix e41fd2a821 refactor: remove unused mutable_minimum_age_seconds lifecycle setting
Closes #1878
2021-07-08 18:34:01 -04:00
Carol (Nichols || Goulding) e5de73133c feat: Change write buffer connection rule to take either Writing or Reading connection info
A database on one IOx server can, exclusively:

- Not interact with Kafka at all
- Send writes to Kafka
- Read writes from Kafka

Notably, a database on a particular server will never write *and* read from Kafka at the same time.
2021-07-08 09:28:34 -04:00
Carol (Nichols || Goulding) 83e50cfba4 refactor: Rename field to not contain the type 2021-07-08 09:28:34 -04:00
Andrew Lamb 33bc85ad18
feat: Infrastructure for persistence (#1925)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 11:14:38 +00:00
Marko Mikulicic 7059f16b9e
refactor: Turn mutable_linger_seconds into a non-optional (#1917) 2021-07-08 11:25:57 +02:00
Andrew Lamb e6d995cbd8
chore: Update to Rust 1.53.0 (#1922)
* chore: Update to Rust 1.53.0

* fix: Update to latest clippy standards

* fix: bad refactor

* fix: Update escaping

* test: update test output

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-07 18:02:03 +00:00
Andrew Lamb 090b0aba11
refactor: remove unused `mutable_size_threshold` lifecycle setting (#1909)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-07 17:03:15 +00:00
Marco Neumann 3d644b63a1 feat: add `Replay` state to DB init 2021-07-06 14:24:39 +02:00
Marco Neumann cdab1bed05 feat: persist part+db checkpoint in parquets and catalog
This will be required for replay on server startup.
2021-07-05 09:42:46 +02:00
Marco Neumann 54fbb60740 feat: expose DB state in gRPC interface 2021-07-02 11:24:36 +02:00
Raphael Taylor-Davies f1a100c6ae
refactor: remove now unused chunk sort order (#1854)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:39:45 +00:00
Jacob Marble 0779b0d9bd
feat: add gRPC listener for new write protocol (#1842)
* feat: add gRPC listener for new write protocol

* chore: clippy happy

* chore: lint

* chore: cargo fmt --all

* chore: cargo clippy

* chore: protobuf-lint

* chore: more formatting

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:15:12 +00:00
kodiakhq[bot] b817ea88dd
Merge branch 'main' into crepererum/parquet_metadata_protobuf 2021-07-01 07:52:39 +00:00
Raphael Taylor-Davies cc038010cd
feat: add persist_age_threshold to LifecycleRules (#1853)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-30 21:27:06 +00:00
Marco Neumann 4204127b05 refactor: use protobuf for in-parquet metadata 2021-06-30 16:51:37 +02:00
Raphael Taylor-Davies 297fc12db8
feat: compact chunks (#1776)
* feat: compact chunks

* chore: review feedback

* chore: clippy lints

* chore: document sort key algorithm

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-24 16:49:10 +00:00
Marco Neumann 8e69202270 feat: catalog wiping gRPC 2021-06-21 09:31:23 +02:00
Marco Neumann d17b5710a8 feat: add server functionality to wipe preserved catalogs 2021-06-21 09:31:23 +02:00
Marco Neumann f4693e36c0 refactor: `catalog_checkpoint_interval` => `catalog_transactions_until_checkpoint` 2021-06-14 10:34:32 +02:00
Marco Neumann 2eb2aca091 fix: fix discrepancy of ckpting config over CLI and protobuf 2021-06-14 10:27:47 +02:00
Marco Neumann eae73591f3 feat: add `catalog_checkpoint_interval` lifecycle config 2021-06-14 10:04:50 +02:00
Marco Neumann edbaaedfc3 docs: clarify behavior of unspecified transaction encoding 2021-06-10 15:42:21 +02:00
Marco Neumann be9b3a4853 fix: protobuf lint fixes 2021-06-10 15:42:21 +02:00
Marco Neumann 33e364ed78 feat: add encoding info to transaction protobuf
This should help with #1381.
2021-06-10 15:42:21 +02:00
kodiakhq[bot] b49abf9b02
Merge branch 'main' into crepererum/lazy_db_loading 2021-06-09 07:23:35 +00:00
Carol (Nichols || Goulding) 4c12fd9b81 fix: Switch tabs for spaces
Not sure why my editor did that...
2021-06-07 13:06:50 -04:00
Carol (Nichols || Goulding) 50a69a7f18 fix: Don't mention Kafka unless it's absolutely necessary 2021-06-07 13:01:04 -04:00
Carol (Nichols || Goulding) 2418e91001 feat: Add a DatabaseRule field for an optional Kafka write buffer connection string 2021-06-07 09:56:23 -04:00
Carol (Nichols || Goulding) f4a9a5ae56 fix: Remove write buffer 2021-06-04 14:40:17 -04:00
Marco Neumann f38a6378a6 fix: make `getServerStatus` gRPC-compliant 2021-06-04 12:00:49 +02:00
Marco Neumann e06d65bb2a refactor: migrate "DBs initialized" RPC to "server status" 2021-06-04 11:33:41 +02:00
Marco Neumann b30d7e2821 feat: move DB loading into background worker
Before this change we loaded databases eagerly when a serverID was
passed on startup BEFORE starting up the gRPC server. Since loading
(esp. at its current state without checkpoints and with too many small
parquet files) can take very long, K8s thinks IOx is unhealthy. With
this change we are now loading databases in the server background worker
once a serverID is available. Until then we block all DB-related
interactions including adding new databases (since without inspecting
the object store there is now way we can check if the DB already
exists).

Furthermore we now load database no matter if the serverID was passed on
startup (via CLI or environment variable) or was set later via gRPC
call. Before this change the latter case was somewhat forgotten.
2021-06-04 11:33:41 +02:00
Marco Neumann 2afa8fa89a docs: fix typo and mention default 2021-06-03 11:23:29 +02:00
Marco Neumann bbd73e59be feat: jitter background clean-up job + wait on first job 2021-06-03 11:23:29 +02:00
Marco Neumann 77aeb5ca5d refactor: use protobuf-native Timestamp instead of string 2021-06-02 09:41:19 +02:00
Marco Neumann 0a625b50e6 feat: store transaction timestamp in preserved catalog 2021-06-02 09:41:19 +02:00
Andrew Lamb 00e735ef0d
chore: remove unused dependencies (#1583) 2021-05-29 10:31:57 +00:00
Marko Mikulicic bae5e5aee3
feat: Add simpler RoutingConfig 2021-05-25 21:51:54 +02:00
Marko Mikulicic 9765c53bb4
feat: Implement service readiness API 2021-05-24 17:57:45 +02:00
Andrew Lamb 07db4932ee
refactor: rename data_types/src/chunk.rs -> data_types/src/chunk_metadata.rs (#1500) 2021-05-15 10:18:01 +00:00
Raphael Taylor-Davies 9320f59de0
feat: add shard sink indirection (#1447)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-07 11:04:51 +00:00
Marco Neumann 1a998d4116 feat: preserve parquet metadata in catalog
Closes #1380.
2021-05-07 09:51:44 +02:00
Marco Neumann 9c9196e8f6 docs: add more info to `catalog.proto` 2021-05-07 09:51:44 +02:00
Marco Neumann 5db504300d refactor: use parsed paths instead of raw strings for catalog paths 2021-05-07 09:51:44 +02:00
Raphael Taylor-Davies 44de42906f
refactor: use Arc<str> instead of Arc<String> (#1442) 2021-05-06 17:05:08 +00:00