Commit Graph

270 Commits (c681da1031241730e2fd286af5ca5b6ad2ee3d2f)

Author SHA1 Message Date
Paul Dix 0c8c81a321 refactor: remove mutable_linger_seconds from lifecycle
The interplay between mutable_linger_seconds, late_arrive_window and persist_age_threshold_seconds can be tricky to reason about. I realized that the lifecycle rules can be simplified by removing mutable_linger_seconds and instead using late_arrive_window_seconds for the same purpose. Semantically, they basically mean the same thing. We want to give data around this amount of time to arrive before the system persists it, which gives it more of an opportunity to persist non-overlapping data.

When a partition goes cold for writes, after we've waiting past this window, we should compact and persist that partition. This removes one unnecessary knob from the lifecycle configuration and also removes the potential for conflicting configuration options.
2021-07-10 08:04:33 -04:00
Andrew Lamb 9534220035
feat: Add any lifecycle_action to system.chunks and API (#1947) 2021-07-09 17:38:29 +00:00
Raphael Taylor-Davies 7af560aa99
feat: Persist lifecycle action (#1888)
* feat: add split and persist operation

* docs: Improve doc strings

* refactor: use for loop rather than map

* refactor: Make it clear that the lifecycle policy picks the split timestamp

* fix: race condition

* docs: improve comments

* fix: logical merge conflict

* fix: clippy

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2021-07-09 13:21:46 +00:00
Carol (Nichols || Goulding) dd6303e85d test: Make test data conform to Kafka partitioning assumptions 2021-07-08 09:31:52 -04:00
Carol (Nichols || Goulding) 80e1dcafe0 feat: Support reading from all Kafka partitions
When reading from the Kafka write buffer, subscribe to all partitions in
a topic and start from the smallest offset available, instead of
assuming there will only be 1 partition per topic.
2021-07-08 09:30:59 -04:00
Carol (Nichols || Goulding) e5168936f5 feat: Better error messages through to gRPC API + e2e Kafka Read tests 2021-07-08 09:28:34 -04:00
Carol (Nichols || Goulding) e5de73133c feat: Change write buffer connection rule to take either Writing or Reading connection info
A database on one IOx server can, exclusively:

- Not interact with Kafka at all
- Send writes to Kafka
- Read writes from Kafka

Notably, a database on a particular server will never write *and* read from Kafka at the same time.
2021-07-08 09:28:34 -04:00
Carol (Nichols || Goulding) 83e50cfba4 refactor: Rename field to not contain the type 2021-07-08 09:28:34 -04:00
Marko Mikulicic 7059f16b9e
refactor: Turn mutable_linger_seconds into a non-optional (#1917) 2021-07-08 11:25:57 +02:00
Andrew Lamb e6d995cbd8
chore: Update to Rust 1.53.0 (#1922)
* chore: Update to Rust 1.53.0

* fix: Update to latest clippy standards

* fix: bad refactor

* fix: Update escaping

* test: update test output

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-07 18:02:03 +00:00
Andrew Lamb 090b0aba11
refactor: remove unused `mutable_size_threshold` lifecycle setting (#1909)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-07 17:03:15 +00:00
Marco Neumann 54fbb60740 feat: expose DB state in gRPC interface 2021-07-02 11:24:36 +02:00
Raphael Taylor-Davies f1a100c6ae
refactor: remove now unused chunk sort order (#1854)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:39:45 +00:00
Raphael Taylor-Davies cc038010cd
feat: add persist_age_threshold to LifecycleRules (#1853)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-30 21:27:06 +00:00
Nga Tran f6731c60d7 fix: change timeout to have all tests passed on slow laptop 2021-06-30 16:04:02 -04:00
Andrew Lamb 89757d7232
fix: do not print test output to logs except on failure (#1840)
* fix: do not print test output to logs except on failure

* docs: update CONTRIBUTING.md
2021-06-30 13:20:11 +00:00
Raphael Taylor-Davies eac9261507
chore: print end-to-end output (#1838)
* chore: print end-to-end output

* chore: clippy

* chore: update CONTRIBUTING.md

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-29 15:09:43 +00:00
Raphael Taylor-Davies 3ae8ac2467
chore: improve wait_for_chunk failure output (#1835) 2021-06-29 11:54:32 +00:00
Raphael Taylor-Davies 5287f6a577
feat: print operations on wait_for_chunk failure (#1809) (#1833)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-29 11:09:11 +00:00
Raphael Taylor-Davies 297fc12db8
feat: compact chunks (#1776)
* feat: compact chunks

* chore: review feedback

* chore: clippy lints

* chore: document sort key algorithm

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-24 16:49:10 +00:00
Carol (Nichols || Goulding) c66f9e5aeb feat: Write entries to Kafka when configured as the write buffer 2021-06-23 10:48:18 -04:00
Raphael Taylor-Davies 5cd911c74a
fix: correct row count for object store chunks (#1789) 2021-06-23 12:06:49 +00:00
Marco Neumann 55c546baff feat: eagerly check object store during CLI `run`
Instead of waiting for the server ID to be set and then mark the server
as errored, directly check the object store on startup. This is
important so that we fail fast when Istio isn't up and running yet.
2021-06-22 18:21:30 +02:00
Andrew Lamb 5362c7c924
feat: enable query deduplication (#1762) 2021-06-21 18:49:04 +00:00
Carol (Nichols || Goulding) 31ad5c85f9 fix: Consistently refer to docker-compose 2021-06-21 09:41:37 -04:00
Carol (Nichols || Goulding) b4644e6108 test: Start of Kafka Write Buffer integration tests 2021-06-21 09:41:35 -04:00
Marco Neumann a153f841d8 feat: add `--force` flag to CLI wipe command 2021-06-21 09:31:23 +02:00
Marco Neumann c0766f1c26 feat: catalog wiping CLI 2021-06-21 09:31:23 +02:00
Marco Neumann 8e69202270 feat: catalog wiping gRPC 2021-06-21 09:31:23 +02:00
Marco Neumann 51f27de2ee docs: fix typo
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
2021-06-14 17:34:57 +02:00
Marco Neumann 14ba02ec87 feat: expose server and DB init errors over gRPC
Closes #1624.
2021-06-14 17:34:57 +02:00
Marco Neumann a449d5ef74 test: make some `server_fixture` functionality public
This is useful when you want to test a server boot-up with custom
configs.
2021-06-14 17:34:57 +02:00
Andrew Lamb 856751deec
feat: Lifecycle manager unloads, rather than drop, chunks when soft limit is hit (#1701)
* feat: unload chunks from memory rather than dropping them

* docs: Update server/src/db/lifecycle.rs

Co-authored-by: Marco Neumann <marco@crepererum.net>

* docs: Update comment wording

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-14 13:14:39 +00:00
Marco Neumann f4693e36c0 refactor: `catalog_checkpoint_interval` => `catalog_transactions_until_checkpoint` 2021-06-14 10:34:32 +02:00
Marco Neumann 2eb2aca091 fix: fix discrepancy of ckpting config over CLI and protobuf 2021-06-14 10:27:47 +02:00
Andrew Lamb 4224b693d9
refactor: combine preservation.rs and persistence.rs (#1692)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-11 11:33:14 +00:00
Andrew Lamb 0cbe74dbde
fix: persistence to parquet by swapping order of arguments (#1687)
* fix: fix order of arguments

* test: for persistence
2021-06-11 10:55:40 +00:00
Raphael Taylor-Davies 11b25b3aaf
refactor: swap order of partition and table in in-memory catalog (#1678)
* refactor: swap order of partition and table in in-memory catalog

* chore: review feedback

* chore: validate panic message

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-10 16:40:30 +00:00
kodiakhq[bot] b49abf9b02
Merge branch 'main' into crepererum/lazy_db_loading 2021-06-09 07:23:35 +00:00
Carol (Nichols || Goulding) 50a69a7f18 fix: Don't mention Kafka unless it's absolutely necessary 2021-06-07 13:01:04 -04:00
Carol (Nichols || Goulding) 2418e91001 feat: Add a DatabaseRule field for an optional Kafka write buffer connection string 2021-06-07 09:56:23 -04:00
kodiakhq[bot] 87297f7db4
Merge branch 'main' into cn/delete 2021-06-07 13:32:42 +00:00
Raphael Taylor-Davies 5749a2c119
chore: cleanup legacy TSM -> parquet code (#1639)
* chore: cleanup legacy parquet code

* chore: remove tests of removed functionality

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-07 12:59:33 +00:00
Raphael Taylor-Davies afe88eeb7c
chore: fix flaky test (#1643) 2021-06-07 12:52:11 +00:00
Carol (Nichols || Goulding) f4a9a5ae56 fix: Remove write buffer 2021-06-04 14:40:17 -04:00
Marco Neumann c4a2a7243f fix: formatting 2021-06-04 12:58:25 +02:00
Marco Neumann 34939e37c7
fix: style
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
2021-06-04 12:46:28 +02:00
Marco Neumann e06d65bb2a refactor: migrate "DBs initialized" RPC to "server status" 2021-06-04 11:33:41 +02:00
Marco Neumann b30d7e2821 feat: move DB loading into background worker
Before this change we loaded databases eagerly when a serverID was
passed on startup BEFORE starting up the gRPC server. Since loading
(esp. at its current state without checkpoints and with too many small
parquet files) can take very long, K8s thinks IOx is unhealthy. With
this change we are now loading databases in the server background worker
once a serverID is available. Until then we block all DB-related
interactions including adding new databases (since without inspecting
the object store there is now way we can check if the DB already
exists).

Furthermore we now load database no matter if the serverID was passed on
startup (via CLI or environment variable) or was set later via gRPC
call. Before this change the latter case was somewhat forgotten.
2021-06-04 11:33:41 +02:00
Marco Neumann bbd73e59be feat: jitter background clean-up job + wait on first job 2021-06-03 11:23:29 +02:00