Commit Graph

169 Commits (2c9c191b17ca4b579a1f49c010a89a72ea37ae40)

Author SHA1 Message Date
Raphael Taylor-Davies efa776bd03
feat: fix log crate output (#2325)
* feat: fix log crate output

* chore: fix doc

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-17 16:19:22 +00:00
Marco Neumann ce35119633 test: fix test case relying on hard buffer limit 2021-08-17 16:53:45 +02:00
Marco Neumann 6847fa97cc feat: drop partition CLI 2021-08-17 09:44:35 +02:00
Marco Neumann fcf2bee443 feat: drop partition gRPC 2021-08-17 09:44:35 +02:00
Marco Neumann 6b907f94da test: simplify `fixture_replay_broken` 2021-08-16 15:31:05 +02:00
Marco Neumann 53d325e8fc test: end2end for skip replay 2021-08-16 13:47:56 +02:00
Marco Neumann c959be2319 test: sort output of `list_chunks` 2021-08-16 13:47:07 +02:00
Marco Neumann 42d5f9f3a1 feat: skip replay via CLI 2021-08-16 13:47:07 +02:00
Marco Neumann 21ebdee5a1 feat: make kafka topic creation code reusable 2021-08-16 13:47:07 +02:00
Raphael Taylor-Davies 0a065b4968
feat: set header on gRPC requests (#2283)
* feat: set header on gRPC requests

* feat: always insert header value

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-13 14:13:50 +00:00
Raphael Taylor-Davies 2344c28f4e
feat: drain database jobs on shutdown (#2239)
* feat: drain database jobs on shutdown

* chore: fmt

* chore: review feedback

* chore: use join() not member directly

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-10 16:47:37 +00:00
Raphael Taylor-Davies 1f450ef371
feat: add Database abstraction (#2186) (#2203)
* feat: add Database abstraction

* chore: minor tweaks

* chore: remove redundant test fixture restart

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-08 17:14:23 +00:00
Jacob Marble 98d4c9fca1
feat: switch protobuf write service to canonical definition (#2182)
* feat: switch protobuf write service to canonical definition

The protobuf definition used for the proto write endpoint was a WIP. Now
that a canonical definition exists at
https://github.com/influxdata/influxdb-pb-data-protocol/ we can switch
to that.

* chore: lint etc

* chore: fix rustdoc nit in proto definition comment
2021-08-04 00:16:49 +00:00
Carol (Nichols || Goulding) 9864c6b7f1 fix: Return a more helpful error message for no matching sharding rule
Fixes #2127.
2021-08-02 16:49:40 -04:00
kodiakhq[bot] 0297aae17e
Merge branch 'main' into cn/1.54 2021-07-30 17:01:37 +00:00
Jacob Marble 8f05569007
test: add Flight/handshake test (#2156) 2021-07-30 15:43:28 +00:00
Carol (Nichols || Goulding) 9d15798288 fix: Address or allow Clippy warnings new with Rust 1.54 2021-07-30 09:59:59 -04:00
Raphael Taylor-Davies 431774c8b7
refactor: extract resolver from server::Config (#2143)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-29 13:14:58 +00:00
Marko Mikulicic fe7f65bfa7
feat(iox): Implement max_active_compactions_cpu_fraction 2021-07-28 17:31:17 +02:00
Raphael Taylor-Davies 754d647c06
feat: enable row timestamp metrics with environment variable (#2135)
* feat: enable row timestamp metrics with environment variable

* chore: fix test

* chore: fix typo

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-28 13:39:56 +00:00
Marko Mikulicic ec0804900a
feat(iox): Quick&Dirty KafkaProducer sink implementation
RoutingRules such as RoutingConfig and ShardConfig use a sink to decide where to write
the entries.

The write buffer is currently implemented in the `db` and is accessed by using the `write_local_entry`
code path. This PR simply invokes that legacy code path whenever a "kafka" sink is selected.

This allows us immediately to benefit from the ability of the ShardingConfig to select or reject
tables and send some to kafka, some to devnull.

This PR does not allow us yet to split an input batch into mulitiple shards and send each
to a different kafka topic. For that, we'll need to pull out the write buffer code path out of
the `db` and do something similar to a ConnectionManager but for write buffers. TODO
2021-07-28 10:13:22 +02:00
Marko Mikulicic 094945a72d
feat: Add '/dev/null' sink 2021-07-26 19:19:11 +02:00
Marko Mikulicic 2478547ad7
refactor: Remove deprecated target field in RoutingConfig 2021-07-26 17:29:39 +02:00
Marco Neumann c386ac013c fix: fix flaky `test_unload_partition_chunk`
Do not rely on the fact that the chunk ID is 1, because compaction and
other mechanisms might create chunks using different IDs.

Fixes #2109.
2021-07-26 12:01:28 +02:00
Marco Neumann ceacd6b4e7 test: return chunks from `wait_for[_exact_chunk]_state` 2021-07-26 11:57:36 +02:00
Marko Mikulicic d58a3ccbc7
refactor: Add sink to routing config
This deprecates the "target" field in the RoutingConfig and replaces it with the "sink"
field, which has a variant that accepts a node group.

This commit is backward compatible in that it will accept existing configs.
The configs will roundtrip to the new format though (i.e. `database get` will render
the sink field).
2021-07-26 11:08:01 +02:00
Marko Mikulicic 16a82ba350
refactor: Generailize sinks: Rename Shard to Sink
The ShardConfig applies matchers that resolve to a shard number.
The config then applies a mapping between shard numbers to targets.
The type that encapsulated the target that a shard points to was also called
a "Shard". This is confusing. This commit changes it to "Sink", i.e. a destination
for traffic to go to. Subsequent commits will expand the definition of a Sink to
encompass different kinds of sinks (like kafka write buffer, "devnull", ...)

This changes only the name of the protobuf message and the related rust types,
it doesn't change any name of the json-rendered protobuf configs.
2021-07-26 11:08:00 +02:00
Raphael Taylor-Davies 446af5eb15
fix: consistent write timestamps (#2104)
* fix: consistent write timestamps

* chore: fix benchmarks
2021-07-23 18:04:15 +00:00
Andrew Lamb 01c79f1a1a
fix: Print all timestamps using RFC3339 format (#2098)
* fix: Use IOx pretty printer rather than arrow pretty printer

* chore: update tests in the query crate

* chore: update influxdb_iox tests

* chore: Update end to end tests

* chore: update query_tests

* chore: update mutable_buffer tests

* refactor: update parquet_file tests

* refactor: update db tests

* chore: update kafka integration test output

* fix: merge conflict
2021-07-22 19:04:52 +00:00
Raphael Taylor-Davies 20d06e3225
feat: include more information in system.operations table (#2097)
* feat: include more information in system.operations table

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 17:16:09 +00:00
Raphael Taylor-Davies 8c974beba0
feat: add access timestamps to CatalogChunk (#2075) (#2081)
* feat: add access timestamps to CatalogChunk (#2075)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 12:19:30 +00:00
Raphael Taylor-Davies e4d2c51e8b
fix: update PersistenceWindows on rules update (#2018) (#2060)
* fix: update PersistenceWindows on rules update (#2018)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 12:44:47 +00:00
Raphael Taylor-Davies 767c2a6fe1
refactor: explicit server startup state machine (#2040)
* refactor: explicit server startup state machine

* chore: update `ServerStage` docs

* chore: further docs

* chore: more logging

* chore: format
2021-07-20 10:11:18 +00:00
kodiakhq[bot] 1d1ac12522
Merge branch 'main' into crepererum/write_buffer_multiple_streams 2021-07-19 15:50:42 +00:00
Edd Robinson dfda23f24a test: update e2e tests 2021-07-19 14:00:10 +01:00
Marco Neumann 592424c896 refactor: use one stream per sequencer/partition
Advantages are:

- for large DBs w/ many partitions we can ingest data in-parallel
- on top of this change we can implement per-sequencer seeking, which is
  required for replay
2021-07-19 12:26:58 +02:00
Raphael Taylor-Davies 5fc98c7c56
feat: add failure reporting to TaskTracker (#2031)
* feat: add failure reporting to TaskTracker

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-19 09:17:20 +00:00
Raphael Taylor-Davies 00b89cd751
fix: freeze chunks in write path (#2021) (#2022)
* fix: freeze chunks in write path (#2021)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-16 08:51:37 +00:00
Raphael Taylor-Davies 6218957bd8
fix: flaky lifecycle test (#1994) (#2020)
* fix: flaky lifecycle test (#1994)

* chore: fix lint

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 21:07:09 +00:00
Andrew Lamb 3fd6430fb6
fix: rename `estimated_bytes` to `memory_bytes` and expose `object_store_bytes` in ChunkSummary and system.chunks (#2017)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 16:00:24 +00:00
Marco Neumann b5428e53a5 refactor: write buffer testing + better mocking
This refactors the write buffer a bit for:

- **Testing:** Add generic tests for the Kafka and the mocking
  implementation. The same interface can be used easily add new
  implementations (e.g. via Redis, filesystem, ...).
- **Partition on Write:** The caller of the writer operation must now
  specify the partition/sequencer ID. The implicit partitioning of the
  Kafka writer would have lead to broken data since we must never spill
  entries w/ the same primary key over multiple partitions. At the
  moment we will only use partition 0 but we can easily implement
  better logic in the future.
- **Improved Mocking:** The mocked implementation now simulates a system
  that feels more real. Especially the handling around multiple streams
  and "write while read" has been improved. This will be helpful for
  testing and for new features like seeking (during replay). A solid
  realistic mock also helps us to ensure that the tests using the mock
  do not rely on unrealistic behavior too much.
2021-07-15 17:20:45 +02:00
Raphael Taylor-Davies a79c0b4e75
feat: add mub row count threshold to lifecycle rules (#1876) (#2016)
* feat: add mub row count threshold to lifecycle rules (#1876)

* chore: update docstring

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 13:42:17 +00:00
Raphael Taylor-Davies 6a4c08ec28
refactor: extract DatabaseBuilder for end-to-end test cases (#2004)
* refactor: extract DatabaseBuilder for end-to-end test cases

* chore: fix kafka tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 12:25:21 +00:00
Andrew Lamb 243cee530a
test: Fix flaky test by specifying ORDER BY in query (#1996)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-14 14:41:24 +00:00
Jacob Marble b79d9eb0ab
chore: add end-to-end test for PB write service (#1894)
* chore: add end-to-end test for PB write service

* chore: lint

* chore: fix test
2021-07-14 14:20:37 +00:00
Marco Neumann 3d008f4d27 feat: add API+CLI to unload chunks
Closes #1919.
2021-07-12 14:06:01 +02:00
Paul Dix 0c8c81a321 refactor: remove mutable_linger_seconds from lifecycle
The interplay between mutable_linger_seconds, late_arrive_window and persist_age_threshold_seconds can be tricky to reason about. I realized that the lifecycle rules can be simplified by removing mutable_linger_seconds and instead using late_arrive_window_seconds for the same purpose. Semantically, they basically mean the same thing. We want to give data around this amount of time to arrive before the system persists it, which gives it more of an opportunity to persist non-overlapping data.

When a partition goes cold for writes, after we've waiting past this window, we should compact and persist that partition. This removes one unnecessary knob from the lifecycle configuration and also removes the potential for conflicting configuration options.
2021-07-10 08:04:33 -04:00
Andrew Lamb 9534220035
feat: Add any lifecycle_action to system.chunks and API (#1947) 2021-07-09 17:38:29 +00:00
Raphael Taylor-Davies 7af560aa99
feat: Persist lifecycle action (#1888)
* feat: add split and persist operation

* docs: Improve doc strings

* refactor: use for loop rather than map

* refactor: Make it clear that the lifecycle policy picks the split timestamp

* fix: race condition

* docs: improve comments

* fix: logical merge conflict

* fix: clippy

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2021-07-09 13:21:46 +00:00
Carol (Nichols || Goulding) dd6303e85d test: Make test data conform to Kafka partitioning assumptions 2021-07-08 09:31:52 -04:00