Commit Graph

292 Commits (1d38d400f05a9200debc0d688e6a1eae0b29a289)

Author SHA1 Message Date
Carol (Nichols || Goulding) 8c7b3966de
fix: Organize imports 2021-12-09 08:49:34 -05:00
Carol (Nichols || Goulding) 403dcae93c
feat: Put kafka write_buffer code behind a feature flag
Which is off by default. This makes rdkafka optional to minimize
build-time dependencies for users that don't plan on using a Kafka write
buffer.
2021-12-09 08:49:34 -05:00
Marco Neumann e558f4c708 fix: address review comments 2021-12-07 09:43:47 +01:00
Marco Neumann 8f098d3ca1 fix: improve Kafka error handling during topic discovery
`MetadataTopic` has an `error` attached which we should check (I'm
wondering why this isn't a proper `Result` though).
2021-12-07 09:23:03 +01:00
Carol (Nichols || Goulding) 7499eac067
fix: Disable uuid serde feature; we're not actually serializing any UUIDs
Connects to #3117.
2021-12-06 09:37:31 -05:00
Carol (Nichols || Goulding) 16d8ae5e04
fix: Match tokio features to what's actually in use in each crate
Some crates listed features they don't use; other crates ware relying on
feature flags enabled by something else. I tested these changes by
disabling the workspace hack crate and testing each crate.
2021-12-06 09:37:16 -05:00
Carol (Nichols || Goulding) 02c297e850
fix: Always specify the parking_lot feature of tokio to get potential perf boost 2021-12-06 09:37:15 -05:00
Carol (Nichols || Goulding) 1a899b939e
fix: Remove redundant closures identified by clippy
https://rust-lang.github.io/rust-clippy/master/index.html#redundant_closure
2021-12-02 11:52:02 -05:00
Marco Neumann 332485d2c9 fix: use correct `bytes_read` in `DmlMeta`
- for file-based write buffers: Use headers + payload
- for Kafka-based write buffers: Use the estimation that we also use for
  other metrics
- as a side effect we can now just use `PartialEq` for more types

Fixes #3186.
2021-12-01 15:57:21 +01:00
Marco Neumann 459c14035c test: ensure that Kafka producers also generate sane metrics 2021-11-29 10:55:08 +01:00
Marco Neumann b7952c15a6 refactor: improve Kafka client usage
With the new rust-rdkafka release (merged in #3234) managing multiple
consumer streams becomes a bit easier. Also we can just reuse consumer
clients for multiple metadata requests. In total that provides:

- use only a single client connection for consumers (we had multiple
  connection attempts during startup and one client per stream)
- use only two clients for producers (sadly we need a consumer client to
  probe the partitions during startup)
- consumers no longer need to poll the stream to receive statistics
2021-11-29 10:39:09 +01:00
dependabot[bot] c729fc6a25
chore(deps): bump rdkafka from 0.27.0 to 0.28.0
Bumps [rdkafka](https://github.com/fede1024/rust-rdkafka) from 0.27.0 to 0.28.0.
- [Release notes](https://github.com/fede1024/rust-rdkafka/releases)
- [Changelog](https://github.com/fede1024/rust-rdkafka/blob/master/changelog.md)
- [Commits](https://github.com/fede1024/rust-rdkafka/compare/v0.27.0...v0.28.0)

---
updated-dependencies:
- dependency-name: rdkafka
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-29 08:18:02 +00:00
Marco Neumann 7f2e4f4342 refactor: remove write buffer direction
The direction was required when a database could read or write from/to a
write buffer. Now it is clear from the usage context of a write buffer
context which of the two applications is meant (databases read, routers
write) so the direction flag is no longer required.
2021-11-26 12:38:40 +01:00
Marco Neumann edf7becd20 fix: address review comments 2021-11-24 12:09:52 +01:00
Marco Neumann f75c12351d fix: do not forget outputs of file-based write buffer
The existing channel construction could lead to cases where streams
would consume messages, put them into the channel but then when the
stream gets dropped the message would be gone forever. So lets move from
a channel-based implementation to directly invoke the generator future,
so this buffering doesn't occur.

Fixes #3179.
2021-11-24 11:38:41 +01:00
kodiakhq[bot] a6a0eda142
Merge branch 'main' into crepererum/issue3030 2021-11-23 08:08:34 +00:00
Carol (Nichols || Goulding) 9fd4a560f5
feat: Results of running cargo hakari manage-deps 2021-11-19 09:21:57 -05:00
Raphael Taylor-Davies e32d367e85
feat: flush delete mailbox on persist (#3126) (#3147)
* feat: flush delete mailbox on persist (#3126)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-19 09:45:29 +00:00
Marco Neumann 7c72a993a3 fix: don't retry "forever" sending Kafka messages
When a Kafka broker pod is recreated (for whatever reason) and gets a
new IP while doing so, the following happened:

1. Old broker pod gets terminated, but is still reachable via DNS and
   TCP.
2. rdkafka looses its connection, re-creates it using the old IP. The
   TCP connection can be established (this heavily depends on the K8s
   network setup), but won't be able to send any messages because the
   old broker is already shutting down / dead.
3. New broker gets created w/ new IP (but same DNS name).
4. Somewhat in parallel to step 3: rdkafka gets informed by other
   brokers that the topic lost its leader and then that the topic has
   the new leader (which has the same identity as the old one). Since
   leader changes in Kafka can also happen when brokers are totally
   healthy, it doesn't conclude that its TCP connection might be broken
   and tries to send messages to the new broker via the old TCP
   connection.
5. It takes very long (~130s on my test setup) for the old
   rdkafka->broker TCP connection to break. Since
   `message.send.max.retries` has a default of `2147483647` rdkafka will
   not give up on the application level.
5. rdkafka re-connects, while doing so resolves via DNS the new broker
   IP and is happy.

An alternative fix that was tried: Use the `connect` rdkafka callback to
hook into the place where it would issue the UNIX `connect` call. There
we can manipulate the socket. Setting `TCP_USER_TIMEOUT` to 5000ms also
solves the issue somewhat, but might have different implications (also
it then takes around 5s to kill the connection). Since this is a more
hackish implementation and somewhat an unofficial way to configure
rdkafka, I decided against it.

Test Setup
==========

```rust
\#[tokio::test]
async fn write_forever() {
    maybe_start_logging();
    let conn = maybe_skip_kafka_integration!();
    let adapter = KafkaTestAdapter::new(conn);
    let ctx = adapter.new_context(NonZeroU32::new(1).unwrap()).await;

    let writer = ctx.writing(true).await.unwrap();
    let lp = "upc user=1 100";
    let sequencer_id = set_pop_first(&mut writer.sequencer_ids()).unwrap();

    for i in 1.. {
        println!("{}", i);

        let tables = mutable_batch_lp::lines_to_batches(lp, 0).unwrap();
        let write = DmlWrite::new(tables, DmlMeta::unsequenced(None));
        let operation = DmlOperation::Write(write);
        let res = writer.store_operation(sequencer_id, &operation).await;
        dbg!(res);

        tokio::time::sleep(Duration::from_secs(1)).await;
    }
}
```

Make sure to set the the rdkafka `log` config to `all`. Then use KinD,
setup a 3-node Strimzi cluster and start the test binary within the K8s
cluster. You need to start a debug container that is close enough to
your developer system (e.g. an old Debian DOES NOT work if you run
bleeding edge Arch):

```console
$(host) kubectl run -i --tty --rm debug --image=archlinux --restart=Never -n kafka -- bash
````

Then you copy over the test binary the container using [cargo-with](https://github.com/cbourjau/cargo-with):

```console
$(host) cargo with 'kubectl cp {bin} kafka/debug:/foo' -- test -p write_buffe
````

Within the container shell that you've just created, start the
forever-running test (make sure to set `KAFKA_CONNECT` according to your
Strimzi setup!):

```console
$(container) TEST_INTEGRATION=1 KAFKA_CONNECT=my-cluster-kafka-bootstrap:9092 RUST_BACKTRACE=1 RUST_LOG=debug ./foo write_forever --nocapture
````

The test should run and tell you that it is delivering messages. It also
tells you within the debug logs which broker it sends the messages to.
Now you need to kill the broker (in my example it was `my-cluster-kafka-1`):

```console
$(host) kubectl -n kafka delete pod my-cluster-kafka-1
````

The test should now stop to deliver messages and should error. Without
this patch it might take over 100s for it to recover even after the
deleted pod was re-created. With this patch it quickly is able to
deliver data again after the broker comes back online.

Fixes #3030.
2021-11-19 09:53:57 +01:00
Carol (Nichols || Goulding) a2454b542d
fix: Small cleanups in Cargo.tomls (#3160)
* fix: Add tokio rt-multi-thread feature so cargo test -p client_util compiles

* fix: Alphabetize dependencies

* fix: Add the data_types_conversions feature to get tests passing

* fix: Remove dev dependencies already listed under normal dependencies

* fix: Make sure the workspace is using the new resolver
2021-11-18 22:26:33 +00:00
Raphael Taylor-Davies 8155747735
feat: add write buffer delete encoding (#2731) (#3127)
* feat: add write buffer delete encoding (#2731)

* chore: fix doc

* chore: review feedback

* chore: review feedback

* chore: fmt

* chore: review feedback
2021-11-17 16:12:19 +00:00
Andrew Lamb b5a7bf03da
feat: Add kafka write buffer consumer metrics (#3129)
* feat: Add kafka write buffer consumer metrics

* refactor: use unwrap_or_else

* fix: Update bucket boundaries
2021-11-17 14:35:40 +00:00
Andrew Lamb d6c6e9a6c7
fix: Default kafka timeout to be shorter than gRPC timeout (60 sec --> 10 sec) (#3131)
* fix: Default kafka timeout to be shorter than gRPC timeout

* docs: fix link style
2021-11-17 12:19:53 +00:00
Marco Neumann 79929c8cf4 feat: add more Kafka metrics 2021-11-16 17:18:41 +01:00
Marco Neumann 9ee004946e fix: do not overload rdkafka w/ statistics 2021-11-16 17:18:41 +01:00
Marco Neumann e6fdd79a0f feat: emit Kafka stats as metrics instead of logs
This maps a subset of Kafka stats as metrics. The set can -- of course
-- be changed in the future depending on our needs.

Fixes #3100.
2021-11-16 17:18:41 +01:00
Raphael Taylor-Davies 553e412226
refactor: DMLOperation write path (#2731) (#3121)
* refactor: DMLOperation write path (#2731)

* chore: fmt

* chore: review feedback
2021-11-16 12:42:19 +00:00
Raphael Taylor-Davies a6d83a3026
feat: WriteBufferReader use DmlOperation (#2731) (#3096)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-15 10:19:54 +00:00
Raphael Taylor-Davies 6f268f8260
refactor: extract DML types (#2731) (#3084)
* refactor: extract DML types (#2731)

* chore: fmt
2021-11-11 12:34:07 +00:00
Marco Neumann 11f4a1dee8 feat: add connection management for router 2021-11-08 11:11:52 +01:00
Raphael Taylor-Davies 60f0deaf1e
feat: remove flatbuffer entry (#3045) 2021-11-05 20:19:24 +00:00
Raphael Taylor-Davies 898567e221
feat: migrate server to DbWrite (#2724) (#3035)
* feat: migrate server to DbWrite (#2724)

* chore: print perf log output

* fix: don't suppress CI status code

* chore: review feedback

* fix: don't error on empty line protocol write payloads

* fix: test

* fix: test
2021-11-05 11:09:33 +00:00
Raphael Taylor-Davies 07ba629e2b
feat: migrate write buffer producer to MutableBatch and pbdata (#2743) (#3021)
* feat: migrate write buffer producer to MutableBatch and pbdata (#2743)

* fix: Kafka message content type header

* chore: fix doc
2021-11-04 10:20:40 +00:00
Raphael Taylor-Davies 08fcd87337
feat: use protobuf encoding in write buffer (#2724) (#3018) 2021-11-03 15:19:05 +00:00
Raphael Taylor-Davies 51c6348e54
refactor: extract codec module for write buffer (#2724) (#3017) 2021-11-03 14:07:33 +00:00
Marco Neumann 0d0c0cb42b refactor: move write buffer configs to new home
Write buffer configs will partially be shared by database and router
nodes, so lets move them into a shared home.
2021-11-02 10:17:01 +01:00
Raphael Taylor-Davies f1a6468e7b
feat: migrate write buffer consumer to use DbWrite (#2724) (#3003)
* feat: migrate write buffer consumer to use DbWrite (#2724)

* fix: doc

* chore: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-01 16:38:48 +00:00
dependabot[bot] c540b40f05
chore(deps): bump tokio from 1.12.0 to 1.13.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.12.0 to 1.13.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.12.0...tokio-1.13.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-01 11:21:59 +00:00
Raphael Taylor-Davies 6ceab054ab
refactor: move DbWrite to mutable_batch (#2986)
* refactor: move DbWrite to mutable_batch

* chore: fix doc

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-29 15:13:05 +00:00
Raphael Taylor-Davies 8a2410e161
feat: mutable batch write entry (#2724) (#2973)
* feat: mutable batch write entry (#2724)

* chore: lint

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-28 20:15:28 +00:00
Marco Neumann 3af1504ed2 fix: race condition in file-based write buffer 2021-10-26 10:09:34 +02:00
Marco Neumann 93f6519c34 fix: address review comments 2021-10-26 10:09:34 +02:00
Marco Neumann d527775aec feat: allow gaps in file-based WB + improve error handling 2021-10-26 10:09:34 +02:00
Marco Neumann bc7244c48e chore: use Rust edition 2021 2021-10-25 10:58:20 +02:00
Marco Neumann a5f15e6e76 refactor: improve directory names 2021-10-19 16:15:22 +02:00
Marco Neumann 6ec0bd5bab feat: file-based write write_buffer
Closes #2849.
2021-10-19 15:26:43 +02:00
Marco Neumann b2698cca44 feat: add `format_jaeger_trace_context` 2021-10-19 15:26:05 +02:00
dependabot[bot] 32e18b6436
chore(deps): bump rdkafka from 0.26.0 to 0.27.0
Bumps [rdkafka](https://github.com/fede1024/rust-rdkafka) from 0.26.0 to 0.27.0.
- [Release notes](https://github.com/fede1024/rust-rdkafka/releases)
- [Changelog](https://github.com/fede1024/rust-rdkafka/blob/master/changelog.md)
- [Commits](https://github.com/fede1024/rust-rdkafka/compare/v0.26.0...v0.27.0)

---
updated-dependencies:
- dependency-name: rdkafka
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-19 11:49:34 +00:00
kodiakhq[bot] 45c2c26168
Merge branch 'main' into crepererum/write_buffer_optional_span_ctx 2021-10-15 07:25:21 +00:00
Marco Neumann 85be39de40 test: make `headers_case_handling` test easier to understand 2021-10-15 09:20:41 +02:00
Marco Neumann 2850487877 feat: make trace collector in Kafka consumer optional
The whole application might not have a trace collector configured in
which case we don't wanna produce any spans.
2021-10-15 09:20:40 +02:00
Marco Neumann d6da68b762 fix: do not panic when writing to an unknown sequencer 2021-10-14 17:27:19 +02:00
kodiakhq[bot] 61ec559eee
Merge branch 'main' into crepererum/write_buffer_span_ctx 2021-10-14 11:50:07 +00:00
Raphael Taylor-Davies e911cf9ac1
refactor: make WriteBufferConfigFactory interior mutable (#2829)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-14 10:30:59 +00:00
Marco Neumann 5e06519afb feat: propagate trace information through write buffer 2021-10-14 11:07:41 +02:00
Raphael Taylor-Davies 8a82f92c5d
refactor: add TimeProvider abstraction (#2722) (#2815)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-12 21:19:03 +00:00
Marco Neumann 90d2f1d60d refactor: use Kafka headers similar to HTTP 2021-10-12 16:31:06 +02:00
Marco Neumann b955ecc6a7 feat: add format header to Kafka messages
This allows us to transition to new formats in the future.
2021-10-12 16:31:06 +02:00
Raphael Taylor-Davies 0554173684
feat: migrate write buffer to TimeProvider (#2722) (#2804)
* feat: migrate write buffer to TimeProvider (#2722)

* chore: review feedback

Co-authored-by: Marco Neumann <marco@crepererum.net>

Co-authored-by: Marco Neumann <marco@crepererum.net>
2021-10-12 10:32:34 +00:00
Marco Neumann 896ce03415 refactor: use sets and maps for write buffer sequencers
Use the type to reflect that entries are unique and sorted.
2021-10-12 11:13:09 +02:00
Marco Neumann 2e8af77e41 feat: allow write buffer producers to read possible sequencer IDs 2021-10-12 10:37:32 +02:00
Raphael Taylor-Davies f35a49edd0
refactor: move Sequence to data_types (#2780)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-11 09:23:00 +00:00
Marco Neumann 07a72a1daa chore: tell cargo-udeps that `dotenv` is needed for `write_buffer` 2021-09-22 11:09:39 +02:00
Raphael Taylor-Davies c33e5c22e6
feat: pull WriteBuffer consumer out of Db and onto Database (#2243) (#2525)
* feat: pull WriteBuffer consumer out of Db and onto Database (#2243)

* chore: restore WritingOnlyAllowedThroughWriteBuffer error

* refactor: remove WriteBufferConfig

* chore: fix docs

* chore: move WriteBufferConsumer tests out of db.rs

* chore: document WriteBufferFactory member functions

* chore: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-14 16:04:58 +00:00
Raphael Taylor-Davies 7e434d16d2
fix: MockBufferForReading waker registration (#2520)
* fix: MockBufferForReading waker registration

* chore: remove unnecessary sleep

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-13 21:21:21 +00:00
Marco Neumann bbb8898d36 refactor: make writer buffer auto-creation types nicer to read 2021-09-08 11:13:48 +02:00
Marco Neumann 2cc1297c96 fix: typos
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
2021-09-07 18:24:59 +02:00
Marco Neumann c0cc239781 fix: improve Kafka error handling 2021-09-07 18:24:59 +02:00
Marco Neumann 801cf08be7 feat: auto-creation of sequencers by write buffer
For Kafka, that basically means that we create a topic if it doesn't
exist yet.

Closed #2455.
Fixes #2189.
2021-09-07 18:24:57 +02:00
Marco Neumann 924e460bf7 feat: sequencer auto-creation for mocked write buffer 2021-09-07 18:18:20 +02:00
Marco Neumann ecbd0165fc feat: prepare auto-creation of sequencers for mocked write buffer 2021-09-07 18:18:20 +02:00
Marco Neumann 82f9750ba7 test: prepare write buffer test suite for failable reader and writer creation
This is required to work w/ sequencer auto-creation.
2021-09-07 18:18:20 +02:00
Marco Neumann f1864813f4 docs: document write buffer test suite 2021-09-07 18:18:20 +02:00
Marco Neumann 2ea3b600d0 test: harden `assert_reader_content` a bit
- entries should be sorted by the stream, there is no need to sort the
  results
- ensure that there are no leftover entries in the stream by asserting
  that it is "pending"
2021-09-07 18:18:20 +02:00
Marco Neumann d5662328b0 refactor: `n_sequencers` should be non-zero 2021-09-07 18:18:20 +02:00
dependabot[bot] b67610d9b9
chore(deps): bump tokio from 1.10.1 to 1.11.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.10.1 to 1.11.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.10.1...tokio-1.11.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 09:11:38 +00:00
dependabot[bot] b1bb390893
chore(deps): bump parking_lot from 0.11.1 to 0.11.2
Bumps [parking_lot](https://github.com/Amanieu/parking_lot) from 0.11.1 to 0.11.2.
- [Release notes](https://github.com/Amanieu/parking_lot/releases)
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Amanieu/parking_lot/compare/0.11.1...0.11.2)

---
updated-dependencies:
- dependency-name: parking_lot
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 01:18:24 +00:00
Marco Neumann a63eb53ac5 feat: forward connection config to Kafka write buffer 2021-09-02 16:53:31 +02:00
Marco Neumann ecf1f99ddb refactor: more flexible writer buffer config
This allows:

- different types (instead of guessing through the connection URL)
- sequencer counts (not used yet but will be by #2455)
- extensible configs (e.g. to configure Kafka in a more granular way,
  not wired up yet)
- future extensions (since we use a message now instead of a single
  string)

**BREAKING: This requires changes for deployed systems / existing DBs!**
2021-09-02 16:41:35 +02:00
Marco Neumann f81885c172 fix: limit kafka consumer queue to 10MB
I think that while this value limits the maximum memory consumption, it
is too optimistic.
2021-08-20 09:55:49 +02:00
Marco Neumann 6c0cefded3 fix: limit kafka consumer queue to 100MB
By default the queue will be filled to up to 1GB which makes finding
memory leaks quite hard.
2021-08-19 11:36:15 +02:00
Raphael Taylor-Davies 817ed4b46b
feat: enable rdkafka statistics (#2312) (#2314)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-17 09:28:02 +00:00
Marco Neumann 4a3fe01743 test: don't overdo it 2021-08-16 18:31:45 +02:00
Marco Neumann 5caa2ad8ec fix: typo 2021-08-16 18:31:45 +02:00
Marco Neumann 1a7293015b test: allow write buffer mocks that always fail 2021-08-16 18:27:09 +02:00
Marco Neumann e3c263a006 test: add tooling to purge kafka topic data 2021-08-16 13:47:07 +02:00
Marco Neumann 21ebdee5a1 feat: make kafka topic creation code reusable 2021-08-16 13:47:07 +02:00
Marco Neumann a72bacae67 test: use proper ignore instead of commenting out 2021-08-12 11:38:02 +02:00
Marco Neumann a5c74f2798 feat: ability to inject mocked write buffers into server/database 2021-08-12 10:46:16 +02:00
Marco Neumann 55f68f0401 feat: add `type_name` to write buffer interface 2021-08-12 10:46:16 +02:00
Marco Neumann 32a5c8ca3a feat: add ability to clear messages from mocked write buffers 2021-08-12 10:46:16 +02:00
Dom 3de6b44e23
build: use new rustdoc lint name (#2261)
* fix: nocache feature code rot

The MBChunk::snapshot code when using the "nocache" option no longer
compiles - this commit updates it to match the not(nocache) code.

* build: use updated broken_intra_doc_links name

The broken_intra_doc_links lint was renamed
rustdoc::broken_intra_doc_links

https://doc.rust-lang.org/rustdoc/lints.html
2021-08-11 19:48:51 +00:00
Marko Mikulicic 99b358d481
feat(iox): Enable snappy compression in the producer 2021-07-29 16:37:11 +02:00
Marco Neumann eb310ecada fix: ensure that the Kafka producer TS we report is also truncated 2021-07-28 15:48:02 +02:00
Marco Neumann 0fcec6b742 refactor: move ingest timestamp from sequence to sequended entry 2021-07-28 15:40:35 +02:00
Marco Neumann d940d4f6d3 docs: explain how test timestamps are measured 2021-07-28 14:46:33 +02:00
Marco Neumann e736bc6953 feat: add ingest timestamp to `Sequence`
This allows us to track wall-clock ingest time for entries that we
receive via write buffer (e.g. Kafka).
2021-07-28 14:41:18 +02:00
Marco Neumann 55490c279a fix: Kafka watermark error for new partitions 2021-07-21 15:21:52 +02:00
Marco Neumann 5df88c70aa feat: add ability to fetch watermarks from write buffer 2021-07-21 11:59:52 +02:00
kodiakhq[bot] 58dd7e9532
Merge branch 'main' into crepererum/writer_buffer_seek 2021-07-20 12:29:18 +00:00
Marko Mikulicic c01cfbc34c
fix: Increase kafka message size 2021-07-20 14:17:37 +02:00
Marco Neumann ec7ebdff29 refactor: use lifetimes to ensure single stream / no seek while streaming 2021-07-20 13:52:33 +02:00
Marco Neumann b0663a0337 feat: disallow multiple write buffer streams and seeking while streams
Multiple streams will mess up ordering. Seeking while streaming is
likely a bug and should not work.
2021-07-20 12:35:20 +02:00
Marco Neumann 38f4eec20e feat: implement `seek` for write buffer
This is required to control replay ranges.
2021-07-20 10:25:56 +02:00
Marco Neumann 592424c896 refactor: use one stream per sequencer/partition
Advantages are:

- for large DBs w/ many partitions we can ingest data in-parallel
- on top of this change we can implement per-sequencer seeking, which is
  required for replay
2021-07-19 12:26:58 +02:00
Edd Robinson 1676c33113
refactor: update write_buffer/src/kafka.rs 2021-07-16 16:56:03 +01:00
Marko Mikulicic de0dce79f6
docs: Update write_buffer/src/kafka.rs
Co-authored-by: Edd Robinson <me@edd.io>
2021-07-16 17:53:02 +02:00
Marko Mikulicic 92623b5f32
fix: Set acks=all for kafka writes 2021-07-16 17:49:04 +02:00
Marko Mikulicic cbadd65cfe
fix: Update write_buffer/src/kafka.rs
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-07-15 23:00:45 +02:00
Marko Mikulicic 06399e88e0
chore: Add some debug logs to write buffer 2021-07-15 22:18:03 +02:00
Carol (Nichols || Goulding) 7301268b4f fix: Increase the internal librdkafka producer queue size
Given that we've increased the max message size by a factor of 10, also
increase the internal producer queue max size by a factor of 10 to
reduce the number of retries needed to successfully enqueue messages to
Kafka.

Connects to #2007.
2021-07-15 11:35:55 -04:00
Carol (Nichols || Goulding) fa3a2db0d3 fix: Retry adding Kafka messages to queue forever
By using [producer.send][] rather than [producer.send_result][] and
specifying Timeout::never.

Connects to #2007.

[producer.send]: https://docs.rs/rdkafka/0.26.0/rdkafka/producer/future_producer/struct.FutureProducer.html#method.send
[producer.send_result]: https://docs.rs/rdkafka/0.26.0/rdkafka/producer/future_producer/struct.FutureProducer.html#method.send_result
2021-07-15 11:34:23 -04:00
Marco Neumann a064820a70 fix: code comment should match the code
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-07-15 17:20:46 +02:00
Marco Neumann b5428e53a5 refactor: write buffer testing + better mocking
This refactors the write buffer a bit for:

- **Testing:** Add generic tests for the Kafka and the mocking
  implementation. The same interface can be used easily add new
  implementations (e.g. via Redis, filesystem, ...).
- **Partition on Write:** The caller of the writer operation must now
  specify the partition/sequencer ID. The implicit partitioning of the
  Kafka writer would have lead to broken data since we must never spill
  entries w/ the same primary key over multiple partitions. At the
  moment we will only use partition 0 but we can easily implement
  better logic in the future.
- **Improved Mocking:** The mocked implementation now simulates a system
  that feels more real. Especially the handling around multiple streams
  and "write while read" has been improved. This will be helpful for
  testing and for new features like seeking (during replay). A solid
  realistic mock also helps us to ensure that the tests using the mock
  do not rely on unrealistic behavior too much.
2021-07-15 17:20:45 +02:00
Marko Mikulicic 8d23dd6d6d
fix: Set kafka max message size in client 2021-07-14 14:46:49 +02:00
Marco Neumann 9cb9ae0874 chore: move write buffer into its own crate 2021-07-14 14:09:18 +02:00
Carol (Nichols || Goulding) f4a9a5ae56 fix: Remove write buffer 2021-06-04 14:40:17 -04:00
Marco Neumann eddc9319ff docs: deny broken intradoc links 2021-04-27 13:22:28 +02:00
Carol (Nichols || Goulding) 938acb98e8 refactor: Rename WAL within the types, variables, and comments in the write_buffer crate 2021-04-21 17:43:03 +00:00
Carol (Nichols || Goulding) f0b93c5c8c refactor: Rename the wal crate to write_buffer 2021-04-21 17:43:03 +00:00
Andrew Lamb 48c43b136c
refactor: rename write_buffer --> mutable_buffer (#595)
* refactor: git mv write_buffer mutable_buffer

* refactor: update crate name references

* refactor: update some more references
2020-12-22 10:49:53 -05:00
Andrew Lamb 263af1eeac
feat: implement read_group in the write_buffer (#583)
* feat: implement read_group in the write_buffer

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: rustfmt

* fix: adjust tests for min/max

* fix: Update write_buffer/src/table.rs

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-12-22 09:03:20 -05:00
Andrew Lamb 28eac06d8f
refactor: Organize window_bounds the same as selector functions (#594)
* refactor: Organize window_bounds the same as selector functions

* fix: add missing file
2020-12-21 12:51:36 -05:00
Andrew Lamb bb96142564
chore: Update arrow dependencies, remove custom min/max implementation (#585)
* chore: Update arrow dependency

* fix: Update code for changes in datafusion

* fix: use arrow version of min_boolean
2020-12-21 12:31:39 -05:00
Andrew Lamb 699504a022
feat: implement read_group for Sum, Count and Mean aggregates (#557)
* feat: Implement read_group for "normal" aggregates

* test: add tests
2020-12-15 09:35:00 -05:00
Andrew Lamb a6d2c13888
chore: Update arrow + other depenencies (#540)
* chore: Update arrow + other depenencies

* chore: Update write_buffer and query crate
2020-12-15 08:46:27 -05:00
Dom 6f473984d0 style: wrap comments
Runs rustfmt with the new config.
2020-12-11 18:22:26 +00:00
Andrew Lamb ea6b2f6bc8
refactor: remove minor code duplication (#555) 2020-12-11 11:18:00 -05:00
Paul Dix fa3ecbd4ed
feat: Implement write buffer to Parquet snapshotting (#526)
* feat: Implement write buffer to Parquet snapshotting

This introduces snapshot to the server packages to manage snapshotting. It also introduces a new trait for representing a Partition. There is a very crude API wired up in http_routes for testing purposes. Follow on work will bring the server package into http_routes and rework the snapshot API.
2020-12-08 14:20:43 -05:00
Andrew Lamb 4ec75a4f22
fix: Fix gRPC panic` when multiple field selections are provided (#523)
* fix: do not assert when multiple fields are selected

* fix: clippy

* fix: write unit test, fix bug

* fix: tweak comments
2020-12-03 12:31:02 -05:00
Andrew Lamb ecc4eee8e1
refactor: Move SQL functions into is own trait (#511)
* refactor: remove uneeded function table_to_arrow from Trait

* refactor: Move SQL functions into is own trait
2020-12-02 08:23:37 -05:00
Andrew Lamb 5ef499bb63
refactor: rename Database --> TSDatabase to better reflect its purpose (#510)
* refactor: rename Database --> TSDatabase to better reflect its purpose

* refactor: rename field_columns to field_column_names

* fix: clippy?
2020-12-01 12:37:11 -05:00
Andrew Lamb 1646397891
refactor: consolidate GroupedSeriesSet and SeriesSet (#502) 2020-11-30 14:23:58 -05:00
Andrew Lamb 20f421e9c6
fix: Do not send GroupFrames in response to read_window_aggregate (#497)
* fix: Do not send GroupFrames in response to read_window_aggregate

* fix: clippy and test
2020-11-30 05:59:05 -05:00
Andrew Lamb 8c8af66d4d
refactor: use datafusion expression building code in write_buffer (#484) 2020-11-25 14:41:39 -05:00
Andrew Lamb 0eaa90e89d
feat: Hook up read_window_aggregate into the write_buffer, end-to-end tests (#483)
* feat: read_window_aggregate_plans

* fix: clippy sacrifice

* fix: clippy

* fix: clippy
2020-11-25 10:20:49 -05:00
Andrew Lamb 9f6427c94f
refactor: query/src/groupby.rs -> query/src/group_by.rs (#477)
* refactor: query/src/groupby.rs -> query/src/group_by.rs

* refactor: update references
2020-11-25 06:43:11 -05:00
Andrew Lamb cdb26e60e4
refactor: rename `storage` crate to `query` to better reflect what it is (#475)
* refactor: rename storage --> query

* refactor: update a few more referenes
2020-11-24 14:19:29 -05:00
Andrew Lamb 85921fe401
feat: Implement gRPC and storage interface for read_group_aggregate (#474)
* feat: Implement gRPC and storage interface for read_group_aggregate

* fix: clippy

* docs: Tweak comments

* fix: moar clippy

* fix: fmt

* docs: Apply doc improvement suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: improve error creation

* refactor: use match instead of if

* refactor: clearer match

* refactor: clean up storage aggregate matching

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-11-24 13:41:11 -05:00
Andrew Lamb 597933622d
fix: improve error messages with more context (#455) 2020-11-16 16:40:29 -05:00
Andrew Lamb 2fa0e03162
fix: Use datafusion optimizer in IOx query plans (#439)
* chore: update arrow dep to 8e4d9ebef3

* fix: checkin Cargo.lock

* fix: Enable datafusion optimizer, use display_indent_schema
2020-11-11 18:06:21 -05:00
Andrew Lamb a52e0001c5
refactor: rename all crates that start with`delorean_` in preparation for rename (#415)
* refactor: rename delorean_cluster --> cluster

* refactor: rebane delorean_generated_types --> generated_types

* refactor: rename delorean_write_buffer --> write_buffer

* refactor: rename delorean_ingest --> ingest

* refactor: rename delorean_storage --> storage

* refactor: rename delorean_tsm --> tsm

* refactor: rename delorean_test_helpers --> test_helpers

* refactor: rename delorean_arrow --> arrow_deps

* refactor: rename delorean_line_parser --> influxdb_line_protocol
2020-11-05 13:44:36 -05:00