influxdb

Commit Graph

Author	SHA1	Message	Date
Marco Neumann	b1215805a8	fix: upgrade rskafka for bugfixes and race-free start offsets (#3775 ) This fixes the concerns that were brought up during the review of #3748.	2022-02-17 11:22:25 +00:00
Marco Neumann	44ee0166a0	fix: start Kafka write buffer stream at "earliest" offset, not at "0" (#3748 )	2022-02-15 13:36:59 +00:00
dependabot[bot]	89105ccfab	chore(deps): Bump tokio-util from 0.6.9 to 0.7.0 (#3743 ) Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.6.9 to 0.7.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/commits) --- updated-dependencies: - dependency-name: tokio-util dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-02-15 11:33:41 +00:00
Marco Neumann	4db27eec68	fix: defined behaviour when seeking to an unknown sequence number (#3745 ) * chore: upgrade rskafka * refactor: less cloning * fix: defined behaviour when seeking to an unknown sequence number The new, defined behavior is: "return an error once and then end the stream". Co-authored-by: Edd Robinson <me@edd.io> Co-authored-by: Edd Robinson <me@edd.io>	2022-02-15 11:08:01 +00:00
Marco Neumann	cf5a5b77cb	fix: use encoded data size estimation instead of mutable batch (#3734 ) For sparse data the PB-encoded data (our Kafka wire format) is way smaller than the MutableBatch (up to a factor 20). So lets use this one to estimate the size during batching.	2022-02-14 16:58:38 +00:00
Marco Neumann	5aada6beb8	fix: hard-code prod kafka config (#3724 ) Prod has a larger max msg. size for Kafka (10MB instead of 1MB), but currently we're unable to wire all the write buffer configs through. As a quick fix lets hard code the config. This however breaks the write buffer when running under default Kafka (1MB), so we should reverse this (tracked under #3723).	2022-02-11 11:39:01 +00:00
dependabot[bot]	ad60dc6949	chore(deps): bump httparse from 1.5.1 to 1.6.0 (#3708 ) Bumps [httparse](https://github.com/seanmonstar/httparse) from 1.5.1 to 1.6.0. - [Release notes](https://github.com/seanmonstar/httparse/releases) - [Commits](https://github.com/seanmonstar/httparse/compare/v1.5.1...v1.6.0) --- updated-dependencies: - dependency-name: httparse dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-10 09:54:42 +00:00
Marco Neumann	70881270c6	fix: increase default `producer_max_batch_size` (#3686 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-09 12:09:58 +00:00
kodiakhq[bot]	35f60945e1	Merge branch 'main' into crepererum/wb_tracing_fixes	2022-02-08 16:44:30 +00:00
Raphael Taylor-Davies	ca331503a5	feat: add WriteBufferErrorKind (#3664 ) * feat: add WriteBufferErrorKind * fix: test_offset_after_broken_message Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-08 15:34:05 +00:00
Marco Neumann	fb5cfcdf23	fix: do not create aggregation span w/ invalid parent When creating a new aggregation span, you MUST NOT just create a new random span context and put its child span into a span recorder, because the then only the child will be reported to the trace collector. Instead create a new root span w/o any parent directly. This makes jaeger slightly more happy and it won't complain about broken spans anymore.	2022-02-08 15:56:59 +01:00
Marco Neumann	d8cc4c9b1e	fix: do not emit unnecessary spans during write aggregation	2022-02-08 15:56:59 +01:00
Marco Neumann	d9cc9f5a2a	feat: expose write buffer connection config via CLI (#3651 ) * feat: improve rskafka config error messages * feat: expose write buffer connection config via CLI	2022-02-07 16:24:28 +00:00
Marco Neumann	e2db1df11f	refactor: improve writer buffer consumer interface (#3631 ) * refactor: improve writer buffer consumer interface The change looks huge but is actually rather simple. To understand the interface change, let me first explain what we want: - be able to fetch watermarks for any sequencer - have streams: - each streams tracks a sequencer and has an offset state (no read multiplexing) - we can seek a stream - seeking and streaming cannot be done at the same time (that would be weird and likely leads to many bugs both in write buffer and in the user code) - ideally we don't need to create streams of all sequencers but can choose a subset Before this change we had one mutable consumer struct where you can get all streams and watermark functions (this mutable-borrows the consumer) or you can seek a single stream (this also mutable-borrows the consumer). This is a bit weird for multiple reasons: - you cannot seek a single stream without dropping all of them - the mutable-borrow construct makes it really difficult to pass the streams into separate threads - the consumer is boxed (because its mutable) which makes it more difficult to handle in a large-scale application What this change does is the following: - you have an immutable consumer (similar to the producer) - the consumer offers the following methods: - get the set of sequencer IDs - get watermark for any sequencer - get a stream handler (see next point) for any sequencer - the stream handler captures the stream state (offset) and provides you a standard `Stream<_>` interface as well as a seek function. Mutable-borrows ensure that you cannot use both at the same time. The stream handler provides you the stream via `handler.stream()`. It doesn't implement `Stream<_>` itself because the way boxing, dynamic dispatch work, and pinning interact (i.e. I couldn't get it to work without the indirection). As a bonus point (which we don't use however) you can now create multiple streams for the same sequencer and they all have their own offset. * fix: review comments Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-07 12:24:17 +00:00
Marco Neumann	50cff27b01	chore: remove rdkafka dependency (#3625 ) All features are now covered by rskafka. This also removes the need to specify a server ID for write buffer consumers. This was only used for rdkafka since there we needed to specify a consumer group, even though we did not use any transactions.	2022-02-03 13:33:56 +00:00
Marco Neumann	bc4b7f8a5b	test: ensure that rskafka and rdkakfa work together (#3624 ) * chore: upgrade rskafka + enable snappy support * test: ensure that rskafka and rdkakfa work together Before removing rdkafka ensure that: - rskafka can consume existing messages produced by rdkafka so we do not need to drain existing topics - rdkafka can consume new messages produced by rskafka so we can roll back I ran the whole `write_buffer` test suite (including the newly added tests) using Apache Kafka as well as Redpanda. * test: ensure we handle consumer offset in error case correctly * docs: explain test setup Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-03 12:52:42 +00:00
Marco Neumann	9567acd621	feat: expose all relevant configs for rskafka write buffers (#3599 ) * feat: expose all relevant configs for rskafka write buffers * refactor: `CreationConfig` => `TopicCreationConfig`	2022-02-02 09:35:54 +00:00
Marco Neumann	36a7d9b8f3	feat: flush interface for write buffer producers (#3595 ) Closes #3504. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-01 15:16:23 +00:00
Marco Neumann	22778a3a80	chore: upgrade rskafka and parking_lot (#3592 )	2022-02-01 11:50:42 +00:00
Marco Neumann	b326b62b44	feat: buffer writes when writing to RSKafka (#3520 )	2022-02-01 10:07:52 +00:00
Raphael Taylor-Davies	4101d16f71	chore: feature flag consistency (#3574 ) * chore: feature flag consistency * chore: add aarch64-apple-darwin to hakari Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-01-28 16:38:59 +00:00
Paul Dix	16d584b2ff	feat: Add db_name/namespace to DmlWrite and DmlDelete (#3531 ) * feat: Add db_name/namespace to DmlWrite and DmlDelete This is required for the new ingester to be able to work with the write buffer. The protobuf that gets serialized over Kafka already includes the database name, it just wasn't getting carried through to the marshaled Dml operation. * fix: database != namespace, propagation through write buffer Co-authored-by: Marco Neumann <marco@crepererum.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-01-27 14:12:20 +00:00
Marco Neumann	2928254c0f	fix: test logging (#3536 ) - Use a more standard way to setup the tracing subsystem (as described in tracing-subscriber docs) - Also capture content from `log` crate - Play nice w/ Rust's libtest message capture Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-01-26 10:28:51 +00:00
Dom	b846ead320	feat(router2): shard writes/deletes into write buffer (#3499 ) * feat: Sequencer wrapper This type wraps an underlying WriteBufferWriter implementation, tagging it with a sequencer ID it should use when enqueuing operations to the buffer. * feat: mock sharder Implements a mock Sharder impl that returns pre-configured responses to shard(), and captures the input to the call. * feat: sharded write buffer Implements sharding of ops into an underlying WriteBuffer. Writes are sharded by some abstract Sharder impl, collated per shard to maximise the size of each op (and therefore compression efficiency), converted into a DML operation and then enqueued in parallel to the underlying WriteBuffer implementation. Deletes are modelled as being mapped to a single write buffer shard, which is the case while we support sharding based on the table & namespace only. Deletes will be extended to support (potentially) multiple shards when column overrides are implemented. * refactor: runtime write buffers Switch from using static dispatch, to using a runtime specified WriteBufferWriting implementation. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-01-25 15:19:48 +00:00
Marco Neumann	76dd62a6c2	feat: RSKafka-driven write buffer	2022-01-20 12:36:10 +01:00
Andrew Lamb	dd23056efd	chore: update datafusion, arrow, prost, tonic, pbjson, etc (#3455 ) * chore: update datafusion, arrow, prost, tonic, etc * fix: update pprof as well * chore: update hakari * fix: update pbjson * chore: update heappy * fix: hakari * fix: workaround https://github.com/influxdata/influxdb_iox/issues/3458 Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-01-13 17:07:15 +00:00
Raphael Taylor-Davies	4731996cf4	revert: #3243 (#3370 ) * revert: "Merge pull request #3243 from influxdata/crepererum/improve_kafka_client_usage" This reverts commit `6ebec4ff71`, reversing changes made to `7684794e5c`. * fix: merge conflicts	2021-12-13 20:33:39 +00:00
Carol (Nichols \|\| Goulding)	87d8f4a85f	fix: Return error instead of panicking if Kafka support is requested but not included Also add some tests around this behavior.	2021-12-09 10:04:27 -05:00
Carol (Nichols \|\| Goulding)	8c7b3966de	fix: Organize imports	2021-12-09 08:49:34 -05:00
Carol (Nichols \|\| Goulding)	403dcae93c	feat: Put kafka write_buffer code behind a feature flag Which is off by default. This makes rdkafka optional to minimize build-time dependencies for users that don't plan on using a Kafka write buffer.	2021-12-09 08:49:34 -05:00
Marco Neumann	e558f4c708	fix: address review comments	2021-12-07 09:43:47 +01:00
Marco Neumann	8f098d3ca1	fix: improve Kafka error handling during topic discovery `MetadataTopic` has an `error` attached which we should check (I'm wondering why this isn't a proper `Result` though).	2021-12-07 09:23:03 +01:00
Carol (Nichols \|\| Goulding)	7499eac067	fix: Disable uuid serde feature; we're not actually serializing any UUIDs Connects to #3117.	2021-12-06 09:37:31 -05:00
Carol (Nichols \|\| Goulding)	16d8ae5e04	fix: Match tokio features to what's actually in use in each crate Some crates listed features they don't use; other crates ware relying on feature flags enabled by something else. I tested these changes by disabling the workspace hack crate and testing each crate.	2021-12-06 09:37:16 -05:00
Carol (Nichols \|\| Goulding)	02c297e850	fix: Always specify the parking_lot feature of tokio to get potential perf boost	2021-12-06 09:37:15 -05:00
Carol (Nichols \|\| Goulding)	1a899b939e	fix: Remove redundant closures identified by clippy https://rust-lang.github.io/rust-clippy/master/index.html#redundant_closure	2021-12-02 11:52:02 -05:00
Marco Neumann	332485d2c9	fix: use correct `bytes_read` in `DmlMeta` - for file-based write buffers: Use headers + payload - for Kafka-based write buffers: Use the estimation that we also use for other metrics - as a side effect we can now just use `PartialEq` for more types Fixes #3186.	2021-12-01 15:57:21 +01:00
Marco Neumann	459c14035c	test: ensure that Kafka producers also generate sane metrics	2021-11-29 10:55:08 +01:00
Marco Neumann	b7952c15a6	refactor: improve Kafka client usage With the new rust-rdkafka release (merged in #3234) managing multiple consumer streams becomes a bit easier. Also we can just reuse consumer clients for multiple metadata requests. In total that provides: - use only a single client connection for consumers (we had multiple connection attempts during startup and one client per stream) - use only two clients for producers (sadly we need a consumer client to probe the partitions during startup) - consumers no longer need to poll the stream to receive statistics	2021-11-29 10:39:09 +01:00
dependabot[bot]	c729fc6a25	chore(deps): bump rdkafka from 0.27.0 to 0.28.0 Bumps [rdkafka](https://github.com/fede1024/rust-rdkafka) from 0.27.0 to 0.28.0. - [Release notes](https://github.com/fede1024/rust-rdkafka/releases) - [Changelog](https://github.com/fede1024/rust-rdkafka/blob/master/changelog.md) - [Commits](https://github.com/fede1024/rust-rdkafka/compare/v0.27.0...v0.28.0) --- updated-dependencies: - dependency-name: rdkafka dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2021-11-29 08:18:02 +00:00
Marco Neumann	7f2e4f4342	refactor: remove write buffer direction The direction was required when a database could read or write from/to a write buffer. Now it is clear from the usage context of a write buffer context which of the two applications is meant (databases read, routers write) so the direction flag is no longer required.	2021-11-26 12:38:40 +01:00
Marco Neumann	edf7becd20	fix: address review comments	2021-11-24 12:09:52 +01:00
Marco Neumann	f75c12351d	fix: do not forget outputs of file-based write buffer The existing channel construction could lead to cases where streams would consume messages, put them into the channel but then when the stream gets dropped the message would be gone forever. So lets move from a channel-based implementation to directly invoke the generator future, so this buffering doesn't occur. Fixes #3179.	2021-11-24 11:38:41 +01:00
kodiakhq[bot]	a6a0eda142	Merge branch 'main' into crepererum/issue3030	2021-11-23 08:08:34 +00:00
Carol (Nichols \|\| Goulding)	9fd4a560f5	feat: Results of running cargo hakari manage-deps	2021-11-19 09:21:57 -05:00
Raphael Taylor-Davies	e32d367e85	feat: flush delete mailbox on persist (#3126 ) (#3147 ) * feat: flush delete mailbox on persist (#3126) * chore: review feedback Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-11-19 09:45:29 +00:00
Marco Neumann	7c72a993a3	fix: don't retry "forever" sending Kafka messages When a Kafka broker pod is recreated (for whatever reason) and gets a new IP while doing so, the following happened: 1. Old broker pod gets terminated, but is still reachable via DNS and TCP. 2. rdkafka looses its connection, re-creates it using the old IP. The TCP connection can be established (this heavily depends on the K8s network setup), but won't be able to send any messages because the old broker is already shutting down / dead. 3. New broker gets created w/ new IP (but same DNS name). 4. Somewhat in parallel to step 3: rdkafka gets informed by other brokers that the topic lost its leader and then that the topic has the new leader (which has the same identity as the old one). Since leader changes in Kafka can also happen when brokers are totally healthy, it doesn't conclude that its TCP connection might be broken and tries to send messages to the new broker via the old TCP connection. 5. It takes very long (~130s on my test setup) for the old rdkafka->broker TCP connection to break. Since `message.send.max.retries` has a default of `2147483647` rdkafka will not give up on the application level. 5. rdkafka re-connects, while doing so resolves via DNS the new broker IP and is happy. An alternative fix that was tried: Use the `connect` rdkafka callback to hook into the place where it would issue the UNIX `connect` call. There we can manipulate the socket. Setting `TCP_USER_TIMEOUT` to 5000ms also solves the issue somewhat, but might have different implications (also it then takes around 5s to kill the connection). Since this is a more hackish implementation and somewhat an unofficial way to configure rdkafka, I decided against it. Test Setup ========== ```rust \#[tokio::test] async fn write_forever() { maybe_start_logging(); let conn = maybe_skip_kafka_integration!(); let adapter = KafkaTestAdapter::new(conn); let ctx = adapter.new_context(NonZeroU32::new(1).unwrap()).await; let writer = ctx.writing(true).await.unwrap(); let lp = "upc user=1 100"; let sequencer_id = set_pop_first(&mut writer.sequencer_ids()).unwrap(); for i in 1.. { println!("{}", i); let tables = mutable_batch_lp::lines_to_batches(lp, 0).unwrap(); let write = DmlWrite::new(tables, DmlMeta::unsequenced(None)); let operation = DmlOperation::Write(write); let res = writer.store_operation(sequencer_id, &operation).await; dbg!(res); tokio::time::sleep(Duration::from_secs(1)).await; } } ``` Make sure to set the the rdkafka `log` config to `all`. Then use KinD, setup a 3-node Strimzi cluster and start the test binary within the K8s cluster. You need to start a debug container that is close enough to your developer system (e.g. an old Debian DOES NOT work if you run bleeding edge Arch): ```console $(host) kubectl run -i --tty --rm debug --image=archlinux --restart=Never -n kafka -- bash ```` Then you copy over the test binary the container using [cargo-with](https://github.com/cbourjau/cargo-with): ```console $(host) cargo with 'kubectl cp {bin} kafka/debug:/foo' -- test -p write_buffe ```` Within the container shell that you've just created, start the forever-running test (make sure to set `KAFKA_CONNECT` according to your Strimzi setup!): ```console $(container) TEST_INTEGRATION=1 KAFKA_CONNECT=my-cluster-kafka-bootstrap:9092 RUST_BACKTRACE=1 RUST_LOG=debug ./foo write_forever --nocapture ```` The test should run and tell you that it is delivering messages. It also tells you within the debug logs which broker it sends the messages to. Now you need to kill the broker (in my example it was `my-cluster-kafka-1`): ```console $(host) kubectl -n kafka delete pod my-cluster-kafka-1 ```` The test should now stop to deliver messages and should error. Without this patch it might take over 100s for it to recover even after the deleted pod was re-created. With this patch it quickly is able to deliver data again after the broker comes back online. Fixes #3030.	2021-11-19 09:53:57 +01:00
Carol (Nichols \|\| Goulding)	a2454b542d	fix: Small cleanups in Cargo.tomls (#3160 ) * fix: Add tokio rt-multi-thread feature so cargo test -p client_util compiles * fix: Alphabetize dependencies * fix: Add the data_types_conversions feature to get tests passing * fix: Remove dev dependencies already listed under normal dependencies * fix: Make sure the workspace is using the new resolver	2021-11-18 22:26:33 +00:00
Raphael Taylor-Davies	8155747735	feat: add write buffer delete encoding (#2731 ) (#3127 ) * feat: add write buffer delete encoding (#2731) * chore: fix doc * chore: review feedback * chore: review feedback * chore: fmt * chore: review feedback	2021-11-17 16:12:19 +00:00
Andrew Lamb	b5a7bf03da	feat: Add kafka write buffer consumer metrics (#3129 ) * feat: Add kafka write buffer consumer metrics * refactor: use unwrap_or_else * fix: Update bucket boundaries	2021-11-17 14:35:40 +00:00

1 2 3 4

170 Commits (8574bb695c613a2e8a847c3f796c69485ad757ab)