influxdb

Commit Graph

Author	SHA1	Message	Date
Andrew Lamb	0b3df2ab50	fix: reduce verbosity of logs (#3159 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-11-19 10:03:27 +00:00
Carol (Nichols \|\| Goulding)	25d55cd08a	feat: Move server config paths beneath 'nodes' (#3144 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-11-19 09:54:32 +00:00
Raphael Taylor-Davies	e32d367e85	feat: flush delete mailbox on persist (#3126 ) (#3147 ) * feat: flush delete mailbox on persist (#3126) * chore: review feedback Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-11-19 09:45:29 +00:00
Marco Neumann	7c72a993a3	fix: don't retry "forever" sending Kafka messages When a Kafka broker pod is recreated (for whatever reason) and gets a new IP while doing so, the following happened: 1. Old broker pod gets terminated, but is still reachable via DNS and TCP. 2. rdkafka looses its connection, re-creates it using the old IP. The TCP connection can be established (this heavily depends on the K8s network setup), but won't be able to send any messages because the old broker is already shutting down / dead. 3. New broker gets created w/ new IP (but same DNS name). 4. Somewhat in parallel to step 3: rdkafka gets informed by other brokers that the topic lost its leader and then that the topic has the new leader (which has the same identity as the old one). Since leader changes in Kafka can also happen when brokers are totally healthy, it doesn't conclude that its TCP connection might be broken and tries to send messages to the new broker via the old TCP connection. 5. It takes very long (~130s on my test setup) for the old rdkafka->broker TCP connection to break. Since `message.send.max.retries` has a default of `2147483647` rdkafka will not give up on the application level. 5. rdkafka re-connects, while doing so resolves via DNS the new broker IP and is happy. An alternative fix that was tried: Use the `connect` rdkafka callback to hook into the place where it would issue the UNIX `connect` call. There we can manipulate the socket. Setting `TCP_USER_TIMEOUT` to 5000ms also solves the issue somewhat, but might have different implications (also it then takes around 5s to kill the connection). Since this is a more hackish implementation and somewhat an unofficial way to configure rdkafka, I decided against it. Test Setup ========== ```rust \#[tokio::test] async fn write_forever() { maybe_start_logging(); let conn = maybe_skip_kafka_integration!(); let adapter = KafkaTestAdapter::new(conn); let ctx = adapter.new_context(NonZeroU32::new(1).unwrap()).await; let writer = ctx.writing(true).await.unwrap(); let lp = "upc user=1 100"; let sequencer_id = set_pop_first(&mut writer.sequencer_ids()).unwrap(); for i in 1.. { println!("{}", i); let tables = mutable_batch_lp::lines_to_batches(lp, 0).unwrap(); let write = DmlWrite::new(tables, DmlMeta::unsequenced(None)); let operation = DmlOperation::Write(write); let res = writer.store_operation(sequencer_id, &operation).await; dbg!(res); tokio::time::sleep(Duration::from_secs(1)).await; } } ``` Make sure to set the the rdkafka `log` config to `all`. Then use KinD, setup a 3-node Strimzi cluster and start the test binary within the K8s cluster. You need to start a debug container that is close enough to your developer system (e.g. an old Debian DOES NOT work if you run bleeding edge Arch): ```console $(host) kubectl run -i --tty --rm debug --image=archlinux --restart=Never -n kafka -- bash ```` Then you copy over the test binary the container using [cargo-with](https://github.com/cbourjau/cargo-with): ```console $(host) cargo with 'kubectl cp {bin} kafka/debug:/foo' -- test -p write_buffe ```` Within the container shell that you've just created, start the forever-running test (make sure to set `KAFKA_CONNECT` according to your Strimzi setup!): ```console $(container) TEST_INTEGRATION=1 KAFKA_CONNECT=my-cluster-kafka-bootstrap:9092 RUST_BACKTRACE=1 RUST_LOG=debug ./foo write_forever --nocapture ```` The test should run and tell you that it is delivering messages. It also tells you within the debug logs which broker it sends the messages to. Now you need to kill the broker (in my example it was `my-cluster-kafka-1`): ```console $(host) kubectl -n kafka delete pod my-cluster-kafka-1 ```` The test should now stop to deliver messages and should error. Without this patch it might take over 100s for it to recover even after the deleted pod was re-created. With this patch it quickly is able to deliver data again after the broker comes back online. Fixes #3030.	2021-11-19 09:53:57 +01:00
Nga Tran	c148251dcb	feat: implement step2: compact and persist os chunks	2021-11-18 18:18:55 -05:00
Carol (Nichols \|\| Goulding)	a2454b542d	fix: Small cleanups in Cargo.tomls (#3160 ) * fix: Add tokio rt-multi-thread feature so cargo test -p client_util compiles * fix: Alphabetize dependencies * fix: Add the data_types_conversions feature to get tests passing * fix: Remove dev dependencies already listed under normal dependencies * fix: Make sure the workspace is using the new resolver	2021-11-18 22:26:33 +00:00
Jacob Marble	2976244244	chore: update one-shot Dockerfile to not depend on rust:ci (#3133 ) * chore: update one-shot Dockerfile to not depend on rust:ci * chore: update Debian to bullseye Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-11-18 16:01:32 +00:00
kodiakhq[bot]	0d500b135b	Merge pull request #3118 from influxdata/cn/alias-db-commands fix: Make delete/restore aliases for release/claim; remove tombstone	2021-11-18 15:22:10 +00:00
kodiakhq[bot]	c9f02f83e7	Merge branch 'main' into cn/alias-db-commands	2021-11-18 15:13:43 +00:00
Nga Tran	ccef3b535a	feat: clean up and add comments for next steps	2021-11-18 10:11:51 -05:00
Andrew Lamb	5e7336b475	docs: Tweak comments on Mailbox (#3152 )	2021-11-18 14:19:19 +00:00
Andrew Lamb	1fae3559cf	docs: document differences between Mailbox and channel (#3148 )	2021-11-18 13:08:03 +00:00
kodiakhq[bot]	d42b416bdb	Merge pull request #3138 from influxdata/crepererum/update_ci_builder ci: update CI image builder to use newer docker	2021-11-18 10:03:31 +00:00
kodiakhq[bot]	ba4e7c2dff	Merge branch 'main' into crepererum/update_ci_builder	2021-11-18 09:56:15 +00:00
Marco Neumann	fef6cafa24	ci: explain some circle decisions	2021-11-18 10:55:36 +01:00
Raphael Taylor-Davies	714fc85c8d	refactor: extract Mailbox type (#3126 ) (#3142 ) * refactor: extract Mailbox type (#3126) * fix: doc * chore: review feedback Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2021-11-18 09:34:06 +00:00
Nga Tran	a5c04e5fe4	feat: framework for compact os chunks	2021-11-17 18:12:51 -05:00
Carol (Nichols \|\| Goulding)	f69d37e9a8	fix: Remove database delete/restore entirely	2021-11-17 12:03:11 -05:00
Carol (Nichols \|\| Goulding)	7783e4a7ff	fix: Make delete/restore aliases for release/claim; remove tombstone Fixes #2680	2021-11-17 11:41:08 -05:00
Raphael Taylor-Davies	8155747735	feat: add write buffer delete encoding (#2731 ) (#3127 ) * feat: add write buffer delete encoding (#2731) * chore: fix doc * chore: review feedback * chore: review feedback * chore: fmt * chore: review feedback	2021-11-17 16:12:19 +00:00
Andrew Lamb	b5a7bf03da	feat: Add kafka write buffer consumer metrics (#3129 ) * feat: Add kafka write buffer consumer metrics * refactor: use unwrap_or_else * fix: Update bucket boundaries	2021-11-17 14:35:40 +00:00
Andrew Lamb	47acd181c5	chore: Update datafusion + arrow/parquet/arrow-flight 6.2.0 (#3136 ) * chore: Update datafusion and arrow * chore: update arrow/parquet/arrow-flight to 6.2.0 * refactor: Add table_exists to SchemaProvider impl * fix: clippy * fix: clippy 2 Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-11-17 14:04:49 +00:00
Dom	da61966858	build: remove proc-macro2 pin (#3137 ) Seems unused, builds without it! Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-11-17 13:55:39 +00:00
Marco Neumann	c9168a2c13	ci: update CI image builder to use newer docker This is a precondition to build ARM64 CI images.	2021-11-17 14:17:27 +01:00
Andrew Lamb	d6c6e9a6c7	fix: Default kafka timeout to be shorter than gRPC timeout (60 sec --> 10 sec) (#3131 ) * fix: Default kafka timeout to be shorter than gRPC timeout * docs: fix link style	2021-11-17 12:19:53 +00:00
kodiakhq[bot]	a87a320eb3	Merge pull request #3134 from influxdata/crepererum/bullseye ci: update CI images from docker buster to bullseye	2021-11-17 09:46:11 +00:00
Marco Neumann	640cd88df3	ci: update CI images from docker buster to bullseye This will break `perf_image` until the new CI image is built due to the newly required `--all-tags` parameter to `docker push` that isn't available for the docker version we run on buster.	2021-11-17 10:00:31 +01:00
kodiakhq[bot]	76790cadd8	Merge pull request #3135 from influxdata/crepererum/tokio140 chore: upgrade tokio to 1.14.0 to fix RUSTSEC-2021-0124	2021-11-17 08:58:18 +00:00
Marco Neumann	04d8133227	chore: upgrade tokio to 1.14.0 to fix RUSTSEC-2021-0124	2021-11-17 09:44:52 +01:00
Andrew Lamb	38ca9e1339	fix: capture all panic messages in logs (#3130 )	2021-11-16 21:59:05 +00:00
kodiakhq[bot]	35f5725a3a	Merge pull request #3120 from influxdata/crepererum/issue3100 feat: emit Kafka stats as metrics instead of logs	2021-11-16 16:26:19 +00:00
Marco Neumann	79929c8cf4	feat: add more Kafka metrics	2021-11-16 17:18:41 +01:00
Marco Neumann	9ee004946e	fix: do not overload rdkafka w/ statistics	2021-11-16 17:18:41 +01:00
Marco Neumann	e6fdd79a0f	feat: emit Kafka stats as metrics instead of logs This maps a subset of Kafka stats as metrics. The set can -- of course -- be changed in the future depending on our needs. Fixes #3100.	2021-11-16 17:18:41 +01:00
Raphael Taylor-Davies	553e412226	refactor: DMLOperation write path (#2731 ) (#3121 ) * refactor: DMLOperation write path (#2731) * chore: fmt * chore: review feedback	2021-11-16 12:42:19 +00:00
kodiakhq[bot]	f3fd94148c	Merge pull request #3113 from influxdata/crepererum/issue3063 fix: ensure `ConsistenHasher` is consistent	2021-11-16 08:49:53 +00:00
kodiakhq[bot]	88de603fc2	Merge branch 'main' into crepererum/issue3063	2021-11-16 08:41:55 +00:00
Carol (Nichols \|\| Goulding)	bc11244828	feat: Rename database disown/adopt to release/claim (#3111 ) * fix: Rename 'disown' to 'release' database Connects to #3110 * fix: Rename 'adopt' to 'claim' database Fixes #3110.	2021-11-15 20:28:09 +00:00
kodiakhq[bot]	2a9d840161	Merge pull request #3090 from influxdata/cn+jpg/adopt feat: Add an Adopt Database API	2021-11-15 19:40:43 +00:00
Carol (Nichols \|\| Goulding)	d759d98612	fix: Update new code with API that changed since branching from main	2021-11-15 14:32:50 -05:00
kodiakhq[bot]	cc693a780e	Merge branch 'main' into cn+jpg/adopt	2021-11-15 19:22:07 +00:00
Carol (Nichols \|\| Goulding)	3545f6d65a	fix: Pass through error for already-owned database	2021-11-15 14:15:56 -05:00
kodiakhq[bot]	1cfbbf0245	Merge pull request #3115 from influxdata/crepererum/issue3020a refactor: clarify `ServerType` background worker handling	2021-11-15 17:45:25 +00:00
Marco Neumann	1d68980e4f	fix: typo Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>	2021-11-15 18:37:31 +01:00
Marco Neumann	c88930a6a5	refactor: clarify `ServerType` background worker handling Ref #3020.	2021-11-15 18:28:32 +01:00
Marco Neumann	4e71de508e	fix: ensure `ConsistenHasher` is consistent The std `DefaultHasher` is NOT guaranteed to stay the same, so let's directly use the `SipHasher13` which at the moment (2021-11-15) is used by the standard lib. Fixes #3063.	2021-11-15 17:39:17 +01:00
Raphael Taylor-Davies	3cd7d2eda2	refactor: improve usability of proto conversion traits (#3109 ) * refactor: improve usability of proto conversion traits * chore: review feedback	2021-11-15 16:10:29 +00:00
Jake Goulding	af28cfa2a6	feat: Add an adopt database API Fixes #2679.	2021-11-15 09:26:06 -05:00
Raphael Taylor-Davies	58f3e2e559	refactor: move delete predicate proto serialization to generated_types (#3108 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-11-15 12:02:14 +00:00
kodiakhq[bot]	60eaf704a9	Merge pull request #3107 from influxdata/crepererum/improve_router_client_errors feat: improve `RouterClient` errors	2021-11-15 11:45:07 +00:00

... 3 4 5 6 7 ...

6141 Commits (42b1436220e4a3bf3687940e30501597c5747b66) All Branches Search

6141 Commits (42b1436220e4a3bf3687940e30501597c5747b66)

All Branches