Dom Dwyer
bd88ac6149
refactor: parallelise Kafka partition client init
...
Changes the Kafka write buffer impl to parallelise initialisation of the
PartitionClient instances.
Now that the PartitionClient constructor also performs leader discovery
(using cached metadata, influxdata/rskafka#164 ) and establishes a broker
connection (influxdata/rskafka#166 ) executing them in parallel will
cause a proportional decrease in the time taken to bring IOx up.
2022-08-12 14:45:23 +02:00
Dom
dbe6b4947c
Merge branch 'main' into dom/bump-rskafka
2022-08-12 09:20:37 +01:00
Dom Dwyer
7118334774
build: bump rskafka
...
Bump rskafka to pick up connection pre-warming:
https://github.com/influxdata/rskafka/pull/166
2022-08-12 10:13:25 +02:00
Carol (Nichols || Goulding)
3a501a4a10
fix: Remove an immediate ref to a deref
...
Caught by clippy now. https://rust-lang.github.io/rust-clippy/master/index.html#borrow_deref_ref
2022-08-11 15:04:14 -04:00
Dom Dwyer
faa1db9a24
build: bump rskafka
...
Bump rskafka & fix minor breakage in order to pick up client
pre-warming:
https://github.com/influxdata/rskafka/pull/165
2022-08-11 17:26:06 +02:00
Dom Dwyer
7174f38f3f
build: bump rskafka
...
Bumps rskafka to HEAD to pick up:
https://github.com/influxdata/rskafka/pull/164
2022-08-11 14:00:21 +02:00
Marco Neumann
6b8b922fe7
fix: do not loose data when Kafka reports that offset is above watermark ( #5322 )
...
* fix: do not loose data when Kafka reports that offset is above watermark
This can happen in certain cluster rebalance settings.
This is also linked to https://github.com/influxdata/rskafka/issues/147
but for the upstream issue I currently have no idea how to fix it, so
let's at least harden IOx against it.
Fixes #5128 .
* refactor: panic for `SequenceNumberAfterWatermark`
2022-08-11 07:32:04 +00:00
Andrew Lamb
16ddc5efc6
chore: Update datafusion / arrow/parquet/arrow-flight and prost/tonic ecosystem ( #5360 )
...
* chore: Update datafusion and arrow
* chore: Update Cargo.lock
* chore: update to Decimal128
* chore: Update tonic/prost/pbjson/etc
* chore: Run cargo hakari tasks
* fix: doctest in generated types
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-09 17:30:44 +00:00
Dom Dwyer
87e4290e1f
refactor(write_buffer): database_name -> topic_name
...
Previously IOx mapped a single database to a single kafka topic - this
is no longer the case, so referring to the kafka topic name as the
"database name" name is confusing.
2022-08-08 15:24:35 +02:00
Dom Dwyer
c133cf22c6
refactor: use kafka produce instrumentation
...
This commit changes the IOx write buffer initialisation code to add the
KafkaProducerMetrics instrumentation to the per-partition Kafka clients.
2022-08-08 15:24:35 +02:00
Dom Dwyer
284a3069ce
feat: Kafka client produce() instrumentation
...
Adds a decorator over the underlying kafka client to capture the latency
distribution of the low-level kafka writes, independent of the
aggregation/DML batching framework that sits "above" this client.
The latency measurements include the serialisation overhead, protocol
overhead, and actual network I/O.
2022-08-08 15:24:35 +02:00
kodiakhq[bot]
0ba3ae1e0d
Merge branch 'main' into dom/instrument-kafka-produce
2022-08-04 15:13:49 +00:00
Dom Dwyer
77fd967517
feat: instrument kafka aggregated DML batch size
...
The Kafka write buffer implementation (and only the Kafka impl) merges
together successive DML writes for the same namespace & partition within
a window of time.
This commit records the number of DML writes that have been merged
together to form a single batched op before it is dispatched to Kafka.
2022-08-04 16:48:56 +02:00
Dom Dwyer
1cad7e13ec
build: bump rskafka to latest
...
Includes minor code changes needed to support the rskafka HEAD commit.
Breaking changes made in
https://github.com/influxdata/rskafka/issues/160
2022-08-04 15:02:11 +02:00
Marco Neumann
273b3cc165
chore: replace `dotenv` with `dotenvy` ( #5285 )
...
The latter one is a maintained fork. This avoids having both crates
after #5282 .
2022-08-03 12:41:38 +00:00
Marco Neumann
87bdabb38a
feat: log external span for query gRPC requests ( #5187 )
...
* feat: log external span for query gRPC requests
This should simplify the correlation with our binlog data.
* refactor: address review comments
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-28 12:53:12 +00:00
dependabot[bot]
9b67de2f43
chore(deps): Bump tokio from 1.19.2 to 1.20.0
...
Bumps [tokio](https://github.com/tokio-rs/tokio ) from 1.19.2 to 1.20.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.19.2...tokio-1.20.0 )
---
updated-dependencies:
- dependency-name: tokio
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-07-14 01:21:43 +00:00
Marco Neumann
9e09f77a45
fix: fix overeager Kafka message flushing ( #5113 )
...
* test: add (failing) test to ensure that interleaved partition writes are aggregated correctly
* fix: fix overeager Kafka message flushing
2022-07-13 12:32:03 +00:00
Andrew Lamb
280698f9f5
feat: Increase `DmlWrite` operation throughput by pipelining kafka read and decode ( #5066 )
...
* feat: pipeline kafka read and decode
* docs: Update write_buffer/src/kafka/mod.rs
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-08 13:18:21 +00:00
Andrew Lamb
8f5210ea3e
test: add test for "duration since production" in kafka `write_buffer` implementation ( #5043 )
...
* test: add test for timestamps in kafka write buffer
* refactor: move timestamp batching test to generic tests
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-07 10:27:27 +00:00
Andrew Lamb
5944f27e77
refactor: avoid write buffer cloning in `store_operation` ( #5042 )
...
* refactor: avoid write buffer cloning in `store_operation`
* fix: update usage
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-06 06:57:03 +00:00
Andrew Lamb
0c705fecf1
refactor: Clean up timestamp handling logic and avoid a conversion ( #4988 )
...
* refactor: Clean up timestamp handling logic
* fix: Remove unused timestamp function
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-01 01:07:21 +00:00
Marco Neumann
751bdce88a
fix: pass write buffer tests w/o Kafka ( #4923 )
...
Fixes interaction of `maybe_skip_kafka_integration!` and `should_panic`
by ensuring that `maybe_skip_kafka_integration!` panics to skip
`should_panic` tests.
Without that it is not possible to just run `cargo test -p write_buffer`.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-22 10:41:40 +00:00
Dom Dwyer
c1f7154031
feat: propagate partition key through kafka
...
Changes the kafka message wire format to include the partition key for
serialised DML writes on the wire.
After this commit, the kafka messages will contain the partition key for
each op, but this information will go unused in the ingester - this
enables us to roll out the producer side, before making the value's
presence necessary on the consumer side.
A follow-up PR will change the ingester to utilise this embedded
partition key.
This has the unfortunate side effect of making the partition key part of
the public gRPC write API:
https://github.com/influxdata/influxdb_iox/issues/4866
2022-06-20 13:42:51 +01:00
Marco Neumann
0fbff981ec
chore(deps): Bump sqlx to 0.6.0 and uuid to 1 ( #4894 )
...
Closes #4889 .
Closes #4890 .
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-17 10:28:28 +00:00
Dom Dwyer
43b3f22411
fix: respect partition key when batching dml ops
...
This commit changes the kafka write aggregator to only merge DML ops
destined for the same partition.
Prior to this commit the aggregator was merging DML ops that had
different partition keys, causing data to be persisted in incorrect
partitions:
https://github.com/influxdata/influxdb_iox/issues/4787
2022-06-16 14:05:32 +01:00
Dom Dwyer
4df2964566
refactor: store PartitionKey in DmlWrite
...
Carry the PartitionKey in the DmlWrite, allowing the batch to be
associated with a specific partition key.
2022-06-15 15:48:54 +01:00
Andrew Lamb
50697906b1
refactor: Make `DMLWrite::sequence_number` a `SequenceNumber` ( #4817 )
2022-06-09 19:36:37 +00:00
dependabot[bot]
04c685b3b7
chore(deps): Bump tokio-util from 0.7.2 to 0.7.3 ( #4784 )
...
Bumps [tokio-util](https://github.com/tokio-rs/tokio ) from 0.7.2 to 0.7.3.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.2...tokio-util-0.7.3 )
---
updated-dependencies:
- dependency-name: tokio-util
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-06 14:46:27 +00:00
dependabot[bot]
e03bf94420
chore(deps): Bump tokio from 1.18.2 to 1.19.1 ( #4783 )
...
Bumps [tokio](https://github.com/tokio-rs/tokio ) from 1.18.2 to 1.19.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.18.2...tokio-1.19.1 )
---
updated-dependencies:
- dependency-name: tokio
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-06 14:15:12 +00:00
Carol (Nichols || Goulding)
e5e08e5b16
test: Add a test of reset_to_earliest for all write buffer implementations
...
This is the basic test case; I've filed #4651 for the more complex test
needing deletion of records from the write buffer.
2022-05-20 20:48:17 -04:00
Carol (Nichols || Goulding)
ab72c93a5e
docs: Updating wrapping, content, and grammar of comments
2022-05-20 10:51:07 -04:00
Carol (Nichols || Goulding)
c811bebdb7
feat: Add ingester CLI option to skip to oldest available WB seq num
...
The default behavior of the ingester is to panic if the min unpersisted
sequence number in the catalog is unknown to the write buffer due to the
retention policies having evicted that sequence number.
Specifying `--skip-to-oldest-available` changes this behavior to skip to
the oldest sequence number the write buffer does have available and go
from there.
Fixes #4624 .
2022-05-20 10:51:07 -04:00
Carol (Nichols || Goulding)
b3f97bdb9d
test: Capture existing behavior for unknown sequence number
2022-05-20 10:51:06 -04:00
Marco Neumann
12937ee724
feat: add SOCKS5 support to Kafka write buffer ( #4623 )
2022-05-17 15:21:35 +00:00
dependabot[bot]
259d2486c1
chore(deps): Bump tokio-util from 0.7.1 to 0.7.2 ( #4605 )
...
Bumps [tokio-util](https://github.com/tokio-rs/tokio ) from 0.7.1 to 0.7.2.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.1...tokio-util-0.7.2 )
---
updated-dependencies:
- dependency-name: tokio-util
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-16 11:42:31 +00:00
Carol (Nichols || Goulding)
068096e7e1
fix: Rename data_types2 to data_types
2022-05-06 14:45:39 -04:00
Carol (Nichols || Goulding)
0541c6e40f
fix: Remove data_types crate where it's no longer used
2022-05-06 14:45:39 -04:00
Carol (Nichols || Goulding)
44209faa8e
fix: Move write buffer data types to write_buffer crate
2022-05-06 14:45:38 -04:00
Carol (Nichols || Goulding)
236edb9181
fix: Move Sequence type to data_types2
2022-05-06 14:45:38 -04:00
Carol (Nichols || Goulding)
afdff2b1db
fix: Move DatabaseName to data_types2
2022-05-06 14:45:37 -04:00
Carol (Nichols || Goulding)
1ea4a40b1f
fix: Move NonEmptyString to data_types2
2022-05-06 14:45:37 -04:00
Carol (Nichols || Goulding)
3ab0788a94
fix: Move DeletePredicate types to data_types2
2022-05-06 14:45:37 -04:00
dependabot[bot]
420c306caa
chore(deps): Bump tokio from 1.17.0 to 1.18.0 ( #4453 )
...
Bumps [tokio](https://github.com/tokio-rs/tokio ) from 1.17.0 to 1.18.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.17.0...tokio-1.18.0 )
---
updated-dependencies:
- dependency-name: tokio
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-28 08:21:17 +00:00
二手掉包工程师
4b47d723b1
refactor: Rename time to iox_time ( #4416 )
...
Signed-off-by: hi-rustin <rustin.liu@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-26 00:19:59 +00:00
Andrew Lamb
73bed810da
chore: Update arrow, arrow-flight, parquet, tonic, prost, etc ( #4357 )
...
* chore: Update datafusion
* chore: Update arrow/arrow-flight/parquet to 12
* chore: update datafusion correctly
* chore: Update prost, tonic, and dependents
* fix: Fixup some api changes
* fix: Update test output in db
* fix: Update test output in parquet_file
* fix: remove old pbjson types
* fix: Add "--experimental_allow_proto3_optional" flag
* chore: Run cargo hakari tasks
* fix: compile error
* chore: Update heappy
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-20 11:12:17 +00:00
dependabot[bot]
694ffd2238
chore(deps): Bump httparse from 1.6.0 to 1.7.0 ( #4277 )
...
Bumps [httparse](https://github.com/seanmonstar/httparse ) from 1.6.0 to 1.7.0.
- [Release notes](https://github.com/seanmonstar/httparse/releases )
- [Commits](https://github.com/seanmonstar/httparse/compare/v1.6.0...v1.7.0 )
---
updated-dependencies:
- dependency-name: httparse
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 09:37:08 +00:00
Dom Dwyer
506cdebf38
refactor: remove manual Debug impl
...
Derive the debug impl so it prints all the fields (specifically the
"number of sequencers configured" is pretty helpful in a test).
Manual impls drift over time and are more effort than the derive!
2022-04-05 12:02:07 +01:00
Andrew Lamb
a384448b92
refactor: rename Sequence::id and Sequence::number field names ( #4190 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-31 15:17:58 +00:00
dependabot[bot]
17af5fcbd1
chore(deps): Bump tokio-util from 0.7.0 to 0.7.1 ( #4154 )
...
* chore(deps): Bump tokio-util from 0.7.0 to 0.7.1
Bumps [tokio-util](https://github.com/tokio-rs/tokio ) from 0.7.0 to 0.7.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.0...tokio-util-0.7.1 )
---
updated-dependencies:
- dependency-name: tokio-util
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore: Run cargo hakari tasks
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-29 08:39:02 +00:00