influxdb

Commit Graph

Author	SHA1	Message	Date
Dom Dwyer	0c0a38c484	refactor: more verbose shard reset logs Adds a little more context to the "shard reset" logs.	2022-10-19 12:28:02 +02:00
Dom Dwyer	c63312ce12	refactor: use histogram to record TTBR Changes the TTBR metric from a gauge to a histogram so that observations maintain a time dimension.	2022-10-18 16:29:09 +02:00
Luke Bond	475c8a0704	fix: only emit ttbr metric for applied ops (#5854 ) * fix: only emit ttbr metric for applied ops * fix: move DmlApplyAction to s/w accessible * chore: test for skipped ingest; comments and log improvements * fix: fixed ingester test re skipping write Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-14 12:06:49 +00:00
Dom Dwyer	9c40d80032	refactor(ingester): log shard_id in op result Include the shard ID in the op apply result to correlate it with other log messages.	2022-10-13 15:41:48 +02:00
Dom Dwyer	dbcbb5b824	refactor: include sequence numbers in apply() logs Include the op sequence number in the error/success apply() log messages.	2022-10-13 14:19:02 +02:00
Dom Dwyer	c4f542bbe2	refactor(ingester): remove tombstone support This commit removes tombstone support from the ingester, and deletes associated code/helpers/tests. This commit does NOT remove tombstone support from any other service, but MAY include removing overlapping test coverage. This also removes the tombstone support from the Ingester -> Querier RPC response message. This has the nice side effect of removing a whole lot of thread spawning in the ingester tests for the Executor, speeding everything up!	2022-10-11 13:10:04 +02:00
Luke Bond	fda1479db0	chore: add trace log to ingester to aid debugging (#5829 )	2022-10-11 10:33:42 +00:00
Dom Dwyer	5f2f735c7e	fix: spurious watermark < read offset panic In staging we observed an ingester panic due to the write buffer stream yielding an WriteBufferErrorKind::SequenceNumberAfterWatermark, suggesting the ingester was attempting to read from an offset that exceeds the current max write offset in Kafka (high watermark offset). This turned out not to be the case - the partition had a single write at offset 2, and the ingester was attempting to seek to offset 1. The first read would fail (offset 1 does not exist) and the error handling did not account for the high watermark not being correctly set (-1 in the response). I have no idea why rskafka returns this watermark / doesn't retry / etc but this change will allow the ingesters to recover.	2022-09-28 15:22:34 +02:00
Dom Dwyer	b873297fad	refactor(ingester): limit visibility Marks many internal data structures as non-pub. Many remain as they're used across tests / from multiple callers "peeking", but this limits the scope of false sharing in the future.	2022-09-27 14:27:32 +02:00
Dom Dwyer	ee8cdb48af	style(ingester): fmt imports & long strings Rewrite the imports to be a consistent order; std, external, crate and merge all crate-level imports into one use statement.	2022-09-14 14:20:19 +02:00
Dom Dwyer	2a19606456	feat(ingester): restrict partition row count This limit restricts a single partition to containing at most N rows before it is marked for persistence (note: being marked for persistence does not currently prevent further ingest for that partition.)	2022-08-31 15:48:18 +02:00
Carol (Nichols \|\| Goulding)	dbd27f648f	refactor: Rename more mentions of Kafka to their other name where appropriate	2022-08-29 14:27:02 -04:00
Carol (Nichols \|\| Goulding)	74c9529062	fix: Rename KafkaPartition to ShardIndex	2022-08-29 14:07:18 -04:00
Marco Neumann	6b8b922fe7	fix: do not loose data when Kafka reports that offset is above watermark (#5322 ) * fix: do not loose data when Kafka reports that offset is above watermark This can happen in certain cluster rebalance settings. This is also linked to https://github.com/influxdata/rskafka/issues/147 but for the upstream issue I currently have no idea how to fix it, so let's at least harden IOx against it. Fixes #5128. * refactor: panic for `SequenceNumberAfterWatermark`	2022-08-11 07:32:04 +00:00
Marco Neumann	9fbc95c3ad	feat: add sequencer reset count metric and log to ingester (#5286 ) Split out from #5253. Helps with #5128. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-03 13:00:36 +00:00
Andrew Lamb	8f5210ea3e	test: add test for "duration since production" in kafka `write_buffer` implementation (#5043 ) * test: add test for timestamps in kafka write buffer * refactor: move timestamp batching test to generic tests Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-07 10:27:27 +00:00
Markus Westerlind	edf3f08e81	refactor: Replace all uses of lazy_static with once_cell Went through and remove all lazy_static uses with once_cell (while waiting for the project to compile). There are still dependencies using lazy_static so it is still in the crate graph but at least there isn't an explicit dependency on it (and it is easier to update to `std::lazy::Lazy` once that is stable).	2022-06-29 16:22:02 +02:00
Dom Dwyer	75a3fd5e1e	refactor: use propagated partition key in ingester Changes the ingester to use the partition key derived in the router, and transmitted over through the kafka API boundary. This should have no observable behavioural change, but be more resilient as we're no longer assuming the partitioning algorithm produces the same value in both the router (where data is partitioned) and the ingester (where data is persisted, segregated by partition key). This is a pre-requisite to allowing the user to specify partitioning schemes.	2022-06-21 15:57:30 +01:00
Andrew Lamb	74f4006580	fix(ingester): make ingester metrics start with `ingester` (#4870 ) * fix(ingester): make ingester metrics start with `ingester` * fix: Update ingester/src/stream_handler/handler.rs Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 12:46:37 +00:00
Dom Dwyer	4df2964566	refactor: store PartitionKey in DmlWrite Carry the PartitionKey in the DmlWrite, allowing the batch to be associated with a specific partition key.	2022-06-15 15:48:54 +01:00
kodiakhq[bot]	dd8d44e24f	Merge branch 'main' into cn/duration	2022-06-10 14:23:09 +00:00
Andrew Lamb	50697906b1	refactor: Make `DMLWrite::sequence_number` a `SequenceNumber` (#4817 )	2022-06-09 19:36:37 +00:00
Carol (Nichols \|\| Goulding)	1c7cbaf5ae	refactor: Use DurationHistogram in more places	2022-06-09 14:20:51 -04:00
Andrew Lamb	dde3c3922c	refactor: use consistent spelling of serialize (#4717 )	2022-05-27 14:42:59 +00:00
Carol (Nichols \|\| Goulding)	6ce6a38094	fix: Make metric names potentially less confusing	2022-05-25 10:04:39 -04:00
Carol (Nichols \|\| Goulding)	05bd9de4d3	test: Add a test for the sequence number skipping metric Ok, so... this needed lots of... channels. Channels everywhere. The stream method on TestWriteBufferStreamHandler previously assumed it would only be called once. In a test where reset_to_earliest is called, stream might be called again to get the reset stream. We want to be able to control which of the streams gets which operations, so that's why the macro now takes a vec of vec of operations-- one vec of operations per expected call to stream, and the stream will send all the operations in its vec. The test thread needs to wait for the handler stream to consume the last item from the last receiver stream, so when the TestWriteBufferStreamHandler has set up the last expected call to stream, pass back the last transmitter and have it wait until it's at full expected capacity (which means all operations have been consumed by the receiver).	2022-05-20 20:50:02 -04:00
Carol (Nichols \|\| Goulding)	bda231051a	feat: Record metrics when resetting the write buffer and skipping sequence numbers	2022-05-20 20:48:17 -04:00
Carol (Nichols \|\| Goulding)	bcbf7b4f46	refactor: Move error handling logic to be all together	2022-05-20 20:48:17 -04:00
Carol (Nichols \|\| Goulding)	ab72c93a5e	docs: Updating wrapping, content, and grammar of comments	2022-05-20 10:51:07 -04:00
Carol (Nichols \|\| Goulding)	c811bebdb7	feat: Add ingester CLI option to skip to oldest available WB seq num The default behavior of the ingester is to panic if the min unpersisted sequence number in the catalog is unknown to the write buffer due to the retention policies having evicted that sequence number. Specifying `--skip-to-oldest-available` changes this behavior to skip to the oldest sequence number the write buffer does have available and go from there. Fixes #4624.	2022-05-20 10:51:07 -04:00
Carol (Nichols \|\| Goulding)	b3f97bdb9d	test: Capture existing behavior for unknown sequence number	2022-05-20 10:51:06 -04:00
Dom Dwyer	7f3473e19f	refactor(ingester): emit per-op debugging info Emit a TRACE level log containing the op offset & other helpful fields. This will allow us to identify which messages were last successfully decoded, and which caused errors so we can pull them from analysis.	2022-05-11 16:35:35 +01:00
Carol (Nichols \|\| Goulding)	068096e7e1	fix: Rename data_types2 to data_types	2022-05-06 14:45:39 -04:00
Marco Neumann	bd600bbac6	refactor: allow ingester to be integrated into query tests (#4427 ) * refactor: improve `IngesterData` public interface * feat: impl `Debug` for `Test{Namespace,Sequencer}` * refactor: trait interface for `LifecyleHandle` This is required to mock the lifecycle for query tests. * refactor: trait for partitioner	2022-04-26 13:44:30 +00:00
二手掉包工程师	4b47d723b1	refactor: Rename time to iox_time (#4416 ) Signed-off-by: hi-rustin <rustin.liu@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-26 00:19:59 +00:00
Dom Dwyer	71a278ac7e	refactor: accept !Sync write buffer streams Removes the Sync bound SequencedStreamHandler input stream type, as the BoxStream returned by the WriteBufferStreamHandler is not Sync. This change means the SequencedStreamHandler is not Sync either, but is still Send and therefore can be moved into tokio tasks.	2022-04-08 11:28:39 +01:00
Dom Dwyer	aaa677dec8	docs: describe graceful shutdown behaviour	2022-04-05 11:31:55 +01:00
Dom Dwyer	8edefc415d	refactor: rename ttbr -> write_time in tests	2022-04-05 11:31:55 +01:00
Dom Dwyer	f15275cf96	feat: expose ingest sequencer errors Instruments the SequencedStreamHandler with a series of new metrics that record the various error classes observable in the stream handler. These metrics are labelled with potential_data_loss=true where relevant to surface potential data loss events for alerting & further review.	2022-04-05 11:31:55 +01:00
Dom Dwyer	083ff1f8e3	refactor: ingest stream handler Refactors the stream_in_sequenced_entries() into a new impl in the SequencedStreamHandler type, decoupling the reading / decoding of ops from Kafka (and associated error handling) from the "what happens to those ops" concern to ease testing, encapsulate the specifics of "how to get an op" and improve flexibility. This is intended to provide robust error handling within what is reasonably possible (unexpected errors are always unexpected!) while retaining the existing metrics and functionality. I've also separated out code that exists in the current impl specifically to drive tests from the prod code path, instead driving those behaviours through mocks. As of this commit, the handler is not used - this commit simply adds the new impl.	2022-04-05 11:31:54 +01:00

40 Commits (66035ada480fb4b7d0ef02ea087ee209b819aa24)