influxdb

Commit Graph

Author	SHA1	Message	Date
Dom Dwyer	71a278ac7e	refactor: accept !Sync write buffer streams Removes the Sync bound SequencedStreamHandler input stream type, as the BoxStream returned by the WriteBufferStreamHandler is not Sync. This change means the SequencedStreamHandler is not Sync either, but is still Send and therefore can be moved into tokio tasks.	2022-04-08 11:28:39 +01:00
Dom Dwyer	c2236fa3fb	feat: impl DmlSink for IngesterData This commit adds an adaptor (IngestSinkAdaptor) that provides a DmlSink implementation for the existing write path (IngesterData). With this, the existing write path becomes compatible with the new op stream handler (SequencedStreamHandler).	2022-04-08 11:28:39 +01:00
Dom Dwyer	6131381b8d	refactor: extra debug in compactor Continues pushing more debug through the compaction processing loop.	2022-04-08 11:20:19 +01:00
dependabot[bot]	ba3eb409d9	chore(deps): Bump libc from 0.2.121 to 0.2.122 (#4248 ) Bumps [libc](https://github.com/rust-lang/libc) from 0.2.121 to 0.2.122. - [Release notes](https://github.com/rust-lang/libc/releases) - [Commits](https://github.com/rust-lang/libc/compare/0.2.121...0.2.122) --- updated-dependencies: - dependency-name: libc dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-07 19:40:53 +00:00
dependabot[bot]	db162980aa	chore(deps): Bump syn from 1.0.90 to 1.0.91 (#4241 ) Bumps [syn](https://github.com/dtolnay/syn) from 1.0.90 to 1.0.91. - [Release notes](https://github.com/dtolnay/syn/releases) - [Commits](https://github.com/dtolnay/syn/compare/1.0.90...1.0.91) --- updated-dependencies: - dependency-name: syn dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-07 19:33:01 +00:00
Andrew Lamb	a30a85e62c	feat: Add get_write_info service (#4227 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-07 19:24:58 +00:00
kodiakhq[bot]	a3ee11a8e8	Merge pull request #4243 from influxdata/dom/ingester-op-instrumentation feat: ingester op instrumentation	2022-04-07 16:41:23 +00:00
kodiakhq[bot]	8bd0bfb669	Merge branch 'main' into dom/ingester-op-instrumentation	2022-04-07 16:33:25 +00:00
kodiakhq[bot]	0cd07c7a19	Merge pull request #4250 from influxdata/dom/compactor-path-debug refactor: compactor debug logging	2022-04-07 16:28:28 +00:00
kodiakhq[bot]	cce7329102	Merge branch 'main' into dom/compactor-path-debug	2022-04-07 16:20:33 +00:00
Dom Dwyer	3706ac042d	refactor: add debug in compaction path Adds debug!() and friends through the compaction path.	2022-04-07 17:13:45 +01:00
Dom Dwyer	2607151ec9	refactor: print parquet path in Debug impl Print the actual path being used when debug-printing a ParquetFilePath.	2022-04-07 16:22:43 +01:00
kodiakhq[bot]	e6c0780987	Merge pull request #4228 from influxdata/cn/sort-key-across-persists feat: Use and update partition sort_key in the catalog	2022-04-07 14:49:17 +00:00
kodiakhq[bot]	f5996c5ab4	Merge branch 'main' into cn/sort-key-across-persists	2022-04-07 14:40:55 +00:00
Dom	b196731892	Merge branch 'main' into dom/ingester-op-instrumentation	2022-04-07 12:22:58 +01:00
Dom	998a66fd98	docs: Update ingester/src/stream_handler/sink_instrumentation.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-04-07 12:18:14 +01:00
Carol (Nichols \|\| Goulding)	30c3ef5aa6	fix: Only save relevant columns in parquet file's sort key	2022-04-06 14:09:08 -04:00
Andrew Lamb	1de280e60f	chore: Update datafusion (#4247 )	2022-04-06 16:22:44 +00:00
Dom Dwyer	24eeddce8a	chore: fix lint warnings	2022-04-06 16:45:31 +01:00
Dom Dwyer	091640bb23	feat: emit tracing span for op apply This commit uses the tracing metadata within the DmlOperation to emit a tracing span from the ingester covering the DmlSink::apply() operation.	2022-04-06 16:32:00 +01:00
Dom Dwyer	f6c65f52a3	refactor: impl WatermarkFetcher Implement WatermarkFetcher for PeriodicWatermarkFetcher and remove unnecessary async.	2022-04-06 16:32:00 +01:00
Dom Dwyer	436da19d9a	feat: DmlSink instrumentation This commit adds the SinkInstrumentation type that decorates an inner DmlSink with call latency and write buffer metrics. The write buffer / sink call metrics may be split apart into two separate responsibilities in the future if there are multiple DmlSink that need instrumentation, but deferring adding more types until it is needed.	2022-04-06 16:32:00 +01:00
Andrew Lamb	c244b03281	feat: Add `SequencerProgress` reporting to ingester (#4238 ) * feat: Add `SequencerProgress` reporting to ingester * refactor: Use KafkaPartition in write_summary * fix: Update docstrings * refactor: Change ingester to use KafkaPartition everywhere * refactor: add SequencerProgress::combine * refactor: return new SequencerProgress rather than updating * fix: distinguish between yes/no/unknown in WriteSummary * docs: Update data_types2/src/lib.rs Co-authored-by: Paul Dix <paul@pauldix.net> Co-authored-by: Paul Dix <paul@pauldix.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-06 15:13:21 +00:00
dependabot[bot]	438e739344	chore(deps): Bump parquet from 11.0.0 to 11.1.0 (#4240 ) * chore(deps): Bump parquet from 11.0.0 to 11.1.0 Bumps [parquet](https://github.com/apache/arrow-rs) from 11.0.0 to 11.1.0. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/apache/arrow-rs/compare/11.0.0...11.1.0) --- updated-dependencies: - dependency-name: parquet dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * fix: Update tests Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-04-06 14:51:01 +00:00
Carol (Nichols \|\| Goulding)	bf3cb45723	refactor: Pass PartitionInfo as argument	2022-04-06 09:31:42 -04:00
Carol (Nichols \|\| Goulding)	f0d5987317	feat: Update partition sort_key in catalog after persist Connects to #4196.	2022-04-06 09:31:42 -04:00
Carol (Nichols \|\| Goulding)	c11fe5b226	refactor: Use the new contains method on SortKey	2022-04-06 09:31:42 -04:00
Carol (Nichols \|\| Goulding)	b16fcc284d	feat: Add new columns to the sort key during compaction Connects to #4196.	2022-04-06 09:31:42 -04:00
Carol (Nichols \|\| Goulding)	98d052dba7	feat: Use catalog sort key if specified Pass the sort key from the catalog through to compact_persisting_batch. If the sort key is Some, use that. If the sort key is None, compute it from the data's cardinality with compute_sort_key. Connects to #4196.	2022-04-06 09:31:42 -04:00
dependabot[bot]	6607c5e179	chore(deps): Bump arrow-flight from 11.0.0 to 11.1.0 (#4242 ) Bumps [arrow-flight](https://github.com/apache/arrow-rs) from 11.0.0 to 11.1.0. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/apache/arrow-rs/compare/11.0.0...11.1.0) --- updated-dependencies: - dependency-name: arrow-flight dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-04-06 10:04:23 +00:00
Paul Dix	a6f18e86fe	chore: add compactor logs (#4239 )	2022-04-05 21:26:59 +00:00
dependabot[bot]	bea49e7611	chore(deps): Bump arrow from 11.0.0 to 11.1.0 (#4234 ) Bumps [arrow](https://github.com/apache/arrow-rs) from 11.0.0 to 11.1.0. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/apache/arrow-rs/compare/11.0.0...11.1.0) --- updated-dependencies: - dependency-name: arrow dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-05 16:54:28 +00:00
kodiakhq[bot]	a11aef69f8	Merge pull request #4235 from influxdata/dom/watermark-fetcher feat: periodic high watermark fetcher	2022-04-05 16:15:59 +00:00
Dom	3134763ea2	Merge branch 'main' into dom/watermark-fetcher	2022-04-05 17:08:41 +01:00
Dom	39199b0305	Merge pull request #4236 from influxdata/dom/compactor-parquet-limit refactor: reduce level_0 query limit	2022-04-05 15:29:55 +01:00
Dom Dwyer	02f87e8484	refactor: reduce level_0 query limit Reduce the query limit from 10,000 to 1,000 to help reduce query execution time.	2022-04-05 15:14:56 +01:00
Dom Dwyer	506cdebf38	refactor: remove manual Debug impl Derive the debug impl so it prints all the fields (specifically the "number of sequencers configured" is pretty helpful in a test). Manual impls drift over time and are more effort than the derive!	2022-04-05 12:02:07 +01:00
Dom Dwyer	891d2e1368	feat: periodic kafka max watermark offset fetcher Adds the PeriodicWatermarkFetcher type responsible for querying write buffer / Kafka for the maximum sequence number / offset, surfacing any errors via both logs & metrics. This high watermark / max offset value is used within the ingest instrumentation metrics. This use case is tolerant of caching / stale values, and as such the value is periodically updated to minimise load on the write buffer.	2022-04-05 12:02:07 +01:00
kodiakhq[bot]	faa85a04ad	Merge pull request #4203 from influxdata/dom/ingester-stream-handler refactor: handle errors in ingester stream handler	2022-04-05 10:41:13 +00:00
Dom Dwyer	aaa677dec8	docs: describe graceful shutdown behaviour	2022-04-05 11:31:55 +01:00
Dom Dwyer	8edefc415d	refactor: rename ttbr -> write_time in tests	2022-04-05 11:31:55 +01:00
Dom Dwyer	a387ec361d	refactor: use self.deref() instead of **self	2022-04-05 11:31:55 +01:00
Dom Dwyer	f15275cf96	feat: expose ingest sequencer errors Instruments the SequencedStreamHandler with a series of new metrics that record the various error classes observable in the stream handler. These metrics are labelled with potential_data_loss=true where relevant to surface potential data loss events for alerting & further review.	2022-04-05 11:31:55 +01:00
Dom Dwyer	083ff1f8e3	refactor: ingest stream handler Refactors the stream_in_sequenced_entries() into a new impl in the SequencedStreamHandler type, decoupling the reading / decoding of ops from Kafka (and associated error handling) from the "what happens to those ops" concern to ease testing, encapsulate the specifics of "how to get an op" and improve flexibility. This is intended to provide robust error handling within what is reasonably possible (unexpected errors are always unexpected!) while retaining the existing metrics and functionality. I've also separated out code that exists in the current impl specifically to drive tests from the prod code path, instead driving those behaviours through mocks. As of this commit, the handler is not used - this commit simply adds the new impl.	2022-04-05 11:31:54 +01:00
Dom Dwyer	850308cdc9	feat(tests): future timeout helper Adds a timeout test helper for futures - this lets us easily write tests that await on futures for a bounded duration of time. Optional feature to avoid dragging tokio into existing consumers of the test_helpers crate that don't need it.	2022-04-05 11:30:47 +01:00
Andrew Lamb	5d66cd0a81	feat: Add WriteSummary serialization and deserialization to protobuf (#4232 ) * feat: Add WriteSummary serialization and deserialization to protobuf * fix: clippy Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-05 09:57:32 +00:00
Andrew Lamb	756116b497	chore: update datafusion (#4229 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-05 09:36:36 +00:00
Andrew Lamb	266f12b494	refactor: delete dead code (#4231 )	2022-04-05 00:10:43 +00:00
Paul Dix	81d41f81a1	fix: ingester replay logic (#4212 ) Fix the ingester to track the max persisted sequence number per partition. Ensure replay takes in data from unpersisted partitions. Simplify the table persist info to not return a max persisted sequence number for the table as that information isn't needed.	2022-04-04 18:04:34 +00:00
kodiakhq[bot]	f1799d836f	Merge pull request #4206 from influxdata/cn/sort-key-catalog feat: Add optional sort_key column to partition table in the catalog	2022-04-04 17:02:56 +00:00

1 2 3 4 5 ...

7555 Commits (7e5d71902722caaee257455921d4f372e58dc536) All Branches Search

7555 Commits (7e5d71902722caaee257455921d4f372e58dc536)

All Branches