influxdb

Commit Graph

Author	SHA1	Message	Date
Carol (Nichols \|\| Goulding)	fb5aa25c5b	fix: Separate most_recent_n into filtering by shard and not	2023-02-17 12:56:51 -05:00
Nga Tran	ae58831467	test: add a test that have over 2 times ax limit files per plan (#7017 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-17 10:42:31 +00:00
Carol (Nichols \|\| Goulding)	1d4f8d2c8d	test: Ingester integration tests that can have a little a internal state As a treat.	2023-02-16 11:06:44 -05:00
Carol (Nichols \|\| Goulding)	2fe9d9647f	refactor: Change the types returned from the IngesterRpcInterface	2023-02-16 10:02:17 -05:00
Dom Dwyer	2d46a364dc	feat: namespace soft-delete support This commit adds initial support for "soft" namespace deletion, where the actual records & data remain, but are no longer queryable / writeable. Soft deletion is eventually consistent - users can expect to continue writing to and reading from a bucket after issuing a soft delete call, until the various components either restart, or have their caches flushed. The components treat soft-deleted namespaces differently: * router: ignore soft deleted namespaces * ingester: accept soft deleted namespaces * compactor: accept soft deleted namespaces * querier: ignore soft deleted namespaces * various gRPC services: ignore soft deleted namespaces This ensures that the ingester & compactor do not see rows "vanishing" from the database, and continue to make forward progress. Writes for the deleted namespace that are buffered in the ingester will be persisted as normal, allowing us to support "un-delete" operations where the system is restored to a the state at which the delete was issued (rather than loosing the buffered data). Follow-on work is required to ensure GC drops the orphaned parquet files after the configured GC time, and optimisations such as not compacting parquet from soft-deleted namespaces seems like a trivial win.	2023-02-13 12:01:35 +01:00
dependabot[bot]	0cbd9f6a82	chore(deps): Bump tokio-util from 0.7.5 to 0.7.7 (#6964 ) --- updated-dependencies: - dependency-name: tokio-util dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-02-13 10:10:53 +00:00
dependabot[bot]	c0c9b51b9e	chore(deps): Bump tokio-util from 0.7.4 to 0.7.5 (#6941 ) Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.4 to 0.7.5. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.4...tokio-util-0.7.5) --- updated-dependencies: - dependency-name: tokio-util dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-02-10 09:42:00 +00:00
Dom	d44b6d412f	Merge branch 'main' into dom/always-requeue	2023-02-09 10:21:32 +00:00
dependabot[bot]	0ecde75af5	chore(deps): Bump object_store from 0.5.3 to 0.5.4 (#6900 ) Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.3 to 0.5.4. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md) - [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.3...object_store_0.5.4) --- updated-dependencies: - dependency-name: object_store dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-08 09:40:11 +00:00
Dom Dwyer	776ec41384	fix(ingester2): always maintain persist interval This fixes an issue where persistence that does not ever complete blocks the periodic enqueuing of persist tasks - this leads to the amount of buffered data in the buffer tree increasing, and the persist queue depth stays the same instead of draining the buffer. This is an issue as the queue depth is designed to act as the back-pressure of the ingester - once the depth exceeds a configurable limit, further writes are rejected until the queue has drained sufficiently (50%). After this commit, stalled persistence (i.e. object store outage) will not prevent the queue depth from growing, which should enable the saturation protection to kick in.	2023-02-07 17:48:07 +01:00
Dom Dwyer	4ffd7fcc68	test: persist timer & wal rotation Adds a unit test covering WAL rotation, buffer persistence & WAL file deletion.	2023-02-07 15:52:11 +01:00
Raphael Taylor-Davies	d3601a59f8	chore: update DataFusion, upgrade `arrow` `arrow-flight` and `parquet` to `32.0.0` (#6756 ) * chore: update DataFusion * fix: test * chore: format * chore: clippy * chore: update arrow * chore: arrow upgrade fallout * chore: Run cargo hakari tasks * chore: remove failing warm compaction test * fix: flight error propagation * chore: update parquet size * fix: Update error message * chore: Update parquet metadata test --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-06 11:35:39 +00:00
Carol (Nichols \|\| Goulding)	ae944668c1	fix: Remove unneeded let underscore for awaited future See <https://rust-lang.github.io/rust-clippy/master/index.html#let_underscore_future> This might be a false positive of the lint because we are awaiting the future, but it's not needed as nothing is must_use here, so we can avoid the lint by removing this.	2023-02-03 13:06:19 -05:00
Carol (Nichols \|\| Goulding)	30fea67701	fix: Move variables within format strings. Thanks clippy! Changes made automatically using `cargo clippy --fix`.	2023-02-03 13:06:17 -05:00
Dom Dwyer	67903a4bf2	feat(metrics): ingester2 WAL replay Adds two metrics: * Number of files replayed (counted at the start of, not completion) * Number of applied ops This will help identify when WAL replay is happening (an indication of an ungraceful shutdown & potential temporary read unavailability).	2023-02-02 14:52:09 +01:00
dependabot[bot]	1a9c27cd9a	chore(deps): Bump uuid from 1.2.2 to 1.3.0 Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.2.2 to 1.3.0. - [Release notes](https://github.com/uuid-rs/uuid/releases) - [Commits](https://github.com/uuid-rs/uuid/compare/1.2.2...1.3.0) --- updated-dependencies: - dependency-name: uuid dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2023-02-01 09:32:39 +00:00
dependabot[bot]	d0e6b16450	chore(deps): Bump bytes from 1.3.0 to 1.4.0 Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.3.0 to 1.4.0. - [Release notes](https://github.com/tokio-rs/bytes/releases) - [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md) - [Commits](https://github.com/tokio-rs/bytes/compare/v1.3.0...v1.4.0) --- updated-dependencies: - dependency-name: bytes dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2023-02-01 00:30:56 +00:00
dependabot[bot]	875b6a3e99	chore(deps): Bump futures from 0.3.25 to 0.3.26 (#6766 ) Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.25 to 0.3.26. - [Release notes](https://github.com/rust-lang/futures-rs/releases) - [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.25...0.3.26) --- updated-dependencies: - dependency-name: futures dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-31 11:33:50 +00:00
Dom Dwyer	a029de95f4	refactor: panic with unresolvable table IDs Include the table ID in the panic message.	2023-01-31 11:47:33 +01:00
Dom Dwyer	6e540bc8d6	refactor: panic with unresolvable namespace IDs Include the namespace ID in the panic message.	2023-01-31 11:46:40 +01:00
Dom Dwyer	0d9b773693	refactor: panic with unresolvable partition IDs Include the partition ID in the panic message.	2023-01-31 11:43:32 +01:00
dependabot[bot]	6f032b1d57	chore(deps): Bump async-trait from 0.1.63 to 0.1.64 (#6769 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.63 to 0.1.64. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.63...0.1.64) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-01-31 10:18:27 +00:00
dependabot[bot]	ed7d02a225	chore(deps): Bump tokio from 1.24.2 to 1.25.0 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.2 to 1.25.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/commits/tokio-1.25.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2023-01-30 01:57:27 +00:00
Marko Mikulicic	0bc7d90ee3	chore: Avoid defining transition shard numbers in multiple crates	2023-01-27 18:30:34 +01:00
Marko Mikulicic	aa9789049a	fix(iox): Use a transition shard id that doesn't overlap with legacy (#6733 )	2023-01-27 14:23:40 +00:00
Nga Tran	b8a80869d4	feat: introduce a new way of max_sequence_number for ingester, compactor and querier (#6692 ) * feat: introduce a new way of max_sequence_number for ingester, compactor and querier * chore: cleanup * feat: new column max_l0_created_at to order files for deduplication * chore: cleanup * chore: debug info for chnaging cpu.parquet * fix: update test parquet file Co-authored-by: Marco Neumann <marco@crepererum.net>	2023-01-26 10:52:47 +00:00
kodiakhq[bot]	98c60f9dc5	Merge branch 'main' into cn/one-test	2023-01-25 15:49:51 +00:00
Carol (Nichols \|\| Goulding)	4658510102	fix: For Ingester2, persist a particular namespace on demand and share MiniClusters This should hopefully help CI from running out of Postgres connections 😬 The old architecture will still need to be non-shared and persist everything.	2023-01-25 10:36:56 -05:00
Dom Dwyer	df87ca3f17	refactor: appropriate queue wait histogram buckets Changes the bucket values for the queue wait duration metric to be more appropriately scaled.	2023-01-25 16:31:49 +01:00
Dom Dwyer	7b69c84ceb	feat: export persist config metrics Export the configured maximum persist parallelism, and the maximum queue depth, so they can be used to compute % saturation in alerts / dashboards.	2023-01-25 14:57:09 +01:00
Dom Dwyer	b775288c92	refactor: fix duration metric units in description It's seconds, not nanoseconds.	2023-01-24 15:49:16 +01:00
Dom Dwyer	d198756a29	feat(metrics): instrument DmlSink::apply() Record latency histograms for DmlSink::apply() calls, configuring ingester2 to report the overall write path latency, and separately the buffer apply latency.	2023-01-24 15:07:17 +01:00
Dom Dwyer	28d575d90f	feat(tracing): emit spans for write path Emit tracing spans for each component of the write path in ingester2.	2023-01-24 15:07:16 +01:00
Dom Dwyer	c9a1c7435b	feat(metrics): instrumented query execution Instrument the query path in ingester2, capturing the query latency + counts, broken down by success/error.	2023-01-24 15:07:16 +01:00
Dom Dwyer	3541243fcb	feat(metrics): persist duration histograms Adds metrics to track the distribution duration spent actively persisting a batch of partition data (compacting, generating parquet, uploading, DB entries, etc) and another tracking the duration of time an entry spent in the persist queue. Together these provide a measurement of the latency of persist requests, and as they contain event counters, they also provide the throughput and number of outstanding jobs.	2023-01-24 15:05:56 +01:00
Dom Dwyer	0637540aad	feat(metrics): cumulative persist job count Tracks the cumulative number of persist jobs enqueued on a single ingester (the total amount, so including now-completed jobs).	2023-01-24 15:05:56 +01:00
dependabot[bot]	0114e7ee50	chore(deps): Bump async-trait from 0.1.61 to 0.1.63 (#6660 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.61 to 0.1.63. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.61...0.1.63) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-01-23 08:41:27 +00:00
Andrew Lamb	5b6d261396	refactor: remove iox_arrow_flight use in ingester2 (#6623 ) * refactor: remove iox_arrow_flight use in ingester2 * fix: Update ingester2/src/server/grpc/query.rs Co-authored-by: Dom <dom@itsallbroken.com> * chore: remove unused Error enums Co-authored-by: Dom <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-19 15:27:23 +00:00
Andrew Lamb	8410998408	chore: Update datafusion to Jan 17, 2023 (2 / 2) and arrow/parquet `30.0.1` (#6604 ) * chore: Update datafusion to Jan 9, 2023 (2 / 2) and arrow/parquet `30.0.1` * chore: Update for changes in arrow ipc * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2023-01-18 15:51:24 +00:00
Dom Dwyer	4b3a5c0c2b	refactor(persist): pluggable completion observer Changes the persist system to call into an abstract PersistCompletionObserver after the persist task has completed, but before releasing the job permit / notifying the enqueuer. This call happens synchronously, driven by the persist worker to completion. A sync construct can easily be made async (by enqueuing work into a channel), but not the other way around, so this gives the best flexibility. This trait allows pluggable logic to be inserted into the persist system, without tightly coupling it to the implementer's logic (for example, replication). One or more observers may be chained together to construct an arbitrary sequence of actors. This commit uses a no-op observer, causing no functional change to the system.	2023-01-17 11:28:32 +01:00
Dom Dwyer	b4c1980e58	chore: remove old TODOs These have been done!	2023-01-16 18:55:17 +01:00
Dom Dwyer	fa62c00002	test: persist concurrent sort key catalog updates Adds an integration test of the persist system, covering: * Node A starts a persist operation * Node B starts a persist operation for the same partition * Node A completes, setting the catalog sort key to a new value * Node B attempts to update the catalog, observing the new sort key * Node B re-compacts the data, re-uploads, and drives to completion This scenario is/was tracked in: https://github.com/influxdata/influxdb_iox/issues/6439	2023-01-16 18:55:17 +01:00
Dom Dwyer	091b428d4f	refactor(persist): decouple Context & worker logic The persist::Context struct carries the data to be persisted, a reference to the partition from which it came, and various cached fields to avoid re-acquiring the partition read lock all the time. Prior to this commit, the Context also had the full persist logic as methods, invoked by the persist worker. This tightly couples the data & logic - it's fairly clear a worker should implement the work, and operate on the data - not commingling the two. I even knew the mess I was making when I wrote it, but effectively copy-pasted it from ingester1 because deadlines. This commit decouples the persist logic from the Context.	2023-01-16 18:36:17 +01:00
Dom Dwyer	1f5294c096	test: persistence system integration test This test ensures the persistence system as a whole works in the happy path.	2023-01-16 13:34:33 +01:00
Dom Dwyer	6413362c72	refactor: use system-wide ingester ID The query API exposes a unique-per-instance UUID to allow callers to detect a crash of the ingester process - this was initialised directly in the query RPC handler. This commit turns the bare UUID into a type, and initialises it in the top-level initialisation of the ingester, plumbing it down into the query RPC handler. This allows the UUID to be reused by other components/handlers.	2023-01-13 16:46:38 +01:00
Dom Dwyer	8dc18a9838	perf: remove double-ref Partition map The ingester no longer needs to access a specific PartitionData by ID (they are addressed either via an iterator over the BufferTree, or shared by Arc reference). This allows us to remove the extra map maintaining ID -> PartitionData references, and the shared access lock protecting it.	2023-01-13 14:05:30 +01:00
Dom	f7ff877582	Merge branch 'main' into cn/ingester-persist-tick	2023-01-13 12:31:45 +00:00
Carol (Nichols \|\| Goulding)	02c7ed58a2	fix: Use up the first interval tick so we wait first and then persist	2023-01-12 14:55:58 -05:00
Carol (Nichols \|\| Goulding)	0554194923	docs: Explain persist-on-demand use case and potential limitations	2023-01-12 11:52:11 -05:00
Carol (Nichols \|\| Goulding)	e1395f4f35	fix: Move PersistNow to grpc/PersistHandler	2023-01-12 11:09:33 -05:00

1 2 3 4

173 Commits (65ba208f88bec8d8b4addb2163c366b1d829ae72)