influxdb

Commit Graph

Author	SHA1	Message	Date
Marko Mikulicic	aa9789049a	fix(iox): Use a transition shard id that doesn't overlap with legacy (#6733 )	2023-01-27 14:23:40 +00:00
Nga Tran	b8a80869d4	feat: introduce a new way of max_sequence_number for ingester, compactor and querier (#6692 ) * feat: introduce a new way of max_sequence_number for ingester, compactor and querier * chore: cleanup * feat: new column max_l0_created_at to order files for deduplication * chore: cleanup * chore: debug info for chnaging cpu.parquet * fix: update test parquet file Co-authored-by: Marco Neumann <marco@crepererum.net>	2023-01-26 10:52:47 +00:00
kodiakhq[bot]	98c60f9dc5	Merge branch 'main' into cn/one-test	2023-01-25 15:49:51 +00:00
Carol (Nichols \|\| Goulding)	4658510102	fix: For Ingester2, persist a particular namespace on demand and share MiniClusters This should hopefully help CI from running out of Postgres connections 😬 The old architecture will still need to be non-shared and persist everything.	2023-01-25 10:36:56 -05:00
Dom Dwyer	df87ca3f17	refactor: appropriate queue wait histogram buckets Changes the bucket values for the queue wait duration metric to be more appropriately scaled.	2023-01-25 16:31:49 +01:00
Dom Dwyer	7b69c84ceb	feat: export persist config metrics Export the configured maximum persist parallelism, and the maximum queue depth, so they can be used to compute % saturation in alerts / dashboards.	2023-01-25 14:57:09 +01:00
Dom Dwyer	b775288c92	refactor: fix duration metric units in description It's seconds, not nanoseconds.	2023-01-24 15:49:16 +01:00
Dom Dwyer	d198756a29	feat(metrics): instrument DmlSink::apply() Record latency histograms for DmlSink::apply() calls, configuring ingester2 to report the overall write path latency, and separately the buffer apply latency.	2023-01-24 15:07:17 +01:00
Dom Dwyer	28d575d90f	feat(tracing): emit spans for write path Emit tracing spans for each component of the write path in ingester2.	2023-01-24 15:07:16 +01:00
Dom Dwyer	c9a1c7435b	feat(metrics): instrumented query execution Instrument the query path in ingester2, capturing the query latency + counts, broken down by success/error.	2023-01-24 15:07:16 +01:00
Dom Dwyer	3541243fcb	feat(metrics): persist duration histograms Adds metrics to track the distribution duration spent actively persisting a batch of partition data (compacting, generating parquet, uploading, DB entries, etc) and another tracking the duration of time an entry spent in the persist queue. Together these provide a measurement of the latency of persist requests, and as they contain event counters, they also provide the throughput and number of outstanding jobs.	2023-01-24 15:05:56 +01:00
Dom Dwyer	0637540aad	feat(metrics): cumulative persist job count Tracks the cumulative number of persist jobs enqueued on a single ingester (the total amount, so including now-completed jobs).	2023-01-24 15:05:56 +01:00
dependabot[bot]	0114e7ee50	chore(deps): Bump async-trait from 0.1.61 to 0.1.63 (#6660 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.61 to 0.1.63. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.61...0.1.63) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-01-23 08:41:27 +00:00
Andrew Lamb	5b6d261396	refactor: remove iox_arrow_flight use in ingester2 (#6623 ) * refactor: remove iox_arrow_flight use in ingester2 * fix: Update ingester2/src/server/grpc/query.rs Co-authored-by: Dom <dom@itsallbroken.com> * chore: remove unused Error enums Co-authored-by: Dom <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-19 15:27:23 +00:00
Andrew Lamb	8410998408	chore: Update datafusion to Jan 17, 2023 (2 / 2) and arrow/parquet `30.0.1` (#6604 ) * chore: Update datafusion to Jan 9, 2023 (2 / 2) and arrow/parquet `30.0.1` * chore: Update for changes in arrow ipc * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2023-01-18 15:51:24 +00:00
Dom Dwyer	4b3a5c0c2b	refactor(persist): pluggable completion observer Changes the persist system to call into an abstract PersistCompletionObserver after the persist task has completed, but before releasing the job permit / notifying the enqueuer. This call happens synchronously, driven by the persist worker to completion. A sync construct can easily be made async (by enqueuing work into a channel), but not the other way around, so this gives the best flexibility. This trait allows pluggable logic to be inserted into the persist system, without tightly coupling it to the implementer's logic (for example, replication). One or more observers may be chained together to construct an arbitrary sequence of actors. This commit uses a no-op observer, causing no functional change to the system.	2023-01-17 11:28:32 +01:00
Dom Dwyer	b4c1980e58	chore: remove old TODOs These have been done!	2023-01-16 18:55:17 +01:00
Dom Dwyer	fa62c00002	test: persist concurrent sort key catalog updates Adds an integration test of the persist system, covering: * Node A starts a persist operation * Node B starts a persist operation for the same partition * Node A completes, setting the catalog sort key to a new value * Node B attempts to update the catalog, observing the new sort key * Node B re-compacts the data, re-uploads, and drives to completion This scenario is/was tracked in: https://github.com/influxdata/influxdb_iox/issues/6439	2023-01-16 18:55:17 +01:00
Dom Dwyer	091b428d4f	refactor(persist): decouple Context & worker logic The persist::Context struct carries the data to be persisted, a reference to the partition from which it came, and various cached fields to avoid re-acquiring the partition read lock all the time. Prior to this commit, the Context also had the full persist logic as methods, invoked by the persist worker. This tightly couples the data & logic - it's fairly clear a worker should implement the work, and operate on the data - not commingling the two. I even knew the mess I was making when I wrote it, but effectively copy-pasted it from ingester1 because deadlines. This commit decouples the persist logic from the Context.	2023-01-16 18:36:17 +01:00
Dom Dwyer	1f5294c096	test: persistence system integration test This test ensures the persistence system as a whole works in the happy path.	2023-01-16 13:34:33 +01:00
Dom Dwyer	6413362c72	refactor: use system-wide ingester ID The query API exposes a unique-per-instance UUID to allow callers to detect a crash of the ingester process - this was initialised directly in the query RPC handler. This commit turns the bare UUID into a type, and initialises it in the top-level initialisation of the ingester, plumbing it down into the query RPC handler. This allows the UUID to be reused by other components/handlers.	2023-01-13 16:46:38 +01:00
Dom Dwyer	8dc18a9838	perf: remove double-ref Partition map The ingester no longer needs to access a specific PartitionData by ID (they are addressed either via an iterator over the BufferTree, or shared by Arc reference). This allows us to remove the extra map maintaining ID -> PartitionData references, and the shared access lock protecting it.	2023-01-13 14:05:30 +01:00
Dom	f7ff877582	Merge branch 'main' into cn/ingester-persist-tick	2023-01-13 12:31:45 +00:00
Carol (Nichols \|\| Goulding)	02c7ed58a2	fix: Use up the first interval tick so we wait first and then persist	2023-01-12 14:55:58 -05:00
Carol (Nichols \|\| Goulding)	0554194923	docs: Explain persist-on-demand use case and potential limitations	2023-01-12 11:52:11 -05:00
Carol (Nichols \|\| Goulding)	e1395f4f35	fix: Move PersistNow to grpc/PersistHandler	2023-01-12 11:09:33 -05:00
Carol (Nichols \|\| Goulding)	642bab5db3	fix: Cloning persist handle for persist_partitions no longer needed	2023-01-12 11:05:25 -05:00
Carol (Nichols \|\| Goulding)	27d58efa18	feat: Implement Persist Service on ingester2	2023-01-12 11:03:12 -05:00
Dom Dwyer	0d111c4672	refactor: delegate frontend shutdown to backend Prior to this commit, the (happy path) shutdown sequence of an IOx process was hard coded to: 1. Stop gRPC & HTTP servers 2. Stop backend server (i.e. ingester2) After this commit, the execution of step 1 is delegated to the handler for step 2; the server implementation (router / ingester / querier / etc) now chooses when to shut down the RPC & HTTP servers. This allows the server shutdown delegate to correctly sequence the shutdown of all components of the IOx server. This allows ingester2 to correctly sequence the shutdown of the query RPC server w.r.t the graceful stop & persist, ensuring queries continue to be serviced.	2023-01-12 14:59:50 +01:00
Dom	0dd4afba74	Merge branch 'main' into dom/shutdown-persist	2023-01-11 16:16:30 +00:00
Dom Dwyer	c01d5566b9	docs: reference WAL segement ref-counting ticket Reference the WAL ref-counting ticket in the WAL rotation task.	2023-01-11 16:50:24 +01:00
Dom Dwyer	bbd41228bc	feat(ingester2): persist on shutdown Persist all buffered data when gracefully stopping an ingester2 instance. This implementation accounts for both late-arriving writes, and concurrent persist tasks - it's carefully constructed in a way that it can discover the presence of, and wait for, outstanding persist tasks started by other code without having to know about all the possible places a persist task can be started (currently WAL rotation & hot partition persistence, but later also a RPC endpoint). There exists a small race that seems to be so incredibly unlikely to occur, I didn't cover off (it would have a RPC write cost for little gain). This is documented in the code comments.	2023-01-11 16:50:23 +01:00
Dom Dwyer	2fa12622c8	refactor: drop PersistState visibility This is now an implementation detail internal to the persist subsystem.	2023-01-11 16:42:50 +01:00
Dom Dwyer	23280d0489	refactor: return persisted partition count When calling persist_partitions(), return the number of partitions that were persisted to the caller.	2023-01-11 16:31:40 +01:00
Dom Dwyer	a80e66336b	refactor: avoid unnecessary PersistHandle clone Avoid having to clone a persist handle in order to pass it into persist_partitions().	2023-01-11 16:30:58 +01:00
dependabot[bot]	43a0280365	chore(deps): Bump prost from 0.11.5 to 0.11.6 Bumps [prost](https://github.com/tokio-rs/prost) from 0.11.5 to 0.11.6. - [Release notes](https://github.com/tokio-rs/prost/releases) - [Commits](https://github.com/tokio-rs/prost/compare/v0.11.5...v0.11.6) --- updated-dependencies: - dependency-name: prost dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2023-01-11 02:53:33 +00:00
Dom Dwyer	63b1e4a3a4	refactor: replace PersistState with IngestState Prior to this commit, when initialising the persist system it would return a PersistState instance, used to communicate the saturation status of the persistence system. The RPC write path used this information to accept or deny write requests accordingly. This was unfortunate in that it tightly coupled the ingest handler to the persist system - in order to initialise the RPC handler, you had to provide a PersistState; this required us to initialise a persist system when testing only the RPC handler (which had nothing to do with persisting). This smells! This commit inverts the dependency, and decouples the subsystems via a shared type (IngestState). Instead of the persist system telling ingest to stop, the ingest system provides a means to be told to stop - this subtle difference decouples the ingest handler from all components that need to block ingest. This allows a fast O(1) error state read for N components and prevents us from having to start N components to test a RPC handler. Additionally this commit introduces an unused ingest error state (GracefulStop) as part of figuring out the API (to be used shortly).	2023-01-10 16:30:10 +01:00
Dom	2ee307f8e0	Merge branch 'main' into dom/reuse-partition-iter	2023-01-10 11:55:29 +00:00
Dom Dwyer	94c71dace7	refactor: reuse PartitionIter Allows any code that interacts with a set of PartitionData to be abstracted from the underlying data structure.	2023-01-10 10:52:12 +01:00
dependabot[bot]	b49cc2e35e	chore(deps): Bump tokio from 1.24.0 to 1.24.1 (#6545 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.0 to 1.24.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.24.0...tokio-1.24.1) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-10 09:48:44 +00:00
Dom	d666bf6d22	Merge branch 'main' into dom/track-seq-num	2023-01-09 13:47:40 +00:00
Dom Dwyer	aab4f6c651	refactor: remove unused QueryExec impl This is completely unused and left over from the initial skeleton.	2023-01-09 14:28:16 +01:00
Dom Dwyer	1f509f47b1	refactor: log number of writes in persist batch Include the number of DML operations applied to the persisted buffer in the "persisted partition" message. Partly because I'm intrigued / it's useful information, and partly to ensure LLVM doesn't get snazzy and dead-code the sequence number tracking because it was never read.	2023-01-09 13:31:42 +01:00
Dom Dwyer	ca2b8afbb1	refactor(ingester2): track buffer sequence numbers Changes the ingester2 buffer FSM to track the sequence numbers that have been applied to it. This is a pre-requisite for replication & correct WAL segment dropping.	2023-01-09 13:27:18 +01:00
Dom Dwyer	0916529cfb	docs: remove outdated monotonicity comment Previously the ingester(1) required ordered writes to be applied, this requirement has been relaxed, and the asserts (previously) removed in ingester2.	2023-01-09 13:24:51 +01:00
dependabot[bot]	e31c84a794	chore(deps): Bump async-trait from 0.1.60 to 0.1.61 (#6533 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.60 to 0.1.61. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.60...0.1.61) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-09 07:44:35 +00:00
Nga Tran	b856edf826	feat: function to get parttion candidates from partition table (#6519 ) * feat: function to get parttion candidates from partition table * chore: cleanup * fix: make new_file_at the same value as created_at * chore: cleanup Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-06 16:20:45 +00:00
Raphael Taylor-Davies	e1036a0c63	refactor: cleanup schema boxing (#6511 ) * refactor: cleanup Schema boxing * chore: clippy	2023-01-06 10:57:39 +00:00
Andrew Lamb	6843eee1d2	feat: Extract encoding from `RecordBatch` --> `FlightData` from flight implementations (#6460 ) * feat: Extract encoding from `RecordBatch` --> `FlightData` from flight implementations Refactor existing flight server impl * fix: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * fix: fixup code review comments * fix: update for more details * fix: Update names / types Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-04 13:36:16 +00:00
dependabot[bot]	0aacef3c59	chore(deps): Bump once_cell from 1.16.0 to 1.17.0 (#6473 ) * chore(deps): Bump once_cell from 1.16.0 to 1.17.0 Bumps [once_cell](https://github.com/matklad/once_cell) from 1.16.0 to 1.17.0. - [Release notes](https://github.com/matklad/once_cell/releases) - [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md) - [Commits](https://github.com/matklad/once_cell/compare/v1.16.0...v1.17.0) --- updated-dependencies: - dependency-name: once_cell dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Change once_cell version specifier to major.minor for less churn Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Carol (Nichols \|\| Goulding) <carol.nichols@gmail.com>	2023-01-02 17:07:15 +00:00

1 2 3

149 Commits (db7e6335ca2a416cc3120f87ce1441bd4f60b03f)