influxdb

Commit Graph

Author	SHA1	Message	Date
Dom Dwyer	c8fdd76033	feat(ingester): partition buffer state machine This commit introduces code that is intended to replace the current implicit state machine used by PartitionData. The existing code is still in use, the new code is NOT used in this commit. A follow-up commit will switch over to minimise the diff. This change has two main goals; * encapsulation & simplification for callers * robust implementation so developing correct additions is easier This is a significant refactor of the partition buffering logic to encapsulate the various states of data (buffering, snapshot, persisting and the mixed states between them) within the Partition. This alleviates the rest of the system from having to be concerned with the differences between "buffering" data, and "unpersisted data", "snapshot data", "persisting data", "persisting with snapshots" etc - callers now invoke a method called get_query_data() and they are provided with all the relevant data for a partition. This abstraction change alone significantly reduces code and test complexity in the rest of the ingester. For the second goal, the new implementation leverages an explicit state machine, encoded using typestates. Typestate ensures compile-time correctness of transitions and method calls, and the explicit FSM itself helps ensure the system progresses in the desired manner - this fixes and helps prevent bugs caused by implicit states such as: https://github.com/influxdata/influxdb_iox/issues/5805 This state machine makes the system states explicit and self-descriptive, helping to reduce the cost of developer on-boarding (no prior knowledge of "how this bit works") and reduces ongoing developer burden. This explicit nature also de-risks adding new functionality - it should be relatively easy to add concurrent snapshot generation or incremental compaction without introducing bugs. The state transition logic is abstracted away from callers, minimising the overhead of this strategy.	2022-10-21 14:25:51 +02:00
Andrew Lamb	83e3a96c19	fix: improve ttbr histogram metric description (#5909 ) Co-authored-by: Dom <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-20 09:03:58 +00:00
Dom	ea7b4a0de6	Merge branch 'main' into dom/ingester-integration-tests	2022-10-19 13:36:54 +01:00
Marco Neumann	eb5a661ab3	refactor: prep work for #5897 (#5907 ) * refactor: add ID to `ParquetStorage` * refactor: remove duplicate code * refactor: use dedicated `StorageId`	2022-10-19 11:54:42 +00:00
Dom Dwyer	0c0a38c484	refactor: more verbose shard reset logs Adds a little more context to the "shard reset" logs.	2022-10-19 12:28:02 +02:00
Dom Dwyer	40f1937e63	test: write buffer seeking tests Asserts write buffer seeking behaviour, including: * Seeking past already persisted data correctly * Skipping to next available op in non-contiguous offset stream * Skipping to next available op for dropped ops due to retention * Panics when seeking beyond available data (into the future) Removes a pair of tests that covered some of the above due to their tight coupling with ingester internals.	2022-10-19 12:28:02 +02:00
Dom Dwyer	7729494f61	test: write, query & progress API coverage This commit adds a new test that exercises all major external APIs of the ingester: * Writing data via the write buffer * Waiting for data to be readable via the progress API * Querying data and and asserting the contents This should provide basic integration coverage for the Ingester internals. This commit also removes a similar test (though with less coverage) that was tightly coupled to the existing buffering structures.	2022-10-19 11:51:15 +02:00
Dom Dwyer	b12d472a17	test(ingester): add integration TestContext Adds a test helper type that maintains the in-memory state for a single ingester integration test, and provides easy-to-use methods to manipulate and inspect the ingester instance.	2022-10-19 11:51:15 +02:00
Dom Dwyer	d0b546109f	refactor: impl converting IngesterQueryResponse An existing function to map the complex IngesterQueryResponse type to a simple set of RecordBatch existed in test code - this has been lifted onto an inherent method on the response type itself for reuse.	2022-10-19 11:51:15 +02:00
dependabot[bot]	b5574c07b7	chore(deps): Bump async-trait from 0.1.57 to 0.1.58 (#5904 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.57 to 0.1.58. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.57...0.1.58) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-10-19 09:40:26 +00:00
Andrew Lamb	d706f8221d	chore: Update datafusion and arrow / parquet / arrow-flight 25.0.0 (#5900 ) * chore: Update datafusion and `arrow` / `parquet` / `arrow-flight` 25.0.0 * chore: Update for structure changes * chore: Update for new projection pushdown * chore: Run cargo hakari tasks * fix: fmt Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-18 20:58:47 +00:00
Dom Dwyer	c63312ce12	refactor: use histogram to record TTBR Changes the TTBR metric from a gauge to a histogram so that observations maintain a time dimension.	2022-10-18 16:29:09 +02:00
Andrew Lamb	8021b8be0b	fix: Use Display rather than Debug when logging errors (#5859 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-14 14:43:11 +00:00
Luke Bond	475c8a0704	fix: only emit ttbr metric for applied ops (#5854 ) * fix: only emit ttbr metric for applied ops * fix: move DmlApplyAction to s/w accessible * chore: test for skipped ingest; comments and log improvements * fix: fixed ingester test re skipping write Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-14 12:06:49 +00:00
Carol (Nichols \|\| Goulding)	efb964c390	feat: Enforce table column limits from the schema cache (#5819 ) * fix: Avoid some allocations by collecting instead of inserting into a vec * refactor: Encode that adding columns is for one table at a time * test: Add another test of column limits * test: Add below/above limit tests for create_or_get_many * fix: Explicitly DO NOT check column limits when inserting many columns * feat: Cache the max_columns_per_table on the NamespaceSchema * feat: Add a function to validate column limits in-memory * fix: Provide more useful information when over column limits * fix: Swap types to remove intermediate allocation * docs: Explain the interactions of the cache and the column limits * test: Actually set up test that showcases column limit race condition * fix: Allow writing to existing columns even if table is over column limit Co-authored-by: Dom <dom@itsallbroken.com>	2022-10-14 11:34:17 +00:00
Andrew Lamb	9134ccd6c3	chore: Update datafusion again (#5855 ) * chore: Update datafusion * chore: Updates for changes in datafusion * chore: more updates * fix: update doc example Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-13 19:18:57 +00:00
kodiakhq[bot]	3039b5877b	Merge branch 'main' into dom/no-persist-lookups	2022-10-13 15:13:36 +00:00
Dom Dwyer	86d28d3359	fix: update cached sort key Once persist() has successfully updated the sort key in the catalog, set the partition sort key cache to reflect the new value.	2022-10-13 17:12:07 +02:00
Dom Dwyer	9c40d80032	refactor(ingester): log shard_id in op result Include the shard ID in the op apply result to correlate it with other log messages.	2022-10-13 15:41:48 +02:00
Dom Dwyer	3e70dc44a0	refactor(catalog): remove partition_info_by_id() This method used to return a subset of partition metadata, and was used exclusively for persistence in the ingester. It is now no longer necessary.	2022-10-13 15:26:36 +02:00
Dom Dwyer	3fbeaa1314	refactor: assert monotonic partition persistence Copies the existing monotonic partition persistence check into the partition too - this ensures that even if the partitions are persisted in order, they are never marked as persisted OUT of order.	2022-10-13 15:26:36 +02:00
Dom Dwyer	920f7edf75	refactor: defer querying for table schema Do not query for the table schema until it is needed.	2022-10-13 15:26:36 +02:00
Dom Dwyer	e556677192	perf(ingester): remove persist lookup queries Removes the catalog queries previously used to look up various information about the partition/table/namespace that was already in memory. As part of this change, the compaction helper function is changed to accept the inputs it needs, rather than a struct of data from the catalog - this significantly simplifies testing. This commit also adds additional context to all log messages in the persist() fn.	2022-10-13 15:26:36 +02:00
Dom Dwyer	10d77b0ef7	refactor: use deferred sort key loading Changes the persist() implementation in the ingester to load the sort key using the deferred loading mechanism, instead of on-demand.	2022-10-13 15:26:36 +02:00
Dom Dwyer	dbcbb5b824	refactor: include sequence numbers in apply() logs Include the op sequence number in the error/success apply() log messages.	2022-10-13 14:19:02 +02:00
Dom Dwyer	15e153a74c	perf(ingester): cheaper table discovery This commit changes the table ID lookup query from an expensive, JOIN multi-query to a simple, single table, indexed lookup. As this is on the hot path, this should help with the recovery rate of the ingesters.	2022-10-13 13:44:50 +02:00
Andrew Lamb	d57c99638c	chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0 (#5792 ) * chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0 * fix: Update for coercion, fix explain plans for change in column name display * chore: Update datafusion lock * fix: Update for other API changes * chore: Update to latest datafusion pin * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-12 16:19:14 +00:00
dependabot[bot]	7202dddab6	chore(deps): Bump tokio-stream from 0.1.10 to 0.1.11 (#5838 ) Bumps [tokio-stream](https://github.com/tokio-rs/tokio) from 0.1.10 to 0.1.11. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-stream-0.1.10...tokio-stream-0.1.11) --- updated-dependencies: - dependency-name: tokio-stream dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-12 12:37:24 +00:00
Luke Bond	11900cea4d	chore: add some tracing logs to the ingester (#5839 )	2022-10-12 12:10:20 +00:00
Dom Dwyer	b294bb98aa	refactor: move query types to query_handler Moves types that are only used for handling queries to the query_handler module.	2022-10-11 17:58:55 +02:00
Dom Dwyer	c4f542bbe2	refactor(ingester): remove tombstone support This commit removes tombstone support from the ingester, and deletes associated code/helpers/tests. This commit does NOT remove tombstone support from any other service, but MAY include removing overlapping test coverage. This also removes the tombstone support from the Ingester -> Querier RPC response message. This has the nice side effect of removing a whole lot of thread spawning in the ingester tests for the Executor, speeding everything up!	2022-10-11 13:10:04 +02:00
Luke Bond	fda1479db0	chore: add trace log to ingester to aid debugging (#5829 )	2022-10-11 10:33:42 +00:00
Dom	d2467d0b63	Merge branch 'main' into dependabot/cargo/object_store-0.5.1	2022-10-11 09:56:27 +01:00
dependabot[bot]	933493fab3	chore(deps): Bump object_store from 0.5.0 to 0.5.1 Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.0 to 0.5.1. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md) - [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.0...object_store_0.5.1) --- updated-dependencies: - dependency-name: object_store dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-10-11 01:19:10 +00:00
Dom Dwyer	97c6e0f8ce	refactor: use TableName, not Arc<str> Adds a type wrapper TableName, internally an Arc<str> to leverage the type system instead of passing around untyped strings.	2022-10-10 19:09:43 +02:00
Dom Dwyer	4518bd49d1	test: constify duration seconds	2022-10-10 14:39:35 +02:00
Dom Dwyer	ab78f99ab2	refactor: eager background task abort Changes the get() code path to abort the background load task when the caller will resolve the sort key. Note that an aborted future will leave the DeferredSortKey without a background task to fetch the key, and the next caller will have to query the catalog. Given the rarity of aborted futures, and desire to minimise catalog load, this seems like a decent trade-off. This commit also documents the many-readers eager loading problem.	2022-10-10 14:39:35 +02:00
Dom Dwyer	afcb96ae47	perf(ingester): deferred sort key lookup queries This commit carries the SortKey in the PartitionData, and configures the ingester to use deferred sort key lookups, smearing the lookups across a fixed period of time after initialising the PartitionData, instead of querying for the sort key at persist time. This allows large numbers of PartitionData to be initialised without causing a equally large spike in catalog load to resolve the sort key - instead this load is spread out randomly to reduce peak query rps.	2022-10-06 16:39:54 +02:00
Dom Dwyer	c022ab6786	feat: deferred partition sort key fetcher Adds a new DeferredSortKey type that fetches a partition's sort key from the catalog in the background, or on-demand if not yet pre-fetched. From the caller's perspective, little has changed compared to reading it from the catalog directly - the sort key is always returned when calling get(), regardless of the mechanism, and retries are handled transparently. Internally the sort key MAY have been pre-fetched in the background between the DeferredSortKey being initialised, and the call to get(). The background task waits a (uniformly) random duration of time before issuing the catalog query to pre-fetch the sort key. This allows large numbers of DeferredSortKey to (randomly) smear the lookup queries over a large duration of time. This allows a large number of DeferredSortKey to be initialised in a short period of time, without creating an equally large spike in queries against the catalog in the same time period.	2022-10-06 16:37:04 +02:00
kodiakhq[bot]	ffa1704d96	Merge branch 'main' into dom/namespace-name	2022-10-06 13:58:47 +00:00
Marco Neumann	c4c83e0840	fix: query error propagation (#5801 ) - treat OOM protection as "resource exhausted" - use `DataFusionError` in more places instead of opaque `Box<dyn Error>` - improve conversion from/into `DataFusionError` to preserve more semantics Overall, this improves our error handling. DF can now return errors like "resource exhausted" and gRPC should now automatically generate a sensible status code for it. Fixes #5799.	2022-10-06 08:54:01 +00:00
Dom Dwyer	abb9122e2c	refactor: carry namespace name in NamespaceData Changes the ingester's NamespaceData to carry a ref-counted string identifier as well as the ID. The backing storage for the name in NamespaceData is shared with the index map in ShardData, so it is effectively free!	2022-10-05 13:03:16 +02:00
Dom Dwyer	1a7eb47b81	refactor: persist() passes all necessary IDs This commit changes the persist() call so that it passes through all relevant IDs so that the impl can locate the partition in the buffer tree - this will enable elimination of many queries against the catalog in the future. This commit also cleans up the persist() impl, deferring queries until the result will be used to avoid unnecessary load, improves logging & error handling, and documents a TOCTOU bug in code: https://github.com/influxdata/influxdb_iox/issues/5777	2022-10-04 14:28:01 +02:00
Dom Dwyer	f9bf86927d	refactor: ref PartitionData by key & ID Changes the TableData to hold a map of partition key -> PartitionData, and partition ID -> PartitionData simultaneously. This allows for cheap lookups when the caller holds an ID. This commit also manages to internalise the partition map within the TableData - one less pub / peeking! This commit also switches from a BTreeMap to a HashMap as the backing collection, as maintaining key ordering doesn't appear to be necessary.	2022-10-04 14:28:01 +02:00
Dom Dwyer	0847cc5458	refactor: PartitionData::id() -> partition_id() Consistent naming is consistent - all the others are thing_id().	2022-10-04 14:28:01 +02:00
Dom Dwyer	66e05b5ea7	refactor: ref NamespaceData by name & ID Changes the ShardData to hold a map of namespace name -> NamespaceData, and namespace ID -> NamespaceData simultaneously. This allows for cheap lookups when the caller holds an ID, and is part of preparatory work to transition away from using string names in the ingester for tables. This commit also switches from a BTreeMap to a HashMap as the backing collection, as maintaining key ordering doesn't appear to be necessary.	2022-10-04 14:28:01 +02:00
Dom Dwyer	9c0e4e98c4	refactor: ref TableData by name & ID Changes the NamespaceData to hold a map of table name -> TableData, and table ID -> TableData simultaneously. This allows for cheap lookups when the caller holds an ID, and is part of preparatory work to transition away from using string names in the ingester for tables. This commit also switches from a BTreeMap to a HashMap as the backing collection, as maintaining key ordering doesn't appear to be necessary.	2022-10-04 14:28:01 +02:00
Dom Dwyer	7efd81a63a	docs: comment write record ordering	2022-10-03 12:23:30 +02:00
Dom Dwyer	b23ad31711	fix: spurious memory accounting for failed write Fixes a case where the ingester may incorrectly record a write as having been buffered in memory, when in fact the buffering failed. This could cause the effective buffer size to be reduced over time as more and more data is spuriously "added" to the buffer, but never released back to the memory tracker as it is never persisted.	2022-10-03 12:13:43 +02:00
Dom Dwyer	20451921d0	test: MockLifecycleHandle captures calls Changes the NoopLifecycleHandle to MockLifecycleCall, and adds code causing it to log all calls made to the log_write() method. This will allow tests to assert calls and their values in DML buffering tests.	2022-10-03 12:13:43 +02:00

1 2 3 4 5 ...

435 Commits (c8fdd760335407579544cebf9f91ccac101a1fc7)