influxdb

Commit Graph

Author	SHA1	Message	Date
Carol (Nichols \|\| Goulding)	c27d3a22d2	fix: Remove namespace argument from test helper function	2022-11-14 16:46:04 -05:00
Carol (Nichols \|\| Goulding)	3943faf998	fix: Remove namespace from DmlWrite and DmlDelete constructors	2022-11-14 16:46:04 -05:00
Carol (Nichols \|\| Goulding)	f78195f7c7	fix: Remove namespace name field from DmlWrite and DmlDelete But leave the argument in their constructors for now. Not all numbers in tests can be 42, Dom.	2022-11-14 16:46:04 -05:00
Carol (Nichols \|\| Goulding)	c203e8295f	test: Keep track of namespaces by ID in ingester TestContext	2022-11-14 16:46:04 -05:00
kodiakhq[bot]	6c1e9f04ef	Merge branch 'main' into dom/deferred-table-name	2022-11-14 18:22:46 +00:00
Carol (Nichols \|\| Goulding)	fd898cea2a	docs: Correct grammar and update outdated comment	2022-11-14 13:21:55 -05:00
dependabot[bot]	a969754819	chore(deps): Bump chrono from 0.4.22 to 0.4.23 (#6129 ) * chore(deps): Bump chrono from 0.4.22 to 0.4.23 Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.22 to 0.4.23. - [Release notes](https://github.com/chronotope/chrono/releases) - [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md) - [Commits](https://github.com/chronotope/chrono/compare/v0.4.22...v0.4.23) --- updated-dependencies: - dependency-name: chrono dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * refactor: chrono future compat Integer->timstamp conversions should not silently panic. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Marco Neumann <marco@crepererum.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-14 13:34:09 +00:00
Dom Dwyer	413b7c8f4a	refactor: use table name from catalog Changes the TableData within the ingester to utilise a TableNameResolver to fetch the TableName via the catalog on demand / in the background, instead of using the table name sent over the write. This change causes the ingester to perform a catalog query in the background (or on demand) to resolve the table name. This is a pre-requisite for removing the table name from the write wire format.	2022-11-14 11:32:22 +01:00
Dom Dwyer	0df6c7877c	refactor: indirect DeferredLoad<TableName> init Like the NamespaceNameProvider, this commit adds a TableNameProvider to provide decoupled initialisation of a DeferredLoad<TableName> instead of hard-coding in a catalog instance / query code, and plumbs it into position to be used when initialising a TableName.	2022-11-14 11:32:21 +01:00
Dom Dwyer	8dae6d3994	perf(ingester): address tables by ID only Changes the buffer tree to address TableData by their ID only (removing support for addressing tables by their string names). This removes the double reference book keeping / twin indexes and associated overhead. As part of this change, the TableName is now wrapped in a DeferredLoad in preparation for removal of the names in the DmlOperation wire format. This commit also switches the map of TableData within the NamespaceData (the parent node) to use the ArcMap for faster lookups and DRY exactly-once initialisation.	2022-11-14 11:27:19 +01:00
Dom Dwyer	d8fc9ff258	test: fix testing deadlocks The MemCatalog suffers from deadlocks when attempting to obtain a second ref to RepoCollection: https://github.com/influxdata/influxdb_iox/issues/3859	2022-11-14 10:50:10 +01:00
Dom Dwyer	9e97866b48	refactor: internalise PartitionProvider Removes the need to leak the PartitionProvider outside of the ingester crate. This will allow the PartitionProvider to utilise a DeferredLoad<TableName> without having to make the DeferredLoad and TableName pub.	2022-11-14 10:50:05 +01:00
Marco Neumann	746032af0f	fix: compatibility after hashbrown upgrade - Some methods need explicit types - `hashbrown::HashMap` now takes 32 bytes, not 64	2022-11-11 13:25:39 -05:00
Jake Goulding	cc17e5a54b	refactor: use a workspace dependency for hashbrown	2022-11-11 13:25:39 -05:00
dependabot[bot]	5024523f00	chore(deps): Bump hashbrown from 0.12.3 to 0.13.1 Bumps [hashbrown](https://github.com/rust-lang/hashbrown) from 0.12.3 to 0.13.1. - [Release notes](https://github.com/rust-lang/hashbrown/releases) - [Changelog](https://github.com/rust-lang/hashbrown/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-lang/hashbrown/compare/v0.12.3...v0.13.1) --- updated-dependencies: - dependency-name: hashbrown dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-11-11 13:24:56 -05:00
Dom	2e7a1391f8	Merge branch 'main' into dom/deferred-namespace-name	2022-11-11 17:39:10 +00:00
Dom Dwyer	0f6470c390	refactor: use correct description for retries Use the correct description for namespace query retries.	2022-11-11 18:38:30 +01:00
Dom Dwyer	1e5d3f31af	docs: clearer code comments / docs Remove redundant comments & clarify returns.	2022-11-11 18:38:29 +01:00
Dom	18c86ca44f	refactor: named unused return Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2022-11-11 17:32:42 +00:00
Nga Tran	9c4266c503	refactor: first step to remove unused retention_duration (#6113 ) * refactor: first step to remove unused retention_duration * refactor: remove retenion_duration from update catalog Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-11 15:21:06 +00:00
Dom Dwyer	2521aedb6a	perf(ingester): address namespaces by ID only Removes reliance on string name identifiers for namespaces in the ingester buffer tree, reducing the memory usage of the namespace index and associated overhead. The namespace name is required (though unused by IOx) in the IoxMetadata embedded within a parquet file, and therefore the name is necessary at persist time. For this reason, a DeferredLoad is used to query the catalog (by ID) for the name, at some uniformly random duration of time after initialisation of the NamespaceData, up to a maximum of 1 minute later. This ensures the query remains off the hot ingest path, and the jitter prevents spikes in catalog load during replay/ingester startup. As an additional / easy optimisation, the persist code causes a pre-fetch of the name in the background while compacting, hiding the query latency should it not have already been resolved. In order to keep the the ingester buffer & catalog decoupled / easily testable, this commit uses a provider/factory trait NamespaceNameProvider and corresponding implementation (NamespaceNameResolver) in a similar fashion to the PartitionResolver, allowing easy mocking for tests, and composition for prod code, allowing future optimisations such as pre-fetching / caching the "hot" namespace names at startup. Internal string identifier removal is a pre-requisite for removing string identifiers from the write wire format (#4880).	2022-11-11 14:37:21 +01:00
Dom Dwyer	611acc1ad2	refactor: plumb in DeferredLoad<NamespaceName> Changes the ingester's buffer tree to use the deferred loading primitive to resolve the namespace name for NamespaceData. Note that the loader is initialised with the name in the first place - this commit just introduces the use of the deferred loading primitive, and doesn't change where the name is sourced from.	2022-11-11 14:37:20 +01:00
Dom Dwyer	3adc66a4b2	feat: Display impl for DeferredLoad This lets deferred loads be used in place of a non-differed T, such as log context fields. If the value has not been resolved, the display impl returns "<unresolved>".	2022-11-11 14:37:19 +01:00
Dom Dwyer	76ed1afb01	perf(ingester): support prefetch deferred loads Allow a caller to signal to the DeferredLoad that the value it may or may not have to materialise will be used imminently, optimistically hiding the latency of resolving the value (typically a catalog query).	2022-11-11 14:37:18 +01:00
Dom Dwyer	d1cfa9d08b	refactor: remove redundant shard data init Removes confusingly unused shard data initialisation.	2022-11-11 13:27:15 +01:00
Dom	02be6ba7e4	refactor: generic deferred loader helper (#6095 ) * refactor: generic deferred loader helper Splits the DeferredSortKey loader introduced in #5807 into two parts - a generic helper type that implements deferred/background loading of values, and SortKey specific logic for use with it. As this will be more widley used, this implementation features improved behaviour of the deferred loader under concurrent demand requests (multiple calls to get() do not attempt to concurrently resolve the value), as well as complete cancellation safety (cancelling the get() doesn't affect the liveness of the background task). * docs: doc-link & minor comment amendments Fixes naming, adds missing doc-links, and expands some code comments. * test: bound wait times to avoid hangs Adds timeouts to all .await of the code under test, ensuring tests don't hang if something goes wrong.	2022-11-10 19:16:51 +00:00
Nga Tran	93e11d4c91	chore: Revert "feat: flag partitions for delete (#6075 )" (#6111 ) This reverts commit `77a2541172`.	2022-11-10 17:01:39 +00:00
Carol (Nichols \|\| Goulding)	dd013c5402	fix: Update the expected size in a test I tracked down the source of the size difference to the difference in `mem::size_of::<mutable_batch::column::ColumnData>`. I believe this enum is now able to take advantage of this niche-filling optimization: <https://github.com/rust-lang/rust/pull/94075/>	2022-11-09 10:54:18 -05:00
Carol (Nichols \|\| Goulding)	fa46951524	fix: Remove needless deref done by auto deref, thanks Clippy!	2022-11-09 10:54:18 -05:00
Nga Tran	77a2541172	feat: flag partitions for delete (#6075 ) * feat: flag partition for delete * fix: compare the right date and time * chore: Run cargo hakari tasks * chore: cleanup * fix: typos * chore: rust style tidy ups in catalog Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Luke Bond <luke.n.bond@gmail.com>	2022-11-09 12:06:23 +00:00
Dom	d9c97795fc	feat: use IDs in ingester query API (#6093 ) * refactor: NS+table ID (instead of name) in querier<>ingester * feat(ingester): use IDs for query API Changes the ingester to utilise the ID fields (instead of names) sent over the query wire message wrapped within the Flight API. BREAKING: this changes the "query-ingester" CLI command arguments which now expects the namespace & table IDs, rather than their names. * refactor(ingester): add more query logging context Updates the log messages during query execution to include more context fields. * style: remove unused import Co-authored-by: Marco Neumann <marco@crepererum.net>	2022-11-09 11:25:13 +00:00
Dom Dwyer	38b0459994	test: simplify tests / remove catalog Remove the catalog from tests that only initialised an implementation in order to call buffer_operation().	2022-11-08 17:02:01 +01:00
Dom Dwyer	226f14a97f	perf(ingester): remove table lookup query Now DML operations contain the table ID, the ingester has all necessary data to initialise the TableData buffer node without having to query the catalog. This also removes the catalog from the buffer_operation() call path, simplifying testing.	2022-11-08 17:00:44 +01:00
Dom Dwyer	225c3b97c1	perf(ingester): remove namespace lookup query Now DML operations contain the namespace ID, the ingester has all necessary data to initialise the NamespaceData buffer node without having to query the catalog.	2022-11-08 16:57:53 +01:00
Dom Dwyer	8ebea0df37	feat: table/namespace IDs in write protocol Expose the Table and Namespace IDs encoded within the serialised DML write (added in #6036). This makes the IDs available for use in the consumers, ending the transition period. This commit DOES NOT remove the strings sent over the wire.	2022-11-08 16:57:53 +01:00
Dom	b7f7ee6a13	Merge branch 'main' into dom/mutex-pushdown	2022-11-08 14:57:32 +00:00
Dom Dwyer	b73d07c22b	perf(ingester): granular per-partition locking This commit pushes the existing table-level mutex down to the partition. This allows the ingester to gather data from multiple partitions within a single table in parallel, and reduces contention between ingest/query workloads.	2022-11-08 15:45:59 +01:00
Dom Dwyer	b8181119e1	refactor: push down per-partition op skipping This moves the logic that skips operations that do not need to be applied to a partition during shard replay from the table level, to the partition level.	2022-11-08 15:45:52 +01:00
Dom Dwyer	4c8882e33a	docs: ref link to fix PR	2022-11-08 15:17:46 +01:00
Dom Dwyer	d71f023a57	refactor: inline helpers Inline the hash generation & key comparator.	2022-11-08 15:17:46 +01:00
Dom Dwyer	8dd7f2c603	refactor: accept owned key for insert() Changes the bounds on the ArcMap to accept an owned key, avoiding an extra allocation. Cleans up the bounds on other fn to ensure the borrowed key impl Eq and is the ref type of K.	2022-11-08 15:17:46 +01:00
Dom Dwyer	bbc2afe2a1	refactor: extract key equality checking Creates a shared fn for checking key equality to DRY the various chaining checks.	2022-11-08 15:17:46 +01:00
Dom Dwyer	8eaccd518b	fix: cross-thread map entry visibility This commit changes the ArcMap HashBuilder to use the same instance as the underlying HashMap hasher. This prevents divergent hashing across threads that MAY initialise a hasher with a different seed.	2022-11-08 15:17:46 +01:00
Dom Dwyer	66a6e8e929	test: cross-thread hashmap entry visibility At the time of this commit, this test fails. Performing a get() on a key previously inserted by another thread should not fail.	2022-11-08 15:17:46 +01:00
Dom Dwyer	fbd25a06d0	revert: push down per-partition op skipping This reverts commit `425fd46def`.	2022-11-08 10:31:51 +01:00
Dom Dwyer	7ac0857a28	revert: granular per-partition locking This reverts commit `79d24fa350`.	2022-11-08 10:31:37 +01:00
Dom Dwyer	79d24fa350	perf(ingester): granular per-partition locking This commit pushes the existing table-level mutex down to the partition. This allows the ingester to gather data from multiple partitions within a single table in parallel, and reduces contention between ingest/query workloads.	2022-11-07 13:45:03 +01:00
Dom Dwyer	425fd46def	refactor: push down per-partition op skipping This moves the logic that skips operations that do not need to be applied to a partition during shard replay from the table level, to the partition level.	2022-11-07 13:45:03 +01:00
kodiakhq[bot]	5e297e259b	Merge branch 'main' into dom/arcmap-get_or_insert_with	2022-11-07 11:47:00 +00:00
Andrew Lamb	034d9b371d	chore: Update datafusion and arrow/arrow-flight/parquet to `26.0.0` (#6061 ) * chore: Update datafusion and arrow/arrow-flight/parquet to `26.0.0` * fix: Update query_functions * fix: update for TimestampNanosecondArray API changes * fix: update for TimestampNanosecondArray API changes * chore: Update flatbuffers and remove rustsec warning * chore: Update text * fix: update more test * fix: Lock ahash to exactly 0.8.0 * fix: Update datafusion pin * chore: Run cargo hakari tasks Co-authored-by: Carol (Nichols \|\| Goulding) <carol.nichols@gmail.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-07 11:01:58 +00:00

1 2 3 4 5 ...

513 Commits (20f1ae1c8fb5a81dc806b2f327cc14a5394c992c)