Commit Graph

9830 Commits (6991b033291208e28db7f8d2f5165e7d1cb5c006)

Author SHA1 Message Date
Carol (Nichols || Goulding) 4f081a8405
fix: Remove 'legacy' from JSON ticket format as it's still used by Go clients 2022-11-14 12:46:58 -05:00
dependabot[bot] a969754819
chore(deps): Bump chrono from 0.4.22 to 0.4.23 (#6129)
* chore(deps): Bump chrono from 0.4.22 to 0.4.23

Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.22 to 0.4.23.
- [Release notes](https://github.com/chronotope/chrono/releases)
- [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md)
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.22...v0.4.23)

---
updated-dependencies:
- dependency-name: chrono
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* refactor: chrono future compat

Integer->timstamp conversions should not silently panic.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-14 13:34:09 +00:00
Dom Dwyer 413b7c8f4a
refactor: use table name from catalog
Changes the TableData within the ingester to utilise a TableNameResolver
to fetch the TableName via the catalog on demand / in the background,
instead of using the table name sent over the write.

This change causes the ingester to perform a catalog query in the
background (or on demand) to resolve the table name. This is a
pre-requisite for removing the table name from the write wire format.
2022-11-14 11:32:22 +01:00
Dom Dwyer 0df6c7877c
refactor: indirect DeferredLoad<TableName> init
Like the NamespaceNameProvider, this commit adds a TableNameProvider to
provide decoupled initialisation of a DeferredLoad<TableName> instead of
hard-coding in a catalog instance / query code, and plumbs it into
position to be used when initialising a TableName.
2022-11-14 11:32:21 +01:00
Dom Dwyer 8dae6d3994
perf(ingester): address tables by ID only
Changes the buffer tree to address TableData by their ID only (removing
support for addressing tables by their string names). This removes the
double reference book keeping / twin indexes and associated overhead.

As part of this change, the TableName is now wrapped in a DeferredLoad
in preparation for removal of the names in the DmlOperation wire format.

This commit also switches the map of TableData within the NamespaceData
(the parent node) to use the ArcMap for faster lookups and DRY
exactly-once initialisation.
2022-11-14 11:27:19 +01:00
kodiakhq[bot] d2c515b069
Merge pull request #6122 from influxdata/dom/internalise-partition-resolver
refactor: internalise PartitionProvider
2022-11-14 09:57:20 +00:00
Dom Dwyer d8fc9ff258
test: fix testing deadlocks
The MemCatalog suffers from deadlocks when attempting to obtain a second
ref to RepoCollection:

    https://github.com/influxdata/influxdb_iox/issues/3859
2022-11-14 10:50:10 +01:00
Dom Dwyer 9e97866b48
refactor: internalise PartitionProvider
Removes the need to leak the PartitionProvider outside of the ingester
crate.

This will allow the PartitionProvider to utilise a
DeferredLoad<TableName> without having to make the DeferredLoad and
TableName pub.
2022-11-14 10:50:05 +01:00
Dom 96a06c060d
Merge pull request #6130 from influxdata/dependabot/cargo/clap-4.0.23
chore(deps): Bump clap from 4.0.22 to 4.0.23
2022-11-14 09:45:10 +00:00
dependabot[bot] f144231e8c
chore(deps): Bump clap from 4.0.22 to 4.0.23
Bumps [clap](https://github.com/clap-rs/clap) from 4.0.22 to 4.0.23.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v4.0.22...v4.0.23)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-11-14 01:07:26 +00:00
kodiakhq[bot] 6fc1bb4e70
Merge pull request #6116 from influxdata/dependabot/cargo/hashbrown-0.13.1
chore(deps): Bump hashbrown from 0.12.3 to 0.13.1
2022-11-11 21:33:28 +00:00
kodiakhq[bot] 05d7d1495e
Merge branch 'main' into dependabot/cargo/hashbrown-0.13.1 2022-11-11 21:26:40 +00:00
kodiakhq[bot] e7ac4eb7fe
Merge pull request #6127 from influxdata/cn/database
fix: Rename more instances of "database" to "namespace"
2022-11-11 21:21:06 +00:00
Carol (Nichols || Goulding) 98621f560b
fix: Remove obsolete SERVER_MODE from the Dockerfile 2022-11-11 16:14:13 -05:00
Carol (Nichols || Goulding) 51afe53d23
fix: Rename database to namespace in service_grpc_flight types, vars, and errors 2022-11-11 16:14:13 -05:00
Carol (Nichols || Goulding) 042bf545fe
fix: Rename database types/vars to namespace in the storage command code 2022-11-11 16:14:12 -05:00
Carol (Nichols || Goulding) 3dde82b3b9
fix: Rename QueryDatabaseProvider to QueryNamespaceProvider 2022-11-11 16:14:12 -05:00
Carol (Nichols || Goulding) 0657ad9600
fix: Rename QueryDatabase to QueryNamespace 2022-11-11 16:14:12 -05:00
Carol (Nichols || Goulding) 621560a0dc
fix: Rename QueryDatabaseMeta to QueryNamespaceMeta 2022-11-11 16:14:12 -05:00
Carol (Nichols || Goulding) d965004e52
fix: Rename DmlError::DatabaseNotFound to NamespaceNotFound 2022-11-11 15:46:05 -05:00
Carol (Nichols || Goulding) bdff4e8848
fix: Consistently use 'namespace' instead of 'database' in comments and other internal text 2022-11-11 15:46:04 -05:00
Nga Tran a3f2fe489c
refactor: remove retention_duration field from namespace catalog table (#6124) 2022-11-11 20:30:42 +00:00
CircleCI[bot] 55f880a498 chore: Run cargo hakari tasks 2022-11-11 18:47:12 +00:00
Jake Goulding 6ca2915d9b fix: missing ahash lockfile entry 2022-11-11 13:27:05 -05:00
Marco Neumann 746032af0f fix: compatibility after hashbrown upgrade
- Some methods need explicit types
- `hashbrown::HashMap` now takes 32 bytes, not 64
2022-11-11 13:25:39 -05:00
Jake Goulding cc17e5a54b refactor: use a workspace dependency for hashbrown 2022-11-11 13:25:39 -05:00
dependabot[bot] 5024523f00 chore(deps): Bump hashbrown from 0.12.3 to 0.13.1
Bumps [hashbrown](https://github.com/rust-lang/hashbrown) from 0.12.3 to 0.13.1.
- [Release notes](https://github.com/rust-lang/hashbrown/releases)
- [Changelog](https://github.com/rust-lang/hashbrown/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/hashbrown/compare/v0.12.3...v0.13.1)

---
updated-dependencies:
- dependency-name: hashbrown
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-11-11 13:24:56 -05:00
Dom 1b322a6a6e
Merge pull request #6121 from influxdata/dom/deferred-namespace-name
perf(ingester): deferred namespace name discovery
2022-11-11 17:45:31 +00:00
Dom 2e7a1391f8
Merge branch 'main' into dom/deferred-namespace-name 2022-11-11 17:39:10 +00:00
Dom Dwyer 0f6470c390
refactor: use correct description for retries
Use the correct description for namespace query retries.
2022-11-11 18:38:30 +01:00
Dom Dwyer 1e5d3f31af
docs: clearer code comments / docs
Remove redundant comments & clarify returns.
2022-11-11 18:38:29 +01:00
Dom 18c86ca44f
refactor: named unused return
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2022-11-11 17:32:42 +00:00
Nga Tran ea2559a640
chore: update SQL syntax after adding the new column to_delete into table partition (#6123) 2022-11-11 16:21:45 +00:00
Nga Tran 9c4266c503
refactor: first step to remove unused retention_duration (#6113)
* refactor: first step to remove unused retention_duration

* refactor: remove retenion_duration from update catalog

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-11 15:21:06 +00:00
Dom Dwyer 2521aedb6a
perf(ingester): address namespaces by ID only
Removes reliance on string name identifiers for namespaces in the
ingester buffer tree, reducing the memory usage of the namespace index
and associated overhead.

The namespace name is required (though unused by IOx) in the IoxMetadata
embedded within a parquet file, and therefore the name is necessary at
persist time. For this reason, a DeferredLoad is used to query the
catalog (by ID) for the name, at some uniformly random duration of time
after initialisation of the NamespaceData, up to a maximum of 1 minute
later. This ensures the query remains off the hot ingest path, and the
jitter prevents spikes in catalog load during replay/ingester startup.

As an additional / easy optimisation, the persist code causes a
pre-fetch of the name in the background while compacting, hiding the
query latency should it not have already been resolved.

In order to keep the the ingester buffer & catalog decoupled / easily
testable, this commit uses a provider/factory trait
NamespaceNameProvider and corresponding implementation
(NamespaceNameResolver) in a similar fashion to the PartitionResolver,
allowing easy mocking for tests, and composition for prod code, allowing
future optimisations such as pre-fetching / caching the "hot" namespace
names at startup.

Internal string identifier removal is a pre-requisite for removing
string identifiers from the write wire format (#4880).
2022-11-11 14:37:21 +01:00
Dom Dwyer 611acc1ad2
refactor: plumb in DeferredLoad<NamespaceName>
Changes the ingester's buffer tree to use the deferred loading primitive
to resolve the namespace name for NamespaceData.

Note that the loader is initialised with the name in the first place -
this commit just introduces the use of the deferred loading primitive,
and doesn't change where the name is sourced from.
2022-11-11 14:37:20 +01:00
Dom Dwyer 3adc66a4b2
feat: Display impl for DeferredLoad
This lets deferred loads be used in place of a non-differed T, such as
log context fields.

If the value has not been resolved, the display impl returns
"<unresolved>".
2022-11-11 14:37:19 +01:00
Dom Dwyer 76ed1afb01
perf(ingester): support prefetch deferred loads
Allow a caller to signal to the DeferredLoad that the value it may or
may not have to materialise will be used imminently, optimistically
hiding the latency of resolving the value (typically a catalog query).
2022-11-11 14:37:18 +01:00
Dom e60b3191c8
Merge pull request #6120 from influxdata/dom/remove-redundant
refactor: remove redundant shard data init
2022-11-11 13:01:36 +00:00
Dom Dwyer d1cfa9d08b
refactor: remove redundant shard data init
Removes confusingly unused shard data initialisation.
2022-11-11 13:27:15 +01:00
Marco Neumann 7b83d6ec64
fix: traverse some more DataFusion error to extract tonic error (#6118)
See https://github.com/apache/arrow-datafusion/issues/4172 .
2022-11-11 12:24:47 +00:00
Marco Neumann 5398a1b986
chore: update blake2 because version 0.10.4 was yanked (#6119) 2022-11-11 11:36:11 +00:00
dependabot[bot] 358bdf6316
chore(deps): Bump ahash from 0.8.1 to 0.8.2 (#6115)
Bumps [ahash](https://github.com/tkaitchuck/ahash) from 0.8.1 to 0.8.2.
- [Release notes](https://github.com/tkaitchuck/ahash/releases)
- [Commits](https://github.com/tkaitchuck/ahash/compare/v0.8.1...v0.8.2)

---
updated-dependencies:
- dependency-name: ahash
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-11 10:47:00 +00:00
Dom 02be6ba7e4
refactor: generic deferred loader helper (#6095)
* refactor: generic deferred loader helper

Splits the DeferredSortKey loader introduced in #5807 into two parts - a
generic helper type that implements deferred/background loading of
values, and SortKey specific logic for use with it.

As this will be more widley used, this implementation features improved
behaviour of the deferred loader under concurrent demand requests
(multiple calls to get() do not attempt to concurrently resolve the
value), as well as complete cancellation safety (cancelling the get()
doesn't affect the liveness of the background task).

* docs: doc-link & minor comment amendments

Fixes naming, adds missing doc-links, and expands some code comments.

* test: bound wait times to avoid hangs

Adds timeouts to all .await of the code under test, ensuring tests don't
hang if something goes wrong.
2022-11-10 19:16:51 +00:00
Nga Tran 93e11d4c91
chore: Revert "feat: flag partitions for delete (#6075)" (#6111)
This reverts commit 77a2541172.
2022-11-10 17:01:39 +00:00
Nga Tran e9c8f40af2
chore: Revert "feat: call soft delete partitions from garbage collector (#6090)" (#6110)
This reverts commit 9fe7a50129.
2022-11-10 16:15:13 +00:00
Nga Tran e81ff1f4d5
chore: Revert "feat: catalog delete old partitions (#6099)" (#6109)
This reverts commit 664b0578e9.
2022-11-10 15:31:16 +00:00
Andrew Lamb 694443bb87
chore: Rename DatabaseName to NamespaceName (#6100)
* chore: Rename DatabaseName to NamespaceName

* fix: fmt

* chore: Updates some more references

* chore: more cleanup

* fix: adjust test

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-10 14:13:59 +00:00
Andrew Lamb 6c17ee29a5
feat: make logging clearer when parquet files upload is retried (#6056)
* feat: log success when parquet files are retried

* fix: Update parquet_file/src/storage.rs

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: fmt

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-10 11:59:54 +00:00
Luke Bond 664b0578e9
feat: catalog delete old partitions (#6099)
* feat: catalog delete old partitions

* chore: remove debug println

* chore: remove debug println

* chore: clippy

* chore: sql statement refactor for deleting partitions

* chore: improve delete partition test

* chore: clippy
2022-11-10 10:51:22 +00:00