Include the minimum acceptable timestamp (the retention bound) and the
observed timestamp that exceeds this bound in the retention enforcement
write error response.
Still insert them into the database and associate them with namespaces,
but don't ever query them back out.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
And return the "deletes unsupported" error sooner.
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Most of the test cases set up a context with or without namespace
autocreation, while few set a retention period. This removes the
requirement to pass a retention period when enabling autocreation
or vice versa.
Using the builder pattern enables a more ergonomic workflow
for setting optional namespace creation semantics when constructing
the TestContext for the router.
This commit adds initial support for "soft" namespace deletion, where
the actual records & data remain, but are no longer queryable /
writeable.
Soft deletion is eventually consistent - users can expect to continue
writing to and reading from a bucket after issuing a soft delete call,
until the various components either restart, or have their caches
flushed.
The components treat soft-deleted namespaces differently:
* router: ignore soft deleted namespaces
* ingester: accept soft deleted namespaces
* compactor: accept soft deleted namespaces
* querier: ignore soft deleted namespaces
* various gRPC services: ignore soft deleted namespaces
This ensures that the ingester & compactor do not see rows "vanishing"
from the database, and continue to make forward progress.
Writes for the deleted namespace that are buffered in the ingester will
be persisted as normal, allowing us to support "un-delete" operations
where the system is restored to a the state at which the delete was
issued (rather than loosing the buffered data).
Follow-on work is required to ensure GC drops the orphaned parquet files
after the configured GC time, and optimisations such as not compacting
parquet from soft-deleted namespaces seems like a trivial win.
* feat: create namespace API call in router
Co-authored-by: Nga Tran <nga-tran@live.com>
* chore: treat retention as ns except in CLI
* fix: overflow in nanosecond calc
* fix: retention test after changing it from hours to ns
* chore: comment clarification in cli; better response type for error in ns API
* fix: correct some rebase mistakes
* chore: merge namespace create & create_with_retention; renamed ns create test helper fn & const
* fix: ns autocreation test was wrong after rebase
* fix: mem catalog has default 1hr retention, accidently removed in rebase
* chore: remove mem catalogs default 1hr retention; make it settable in sets & router
Co-authored-by: Luke Bond <luke.n.bond@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: reject writes that are outside the retention period
* feat: add retention validator into handler stack
* chore: Apply suggestions from code review
Co-authored-by: Dom <dom@itsallbroken.com>
* refactor: address review comments
* test: unit tests fot retention validation
* chore: address review comments
* test: more unit tests and integration tests
* refactor: make time inside retention period for emphemeral_mode test
* fix: 2 hours
Co-authored-by: Dom <dom@itsallbroken.com>
Expose the Table and Namespace IDs encoded within the serialised DML
write (added in #6036).
This makes the IDs available for use in the consumers, ending the
transition period. This commit DOES NOT remove the strings sent over the
wire.
* chore: delete metric duplicate character
* fix: failure ci test case
* fix: failure ci test case
* fix: failure ci test case
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Changes the DML handler transformers to pass through the TableId once it
has been resolved during schema validation.
This value is collated by shard, and then unused. This collated TableId
map will be used in a follow-up PR.
This commit introduces a new (composable) trait; a NamespaceResolver is
an abstraction responsible for taking a string namespace from a user
request, and mapping to it's catalog ID.
This allows the NamespaceId to be injected through the DmlHandler chain
in addition to the namespace name.
As part of this change, the NamespaceAutocreation layer was changed from
an implementator of the DmlHandler trait, to a NamespaceResolver as it
is a more appropriate abstraction for the functionality it provides.
* fix: Avoid some allocations by collecting instead of inserting into a vec
* refactor: Encode that adding columns is for one table at a time
* test: Add another test of column limits
* test: Add below/above limit tests for create_or_get_many
* fix: Explicitly DO NOT check column limits when inserting many columns
* feat: Cache the max_columns_per_table on the NamespaceSchema
* feat: Add a function to validate column limits in-memory
* fix: Provide more useful information when over column limits
* fix: Swap types to remove intermediate allocation
* docs: Explain the interactions of the cache and the column limits
* test: Actually set up test that showcases column limit race condition
* fix: Allow writing to existing columns even if table is over column limit
Co-authored-by: Dom <dom@itsallbroken.com>
This doesn't really need to be fallible but forces propagation of a ton
of error handling - no shards is always a sign of something being very
wrong, and can be caught in the caller if it's for some reason an
acceptable state / can be recovered from.
The Sequencer (which will be renamed shortly) is a type that represents
a single sequencer/shard/kafka partition in the router.
In order to minimise confusion with all the various IDs floating around,
we have a KafkaPartition - this commit changes the Sequencer to return
the Kafka partition index as a typed value, rather than a usize to help
eliminate any inconsistencies.
As a side effect of these conversion changes, I've tightened up the
casting to ensure we assert on any overflows - we juggle a lot of
numeric types!
* feat: Log time spent requesting ingester partitions
Fixes#4558.
* feat: Record a metric for the duration queriers wait on ingesters
* fix: Use DurationHistogram instead of U64 Histogram
* test: Add a test for the ingester ms metric
* feat: Add back the logging to provide both logging and metrics for ingester duration
* refactor: Use sample_count method on metrics
* feat: Record ingester duration separately for success or failure
* fix: Create a separate test for the ingester metrics
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>