This commit removes tombstone support from the ingester, and deletes
associated code/helpers/tests. This commit does NOT remove tombstone
support from any other service, but MAY include removing overlapping
test coverage.
This also removes the tombstone support from the Ingester -> Querier RPC
response message.
This has the nice side effect of removing a whole lot of thread spawning
in the ingester tests for the Executor, speeding everything up!
* feat: Add back rdkafka dependency
* feat: Remove RSKafkaProducer
* feat: Remove write buffer RecordAggregator
* feat: Add back rdkafka producer
Using code from 58a2a0b9c8311303c796495db4f167c99a2ea3aa then getting it
to compile with the latest
* feat: Add a metric around enqueue
* fix: Remove unused imports
* fix: Increase Kafka timeout to 20s
* docs: Clarify that Kafka topics should only be created in test/dev envs
* fix: Remove metrics that aren't needed for this experiment
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Changes the get() code path to abort the background load task when the
caller will resolve the sort key.
Note that an aborted future will leave the DeferredSortKey without a
background task to fetch the key, and the next caller will have to query
the catalog. Given the rarity of aborted futures, and desire to minimise
catalog load, this seems like a decent trade-off.
This commit also documents the many-readers eager loading problem.
* feat: initial step to identify where the projection should be provided
* feat: start getting columns of all expressions
* chore: format
* test: test for the table_chunk_stream
* fix: fix a compile error. Thanks @alamb
* test: full tests for table_chunk_stream
* chore: cleanup
* fix: do not cut any columns in case all fields are needed
* test: add one more test case of reading all columns
* refactor: move code that identify columbs ot push down to a function. Add the use of field_columns
* chore: cleanup
* refactor: make sream_from_batch support empty batches
* chore: cleanup
* chore: fix clippy after auto merge
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This commit carries the SortKey in the PartitionData, and configures the
ingester to use deferred sort key lookups, smearing the lookups across a
fixed period of time after initialising the PartitionData, instead of
querying for the sort key at persist time.
This allows large numbers of PartitionData to be initialised without
causing a equally large spike in catalog load to resolve the sort key -
instead this load is spread out randomly to reduce peak query rps.
Adds a new DeferredSortKey type that fetches a partition's sort key from
the catalog in the background, or on-demand if not yet pre-fetched.
From the caller's perspective, little has changed compared to reading it
from the catalog directly - the sort key is always returned when calling
get(), regardless of the mechanism, and retries are handled
transparently. Internally the sort key MAY have been pre-fetched in the
background between the DeferredSortKey being initialised, and the call
to get().
The background task waits a (uniformly) random duration of time before
issuing the catalog query to pre-fetch the sort key. This allows large
numbers of DeferredSortKey to (randomly) smear the lookup queries over a
large duration of time. This allows a large number of DeferredSortKey to
be initialised in a short period of time, without creating an equally
large spike in queries against the catalog in the same time period.
- treat OOM protection as "resource exhausted"
- use `DataFusionError` in more places instead of opaque `Box<dyn Error>`
- improve conversion from/into `DataFusionError` to preserve more
semantics
Overall, this improves our error handling. DF can now return errors like
"resource exhausted" and gRPC should now automatically generate a
sensible status code for it.
Fixes#5799.
Changes the ingester's NamespaceData to carry a ref-counted string
identifier as well as the ID.
The backing storage for the name in NamespaceData is shared with the
index map in ShardData, so it is effectively free!
This commit changes the persist() call so that it passes through all
relevant IDs so that the impl can locate the partition in the buffer
tree - this will enable elimination of many queries against the catalog
in the future.
This commit also cleans up the persist() impl, deferring queries until
the result will be used to avoid unnecessary load, improves logging &
error handling, and documents a TOCTOU bug in code:
https://github.com/influxdata/influxdb_iox/issues/5777
Changes the TableData to hold a map of partition key -> PartitionData,
and partition ID -> PartitionData simultaneously. This allows for cheap
lookups when the caller holds an ID.
This commit also manages to internalise the partition map within the
TableData - one less pub / peeking!
This commit also switches from a BTreeMap to a HashMap as the backing
collection, as maintaining key ordering doesn't appear to be necessary.
Changes the ShardData to hold a map of namespace name -> NamespaceData,
and namespace ID -> NamespaceData simultaneously.
This allows for cheap lookups when the caller holds an ID, and is part
of preparatory work to transition away from using string names in the
ingester for tables.
This commit also switches from a BTreeMap to a HashMap as the backing
collection, as maintaining key ordering doesn't appear to be necessary.
Changes the NamespaceData to hold a map of table name -> TableData, and
table ID -> TableData simultaneously.
This allows for cheap lookups when the caller holds an ID, and is part
of preparatory work to transition away from using string names in the
ingester for tables.
This commit also switches from a BTreeMap to a HashMap as the backing
collection, as maintaining key ordering doesn't appear to be necessary.