`None` was only used for testing and even than we should probably have a
proper executor instead of panicking for some methods.
Found while working on #6216.
This commit implements the QueryExec trait for the BufferTree, allow it
to be queried for the partition data it contains. With this change, the
BufferTree now provides "read your writes" functionality.
Notably the implementation streams the contents of individual partitions
to the caller on demand (pull-based execution), deferring acquiring the
partition lock until actually necessary and minimising the duration of
time a strong reference to a specific RecordBatch is held in order to
minimise the memory overhead.
During query execution a client sees a consistent snapshot of
partitions: once a client begins streaming the query response, incoming
writes that create new partitions do not become visible. However
incoming writes to an existing partition that forms part of the snapshot
set become visible iff they are ordered before the acquisition of the
partition lock when streaming that partition data to the client.
Allow the return type of the QueryExec trait's query_exec() method to be
parametrised by the implementer.
This allows the trait to be reused across different data sources that
return differing concrete types.
Changes the query code (taken from the ingester crate) to stream data
for query execution, tidy up unnecessary Result types and removing
unnecessary indirection/boxing.
Previously the query data sourcing would collect the set of RecordBatch
for a query response during execution, prior to sending the data to the
caller. Any data that was dropped or modified during this time meant the
underlying ref-counted data could not be released from memory until all
outstanding queries referencing it completed. When faced with multiple
concurrent queries and ongoing ingest, this meant multiple copies of
data could be held in memory at any one time.
After this commit, data is streamed to the user, minimising the duration
of time a reference to specific partition data is held, and therefore
eliminating the memory overhead of holding onto all the data necessary
for a query for as long as the client takes to read the data.
When combined with an upcoming PR to stream RecordBatch out of the
BufferTree, this should provide performant query execution with minimal
memory overhead, even for a maliciously slow reading client.
Useful because it updates `zstd` to 0.12. With the upcoming `parquet`
update, we can than drop `zstd` 0.11.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* refactor: avoid channels to to create a one-element stream
* refactor: move `StreamWithPermit` into its own module
* refactor: make `QueryCompletedToken` handling stream-based
For #6216.
* docs: improve
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* fix: ignore fields when considering tag predicates
* chore: update test to not use time column in predicate
* chore: update with review feedback
* chore: update tests to avoid fields refs in RPC preds
This is more like what would be coming off the wire from
Influx RPC.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* refactor: avoid `tokio::spawn` for cache requests
During the happy-path, we don't need any tokio task to drive a cache
loader requests because the future issuing the request just act as a
driver. If that future is cancelled, we place the cache request in an
extra task. This avoid latencies due to task overhead and (task) context
switches for most requests. This may remove a millisecond or two from
latency but also makes the whole thing easier to analyze/profile because
we don't spawn a truckload of tasks.
This trick was borrowed from rskafka.
* refactor: split up code
Implement a QueryExec decorator that emits named tracing spans covering
the inner delegate's query_exec() execution.
Captures the result, emitting the error string in the span on failure.
This commit implements the gRPC direct-write RPC interface (largely
copied from the ingester crate), and adds a much improved RPC query
handler.
Compared to the ingester crate, the query API is now split into two
defined halves - the API handler side, and types necessary to support it
(server/grpc/query.rs) and the Ingester query execution side (a stub in
query/exec.rs). These two halves maintain a separation of concerns, and
are interfaced by an abstract QueryExec trait (in query/trait.rs).
I also added the catalog RPC interface as it is currently exposed on the
ingester, though I am unsure if it is used by anything.
This commit also introduces the "init" module, and the
IngesterRpcInterface trait within it. This trait forms the public
ingester2 crate API, defining the complete set of methods external
crates can expect to utilise in a stable, unchanging and decoupled way.
The IngesterRpcInterface trait also serves as a method of type-erasure
on the underlying handler implementations, avoiding the need to
expose/pub the types, abstractions, and internal implementation details
of the ingester to external crates.
* fix: `identifier` parser consumes preceding whitespace
* chore: Update module docs
* feat: Add function to identify when an identifier requires quotes
* feat: Add ability to deref OneOrMore to its vector representation
This feature was used as part the InfluxQL logical planner in IOx
* chore: Feedback, prefer slice `[T]` rather than `Vec<T>`