influxdb/schema/src
Dom Dwyer 7d0e3637ed
perf(ingester): projection pushdown to data source
Prior to this change projection pushdown was implemented as a filter,
which meant a query using it would take the following steps:

    * Query arrives
    * Find necessary partition data
    * Copy all the partition data into a RecordBatch
    * Filter that RecordBatch to apply the projection
    * Return results to caller

This is far from ideal, as the underlying partition data is copied in
its entirety and then the unneeded columns discarded - a pure waste!

After this PR, the projection is pushed down to the point of RecordBatch
generation:

    * Query arrives
    * Find necessary partition data
    * Copy only the projected columns to a RecordBatch
    * Return results to the caller

This minimises the amount of data copying, which for large amounts of
data should lead to a meaningful performance improvement when querying
for a subset of columns. It also uses a slightly more efficient
projection implementation by using a single pass over the columns (still
O(n) but less constant overhead).
2023-07-05 13:44:11 +02:00
..
builder.rs perf(ingester): projection pushdown to data source 2023-07-05 13:44:11 +02:00
interner.rs refactor: cleanup schema boxing (#6511) 2023-01-06 10:57:39 +00:00
lib.rs build: remove unused dependencies from crates 2023-05-23 14:55:43 +02:00
merge.rs refactor: don't Arc-wrap RecordBatch instances 2023-07-05 13:44:09 +02:00
projection.rs fix: Move variables within format strings. Thanks clippy! 2023-02-03 13:06:17 -05:00
sort.rs refactor: sort key cleanups (#7113) 2023-03-02 16:08:21 +00:00