83f77712b1
* refactor: querier<>ingester flight protocol adjustments This makes a few adjustments to the querier<>ingester flight protocol. Query Scope =========== The querier will request data for ALL sequencer IDs for now. There is no reason to have a request per sequencer ID. We can add a range/set filter later if we want, but this is not required for now. Partition-level =============== The only time when the querier cares about sequencer IDs (i.e. sharding) at all is when it selects which ingesters to ask for unpersisted data (this is currently not implemented, it just asks all ingesters). Afterwards the querier only cares about partitions (which are bound to specific sequencers anyways) because this is the level where parquet file persistence and compaction as well as deduplication happen. So we make partitions a first-class citizen in the ingester response. Metadata VS RecordBatches ========================= The global app-metadata will list all partitions and their max persisted parquet files and tombstones (theoretically tombstones are at table-level, but the ingester could in the future break them down to the partition-level). Then it receives a stream of record batches. Each record batch is tagged (via key-value metadata in its schema) so it can be assigned to a partition. At the moment the ingester returns 0 or 1 batches per unpersisted partition (0 in case we've filtered out all the data via the predicate), but in the future it is free to return multiple batches. This setup gives the ingester more freedom over memory management and (potentially parallel) query processing, while at the same time keeps the set of duplicated information minimal and allows easy extensions (since the global metadata is a full-blown protobuf message). Querier ======= At the moment the querier ignores all the metdata. Follow-up PRs will change that. * docs: improve Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: make code clearer Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> |
||
---|---|---|
.. | ||
protos | ||
src | ||
.gitignore | ||
Cargo.toml | ||
build.rs |