docs: high-level documentation for `compactor2` (#6818)

* docs: high-level documentation for `comapctor2` * docs: typos & clarification Co-authored-by: Nga Tran <nga-tran@live.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com> * docs: improve wording * fix: typo --------- Co-authored-by: Nga Tran <nga-tran@live.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-02-06 09:37:57 +01:00 · 2023-02-06 09:37:57 +01:00 · cd6d69c9b1
parent 6f4e287a3a
commit cd6d69c9b1
2 changed files with 204 additions and 0 deletions
--- a/compactor2/img/driver.svg
+++ b/compactor2/img/driver.svg
--- a/compactor2/src/lib.rs
+++ b/compactor2/src/lib.rs
@ -1,4 +1,162 @@
 //! The compactor.
+//!
+//!
+//! # Mission statement
+//!
+//! This processes parquet files produced by the ingesters so that queriers can read them more efficiently. More precisely it
+//! tries to meet the following -- partially conflicting -- objectives:
+//!
+//! - **O1 -- duplicates:** Duplicate across files should be removed[^duplicates_within_files]. Removing duplicates
+//!   during query is costly.
+//! - **O2 -- overlaps:** The key-space (tags + timestamp) within parquet files should be strictly distinct. Without this
+//!   attribute the querier has now way to detect the inter-file key uniqueness and hence must run costly
+//!   de-duplications during query time.
+//! - **O3 -- minimize number of  files:** There should be as few files as possible. Every file comes with
+//!   overhead both within the catalog and on the querier side.
+//! - **O4 -- max file size:** Parquet files must have a maximum (configurable) size (with a certain margin on top due
+//!   for technical reasons[^max_parquet_size]). Files that are too large will cause memory issues with querier
+//! - **O5 -- minimal changes:** The changes to the set of parquet files should be as minimal as possible to increase cache
+//!   efficiency on the querier side,  reduce write amplification (rewriting the same row over and over again) and
+//!   decrease the catalog load (each file creation / deletion requires catalog updates).
+//! - **O6 -- catch-up:** The compactor should be able to not "fall behind" the ingest tier.
+//! - **O7 -- costs:** The compactor shall be cost-efficient.
+//!
+//! From these objectives, we can derive the following additional requirements -- some more and some less obvious ones:
+//!
+//! - **R1 -- avoid re-compaction:** Avoid re-processing the same data multiple times. This can be derived from *O5*,
+//!   *O7*, and potentially *O6*.
+//! - **R2 -- parallelization:** Be able to run multiple compaction jobs at the same time. Derived from *O6* and *O7*.
+//! - **R3 -- scale-out:** Be able to scale to multiple nodes. Derived from *O6* but may very much depend on *R2*.
+//! - **R4 -- concurrent compute & IO:** Be able to decouple IO (and the involved latencies) from the actual
+//!   computation. Derived from *O6* and *O7*.
+//!
+//!
+//! # Crate Layout
+//!
+//! This crate tries to decouple "when to do what" from "how to do what". The "when" is described by the [driver] which
+//! provides an abstract framework. The "how" part is described by [components] which act as puzzle pieces that are
+//! plugged into the [driver].
+//!
+//! The concrete selection of components and their setup is currently [hardcoded](crate::components::hardcoded). This
+//! setup process take a [config] and creates a [component setup](crate::components::Components).
+//!
+//! The final flow graph looks like this:
+//!
+//! ```text
+//! (CLI/Env) --> (Config) --[hardcoded]--> (Components) --> (Driver)
+//! ```
+//!
+//! ## Driver
+//! The driver is the heart of the compactor and describes and executes the framework.
+//!
+//! This is the data and control flow of driver:
+#![doc = include_str!("../img/driver.svg")]
+//!
+//! In general modification of the driver should be rare. Most changes should be implement by using components.
+//!
+//! ## Components
+//! Components can be thought as puzzle pieces used by the driver. Technically there are two parts to it: the driver
+//! interfaces (traits) and the implementations.
+//!
+//! **Note: The code samples used in the "components" section are simplified and may not reflect actual components. They are
+//! mostly illustrations.**
+//!
+//! ### Interfaces
+//! Component interfaces should describe one single aspect of the framework. For example:
+//!
+//! ```
+//! # use std::fmt::{Debug, Display};
+//! # use data_types::ParquetFile;
+//! /// Calculate importance of a file.
+//! pub trait FileImportants: Debug + Display + Send + Sync {
+//!     /// Rates file by importance.
+//!     ///
+//!     /// Higher ratings mean "more important".
+//!     fn rate(&self, file: &ParquetFile) -> u64;
+//! }
+//! ```
+//!
+//! Looking this interface you will note the following aspects:
+//!
+//! - **`Debug`:** Allows to use the component in our structs that implement [`Debug`].
+//! - **`Display`:** Allows to produce easy human (and machine) readable dumps of the current component setup. Also
+//!   allows other components to implement [`Display`] more easily.
+//! - **immutable:** All methods take `&self`. Interior mutability (e.g. for caching) is allowed. This allows easy parallelization.
+//! - **not-consuming:** No method takes `self`. This ensures that the trait is object-safe and can be put into an [`Arc`].
+//! - **`Send + Sync`:** Together with the the two previous points this allows the usage within an [`Arc`].
+//! - **single method:** Most interfaces will only have a single method. This is because they have one dedicated job.
+//!
+//! The above interface will not allow any IO because it is not async. It is important to keep the interfaces that
+//! perform IO and the ones that don't clearly apart, so it is visible to the user and allows a proper driver design. An
+//! interface that performs IO looks like this:
+//!
+//! ```
+//! # use std::fmt::{Debug, Display};
+//! # use async_trait::async_trait;
+//! # use data_types::{Partition, PartitionId};
+//! /// Fetches partition info.
+//! #[async_trait]
+//! pub trait PartitionSource: Debug + Display + Send + Sync {
+//!     /// Fetches partition info.
+//!     ///
+//!     /// This method retries internally.
+//!     async fn fetch(&self, id: PartitionId) -> Partition;
+//! }
+//! ```
+//!
+//! For IO interfaces, all the points of the normal interface apply. In addition, the following points are important:
+//!
+//! - **error & retries:** Most IO interfaces are NOT allowed to return an error. Instead they shall implement internal
+//!   retries (e.g. using [`backoff`]).
+//!
+//! ### Implementations
+//! The implementations of the aforementioned interfaces roughly fall into the following categories:
+//!
+//! - **true sources/sinks:** These are the actual (mostly object-store and catalog backed) implementations used in
+//!   production. There may be multiple variants that are used depending on the config. NO metrics or logging is
+//!   implemented directly within these (see "wrappers"  below).
+//! - **assemblies:** Assembles components of other types into this type. E.g. a filter for a set of files
+//!   (`[file] -> [file]`) can be assembled by a filter for a single file (`file -> bool`).
+//! - **mocks:** Mocks can be controlled during initialization or later and may record the interfaction with them. They
+//!   are mostly used for testing or development purposes.
+//! - **wrappers:** Wrap a component of the same type but add functionality on top. This can be logging, metrics,
+//!   assertions, data dumping, filtering, randomness, etc.
+//!
+//! This modular approach allows easy testing of components as well consistency. E.g. swapping out a data source for
+//! another one will preserve the same logs and metrics (given that the same wrappers are used).
+//!
+//! ## Configuration
+//! The [config] allows the compactor to set up a wide range of component setups. The config options fall roughly into
+//! the following categories:
+//!
+//! - **parameters:** Information required to run the compactor, e.g. the object store connection or sharding information.
+//! - **tuning knobs:** Options like "maximum desired file size", or "number of concurrent jobs" allow deployments to
+//!   fine-tune the compactor behavior. They do NOT fundamentally change the behavior though.
+//! - **behavior switches:** Switches like "ignore that partitions where flagged", "run once and exit", or "use this
+//!   pre-defined set of partitions" change the behavior of the system. This may be used for emergency situations or to
+//!   debug / develop the compactor.
+//! - **algorithm versions:** This sets [`AlgoVersion`] and fundamentally changes the behavior, e.g. from a simple demo
+//!   approach to some experimental new system. Note that this only changes the setup of the components, it does NOT
+//!   change the driver.
+//!
+//!
+//! [^duplicates_within_files]: The compactor design would not fundamentally change if the ingester would produce output
+//!     that contains duplicates within in single file. The current implementation however assumes that this is NOT the case.
+//!
+//! [^max_parquet_size]: It is technically hard to determine the size of a parquet size before writing it. It may be
+//!     estimated by assuming that "sum size of input files >= sum size of output files". However there are certain
+//!     cases where this does NOT hold due to the way the data within the output parquet file is split into data pages,
+//!     is then potentially dictionary- and RLE-encoded and [ZSTD] compressed.
+//!
+//!
+//! [`AlgoVersion`]: crate::config::AlgoVersion
+//! [`Arc`]: std::sync::Arc
+//! [components]: crate::components
+//! [config]: crate::config
+//! [`Debug`]: std::fmt::Debug
+//! [`Display`]: std::fmt::Display
+//! [driver]: crate::driver
+//! [ZSTD]: https://github.com/facebook/zstd
 #![deny(rustdoc::broken_intra_doc_links, rust_2018_idioms)]
 #![warn(
    missing_copy_implementations,
@ -10,6 +168,7 @@
    clippy::todo,
    clippy::dbg_macro
 )]
+#![allow(rustdoc::private_intra_doc_links)]

 pub mod compactor;
 mod components;