`persist_row_threshold` should limit the rows of the post-compaction
output chunk (and hence the sum of rows over the input chunks), not
the number of rows of each individual input chunk.
Fixes#1874.
There is no need to recompile the entire `query_tests` crate when the
CONTENT (not the SET) of the test cases changes, e.g. due to new
optimizations, datafusion upgrades, query additions, etc. We now check
if `cases.rs` really changed before touching it, so that Cargo can rely
on the files mtime.
Beforehand:
```text
❯ env CARGO_LOG=cargo::core::compiler::fingerprint=info cargo test -p query_tests
[2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] stale: changed "/home/mneumann/src/influxdb_iox/query_tests/cases"
[2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] (vs) "/home/mneumann/src/influxdb_iox/target/debug/build/query_tests-0e8f741dfb84437f/output"
[2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] FileTime { seconds: 1625474716, nanos: 436081357 } != FileTime { seconds: 1625474752, nanos: 52625167 }
[2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/Test/TargetInner { ..: lib_target("query_tests", ["lib"], "/home/mneumann/src/influxdb_iox/query_tests/src/lib.rs", Edition2018) }
[2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] err: current filesystem status shows we're outdated
[2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/RunCustomBuild/TargetInner { ..: custom_build_target("build-script-build", "/home/mneumann/src/influxdb_iox/query_tests/build.rs", Edition2018) }
[2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] err: current filesystem status shows we're outdated
[2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/Build/TargetInner { ..: lib_target("query_tests", ["lib"], "/home/mneumann/src/influxdb_iox/query_tests/src/lib.rs", Edition2018) }
[2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] err: current filesystem status shows we're outdated
Compiling query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)
```
The issue is that both the input and the test output files are located
under `cases/`. `build.rs` used `cargo:rerun-if-changed=cases` which per
Cargo doc will scan ALL files in that directory. Note that the normal
`exclude` directive in `Cargo.toml` does NOT work, see
https://github.com/rust-lang/cargo/issues/4587 .
So we need to split input and output files into separate directories
(`cases/{in,out}`).
The entire persistence windows data structures (including the
checkpoints) have nothing to do with the mutable buffer per se. So lets
move them into their own crate. This also makes `parquet_file` not
longer depend on `mutable_buffer`.
Since adding new features like "sequencer replay" or init retries would
make the current code too complex, a refactor is required:
Config:
The config struct now holds a `DatabaseState` which is a simple linear
state machine representing the different stages of the database init.
Init:
The init module now has a fixpoint-loop which looks at the state,
decides what to do based on it and repeats until either the DB is
initialized or an error occured. This also makes it easier to continue
the init process "in the middle", e.g. when the preserved catalog is
broken or the sequencer (e.g. Kafka) could not be reached.