* refactor: split influxdb_ioxd, clap_blocks, and serving_readiness out of influxdb_iox
split out serving readiness, get compiling
* fix: hakari
* fix: hakari again
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: port sqlx-hotswap-pool over from conductor
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
* chore: workspace hack fixes
* fix: unique schema per test db connection
* fix: adjust search path in catalog pg tests to see if it fixes test schema issue
* fix: actually fixed sqlx hotswap pool test
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: remove references to perf_image in CI
* chore: adding gitops adapter image build in CI
* chore: gitops adapter bin now same as dir & package so docker build works
* fix: circle config package change after renaming gitops adapter package
* feat: Add a way to run ingester with an in-memory catalog from the CLI
If you set the --catalog-dsn string to "mem", rather than using that as
a Postgres connection URL, create an in-memory catalog.
Planning on using this in tests, so not documenting.
* fix: Set default topic to the same value as SHARED_KAFKA_TOPIC
Namely, both should use an underscore. I don't think there's a way to
directly share these values between a constant and an annotation.
* feat: Add a flight API (handshake only) to ingester
* fix: Create partitions if using file-based write buffer
* fix: Change the server fixture to handle ingester server type
For now, the ingester doesn't implement the deployment API. Not sure if
it should or not.
* feat: Start implementing ingester do_get, namely decoding the query
Skip serialization of the predicate for the moment.
* refactor: Rename ingest protos to ingester to match crate name
* refactor: Rename QueryResults to QueryData
* feat: Move ingester flight client to new querier crate
* fix: Off by one error, different starting indexes in sequencers
* fix: Create new CLI argument to pick the catalog type
* fix: Create a CLI option to set the number of topics to auto-create in the write buffer
* fix: Check the arrow flight service's health to tell that the ingester gRPC is up
* fix: Set postgres as the default catalog type
* fix: Return an error rather than panicking if CLI args aren't right
* refactor: Extract JobRegistry from the server crate
Both the server crate and a db crate that I'm about to extract depend on
JobRegistry, so to avoid making circular dependencies, extract the
JobRegistry to its own crate.
* refactor: Move db out of server into its own crate
Fixes#2821.
* fix: Add tokio rt-multi-thread feature so cargo test -p client_util compiles
* fix: Alphabetize dependencies
* fix: Add the data_types_conversions feature to get tests passing
* fix: Remove dev dependencies already listed under normal dependencies
* fix: Make sure the workspace is using the new resolver
Use `codegen-units = 1`, thin-LTO and debug section compression to make our binary smaller (which is good for deploy and
test times) and faster.
# Summary
The binary size of `influxdb_iox` after building with:
```console
$ cargo build --release --no-default-features --features="aws,gcp,azure,jemalloc_replacing_malloc"
```
The profile was:
```toml
[profile.release]
debug = true
```
The commit was:
```text
89ece8b493
```
The size results are:
| Method | Size |
| ------------------------------------------ | ----- |
| baseline | 833MB |
| baseline + dbg compression | 222MB |
| baseline + strip | 49MB |
| codegen-units | 520MB |
| codegen-units + strip | 40MB |
| codegen-units + dbg compression | 143MB |
| thin LTO | 715MB |
| thin LTO + strip | 49MB |
| thin LTO + dbg compression | 199MB |
| codegen-units + thin LTO | 449MB |
| codegen-units + thin LTO + strip | 40MB |
| codegen-units + thin LTO + dbg compression | 130MB |
For the methods that were successfully measured I couldn't really see any compile time differences on my laptop.
# Methods
## Strip
Remove debug symbols. We don't really want this, so this is just to get an idea of the size
```console
$ strip baseline
```
## Debug Sections compression
Debug sections make a large amount of our binary size (a stripped executable is 49MB instead of 833MB). Since we like to
have debug symbols we cannot just strip them. However these symbols are only used for:
- backtrace generation (something went wrong, not BAU)
- profiling
- debugging
So in normal operation and most test scenarios, we're just wasting memory. So we could compress them:
```console
$ objcopy --compress-debug-sections baseline baseline-dbg_compressed
```
There is also elfutils:
```console
$ eu-elfcompress test
```
Elfutils nearly ends up with the same size (220MB instead of 222MB that objcopy achieves), but takes more time and is
probably not worth it.
Note that compressed debug sections exist since many years. The Rust ecosystem supports reading them since over a year,
see:
- <https://github.com/gimli-rs/gimli/issues/195>
- <https://github.com/rust-lang/backtrace-rs/issues/342>
## Codegen Units
The rust compiler parallelizes codegen work. This split into units however means that optimizations are somewhat
limited. This can be change by:
```toml
[profile.release]
...
codegen-units = 1
```
As a nice side effect this should also make our code faster.
## Thin LTO
Get LLVM to run "thin" Link Time Optimization:
```toml
[profile.release]
...
lto = "thin"
```
As a nice side effect this should also make our code faster.
## Fat LTO
Get LLVM to run "fat" Link Time Optimization:
```toml
[profile.release]
...
lto = "fat"
```
There are no results for this because this took a massive amount of memory and CPU time and did not finish on my system.
Kafka is now sufficiently tested via the `write_buffer` crate. The
end2end tests can now use the in-memory mock implementation or -- if
servers can only be controlled via CLI -- the file-based implementation.