* refactor: use new ingester<>querier wire protocol
Use and document the new and more flexible ingester<>querier wire
protocol.
Note that the ingester does NOT stream the response data yet, but the
internal data structures would allow that. A follow-up change will
adjust the ingester code to stream the data.
Ref #4849.
* fix: typos
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* refactor: clarify naming and public interface
* test: add schema assertion to `ingester_response_to_record_batches`
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* feat: extend flight client to accept multiple (changing) schemas
See #4849.
Originally I intended not to use Flight at all for the new
ingester<>querier protocol. However since flight also deals with
dictionary batches and multiple batches and the gRPC protocol that I
would write would look very similar, I will use Flight with a bit more
flexible message types.
The rough idea for the protocol is the following stream:
- for each partition:
1. "none" message with partition metadata
2. for each chunk (can have different schemas under certain
circumstances):
1. "schema" message (resets dictionary state)
2. (optional) dictionary batch messages
3. one or more "record batch" message
The nice thing about it is that the same arrow client works also for the
existing client<>querier protocol since there we just send:
1. "schema" message (no app metadata)
2. (optional) dictionary batch messages
3. zero, one or more "record batch" message (no app metadata)
* refactor: separate high- and low-level flight client
It is very unlikely that a user will use the high-level batch-producing
functionality and the low-level stuff within the same session. So let's
split this into to clients (high-level uses the low-level one
internally) to avoid confusion.
Also add documentation on our protocol handling.
* refactor: enumerate all variants in match statement to better catch errors in the future
Unset the all env vars for the following CLI e2e tests:
* default_mode_is_run_all_in_one
* default_run_mode_is_all_in_one
This prevents them from executing against the "prod" catalog, running
migrations and inserting values to the prod database specified in the
prod DSN env (INFLUXDB_IOX_CATALOG_DSN).
Warn when downloading files to an in-memory object store.
The "remote partition pull" command downloads parquet files from an
object store via a router, and saves them locally. It's pretty unlikely
the user intends to download those files to memory of the CLI process
which then exits when the pull is complete, throwing away the downloaded
files, but this is the default.
* feat: enable debugging of failed querier->ingester requests
- extend `query-ingester` CLI to allow usage of predicates
- on failed requests: log all information that required for the CLI
- test the "ingester fails" scenario
* test: explain
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* docs: improve
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* refactor: move b64 pred. serde into a single crate
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* feat: Increase logging to investigate multi ingester flaky test
* feat: Temporarily disable a test while logging is increased in CI
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Add lookup of partitions by table id to catalog.
Add API to catalog to return partitions by table id.
Add to client to return partitions by table id.
Add CLI to pull remote schema, partition, and parquet files into a local catalog and object store.
* feat: share server processes in end to end test
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Prior to this commit, adding any metric to the catalog (and only the
catalog) would cause the end_to_end_ng_cases::metrics::test_metrics test
to fail due to asserting an exact number of metrics observed.
This commit changes the check condition to a more permisive >= rather
than ==.
* refactor: Expose data generation tool for wider use
* feat: Add a step for retrieving the server metrics
* refactor: Copy the OG end-to-end metrics test to NG
* feat: Rewrite the NG end-to-end metrics test
This is still broken because the the row timestamp metrics don't exist
in NG.
* fix: Test metrics relevant to NG
* refactor: Move the data generator to the test helper crate
* refactor: Extract a ReadFilter request builder into the test helper crate
* refactor: Make test helper request builder able to build other gRPC requests
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
* feat: Copy end-to-end logging test to NG
This was created with:
cp influxdb_iox/tests/end_to_end_cases/influxdb_ioxd.rs influxdb_iox/tests/end_to_end_ng_cases/logging.rs
* feat: Port logging test to NG end-to-end tests
And re-enable it, it was ignored.
* fix: Specify that an in-memory catalog should be used for the logging test
* fix: Check for gRPC instead of HTTP
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* refactor: use `info!` rather `println!` in end to end tests
* chore: change from println to info in end_to_end_ng_cases too
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>