* chore: Update DataFusion pin
* chore: Update for new API
* fix: fix test
* fix: only check error messages
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Rather than always having to request all of a namespace's schema then
filtering to the one you want. Will make this more consistent with
upserting schema by namespace+table.
Fixes#4997.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
There are a bunch of dependencies in `Cargo.lock` that are related to
mysql. These are NOT compiled at all, and are also not part of `cargo
tree`. The reason for the inclusion is a bug in cargo:
https://github.com/rust-lang/cargo/issues/10801
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: add `influxdb_iox debug build-catalog` command
* fix: tests
* fix: Use info! logs instead of println for status
* fix: Set partition_hash_id as well
* fix: remove leftover code
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
An existing pair of e2e steps allows the caller to assert an increase in
the number of persisted parquet files, but there was no primitive to
assert the current count.
Allow the timeout to be referenced in docs, and increase it a bit - it's
it a bit close to being too low for ingester shutdown persistence to
complete.
This will hold the deterministic ID for partitions.
Until all existing partitions have this value, this is optional/nullable.
The row ID still exists and is used as the main foreign key in the
parquet_file and skipped_compaction tables.
The hash_id has a unique index so that we can look up records based on
it (if it's available).
If the parquet file record has a partition_hash_id value, use that to
generate the object storage path instead of the partition_id.
This changes the e2e test to delete the WAL segment file, restart the
ingester and ensure the results returned by an ingester query after
feeding the regenerated line proto in are the same as those before.
* test: integration test for tracing of queries to the ingester
* chore: add FlightFrameEncodeRecorder to record spans per each polling result
* refactor(trace): impl TraceCollector for Arc
Allow any Arc-wrapped TraceCollector implementation to be used as a
TraceCollector. This avoids needing to as_any() and downcast later.
* test: assert FlightFrameEncodeRecorder trace spans
This test exercises the FlightDataEncoder wrapped with the trace
decorator (FlightFrameEncodeRecorder) when executing against a data
source that yields data after varying numbers of Stream polls.
This test passing will validate the FlightFrameEncodeRecorder correctly
instruments the amount of time a client spends waiting on the
FlightDataEncoder to acquire or encode a protocol frame, but also
ensures the decorator correctly accounts for varying behaviours allowed
through the Stream abstraction. It does this by simulating a data source
that is not always immediately ready to provide data, such as a buffer
wrapped in a contended async mutex.
* refactor: move tracing decorator into separate mod
* fix: record spans
* refactor(test): update test
The frame encoder is not one-to-one - it emits two frames for the first
data payload, a schema and a payload. This commit updates the test to
account for it!
* refactor: remove unneeded mut ref, and use enum state method which panics when in a (should be unreachable) state
* chore: add more docs to FlightFrameEncodeRecorder and related
---------
Co-authored-by: Dom Dwyer <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This commit fixes loads of crates (47!) had unused dependencies, or
mis-configured dependencies (test deps as normal deps).
I added the "unused_crate_dependencies" to all crates to help prevent
this mess from growing again!
https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html
This has the minor downside of false-positives when specifying
dev-dependencies for test/bench binaries - these are files in /test or
/benches (not normal tests). This commit includes a workaround,
importing them in lib.rs (gated by a feature flag). I think the
trade-off of better dependency management is worth it!
Introduce a new header called `iox-debug` which when set enables certain
debug features. The first one will be the `system.queries` table which
is a process-local, namespace-scoped query log. In most prod setups this
is only useful for debugging and will confuse the user a lot because
when multiple queries are deployed then the K8s routing decides which
pod/process the users hits. This leads to an inconsistent view. However
the log is still useful for debugging.
This also wires the "debug header set" flag through the Flight ticket,
because JDBC proved (integration tests FTW!) that headers are only
passed to `GetFlightInfo` but not to `DoGet` and the ticket must encode
all the relevant information.
Closes#7119.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This adds a command to `influxdb_iox` that can take a WAL segment file
and regenerate all write operation entries, writing to stdout or namespaced
files within a target directory, using table ID as the measurement name
in the case where there is no catalog access at point of regeneration.
* refactor: Change catalog configuration so it is entirely dsn based / support end to end testing without postgres
Restores code from https://github.com/influxdata/influxdb_iox/pull/7708
Revert "revert: PR #7708"
This reverts commit c9cfe05f8d.
* fix: merge
* fix: Update new test