* chore: Update DataFusion pin
* chore: Update for new API
* fix: fix test
* fix: only check error messages
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
When an upstream ingester goes offline, the "circuit breaker" detects it
as unhealthy, and prevents further requests being sent to it.
Periodically a small number of requests are allowed ("probe requests")
to check for recovery.
If a write request is selected as a "probe request", it SHOULD be sent -
a limited number writes are selected as probes, and enough have to be
successful to drive recovery. If no probes are ever sent/successful, the
upstream will never be marked as healthy.
Additionally the RPC handler applies an optimisation: if the number of
ingesters selected to service a write is less than the number needed to
successfully reach the desired replication factor, no requests are sent
and an error is returned immediately, preventing unnecessary system load
for writes that would never succeed.
This optimisation conflicts with the probe request requirement when a
replication factor of >= 2 is specified:
* All ingesters are offline
* Write comes in
* UpstreamSnapshot is populated with a probe request for 1 ingester
only - no other healthy candidate ingesters exist.
* Optimisation applied: 1 probe candidate < 2 needed for replication
This results in a probe request never being sent, and in turn, never
allowing further requests to the recovered upstream.
This fix changes the optimisation, applying it only when there are no
probes in the candidate ingester list - the write will always fail, but
it will drive detection of recovered ingesters and maintain liveness of
the system.
Even though all subfields of `CachedPartition` are `Arc`ed, the size of
this structure grows and copying more and more fields around for every
cache access gets quite expensive. `Arc` the whole thing and simplify
management a bit.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: Update DataFusion pin
* chore: Update for new API
* fix: Update for API
* fix: update compactor test
* fix: Update to patched version of arrow 46.0.0
* fix: map `DataFusionError::Configuration` to an internal error
* fix: do not use deprecated API
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* fix(compactor): prevent sort order mismatches from creating overlapping regions
* chore: test additions for incorrectly created regions
* fix(compactor): more sort order mismatch fixes
* chore: insta updates
* chore: insta updates after merge
Rather than always having to request all of a namespace's schema then
filtering to the one you want. Will make this more consistent with
upserting schema by namespace+table.
Fixes#4997.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* fix(influxql): FILL(linear) for selectors
Ensure that selector functions such as FIRST, LAST, MIN and MAX can
use LINEAR filling in the same way as influxdb 1.8.
* chore: review suggestions
Apply suggestions from the review. This adds more tests and support
for interpolation in SQL.
* fix: lint
* fix: lint
* chore: buffered input for struct arrays
Ensure that for linear interpolation the buffered input of a struct
field ensures that buffering only stops when there is a non-null
struct containing a non-null value.
* fix: integration test
* fix(iox_query): make clippy happy
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
The layer that serializes our requests. This also contains the logic to
leave out non-serialiable filters like the V1 version (same tests, just
slightly differently arranged).
For #8349.
* feat: more `TestResponse` constructors
* feat: "logging" layer for i->q V2 client
Logging layer for #8349. This mostly logs in debug mode but emits errors
to the log. Simple implementation that can be extended later.
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
For #8350 we want to be able to stream record batches from the ingester
instead of waiting to buffer them fully before the query starts. Hence
we can no longer inspect the batches in the "display" implementation of
the plan.
This change mostly contains the display change, not the actual streaming
part. I'll do that in a follow-up.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: have ingester's SortKeyState include sort_key_ids
* fix: test failures
* chore: address review comments
* chore: address review comments by asding asserts to catch bugs if any
* chore: fix typo
* test: get column IDs for the tests
* refactor: reuse function
* chore: address review comments
This will enable some subsystems to trivially respect any `IngestStateError`
set while ignoring specific errors which they may be responsible for
resolving (such as WAL replay needing to ingest from disk when `DiskFull`
is set).