* fix: hoist repeated computation out of chunk creation
We have hundreds of chunks per table, so it is beneficial to only
do common work once.
* chore: remove TableCache as it is no longer used
* fix: prune chunks both before and after metadata fetch
Fetching the metadata for all the chunks in a table is expensive,
especially when we have a narrow time range query that only
needs a few chunks.
* chore: fix clippy
* fix: fix up some last tests
* fix: review comments
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This doesn't really need to be fallible but forces propagation of a ton
of error handling - no shards is always a sign of something being very
wrong, and can be caught in the caller if it's for some reason an
acceptable state / can be recovered from.
* refactor: allow `ChangeRequest` to carry a lifetime
Let's not restrict our change functions to `'static` because this would
require us to clone loads of data to achieve predicate-based
`remove_if`.
* refactor: convert `remove_if` feature to policy framework
Decided to drop the "shared" functionality. We only use the small
`remove_if` bit which is way easier to reason about.
For #5320.
* refactor: address review comments
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* refactor: port TTL backend to policy framework
Note that this is "just" a port, it does NOT change how TTL works. This
will be done in #5318.
Helps with #5320.
* fix: ensure inner backend is empty
* test: add some smoke test
We already prune all chunks in the query-access layer. There's no need
to do that another time (which is actually the first time) in
`QuerierTable::chunks`. The time savings we get from feeding less chunks
into the state reconciling should be negligible. On the pro-side however
we get a more streamlined data flow and actually correct chunk pruning
metrics. Also see #5336.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
- emit a warning if we cannot even attempt to prune chunks due to an
error. This is always either a missing feature or a bug (even though
it does not impact correctness but _only_ performance). Also see
https://github.com/influxdata/conductor/issues/1107
- change metrics to clearly differentiate between "could not prune" and
"not pruned"
- add new "not pruned" observer hook (this was missing for some reason,
the "pruned" hook existed though)
* refactor: make could-not-prune reason a static string
* refactor: introduce `QuerierTableArgs`
* feat: chunk pruning metrics
Closes#4974.
* refactor: address review comments
* refactor: use static typing for not-pruned reason
* refactor: pass chunk to not-pruned observer and use it for some metrics
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: make querier RAM pool split a proper feature
- use propre pool names
- expose sizing via CLI/env
Closes https://github.com/influxdata/conductor/issues/1102.
* refactor: improve naming and docs
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Quick&Dirty implementation of a RAM-pool split to see if this has any
effect. I expect the querier performance to improve due to this because
large read buffers can no longer evict precious metadata.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This is what DataFusion uses by default and I don't see a reason why we
should use such small batch sizes.
The affect is probably only visible in certain filter-aggregate queries
that don't focus on a single series (because there we likely end up with
1 or 2 batches only, esp. after #5250) for coarse-grained filters, esp.
when the filter key is not the first sort key.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: `QueryChunk::as_any`
* feat: allo `ChunkPruner::prune_chunks` to fail
* feat: limit per-table chunk data for every query
Closes#5211.
* fix: address review comments
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* refactor: remove min_sequnce_number
* fix: typos
* fix: remove min_sequencer_number from new files from merging main
* fix: add back throwing error if the compactor compacts files persisted by the ingester after the ingester sends max seq_num back to querier
* test: add test_compactor_collision back but modify the input to make it work woth new changes
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: cache tracing
Add tracing to the metrics cache wrapper. The extra arguments for GET
and PEEK make this quite simple, because the wrapper can just extend the
inner args with the trace information.
We currently terminate the span in `querier::cache` (i.e. only pass in
`None`, so no tracing will occur) to keep this PR rather small. This
will be changed in subsequent PRs.
For #5129.
* fix: typo
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
- remove `IOxSessionContext::default()` because untracked contexts
should only be created by tests
- remove `Option<IOxSessionContext>` because it is a typed workaround
for `IOxSessionContext::default`
Tests should use `IOxSessionContext::testing` and all _normal_ users
should create proper contexts.
I suspect this will help tracing or at least prevent silent regressions.
See #5129.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This will be used to pass spans down to `CacheWithMetrics` (or a new
wrapper specific to tracing) and will help with #5129.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This adds tracing of querire->ingester request up to the point where we
perform the network request, i.e. the trace will only appear on the
querier side. We may extend this at some point to carry the tracing
information to the ingester as well.
Ref #5129.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
I forgot to address a TODO in #5091. Extends to test to actually check
the chunk stage and removes the function for manual force-loads.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>