1. add GET support for the subscribers (this is needed so that
TTL/refreshers and LRU systems "know" when a key was used)
2. improve multi-threading test to not rely on a wait loop but use a 2nd
barrier instead
3. ensure that panics in the testing background thread are propagated
and make the test fail
4. implement defaults for `Subscriber` methods to reduce boilerplate for
implementators
Helps with #5320.
* feat: new policy system for caches
This is the framework part for the policy system outlined in #5320. It
does NOT port any existing policies (likely TTL, LRU and shared) over to
the new system. This will be a follow-up.
* docs: improve example
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* feat: cache tracing
Add tracing to the metrics cache wrapper. The extra arguments for GET
and PEEK make this quite simple, because the wrapper can just extend the
inner args with the trace information.
We currently terminate the span in `querier::cache` (i.e. only pass in
`None`, so no tracing will occur) to keep this PR rather small. This
will be changed in subsequent PRs.
For #5129.
* fix: typo
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This prepares the test system for the usage of extra args for GET and
PEEK by pulling all generics and the whole interface into a single
trait. This is similar to how the write buffer tests work.
This is needed to introduce extra args into the metrics wrapper (i.e. to
pass down spans) while still being able to use the generic tests in a
follow-up PR. Overall this change is required for #5129.
The tests itself are unchanged.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This will be used to pass spans down to `CacheWithMetrics` (or a new
wrapper specific to tracing) and will help with #5129.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Removes an erroneous line and reworks one sentence for clarity.
Co-authored-by: pierwill <pierwill@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore(arrow_util): readability improvements
Signed-off-by: Ryan Russell <git@ryanrussell.org>
* chore(tracker): readability improvements
Signed-off-by: Ryan Russell <git@ryanrussell.org>
* chore(cache_system): improve readability
Signed-off-by: Ryan Russell <git@ryanrussell.org>
* refactor(lru test): rename `test_panic_id_collision`
Signed-off-by: Ryan Russell <git@ryanrussell.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
To roughly gauge how much data we re-load into cached (i.e. data that
was already loaded but was later evicted due to LRU pressure or TTL
eviction) this change introduces a new metric that estimates if a cache
entry that is requested from the loader was already seen before (using a
probabilistic filter).
* refactor: expose `CacheGetStatus` (and improve tests)
- add a `CacheGetStatus` which tells the user if the request was a hit
or miss (or something inbetween)
- adapt some tests to use the status (only the tests where this could be
relevant)
- move the test suite from using `sleep` to proper barriers (more stable
under high load, more correct, potentially faster)
* refactor: improve `abort_and_wait` checks
* docs: typos and improve wording
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* refactor: `FutureExt2` -> `EnsurePendingExt`
* refactor: `Queried` -> `MissAlreadyLoading`
* docs: explain `abort_or_wait` more
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* refactor: make `Cache` a trait
To insert more high-level metrics (e.g. cache misses/hits) it would be
helpful if we could easily instrument the layer right above the cache
driver (that combines the backend and the loader). To do that without
polluting the types too much, let's introduce a trait that describes the
driver interface and that we could later wrap with intrumentation.
This also pulls out the test into a generic setup, similar to how this
is done for the cache storage backends.
This does NOT include any functionality changes.
* fix: typo
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* refactor: require `Resource`s to be convertible to `u64`
* refactor: require `Resource`s to have a unit name
* refactor: make LRU cache IDs static
* feat: add LRU cache metrics
* docs: improve type names in LRU doctest
* docs: epxlain `MeasuredT`
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* docs: explain `test_metrics`
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* feat: `SortKey::size`
* feat: `FunctionEstimator`
* feat: querier RAM pool
Let's put all the caches into a single RAM pool, so we can at least
somewhat control RAM usage. Note that this does NOT limit the peak
memory during query execution though, but should at least stop unlimited
cache growth. A follow-up PR will add metrics.
* refactor: improve some size calculations
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>