* feat: rework cache refresh logic
Instead of issuing a single refresh when a GET request for a cached key
comes in, start a background job (using some efficient logic to not
overload tokio) per key that refreshes the key using some exponential
backoff. The timer is reset a new GET request comes in. This has the
following advantages:
- our backoff logic decorrelates the requests
- the longer a key was not used, the less often it will be updated
All test (esp. integration tests) as adjusted accordingly, mostly to
account for the fact that no extra GET is required to start the refresh
timer.
Closes#5720.
* docs: improve
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* refactor: simplify rng overwrite
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Criterion comes with some extra cargo tooling called `cargo criterion`
which can be used instead of `cargo bench`. The advantage is that we
don't need to compile the entire reporting infrastructure into our
benchmarks. So let's embrace this separation of concerns.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* ci: use same feature set in `build_dev` and `build_release`
* ci: also enable unstable tokio for `build_dev`
* chore: update tokio to 1.21 (to fix console-subscriber 0.1.8
* fix: "must use"
* feat: refresh policy for caches
For #5318 we want to have a policy that refreshes keys before they are
too old. I initially tried to fold both TTL and the refresh system into
a single policy but than decided that they will basically be two
policies in one with a harder-to-test interface. Semantically TTL and
refresh are also a bit different (but will usually be used together):
- **TTL:** Prevents that a users gets data that is too old. It is some kind
of "soft correctness". In some sense this is related to the "remove
if" policy where some part of the system knows for sure (or with
reasonable likelyhood) that a cache entry is outdated. Note that TTL's
primary job is NOT to clean up memory from old keys (even though it
indirectly does that). There is no reason cached entries should be
removed except for correctness (TTL and remove-if) or resource
pressure -- and the latter is handled by the LRU policy.
- **Refresh:** While TTL is some kind of deadline, we often have good
reason to refresh the key before we pull the plug, namely when an
entry is used and a bit old (but not too old). The concrete mechanism
to archive this is flexible. At the moment the policy is rather
simple -- just start a refresh task if a key is old and we receive a
GET request -- but can be extended in the future.
This also adds some integration tests for TTL+refresh. There will be
follow-up changes to test the interaction with LRU as well, althouh I am
pretty certain that there won't be any surprises due to the excessive
testing we have in place for the policy backend itself as well as all
the policies.
This change also does NOT integrate the refresh with the querier for the
sake of keeping the changeset "small" (i.e. it is already rather large).
* docs: improve
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* refactor: port TTL backend to policy framework
Note that this is "just" a port, it does NOT change how TTL works. This
will be done in #5318.
Helps with #5320.
* fix: ensure inner backend is empty
* test: add some smoke test
* feat: cache tracing
Add tracing to the metrics cache wrapper. The extra arguments for GET
and PEEK make this quite simple, because the wrapper can just extend the
inner args with the trace information.
We currently terminate the span in `querier::cache` (i.e. only pass in
`None`, so no tracing will occur) to keep this PR rather small. This
will be changed in subsequent PRs.
For #5129.
* fix: typo
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
To roughly gauge how much data we re-load into cached (i.e. data that
was already loaded but was later evicted due to LRU pressure or TTL
eviction) this change introduces a new metric that estimates if a cache
entry that is requested from the loader was already seen before (using a
probabilistic filter).
* refactor: require `Resource`s to be convertible to `u64`
* refactor: require `Resource`s to have a unit name
* refactor: make LRU cache IDs static
* feat: add LRU cache metrics
* docs: improve type names in LRU doctest
* docs: epxlain `MeasuredT`
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* docs: explain `test_metrics`
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>