* feat: `Cache::set`
This will be helpful to fill caches if we got the information from
somewhere else.
For #3985.
* docs: improve
Co-authored-by: Edd Robinson <me@edd.io>
* docs: explain lock gap
* feat: add debug log to `Cache`
Co-authored-by: Edd Robinson <me@edd.io>
* feat: `CacheBackend::as_any`
* refactor: add TTL cache backend
This is based on the new `AddressableHeap`, which simplifies the
implementation quite a lot.
For #3985.
* refactor: `TtlBackend::{update->evict_expired}`
* docs: exlain ttl cache eviction
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: add addressable heap for query cache
This will be used as a helper data structure for TTL and LRU. It's
probably not the most performant implementation but it's good enough for
now.
This is for #3985.
* fix: test + explain tie breaking in `AddressableHeap`
* feat: extract "backend" from querier cache
The backend will implement pruning policies like LRU and TTL as well as
where/how the data is stored. Having a proper interface for that
simplifies the implementation since we don't need to have one massive
`Cache` object with a super complex mechanism.
This is for #3985.
* refactor: `Backend` -> `CacheBackend`
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: querier test system, ground work
See #3985 for the motivation.
This introduces a cache system for the querier which can later be
extended to support the remaining features listed in #3985 (e.g.
metrics, LRU/TTL).
All current caches are wired up to go throw the new cache system. Once
we move away from (ab)using `db`, the set of caches will be different
but the system will remain.
* test: explain it
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
* refactor: simplify cache result broadcast
* refactor: introduce `Loader` crate
* fix: docs
* docs: explain why we manually drop removed hashmap entries
* docs: fix intra-doc link
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
- This is not used by the query engine at all.
- The query engine should not care about ALL chunks but only about the
chunks it gets via `QueryDatabase::chunks` (which includes a table
name and a predicate).
- All other users of that API are NOT really query-related.
- This was not actually used by the query engine.
- The query engine doesn't have a concept of a "partition", it only
cares about chunks.
- Unbound access to all partitions in the database is quite expensive
(esp. on NG).
* refactor: wire exectution context to Deduplicator
* feat: example trace to chunk read_filter
* refactor: make execution context required
* refactor: expose metadata API
* refactor: more span context for chunk read_filter
* refactor: fix build
* refactor: push context into result stream
* refactor: make executor optional
* feat: add `success` column to system.queries
* refactor: Remove lifetime from QueryCompletedToken and thread through flight
* test: update test to make incomplete query clearer
* refactor: use better patter to set complete
* fix: logical merge conflict
Before adding more and more features, here is a bit of a clean up and
prep work:
- Pull out caching into its own module and add proper tests for it.
- Start to build a test infrastructure so tests are shorter and easier
to read. This doesn't fully pay off just yet but gets more and more
important when we actually sync tables and chunks.
* feat: skeleton of querier CLI
* chore: wrap metrics in opt&arc in querier to satisfy new api
* chore: derive debug in querier handler
* chore: add join handles and their shutdown to nascent querier server
* chore: querier server http unimpl -> 404
* fix: join/shutdown fix in querier; removed unused delegates
* feat: Add a way to run ingester with an in-memory catalog from the CLI
If you set the --catalog-dsn string to "mem", rather than using that as
a Postgres connection URL, create an in-memory catalog.
Planning on using this in tests, so not documenting.
* fix: Set default topic to the same value as SHARED_KAFKA_TOPIC
Namely, both should use an underscore. I don't think there's a way to
directly share these values between a constant and an annotation.
* feat: Add a flight API (handshake only) to ingester
* fix: Create partitions if using file-based write buffer
* fix: Change the server fixture to handle ingester server type
For now, the ingester doesn't implement the deployment API. Not sure if
it should or not.
* feat: Start implementing ingester do_get, namely decoding the query
Skip serialization of the predicate for the moment.
* refactor: Rename ingest protos to ingester to match crate name
* refactor: Rename QueryResults to QueryData
* feat: Move ingester flight client to new querier crate
* fix: Off by one error, different starting indexes in sequencers
* fix: Create new CLI argument to pick the catalog type
* fix: Create a CLI option to set the number of topics to auto-create in the write buffer
* fix: Check the arrow flight service's health to tell that the ingester gRPC is up
* fix: Set postgres as the default catalog type
* fix: Return an error rather than panicking if CLI args aren't right