Specifying a large timestamp value and a non-default precision can cause
the multiply to panic if it overflows.
This commit prevents the overflow, returning an error to the user.
We're currently emitting ~5GB of logs every 30 minutes, and a quick scan
through the logs shows the lines this PR changes to be the most frequent
(multiple times per second).
I don't believe any of these are important enough to be INFO, but if
needed, an appropriate log filter will bring them back.
fix: refactor table & col limit enforcement in catalog into single SQL statement
fix: borked rebase
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This makes it way easier to dyn-type database implementations. The only
real change is that we make `QueryChunk::Error` opaque. Nobody is going
to inspect that anyways, it's just printed to the user.
This is a follow-up of #4053.
Ref #3934.
To test the `db::Db` as well as the `querier` with the same test
framework, they require a shared interface. Ideally this interface is
dynamically typed instead of static dispatched via generics because:
- `query_tests` already take ages to compile
- we often hold a list of scenarios and a single scenario should (in a
future PR) be able to represent both OG as well as NG
The vision here is that we basically keep the whole test setup but add
new scenarios which are NG-specific later on.
Now the issue w/ many query-related types is that they are NOT
object-safe because methods that don't take `&self` or they have
associated types that we cannot specify in general for OG and NG at the
same time.
So we need a bunch of wrappers that make dynamic dispatch possible. They
mostly call to an internal "interface" crate which is the actual `dyn`
part. The interface is currently only implemented for OG.
The scenarios currently also only contain OG databases. However,
creating a dynamic interface that can be used in all `query_tests` is
already a huge step.
Note that there are two places where we downcast the dynamic/abstract
database to `db::Db` again:
1. To create one scenario based on another and where we need to
manipulate `db::Db` with OG-specific semantics.
2. `server_benchmarks`. These contain OG databases only and there is no
point in benchmarking throw the dynamic dispatch interface because
prod (`influxdb_ioxd`) also uses static dispatch.
Ref #3934.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Emit a counter metric "ingest_paused_duration_ms_total" that records the
duration of time an ingester stream is paused with millisecond
granularity.
This metric will allow us to measure the frequency and severity of, and
alert on, an ingester stopping ingest due to memory limits enforced by
the LifecycleManager. This will help us tune these config params.
Configure the postgres catalog to close unused connections after 1
minute, rather than 500s to introduce a bit of fluidity to pool of
connection acquires.
* feat: add "dual" cache pattern
This will be useful for certain parts that are addressed internally via
ID but where the user-facing APIs use names.
For #3985.
* refactor: rework "dual" cache construct to be backend based
Pros:
- easiser to reason about the locking and consistency, esp. in
concurrent applications
Cons:
- we are not canceling running queries for the dual cache any longer
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
For OG we can determine the chunks w/o any IO, for NG however this might
require a few catalog queries.
This is likely not the last change of this sort, i.e. the whole schema
handling is currently sync as well.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Quite a few caches will request data from the catalog w/o knowing if it
exists (e.g. a table by name). We should have different TTLs for "exists"
and "unknown" w/o writing much boilerplate code.
For #3985.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
In theory on a multi-threaded tokio executor, the following could have
happened:
| Thread 1 | Thread 2 |
| --------------------- | ----------------------------------- |
| | Running query begin |
| | ... |
| | `loader.await` finished |
| `Cache::set` begin | |
| state locked | |
| | try state lock, blocking |
| running query removed | |
| ... | |
| state unlocked | |
| `Cache::set` end | |
| | state locked |
| | panic because running query is gone |
Another issue that could happen is if we:
1. issue a get request, loader takes a while, this results in task1
2. side-load data into the running query (task1 still running)
3. the underlying cache backend drops the result very quickly (task1
still running)
4. we request the same data again, resulting in yet another query task
(task2), task1 is still running at this point
In this case the original not-yet-finalized query task (task1) would
remove the new query task (task2) from the active query set, even
though task2 is actually not done.
We fix this by the following measures:
- **task tagging:** tasks are tagged so if two tasks for the same key
are running, we can tell them apart
- **task->backend propagation:** let the query task only write to the
underlying backend if it is actually sure that it is running
- **prefer side-loaded results:** restructure the query task to strongly
prefer side-loaded data over whatever comes from the loader
- **async `Cache::set`:** Let `Cache::set` wait until a running query
task completes. This has NO correctness implications, it's probably
just nicer for resource management.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Prior to this commit, calling shutdown() on the CompactorServerType (the
server layer run by the iox binary) would cancel it's own
CancellationToken, while the CompactorHandler (the actual compaction
workload entrypoint) would be watching it's own, different token.
This commit removes the redundant CancellationToken in the
CompactorServerType, instead using the inner CompactorHandler for
cancellation notification & completion.
Prior to this commit, calling shutdown() on the QuerierServer (the
server layer run by the iox binary) would cancel it's own
CancellationToken, while the QuerierHandlerImpl (the actual querier
workload entrypoint) would be watching it's own, different token.
This commit removes the redundant CancellationToken in the
QuerierServer, instead using the inner QueryHandlerImpl for cancellation
notification & completion.
Changes all consumers of the object store to use the dynamically
dispatched DynObjectStore type, instead of using a hardcoded concrete
implementation type.