Commit Graph

7251 Commits (66cf34c2e27e3894d3d1b21e42b82c533d450e83)

Author SHA1 Message Date
dependabot[bot] 66cf34c2e2
chore(deps): Bump tokio-rustls from 0.23.2 to 0.23.3 (#4074)
Bumps [tokio-rustls](https://github.com/tokio-rs/tls) from 0.23.2 to 0.23.3.
- [Release notes](https://github.com/tokio-rs/tls/releases)
- [Commits](https://github.com/tokio-rs/tls/commits)

---
updated-dependencies:
- dependency-name: tokio-rustls
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-21 10:43:07 +00:00
kodiakhq[bot] c543b61f6c
Merge pull request #4079 from influxdata/crepererum/issue3934c
refactor: steps towards dynamic database type dispatch
2022-03-21 10:17:37 +00:00
Marco Neumann ca152e7934 refactor: avoid generics in `QueryDatabase`
A step to make this trait object-safe.

Ref #3934.
2022-03-21 10:45:05 +01:00
Marco Neumann 0071b85c22 refactor: make `ExecutionContextProvider` object-safe
Ref #3934.
2022-03-21 10:40:53 +01:00
dependabot[bot] 836aecc7ad
chore(deps): Bump libc from 0.2.120 to 0.2.121 (#4076)
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.120 to 0.2.121.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.120...0.2.121)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-21 09:28:39 +00:00
kodiakhq[bot] 0e5dc716e3
Merge pull request #4061 from influxdata/crepererum/issue3934b
refactor: make `QueryChunk` object-safe
2022-03-19 07:05:51 +00:00
kodiakhq[bot] 67939fb37d
Merge branch 'main' into crepererum/issue3934b 2022-03-19 06:56:30 +00:00
kodiakhq[bot] c75be65a46
Merge pull request #4067 from influxdata/dom/router-precision
feat(router2): write timestamp precision
2022-03-18 17:45:02 +00:00
kodiakhq[bot] 171dac8cd6
Merge branch 'main' into dom/router-precision 2022-03-18 17:35:56 +00:00
Paul Dix 85287abc4e
feat: add ttbr metric to ingester (#4068) 2022-03-18 17:35:30 +00:00
Dom Dwyer d40b018493 feat(router2): support lp timestamp precision
Adds support for non-default timestamp precision in HTTP writes.
2022-03-18 17:15:42 +00:00
Dom Dwyer 4c4f84871e fix: timestamp overflow applying lp precisions
Specifying a large timestamp value and a non-default precision can cause
the multiply to panic if it overflows.

This commit prevents the overflow, returning an error to the user.
2022-03-18 17:14:19 +00:00
kodiakhq[bot] 950d78e749
Merge pull request #4065 from influxdata/dom/log-verbosity
refactor: reduce INFO verbosity
2022-03-18 16:08:24 +00:00
Dom Dwyer 1b26bf1d78 refactor: reduce INFO verbosity
We're currently emitting ~5GB of logs every 30 minutes, and a quick scan
through the logs shows the lines this PR changes to be the most frequent
(multiple times per second).

I don't believe any of these are important enough to be INFO, but if
needed, an appropriate log filter will bring them back.
2022-03-18 15:57:55 +00:00
Luke Bond da517bd8e2
feat: impl table & column limits in catalog (#3832)
fix: refactor table & col limit enforcement in catalog into single SQL statement

fix: borked rebase

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-18 13:54:07 +00:00
Marco Neumann 169fa2fb2f refactor: make `QueryChunk` object-safe
This makes it way easier to dyn-type database implementations. The only
real change is that we make `QueryChunk::Error` opaque. Nobody is going
to inspect that anyways, it's just printed to the user.

This is a follow-up of #4053.

Ref #3934.
2022-03-18 11:40:31 +01:00
Marco Neumann a122b1e2ca
refactor: dyn-typed DB for `query_tests` (#4053)
To test the `db::Db` as well as the `querier` with the same test
framework, they require a shared interface. Ideally this interface is
dynamically typed instead of static dispatched via generics because:

- `query_tests` already take ages to compile
- we often hold a list of scenarios and a single scenario should (in a
  future PR) be able to represent both OG as well as NG

The vision here is that we basically keep the whole test setup but add
new scenarios which are NG-specific later on.

Now the issue w/ many query-related types is that they are NOT
object-safe because methods that don't take `&self` or they have
associated types that we cannot specify in general for OG and NG at the
same time.

So we need a bunch of wrappers that make dynamic dispatch possible. They
mostly call to an internal "interface" crate which is the actual `dyn`
part. The interface is currently only implemented for OG.

The scenarios currently also only contain OG databases. However,
creating a dynamic interface that can be used in all `query_tests` is
already a huge step.

Note that there are two places where we downcast the dynamic/abstract
database to `db::Db` again:

1. To create one scenario based on another and where we need to
   manipulate `db::Db` with OG-specific semantics.
2. `server_benchmarks`. These contain OG databases only and there is no
   point in benchmarking throw the dynamic dispatch interface because
   prod (`influxdb_ioxd`) also uses static dispatch.

Ref #3934.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-18 10:11:17 +00:00
kodiakhq[bot] d831e22f8b
Merge pull request #4057 from influxdata/dom/ingester-metrics
feat: more ingester metrics
2022-03-17 17:31:42 +00:00
Dom Dwyer d9900f661b refactor: ingester table & namespace count metrics
Record the number of tables / namespaces an ingester process has
observed.
2022-03-17 17:20:30 +00:00
Dom Dwyer c0d5c6a559 feat: ingester pause metrics
Emit a counter metric "ingest_paused_duration_ms_total" that records the
duration of time an ingester stream is paused with millisecond
granularity.

This metric will allow us to measure the frequency and severity of, and
alert on, an ingester stopping ingest due to memory limits enforced by
the LifecycleManager. This will help us tune these config params.
2022-03-17 17:20:30 +00:00
kodiakhq[bot] 0a2ff2eae2
Merge pull request #4052 from influxdata/dom/pg-conn-ttl
refactor: lower pg idle connection timeout
2022-03-17 16:14:22 +00:00
kodiakhq[bot] c8aeab29e7
Merge branch 'main' into dom/pg-conn-ttl 2022-03-17 16:04:33 +00:00
Luke Bond fe51c52c5f
Merge pull request #4051 from influxdata/fix/docker-server-fixture
fix: docker server fixture had mistake in run command
2022-03-17 16:03:41 +00:00
kodiakhq[bot] 1e37c43fb2
Merge branch 'main' into fix/docker-server-fixture 2022-03-17 13:52:44 +00:00
Dom Dwyer 0d4949cd1b refactor: lower pg idle connection timeout
Configure the postgres catalog to close unused connections after 1
minute, rather than 500s to introduce a bit of fluidity to pool of
connection acquires.
2022-03-17 13:44:59 +00:00
Marco Neumann 98c8475e3b
feat: add "dual" cache pattern (#4039)
* feat: add "dual" cache pattern

This will be useful for certain parts that are addressed internally via
ID but where the user-facing APIs use names.

For #3985.

* refactor: rework "dual" cache construct to be backend based

Pros:
- easiser to reason about the locking and consistency, esp. in
  concurrent applications

Cons:
- we are not canceling running queries for the dual cache any longer

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-17 13:39:09 +00:00
kodiakhq[bot] e52d10aa99
Merge branch 'main' into fix/docker-server-fixture 2022-03-17 13:24:09 +00:00
Carol (Nichols || Goulding) cd9c483864
feat: Group files by whether they overlap in time (#4048)
Fixes #3949.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-17 13:05:18 +00:00
Marco Neumann 0850a93f20
refactor: make `QueryDatabase::chunks` async (#4047)
For OG we can determine the chunks w/o any IO, for NG however this might
require a few catalog queries.

This is likely not the last change of this sort, i.e. the whole schema
handling is currently sync as well.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-17 12:55:25 +00:00
Luke Bond f17a93ac6e fix: docker server fixture had mistake in run command 2022-03-17 12:21:42 +00:00
dependabot[bot] 58256c32ff
chore(deps): Bump syn from 1.0.88 to 1.0.89 (#4050)
Bumps [syn](https://github.com/dtolnay/syn) from 1.0.88 to 1.0.89.
- [Release notes](https://github.com/dtolnay/syn/releases)
- [Commits](https://github.com/dtolnay/syn/compare/1.0.88...1.0.89)

---
updated-dependencies:
- dependency-name: syn
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-17 11:30:42 +00:00
Marco Neumann dc67570e1c
feat: `OptionalValueTtlProvider` (#4040)
Quite a few caches will request data from the catalog w/o knowing if it
exists (e.g. a table by name). We should have different TTLs for "exists"
and "unknown" w/o writing much boilerplate code.

For #3985.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-16 11:56:21 +00:00
Marco Neumann 31d6a7e6b3
fix: race condition in `Cache::set` (#4038)
In theory on a multi-threaded tokio executor, the following could have
happened:

| Thread 1              | Thread 2                            |
| --------------------- | ----------------------------------- |
|                       | Running query begin                 |
|                       | ...                                 |
|                       | `loader.await` finished             |
| `Cache::set` begin    |                                     |
| state locked          |                                     |
|                       | try state lock, blocking            |
| running query removed |                                     |
| ...                   |                                     |
| state unlocked        |                                     |
| `Cache::set` end      |                                     |
|                       | state locked                        |
|                       | panic because running query is gone |

Another issue that could happen is if we:
1. issue a get request, loader takes a while, this results in task1
2. side-load data into the running query (task1 still running)
3. the underlying cache backend drops the result very quickly (task1
   still running)
4. we request the same data again, resulting in yet another query task
   (task2), task1 is still running at this point

In this case the original not-yet-finalized query task (task1) would
remove the new query task (task2) from the active query set, even
though task2 is actually not done.

We fix this by the following measures:

- **task tagging:** tasks are tagged so if two tasks for the same key
  are running, we can tell them apart
- **task->backend propagation:** let the query task only write to the
  underlying backend if it is actually sure that it is running
- **prefer side-loaded results:** restructure the query task to strongly
  prefer side-loaded data over whatever comes from the loader
- **async `Cache::set`:** Let `Cache::set` wait until a running query
  task completes. This has NO correctness implications, it's probably
  just nicer for resource management.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-16 11:36:20 +00:00
Marco Neumann 1cbd878379
feat: do not attempt to store entries that will immediately expire (#4045)
Let's keep the TTL cache as clean as possible.

For #3985.
2022-03-16 11:16:46 +00:00
Paul Dix d3ea361337
feat: add ingester lifecycle metrics (#4031) 2022-03-15 19:32:58 +00:00
kodiakhq[bot] 943cc3731c
Merge pull request #4044 from influxdata/dom/compactor-shutdown
fix: propagate shutdown into CompactorHandler
2022-03-15 17:40:00 +00:00
kodiakhq[bot] 026b415eea
Merge branch 'main' into dom/compactor-shutdown 2022-03-15 17:30:03 +00:00
kodiakhq[bot] b48f8ca6f9
Merge pull request #4043 from influxdata/dom/query-shutdown
fix: propagate shutdown into QuerierHandlerImpl
2022-03-15 17:23:45 +00:00
Dom Dwyer 5c16f28a17 fix: propagate shutdown into CompactorHandler
Prior to this commit, calling shutdown() on the CompactorServerType (the
server layer run by the iox binary) would cancel it's own
CancellationToken, while the CompactorHandler (the actual compaction
workload entrypoint) would be watching it's own, different token.

This commit removes the redundant CancellationToken in the
CompactorServerType, instead using the inner CompactorHandler for
cancellation notification & completion.
2022-03-15 17:21:00 +00:00
Dom Dwyer 22eab934cf fix: propagate shutdown into QuerierHandlerImpl
Prior to this commit, calling shutdown() on the QuerierServer (the
server layer run by the iox binary) would cancel it's own
CancellationToken, while the QuerierHandlerImpl (the actual querier
workload entrypoint) would be watching it's own, different token.

This commit removes the redundant CancellationToken in the
QuerierServer, instead using the inner QueryHandlerImpl for cancellation
notification & completion.
2022-03-15 17:10:41 +00:00
kodiakhq[bot] 8c294ecb8e
Merge pull request #4042 from influxdata/dom/dyn-object-store
feat: emit object store metrics
2022-03-15 16:50:47 +00:00
Dom Dwyer 6fb1a9b592 feat(querier): enable object store metrics 2022-03-15 16:32:52 +00:00
Dom Dwyer f4d836eed7 feat(ingester): enable object store metrics 2022-03-15 16:32:52 +00:00
Dom Dwyer 65273721b6 feat(compactor): enable object store metrics 2022-03-15 16:32:52 +00:00
Dom Dwyer 5585dd3c21 refactor: switch to using DynObjectStore
Changes all consumers of the object store to use the dynamically
dispatched DynObjectStore type, instead of using a hardcoded concrete
implementation type.
2022-03-15 16:32:52 +00:00
Dom Dwyer b727d26dab refactor: path_from_dirs_and_filename trait method
Moves the path_from_dirs_and_filename from an ObjectStoreImpl method to
a trait method, completing the abstraction over all object store
backends.
2022-03-15 16:29:43 +00:00
Dom Dwyer 1d5066c421 refactor: rename ObjectStore -> ObjectStoreImpl
Frees up the name for so we can use `dyn ObjectStore` throughout the
code instead of `ObjectStoreApi`.
2022-03-15 16:29:43 +00:00
Andrew Lamb 9b3f946c10
feat: all in 1 IOx NG mode (#3965)
* feat: Add all_in_one mode

* fix: doc

* docs: fix truncated docs

* refactor: correctly identify PG connections

* refactor: resolve failed merge

Co-authored-by: Dom Dwyer <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-15 16:28:37 +00:00
kodiakhq[bot] c0e0bcbb1a
Merge pull request #4041 from influxdata/dom/cargo-deny
build: manually clone advisory-db repo
2022-03-15 16:04:41 +00:00
kodiakhq[bot] 7a9618b218
Merge branch 'main' into dom/cargo-deny 2022-03-15 15:53:38 +00:00