Commit Graph

452 Commits (ac426fe5e1f7dab9d21b3f67fd0eae49b5356cdf)

Author SHA1 Message Date
Dom ffa3c39dbc
Merge branch 'main' into dom/rpcwrite-health-livelock 2023-09-11 17:37:07 +01:00
Dom Dwyer a513301e23
fix(router): health probe livelock / missing probe
When an upstream ingester goes offline, the "circuit breaker" detects it
as unhealthy, and prevents further requests being sent to it.
Periodically a small number of requests are allowed ("probe requests")
to check for recovery.

If a write request is selected as a "probe request", it SHOULD be sent -
a limited number writes are selected as probes, and enough have to be
successful to drive recovery. If no probes are ever sent/successful, the
upstream will never be marked as healthy.

Additionally the RPC handler applies an optimisation: if the number of
ingesters selected to service a write is less than the number needed to
successfully reach the desired replication factor, no requests are sent
and an error is returned immediately, preventing unnecessary system load
for writes that would never succeed.

This optimisation conflicts with the probe request requirement when a
replication factor of >= 2 is specified:

   * All ingesters are offline
   * Write comes in
   * UpstreamSnapshot is populated with a probe request for 1 ingester
     only - no other healthy candidate ingesters exist.
   * Optimisation applied: 1 probe candidate < 2 needed for replication

This results in a probe request never being sent, and in turn, never
allowing further requests to the recovered upstream.

This fix changes the optimisation, applying it only when there are no
probes in the candidate ingester list - the write will always fail, but
it will drive detection of recovered ingesters and maintain liveness of
the system.
2023-09-11 15:24:40 +02:00
Dom Dwyer 03a15aee62
refactor: UpstreamSnapshot aware of probe requests
Allows the UpstreamSnapshot to be initialised with a "contains probe"
boolean indicator that's passed through to the RPC layer.
2023-09-11 14:06:30 +02:00
dependabot[bot] 5cd9c37519
chore(deps): Bump base64 from 0.21.3 to 0.21.4 (#8701)
Bumps [base64](https://github.com/marshallpierce/rust-base64) from 0.21.3 to 0.21.4.
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md)
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.21.3...v0.21.4)

---
updated-dependencies:
- dependency-name: base64
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-11 08:14:43 +00:00
dependabot[bot] cb2d6d1d25
chore(deps): Bump chrono from 0.4.29 to 0.4.30 (#8693)
Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.29 to 0.4.30.
- [Release notes](https://github.com/chronotope/chrono/releases)
- [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md)
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.29...v0.4.30)

---
updated-dependencies:
- dependency-name: chrono
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-08 13:03:27 +00:00
dependabot[bot] 7f20b0faa0
chore(deps): Bump bytes from 1.4.0 to 1.5.0 (#8692)
Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.4.0 to 1.5.0.
- [Release notes](https://github.com/tokio-rs/bytes/releases)
- [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/bytes/compare/v1.4.0...v1.5.0)

---
updated-dependencies:
- dependency-name: bytes
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-08 12:17:12 +00:00
dependabot[bot] 4f6864c0b9
chore(deps): Bump chrono from 0.4.28 to 0.4.29 (#8677)
* chore(deps): Bump chrono from 0.4.28 to 0.4.29

Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.28 to 0.4.29.
- [Release notes](https://github.com/chronotope/chrono/releases)
- [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md)
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.28...v0.4.29)

---
updated-dependencies:
- dependency-name: chrono
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix: deprecations

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-06 09:20:58 +00:00
Dom f5bb59ded9
Merge branch 'main' into dom/memory-cache-arc 2023-09-05 14:35:01 +01:00
Dom Dwyer 021e22b5bf
refactor: remove Arc wrap from ReadThroughCache
This Arc was unnecessary in most uses.
2023-09-05 14:15:35 +02:00
Dom Dwyer 529f11e85d
refactor: remove Arc wrap from InstrumentedCache
This Arc is unnecessary in most calls.
2023-09-05 14:11:17 +02:00
Dom Dwyer b200d82d0f
docs: remove outdated cache race warning
Concurrent writes to the cache no longer overwrite each other - entries
are now merged.
2023-09-05 13:55:56 +02:00
Dom Dwyer bcdafa5f25
refactor: remove Arc wrapper from ShardedCache
This Arc wrapper is unnecessary.
2023-09-05 13:49:57 +02:00
Dom Dwyer 51096119be
refactor: remove Arc from MemoryNamespaceCache
Prior to this commit, the NamespaceCache was only implemented for
Arc<MemoryNamespaceCache> instead of the cache type itself.

In the vast majority of cases, this Arc wrapper is completely
unnecessary - it adds both runtime overhead, and code/type complexity.

This commit impls NamespaceCache for any Arc-wrapped NamespaceCache, and
removes all unnecessary Arc wrapping of the MemoryNamespaceCache.
2023-09-05 13:47:00 +02:00
dependabot[bot] f631b13fb0
chore(deps): Bump chrono from 0.4.27 to 0.4.28 (#8622)
Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.27 to 0.4.28.
- [Release notes](https://github.com/chronotope/chrono/releases)
- [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md)
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.27...v0.4.28)

---
updated-dependencies:
- dependency-name: chrono
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-31 09:49:55 +00:00
dependabot[bot] 4ce11fd9f2
chore(deps): Bump chrono from 0.4.26 to 0.4.27 (#8607)
* chore(deps): Bump chrono from 0.4.26 to 0.4.27

Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.26 to 0.4.27.
- [Release notes](https://github.com/chronotope/chrono/releases)
- [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md)
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.26...v0.4.27)

---
updated-dependencies:
- dependency-name: chrono
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

* fix: Update deprecated chrono methods to their now-recommended versions

`chrono::DateTime::<Tz>::from_utc` has been deprecated and it's now
recommended to use `chrono::DateTime::from_naive_utc_and_offset`
instead.

<https://github.com/chronotope/chrono/pull/1175>

Note that the `Timestamp` type in `influxdb_influxql_parser` is an alias
for `chrono::DateTime<FixedOffset>`.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-31 09:18:25 +00:00
Dom Dwyer 4414c6940b
refactor: move MST et al into module
Adds a "mst" (merkle search tree) submodule in anti_entropy, and moves
all the MST code into it.

This makes space for a gossip-based sync primitive to live here too.
2023-08-30 14:47:15 +02:00
Dom Dwyer e120935a61
fix(bench): reuse same namespace cache
And add another thread for the hashing to happen on now there's two
async tasks to run.
2023-08-30 12:07:15 +02:00
Dom Dwyer 38f67ae736
perf(anti_entropy): add AntiEntropyActor for MST
Separate the management of the Merkle Search Tree state into an actor
to manage concurrent access.

This moves hashing and tree updates off of the hot request path, and
into an asynchronous background process, practically eliminating
overhead for maintaining the MST structure.

This decoupling will allow convergence runs between peers to proceed
without causing contention on the lock in the request hot path.
2023-08-30 12:07:15 +02:00
Dom Dwyer 934e4bc9c6
test(bench): MerkleTree overhead
Layers the MerkleTree over the NamespaceCache stack in benchmarks to
measure the overhead.
2023-08-30 12:07:14 +02:00
Dom Dwyer 9055bc1754
ci: lint failures as warnings
Don't fail to compile / run tests because of an unreachable pub, or
missing debug impl - just emit a compiler warning.

This lets the compilation complete, but isn't accepted in PRs as CI runs
with "deny warnings".
2023-08-29 16:29:31 +02:00
Dom 15da02b59f
Merge branch 'main' into dom/merkle-cache 2023-08-29 14:21:26 +01:00
Dom Dwyer c1ba3918a4
test: cache content hash fixture
Assert the hash representing static cache content does not change.
2023-08-29 13:57:39 +02:00
Dom Dwyer 19e7c90fc1
test: compose prop::Strategy for schema generation
Accept an arbitrary ID generation strategy, composing it into a
NamespaceSchema generation strategy to simplify the call args / usage.
2023-08-29 13:47:22 +02:00
Dom Dwyer 932532c3e3
fix(bench): schema validator bench panic
The benchmark code was completely broken - running the benches would
immediately panic.
2023-08-29 12:41:47 +02:00
Dom Dwyer 124b3d2b42
test: MerkleTree cache decorator
Adds an integration test asserting the derived MST content hashes
accurately track updates to an underlying cache entry merge
implementation.

This ensures the merge implementation, and content hashes do not become
out-of-sync.
2023-08-29 12:26:42 +02:00
Dom Dwyer b694b9f494
feat(router): Merkle tree content hash for cache
Adds a (currently unused) NamespaceCache decorator that observes the
post-merge content of the cache to maintain a content hash.

This makes use of a Merkle Search Tree
(https://inria.hal.science/hal-02303490) to track the CRDT content of
the cache in a deterministic structure that allows for synchronisation
between peers (itself a CRDT).

The hash of two routers' NamespaceCache will be equal iff their cache
contents are equal - this can be used to (very cheaply) identify
out-of-sync routers, and trigger convergence. The MST structure used
here provides functionality to compare two compact MST representations,
and identify subsets of the cache that are out-of-sync, allowing for
cheap convergence.

Note this content hash only covers the tables, their set of columns, and
those column schemas - this reflects the fact that only these values may
currently be converged by gossip. Future work will enable full
convergence of all fields.
2023-08-29 12:19:43 +02:00
Carol (Nichols || Goulding) 12b8095c46
feat: Upgrade to Rust 1.72.0 (#8589)
* feat: Upgrade to Rust 1.72.0

* fix: Allow a warning about an error we're intentionally creating

This is a test for an error. This lint warns that this code will cause
an error. Thanks lint, that's what we wanted!

* chore: rustfmt 1.72

* fix: Remove unnecessary hashes in raw string literals

Thanks Clippy!
https://rust-lang.github.io/rust-clippy/master/index.html#/needless_raw_string_hashes

Note that there are a number of false negatives with this lint; see
https://github.com/rust-lang/rust-clippy/issues/11420

* fix: Remove unnecessary explicit iteration

Looks like clippy::explicit_iter_loop was improved.
https://rust-lang.github.io/rust-clippy/master/index.html#/explicit_iter_loop

* fix: Allow clippy::manual_try_fold in a few places

Some of these might not be possible to rewrite with try_fold, or at
least not trivially. I don't feel confident enough to change these, in
any case. I think the lint is good to have on for future code though, so
that new code can be written with try_fold.

* fix: Remove useless creation of vectors when an array will do

Mostly in tests. Also fix some long lines.

Thanks Clippy!
https://rust-lang.github.io/rust-clippy/master/index.html#/useless_vec

* fix: Allow a single range in a vec init, which is actually what we want

Looks like Clippy's trying to catch a common mistake here, but for realz
we actually want `Vec<Range<usize>>` not `Vec<usize>`

https://rust-lang.github.io/rust-clippy/master/index.html#/single_range_in_vec_init

* fix: Remove a useless conversion

This looks like removing explicit iteration, but it's actually caught by
useless_conversion.

https://rust-lang.github.io/rust-clippy/master/index.html#/useless_conversion

* fix: Remove redundant pattern matching

Thanks Clippy!
https://rust-lang.github.io/rust-clippy/master/index.html#/redundant_pat

* fix: Allow an unwrap on a literal None in a test

This matches with the other tests better, and also when I tried to
remove the `unwrap_or_default` it changed the JSON sent from something
with an empty value to `null`, so I think the `or_default` part is
actually changing from one `None` to another `None`.

https://rust-lang.github.io/rust-clippy/master/index.html#/unnecessary_literal_unwrap
2023-08-29 05:57:38 +00:00
dependabot[bot] aae478d0f5
chore(deps): Bump base64 from 0.21.2 to 0.21.3
Bumps [base64](https://github.com/marshallpierce/rust-base64) from 0.21.2 to 0.21.3.
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md)
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.21.2...v0.21.3)

---
updated-dependencies:
- dependency-name: base64
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-08-28 09:05:14 +00:00
Dom Dwyer ee063057b3
refactor(router): use gossip_schema
Replace the bespoke schema gossip logic in the router with the reusable
gossip_schema crate.
2023-08-23 12:42:33 +02:00
Dom Dwyer d35bd48f65
refactor(gossip): rename GossipMessage
Now there's a Topic, there's no need for a giant "all message types"
enum.

As part of this shift, the gossip_message::GossipMessage used for schema
gossiping is sounding overly generic. This commit changes the name to
schema_message::SchemaMessage and updates the code.

This is a backwards-compatible change (and if anything goes wrong, the
"old" routers simply log a warning if a message is unreadable).
2023-08-22 12:06:49 +02:00
Dom Dwyer ca29e9b0d8
feat(gossip): topic support
Adds "topic" support, allowing a node to subscribe to one or more types
of application payloads independently.

A gossip node is optionally initialised with a set of topics (defaulting
to "all topics") and this set of topic interests is propagated
throughout the cluster via the usual PEX mechanism, alongside the
existing connection & identity information.

When broadcasting an application payload, the sender only transmits it
to nodes that had registered an interest in this payload type. This
prevents wasted network bandwidth and CPU for all nodes, and allows
multiple, distinct payload types to be propagated independently to keep
subsystems that rely on gossip decoupled from each other (no giant,
brittle payload enum type).
2023-08-17 14:53:40 +02:00
Fraser Savage a3ab4d33da
refactor(router): Revert configurable health-check `ERROR_WINDOW`
Configuring the `ERROR_WINDOW` of the router's on-path health check
did not provide a consistent improvement for low write volume clusters.
Now that the `NUM_PROBES` parameter is configurable, this can be
un-exposed to simplify configuration options and clean up boiler plate.
2023-08-08 17:49:14 +01:00
Dom Dwyer a017d1d7f9
test: simplify integration test setup
Remove the redundant async mutex (previously required, but I refactored
the code to make it unnecessary) and DRY the node setup.
2023-08-03 18:02:33 +02:00
Dom Dwyer 757ecc1d03
perf(router): schema gossip between peers
This commit allows schema gossiping to be enabled on router nodes.

Enabling gossiping allows any schema changes made on router A to be sent
to the N-1 other routers, populating their internal caches in
anticipation of handling a similar request.

By populating their cache, they avoid incurring a catalog lookup to
populate their local state upon a cache miss, therefore reducing request
latency, and reducing catalog load.

Enabling gossip on the routers automatically enables schema gossiping -
enabling gossip remains optional, and off by default.
2023-08-03 17:10:17 +02:00
Dom Dwyer 8928c838a8
test: schema gossip w/ default partition keys
Ensure gossiping namespace & tables with empty partition keys is
correct.
2023-08-03 17:10:16 +02:00
Dom Dwyer 00542f7041
test: schema gossip integration
Adds an integration test ensuring the schema gossip layer added to one
instance ("node A") propagates schema diffs to another ("node B").
2023-08-03 17:10:16 +02:00
Dom Dwyer 16c115d5cb
docs(router): gossip subsystem types / topology
Describes the router's schema gossiping types and how they fit together.
2023-08-03 17:10:15 +02:00
Dom Dwyer 3133318e16
refactor: remove redundant NamespaceCache impl
The NamespaceCache does not need to be a decorator itself - it can
operate using a reference to the cache without needing access to cache
requests.
2023-08-03 17:10:14 +02:00
Dom Dwyer b1cdb928f6
refactor: always log error message
Always log the actual error as it may change.
2023-08-03 16:59:06 +02:00
Dom Dwyer fc903b8102
test: preserve duplicates in column set assertions
Don't collect into a BTreeSet for sorting as it drops duplicates.
2023-08-03 16:56:56 +02:00
Dom Dwyer 7a4ed257a2
feat: send-side schema gossip implementation
This commit adds the SchemaChangeObserver, the delegate which is handed
a schema diff, and is responsible for computing the gossip message and
handing it off to the gossip system.

This sits between the cache layer, and the gossip layer, converting
schema things into gossip things.

This isn't connected up, so no messages will be sent.
2023-08-03 12:42:16 +02:00
Dom a32c3d0fa8
Merge branch 'main' into dom/gossip-namespace-cache 2023-08-02 16:39:32 +01:00
Fraser Savage ff207ec158
fix(router): Use BatchSize::NumIterations(1) for namespace schema cache benchmark
Batches share the same set-up step between iterations, so using a batch
size of more than 1 per setup provides inaccurate readings.
2023-08-02 13:35:55 +01:00
Dom Dwyer 10a3a048d8
feat: NamespaceSchemaGossip cache decorator
This commit adds the NamespaceSchemaGossip type, a decorator of
[`NamespaceCache`] implementations utilising peer gossiping to provide
best-effort convergence of the local cache state.

This decorator will sit in the NamespaceCache stack, allowing it to
receive incoming schema gossip messages, and update the local cache
through the regular NamespaceCache abstraction methods.

This currently implements the message handlers only - no messages are
sent yet!
2023-08-02 14:08:06 +02:00
Fraser Savage 33e4098cf8
perf(router): Add benchmark for additions to namespace schema cache
This benchmark covers two axis of performance for calls to the
namespace cache's `put_schema()` stack. These are the cost of adding
varying numbers of new columns to an existing table in the namespace, as
well as adding new tables with their own set of columns to an existing
namespace.
2023-08-02 12:45:30 +01:00
Dom Dwyer 41c9604e46
feat(router): schema gossip skeleton
Adds the supporting types required to integrate the generic gossip crate
into a schema-specific broadcast primitive.

This commit implements the two "halves":

    * GossipMessageDispatcher: async processing of incoming gossip msgs
    * Handle: the send-side handle for async sending of gossip msgs

These types are responsible for converting into/from the serialised
bytes sent over the gossip primitive into application-level / protobuf
types.
2023-08-01 17:11:09 +02:00
Fraser Savage df2c1850fb
refactor(router): Try to fix rustfmt having a nap 2023-08-01 14:51:20 +01:00
Fraser Savage e643014900
docs(router): Fix typo in circuit breaker document comment 2023-08-01 14:46:17 +01:00
Fraser Savage e4a5d2efaa
feat(router): Expose `num_probes` request count used to health-check ingesters as config option
This allows routers to be configured to mark downstreams as healthy/
unhealthy with a requirement for the number of probe requests
which can/must be collected to transition the health checkers circuit
state to healthy/unhealthy.
2023-08-01 14:21:56 +01:00
Dom Dwyer 8da08fa574
feat(router): optionally enable gossip subsystem
Allows the router to optionally enable and start the gossip subsystem
(disabled by default).

No code uses the gossip system, so no application-level messages are
exchanged, but this allows the gossip subsystem to run and exchange
control frames / perform discovery / etc.
2023-07-31 11:01:30 +02:00