Commit Graph

126 Commits (7b385600fd08165ab645b0cfe2fcd75bf47603f5)

Author SHA1 Message Date
Fraser Savage e74a7a7dd4
test(wal): Test correct assignment of write per-partition sequence numbers
This adds extra test coverage for the ingester's WAL replay & RPC write
paths, as well as the WAL E2E tests, to ensure that all sequence numbers
present in a WriteOperation/WalOperation are encoded and present when
decoded.
2023-07-05 14:42:47 +01:00
Fraser Savage e6e09d0c15
feat(ingester): Assign individual sequence numbers for writes per partition
This commit asks the oracle for a new sequence number for each table
batch of a write operation (and thus each partition of a write) when
handling an RPC write operation before appending the operation to the
WAL. The ingester now honours the sequence numbers per-partition when
WAL replay is performed.
2023-07-05 14:29:27 +01:00
Fraser Savage 30939cfe96
refactor(wal): Remove op-level `sequence_number`, use per table map
This commit removes the op-level sequence number from the proto
definition, now reading and writing solely to the per table (and thus
per partition) sequence number map. Tables/partitions within the same
write op are still assigned the same number for now, so there should be
no semantic different
2023-07-05 14:20:43 +01:00
dependabot[bot] b15c6062a9
chore(deps): Bump tokio from 1.28.2 to 1.29.0 (#8100)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.28.2 to 1.29.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.28.2...tokio-1.29.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-28 13:18:08 +00:00
Fraser Savage 71e47b59ab
refactor(wal): Make more use of combinators for WAL segment reading logic 2023-06-12 12:27:20 +01:00
Fraser Savage fa69994358
refactor(wal): Implement `Iterator` for ClosedSegmentFileReader
The ClosedSegmentFileReader is pretty much an iterator anyways, this
just enables using all the juicy combinators with it more easily.
2023-06-09 17:30:53 +01:00
Fraser Savage fad34c375e
refactor(wal): Use TableId type for look-aside map key
This adds a little extra layer of type safety and should be optimised
by the compiler. This commit also makes sure the ingester's WAL sink
tests assert the behaviour for partitioned sequence numbering on an
operation that hits multiple tables & thus partitions.
2023-06-08 11:39:23 +01:00
Fraser Savage 7de98a6f11
refactor(wal): Associate sequence numbers to table ID in `SequencedWalOp`s
Writes are partitioned before being placed in the buffer tree. This
has the effect of splitting up the persistence of a DmlWrite's contents
and thus the persistence of data referred to by write operations placed
into a single WAL entry for a write op.

This change associates the currently assigned sequence number
with every `TableId` in the write, so that persist events for a single
write can be tracked on a per table/partition level.
Making this partial change enables a transition period where changes
can be rolled back and WAL files can still be processed.

A future change will produce a new sequence number per table
ID.
2023-06-06 17:49:09 +01:00
Fraser Savage 51d59f8216
refactor(`wal_inspect`): Make `LineProtoWriter` namespace unaware
Instead, the type responsible for initialising it handles namespaced
`Write` initialisation and management, as well as the failure paths that
may need handling. This commit introduces a `NamespaceDemultiplexer`
type with a generic implementation allowing fallible `async` lazy init
of any type from a given `NamespaceId`. This paves the way for catalog-aware
initialisation of `LineProtoWriter`s.
2023-05-26 17:12:35 +01:00
Dom Dwyer 928a4d163e
build: remove unused dependencies from crates
This commit fixes loads of crates (47!) had unused dependencies, or
mis-configured dependencies (test deps as normal deps).

I added the "unused_crate_dependencies" to all crates to help prevent
this mess from growing again!

    https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html

This has the minor downside of false-positives when specifying
dev-dependencies for test/bench binaries - these are files in /test or
/benches (not normal tests). This commit includes a workaround,
importing them in lib.rs (gated by a feature flag). I think the
trade-off of better dependency management is worth it!
2023-05-23 14:55:43 +02:00
kodiakhq[bot] b9bcaf1aa0
Merge branch 'main' into savage/wal-regenerate-lp-cli-command 2023-05-22 16:18:44 +00:00
Dom Dwyer 0719928800
chore: remove unused deps
The wal crate imports a bunch of stuff it never uses!
2023-05-22 13:38:49 +02:00
Fraser Savage fd5d5e0758
fix(wal): Assert WriteOpDecoder handles tail-corrupt WAL files correctly
WAL read errors encountered by the new WAL WriteOpDecoder were being
discarded and presented as a "happy path" end of file to callers due
to a bug in handling a nested result type. This moves the test for
decoding a tail-corrupt WAL into the crate itself and asserts the
error is reported correctly.
2023-05-18 17:18:19 +01:00
Fraser Savage a4a22b2732
refactor(wal): Tidy up WriteOpEntryDecoder next() body
Refactor out op comparison for `wal` decode tests to be
more general.
2023-05-16 12:06:54 +01:00
Fraser Savage fcd80060be
feat(wal): Make `wal` WriteOpEntryDecoder an interator
Also, implement drop on `wal_inspect`'s LineProtoWriter and
bubble up flush to the caller.
2023-05-15 20:46:56 +01:00
Fraser Savage 6cdc95e49d
refactor(wal): Use a separate DecodeError type for WriteOpEntryDecoder
Having a ginormous error enum returned for this method means that
the catch-all behaviour gets leaked into the error naming and
semantics of callers. The decoder is a new type and could benefit from
not adding to the existing error enum.
2023-05-04 12:36:23 +01:00
Fraser Savage b2e5ea2266
refactor(wal): Add test & docs for WriteOpEntryDecoder
Adds some documentation for the WriteOpEntryDecoder and
a unit test that asserts it skips over non write entries
and can continue to be consumed from.
2023-05-03 11:51:00 +01:00
Fraser Savage f6dea224e8
feat(wal): Add wal_inspect crate & a write op entry decoder
This adds a new crate with a type capable of converting
decoded WAL Write Op entries to line protocol and writing
the result to a namespaced destination. The wal crate
now exports a type which reads the sequenced wal ops and
decodes them as namespaced table batch writes.
2023-05-03 11:50:59 +01:00
dependabot[bot] bdf7f316d7
chore(deps): Bump tokio from 1.27.0 to 1.28.0 (#7667)
* chore(deps): Bump tokio from 1.27.0 to 1.28.0

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.27.0 to 1.28.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.27.0...tokio-1.28.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Dom <dom@itsallbroken.com>
2023-04-26 12:53:26 +00:00
dependabot[bot] 850d7f7011
chore(deps): Bump regex from 1.8.0 to 1.8.1 (#7627)
Bumps [regex](https://github.com/rust-lang/regex) from 1.8.0 to 1.8.1.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/commits/1.8.1)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-24 08:25:26 +00:00
dependabot[bot] 91ecf3e820
chore(deps): Bump regex from 1.7.3 to 1.8.0 (#7616)
* chore(deps): Bump regex from 1.7.3 to 1.8.0

Bumps [regex](https://github.com/rust-lang/regex) from 1.7.3 to 1.8.0.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/commits)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-21 14:58:46 +00:00
dependabot[bot] 9cbcdc7672
chore(deps): Bump tokio from 1.26.0 to 1.27.0 (#7373)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.26.0 to 1.27.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.26.0...tokio-1.27.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-30 09:36:04 +00:00
dependabot[bot] 44551e7519
chore(deps): Bump regex from 1.7.2 to 1.7.3 (#7338)
Bumps [regex](https://github.com/rust-lang/regex) from 1.7.2 to 1.7.3.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.7.2...1.7.3)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-27 09:20:35 +00:00
dependabot[bot] 6172e6c513
chore(deps): Bump regex from 1.7.1 to 1.7.2 (#7299)
Bumps [regex](https://github.com/rust-lang/regex) from 1.7.1 to 1.7.2.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.7.1...1.7.2)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-22 14:45:20 +00:00
Dom Dwyer d9ca8f948a
refactor(wal): const SegmentId constructor
Allow a SegmentId to be constructed in a const context.
2023-03-14 17:29:15 +01:00
kodiakhq[bot] 0c530aa9f7
Merge branch 'main' into dom/wal-flusher-task-leak 2023-03-02 20:44:02 +00:00
Dom 8fe874a7f0
Merge branch 'main' into dom/record-wal-seqnum-sets 2023-03-02 15:13:36 +00:00
Dom 160e93ea48
Merge branch 'main' into dom/perf-batch-buffer-reuse 2023-03-02 10:43:06 +00:00
Dom Dwyer f3caf604b5
refactor(wal): last batch length for preallocation
There's no need to sub 1 from the batch length to shrink the buffer over
time - the capacity of the new batch will be the length of the last. A
large batch followed by a small batch will cause the pre-allocated next
batch to be small too.
2023-03-02 11:40:38 +01:00
Dom Dwyer 0b40e0d17c
feat(wal): SequenceNumberSet for rotated file
Changes Wal::rotate() to return the SequenceNumberSet containing the IDs
of all writes in the segment file that is rotated out.
2023-03-02 10:58:03 +01:00
Dom Dwyer b22643350f
refactor(wal): track segment sequence numbers
Changes the WAL to maintain a SequenceNumberSet containing every ID
wrote to the currently open segment file.

The sets are derived from batched data for efficiency, rather than
recorded per write, to prevent any overhead in the hot path. The batch
set is merged with the file set off the hot path, in a separate I/O
thread (not the async runtime).
2023-03-02 10:58:02 +01:00
dependabot[bot] c538cac4ef
chore(deps): Bump tokio from 1.25.0 to 1.26.0 (#7107)
* chore(deps): Bump tokio from 1.25.0 to 1.26.0

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.25.0 to 1.26.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.25.0...tokio-1.26.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-02 09:50:39 +00:00
Dom Dwyer a55bbebbee
perf(wal): avoid batch buffer reallocations
This change causes the WAL to pre-allocate the write batch buffer,
reducing the reallocations & copies that occur in the hot path (this
buffer can grow to be moderately large).

This should automatically size to the correct capacity and (slowly)
reduce buffer overrun.
2023-03-01 15:58:45 +01:00
Dom Dwyer 79f9411e11
fix: wal flusher task / memory leak
Although not a problem in conventional usage, leaking this task prevents
the memory used by the wal (which can be substantial) from ever being
deallocated. In turn, this prevents the WAL writer I/O thread from
stopping too.
2023-03-01 15:32:33 +01:00
Carol (Nichols || Goulding) faae5eb438 chore: Rerun cargo hakari manage-deps 2023-02-27 11:56:15 +01:00
Dom Dwyer b9f7f12c0c
perf(wal): avoid buffer allocation in writer
Eliminate buffer allocation (& growing) in the WAL file writer by
reusing a single buffer each time.

This implementation shrinks the buffer size down to 128KiB if it grows
above that amount to prevent one large write from consuming memory
forever more (128KiB should be plenty more than the common write size).
2023-02-23 18:05:06 +01:00
Dom Dwyer c180d3d8ac
perf(wal): reduce I/O syscall count
Each WAL entry is prepended by a two field header, followed by the
payload bytes. Previously a syscall was made for each header field, and
then another to write the payload bytes (or in reality, at least one
call is made).

This commit reduces the syscalls down to a single write call by building
the entire record in memory before calling write(). This adds 8 bytes to
the in-memory buffer size compared to prior to this commit.

This is effectively a reimplementation of a BufWriter but optimised for
our expected memory usage and (more importantly) capable of issuing the
fsync calls necessary for WAL durability.
2023-02-23 18:05:06 +01:00
Dom Dwyer 6d147ec008
refactor: warn! -> error! and spelling
Fix a typo, use "error" level instead of "warn".
2023-02-23 11:13:57 +01:00
Dom Dwyer e3498e3925
perf(wal): use dedicated writer I/O thread
Change the WAL buffer flusher to use a dedicated I/O thread instead of
performing serialisation & blocking file I/O on the async runtime
threads. This should reduce runtime blocking / latency variance on the
async threads.

The added overhead is 1 channel send, but this is per WAL batch of
writes (not per DML write, or worse, per file write). This impl also
amortises allocation of the serialisation buffer, rather than growing
one incrementally for each batch.
2023-02-23 11:13:56 +01:00
Dom Dwyer c72c9d2dba
refactor: derive Debug on WAL types
Deriving debug is highly encouraged so that Result::unwrap() and friends
can print the state of an object if it is causing a panic (it's
impossible to call unwrap() otherwise!)
2023-02-23 11:13:56 +01:00
Carol (Nichols || Goulding) 30fea67701
fix: Move variables within format strings. Thanks clippy!
Changes made automatically using `cargo clippy --fix`.
2023-02-03 13:06:17 -05:00
Dom Dwyer b5ce0e4c4d
refactor: remove test-only checksum
The correctness of data checksumming is validated by the tests as a
reader property (corrupt checksum -> error), the actual value of the
checksum is irrelevant.
2023-02-03 14:26:32 +01:00
Dom Dwyer 6e6a439ef6
refactor: remove unused checksum field
This unreachable checksum is meaningless outside of the WAL
reader/writer implementations.
2023-02-03 14:23:04 +01:00
Andrew Lamb 4e650110cb
chore: reduce scope of allow_deadcode in wal (#6822)
Co-authored-by: Dom <dom@itsallbroken.com>
2023-02-03 10:33:41 +00:00
Stuart Carnie 63d0a77daf
feat: Updating to new services for all-in-one (#6811)
* feat: Updating to new services for all-in-one

* fix: Use correct shard id for ingester2

* fix: clippy

* fix: use wal directory

* fix: end to end tests

* fix: Update tracing cases for new ingest reality

* fix: update metrics test

* fix: Use rpc mode

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-02-02 20:42:29 +00:00
dependabot[bot] d0e6b16450
chore(deps): Bump bytes from 1.3.0 to 1.4.0
Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.3.0 to 1.4.0.
- [Release notes](https://github.com/tokio-rs/bytes/releases)
- [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/bytes/compare/v1.3.0...v1.4.0)

---
updated-dependencies:
- dependency-name: bytes
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-02-01 00:30:56 +00:00
dependabot[bot] ed7d02a225
chore(deps): Bump tokio from 1.24.2 to 1.25.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.2 to 1.25.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/commits/tokio-1.25.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-01-30 01:57:27 +00:00
dependabot[bot] c68049c37a
chore(deps): Bump regex from 1.7.0 to 1.7.1 (#6546)
Bumps [regex](https://github.com/rust-lang/regex) from 1.7.0 to 1.7.1.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.7.0...1.7.1)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 09:55:41 +00:00
dependabot[bot] b49cc2e35e
chore(deps): Bump tokio from 1.24.0 to 1.24.1 (#6545)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.0 to 1.24.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.24.0...tokio-1.24.1)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 09:48:44 +00:00
dependabot[bot] 0aacef3c59
chore(deps): Bump once_cell from 1.16.0 to 1.17.0 (#6473)
* chore(deps): Bump once_cell from 1.16.0 to 1.17.0

Bumps [once_cell](https://github.com/matklad/once_cell) from 1.16.0 to 1.17.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.16.0...v1.17.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Change once_cell version specifier to major.minor for less churn

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
2023-01-02 17:07:15 +00:00