Commit Graph

10005 Commits (14a9bc92e99638725df15c32a8a46d4e11a09e03)

Author SHA1 Message Date
Andrew Lamb 14a9bc92e9
Revert "Revert "chore: Update Datafusion and arrow/arrow-flight/parquet to `28.0.0` (#6279)" (#6294)" (#6296)
This reverts commit b7e52c0d8d.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-01 14:20:43 +00:00
kodiakhq[bot] b427213ae9
Merge pull request #6299 from influxdata/dom/more-partition-cache
refactor(ingester2): cache more hot partitions
2022-12-01 13:55:41 +00:00
kodiakhq[bot] 523b041319
Merge branch 'main' into dom/more-partition-cache 2022-12-01 13:49:09 +00:00
Marco Neumann 6cecc439d4
refactor: revert "simplify `SeriesSet` (#6277)" (#6298)
This reverts commit c41200536e.
2022-12-01 13:30:19 +00:00
Dom Dwyer 1be9ffb409
refactor(ingester2): cache more hot partitions
Now partition cache entries are smaller, the number of entries held in
memory can be increased - this now uses ~2MiB of memory and drains the
cache during execution, amortising to 0.
2022-12-01 13:45:19 +01:00
kodiakhq[bot] 8515b770e3
Merge pull request #6293 from influxdata/dom/remove-ordering
refactor(ingester2): document reordering / remove seqnum ranges
2022-12-01 12:44:40 +00:00
kodiakhq[bot] 047bcc6e7e
Merge branch 'main' into dom/remove-ordering 2022-12-01 12:37:54 +00:00
Marco Neumann 88de327f70
Merge pull request #6295 from influxdata/crepererum/revert_dad6dee924ef71b414e4fc3b79864e454f4f7fea
refactor: revert stream-based `SeriesSetConvert::convert` interface (#6282)
2022-12-01 12:20:46 +00:00
Marco Neumann 4a8bb871dc refactor: revert stream-based `SeriesSetConvert::convert` interface (#6282)
This reverts commit dad6dee924.
2022-12-01 12:51:56 +01:00
Andrew Lamb b7e52c0d8d
Revert "chore: Update Datafusion and arrow/arrow-flight/parquet to `28.0.0` (#6279)" (#6294)
This reverts commit 039a45ddd1.
2022-12-01 11:38:42 +00:00
Carol (Nichols || Goulding) c008219692
feat: Add a feature flag to switch to the router RPC write path (#6247)
* feat: Add a feature flag to switch to the router RPC write path

Fixes #6242.

* refactor: Remove a weird arc clone/rename that's not needed

I'm sure this was needed at some point, but it doesn't make much sense.
I wasn't going to change this, but I'm now trying to minimize the
differences between this function and the write path init function, so
make this one better too.

* fix: Add the namespace autocreation to the RPC write path too

The topic/query pool don't really apply to this case, but use them
anyway to be able to use the existing catalog methods.

Also add a bunch of comments pointing out where the RPC write path
initializer and the old router's initializer are the same and where
they're different, so that perhaps it'll be easier to keep them in sync
while they both exist.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-01 11:05:39 +00:00
Dom Dwyer 6639568665
refactor: remove unimplemented!()
This lets us keep the existing test coverage for the new impl.
2022-12-01 11:35:00 +01:00
Dom Dwyer 464242ebc6
refactor: remove ordering asserts, sequence ranges
This commit removes the invariant asserts of monotonicity carried over
from the "ingester" crate - ingester2 does not define any ordering of
writes within the system.

This commit also removes the SequenceNumberRange as it is no longer
useful to indirectly check the equality of two sets of ops -
non-monotonic writes means overlapping ranges does not guarantee a full
set of overlapping operations (gaps may be present). Likewise bounding
values (such as "min persisted") provide no useful meaning in an
out-of-order system.
2022-12-01 10:38:20 +01:00
Dom Dwyer 50d5e4a6f1
docs: arbitrarily reordering & sequence numbers
Document the arbitrary reordering of concurrent writes in an ingester,
and the potential divergence of WAL entries / buffered state.

Also documents that in ingester2, sequence numbers only identify writes,
not their ordering.
2022-12-01 10:38:20 +01:00
kodiakhq[bot] 7ab21ddf32
Merge pull request #6280 from influxdata/dom/wal-write
feat(ingester2): commit DML ops to WAL
2022-11-30 18:41:07 +00:00
Carol (Nichols || Goulding) b6b8e6ac10
Merge remote-tracking branch 'origin/main' into dom/wal-write 2022-11-30 13:27:28 -05:00
Carol (Nichols || Goulding) 096d850fd5
fix: Maintain WAL segment file ordering (#6287)
Rather than naming WAL files with a UUID, give them a number that
indicates the order they were created in so that they can be read back
in order.

Fixes #6227.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-30 17:37:37 +00:00
Marco Neumann dad6dee924
refactor: stream-based `SeriesSetConvert::convert` interface (#6282)
Change the interface of `SeriesSetConvert::convert` to be stream-based.
This is the final interface-prep step before actually implementing #6216.
2022-11-30 17:12:54 +00:00
kodiakhq[bot] 6b83b3d9ea
Merge branch 'main' into dom/wal-write 2022-11-30 17:10:31 +00:00
Carol (Nichols || Goulding) f326baa5d0
test: Update to correctly expect old open files are closed on replay 2022-11-30 12:09:36 -05:00
Marco Neumann c41200536e
refactor: simplify `SeriesSet` (#6277)
`RecordBatch` offers zero-copy slicing, so there is no need to store the
row range manually. This makes #6216 simpler.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-30 16:48:31 +00:00
Dom Dwyer f34d4db994
docs: reference WAL cancellation safety ticket
Reference https://github.com/influxdata/influxdb_iox/issues/6281
2022-11-30 16:37:06 +01:00
Dom Dwyer d2a3d0920b
feat(ingester2): commit writes to write-ahead log
Adds WalSink, an implementation of the DmlSink trait that commits DML
operations to the write-ahead log before passing the DML op into the
decorated inner DmlSink chain.

By structuring the WAL logic as a decorator, the chain of inner DmlSink
handlers within it are only ever invoked after the WAL commit has
successfully completed, which keeps all the WAL commit code in a
singly-responsible component. This also lets us layer on the WAL commit
logic to the DML sink chain after replaying any existing WAL files,
avoiding a circular WAL mess.

The application-logic level WAL code abstracts over the underlying WAL
implementation & codec through the WalAppender trait. This decouples the
business logic from the WAL implementation both for testing purposes,
and for trying different WAL implementations in the future.
2022-11-30 16:37:05 +01:00
Dom Dwyer c48a3b49fb
refactor(wal): rename next_ops -> next_op
It only returns one op, so remove the plural.
2022-11-30 16:37:05 +01:00
Dom Dwyer 3029146f5a
refactor(wal): remove SequenceNumberNg
This actually starts getting more confusing than passing the bare u64
around.
2022-11-30 16:37:00 +01:00
Carol (Nichols || Goulding) eafc0ea131
fix: Get the file stem rather than file name for the UUID (#6284)
Oops. Stupid mistake, behavior that should have had a test but didn't.

Fixes #6270.
2022-11-30 15:22:03 +00:00
Andrew Lamb 039a45ddd1
chore: Update Datafusion and arrow/arrow-flight/parquet to `28.0.0` (#6279)
* chore: Update Datafusion and arrow/arrow-flight/parquet to `28.0.0`

* chore: Update thrift to 0.17

* fix: use workspace arrow-flight in ingester2

* chore: Update for API changes

* fix: test

* chore: Update hakari

* chore: Update hakari again

* chore: Update trace_exporters to latest thrift

* fix: update test

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-30 14:12:30 +00:00
kodiakhq[bot] be726a8327
Merge pull request #6274 from influxdata/dom/sequence-rpc-write
feat: sequence rpc writes
2022-11-30 13:38:55 +00:00
Dom 4bddd370e9
Merge branch 'main' into dom/sequence-rpc-write 2022-11-30 13:30:59 +00:00
Marco Neumann fa6f7ee926
refactor: stream-based(TM) `to_series_and_groups`, part 3 (#6275)
* refactor: stream-based(TM) `to_series_and_groups`, part 3

* refactor: remove dead code

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-30 13:21:39 +00:00
Luke Bond d07658282c
feat: add router config parameter for retention (#6278)
* chore: remove unused/moved ns_autocreation dml handler

* feat(router): expose new ns retention as config

* fix: forgot to set default value for router retention arg

* chore: make new namespace retention param an option
2022-11-30 13:14:39 +00:00
Dom 8249396705
Merge branch 'main' into dom/sequence-rpc-write 2022-11-30 12:25:53 +00:00
dependabot[bot] 9356868562
chore(deps): Bump nix from 0.25.0 to 0.26.1 (#6273)
Bumps [nix](https://github.com/nix-rust/nix) from 0.25.0 to 0.26.1.
- [Release notes](https://github.com/nix-rust/nix/releases)
- [Changelog](https://github.com/nix-rust/nix/blob/master/CHANGELOG.md)
- [Commits](https://github.com/nix-rust/nix/compare/v0.25.0...v0.26.1)

---
updated-dependencies:
- dependency-name: nix
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-30 10:31:46 +00:00
dependabot[bot] f9c9e49e10
chore(deps): Bump tonic-reflection from 0.5.0 to 0.6.0 (#6271)
Bumps [tonic-reflection](https://github.com/hyperium/tonic) from 0.5.0 to 0.6.0.
- [Release notes](https://github.com/hyperium/tonic/releases)
- [Changelog](https://github.com/hyperium/tonic/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/tonic/compare/v0.5.0...v0.6.0)

---
updated-dependencies:
- dependency-name: tonic-reflection
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-30 09:57:08 +00:00
Dom Dwyer 5f1635bfc5
feat(ingester2): sequence write ops
Sequence all gRPC write requests to (internally) order the resulting DML
operations.

These sequence numbers are assigned from a timestamp oracle and passed
through to the downstream DmlSink implementers.
2022-11-30 10:40:26 +01:00
Dom Dwyer ace4b7f669
feat: operation timestamp sequencer
Adds a TimestampOracle to provide an ingester-internal ordering to
incoming DmlOperations using a logical clock.
2022-11-30 10:40:22 +01:00
Dom f7a6be4042
Merge pull request #6269 from influxdata/dom/ingester2-init
feat(ingester2): initialise an ingester2 instance
2022-11-30 09:39:32 +00:00
Dom 8625ce4048
Merge branch 'main' into dom/ingester2-init 2022-11-30 09:30:56 +00:00
Dom a3155fb04c
Merge pull request #6272 from influxdata/dependabot/cargo/tonic-build-0.8.4
chore(deps): Bump tonic-build from 0.8.3 to 0.8.4
2022-11-30 09:30:47 +00:00
dependabot[bot] b8e6a89b9b
chore(deps): Bump tonic-build from 0.8.3 to 0.8.4
Bumps [tonic-build](https://github.com/hyperium/tonic) from 0.8.3 to 0.8.4.
- [Release notes](https://github.com/hyperium/tonic/releases)
- [Changelog](https://github.com/hyperium/tonic/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/tonic/compare/v0.8.3...v0.8.4)

---
updated-dependencies:
- dependency-name: tonic-build
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-11-30 01:18:23 +00:00
Andrew Lamb 3d74790191
chore: update dependencies (#6267)
* chore: update dependencies

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 16:07:20 +00:00
Dom Dwyer 9648207f01
feat(ingester2): initialise an ingester2 instance
Adds a public constructor to initialise an ingester2 instance.
2022-11-29 17:05:42 +01:00
Nga Tran 55508ea794
docs: data retention (#6245)
* docs: data retention

* docs: address review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 15:34:23 +00:00
Dom 366e60383f
Merge pull request #6263 from influxdata/dom/buffer-tree-query
perf(ingester2): streaming buffer tree queries
2022-11-29 15:27:21 +00:00
Dom 0b1449e908
Merge branch 'main' into dom/buffer-tree-query 2022-11-29 15:06:41 +00:00
Marco Neumann 6eb13712c4
refactor: stream-based(TM) `to_series_and_groups`, part 2 (#6265)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 15:00:21 +00:00
Marco Neumann 297ea8be55
refactor: make `IOxSessionContext::exec` non-optional (#6266)
`None` was only used for testing and even than we should probably have a
proper executor instead of panicking for some methods.

Found while working on #6216.
2022-11-29 14:52:32 +00:00
Marco Neumann 514aa60f91
refactor: stream-based(TM) `to_series_and_groups`, part 1 (#6261)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 14:16:22 +00:00
Andrew Lamb fc520e0c0f
refactor: Remove unecessary optimize_record_batch (#6262)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 13:35:46 +00:00
Marco Neumann a216c4d0f5
refactor: stream-based series-to-frame conversion (#6260)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 12:42:28 +00:00