Commit Graph

10085 Commits (6f7cb5ccf05544fb3efe40b0f7148046d12f34d3)

Author SHA1 Message Date
Marco Neumann c41200536e
refactor: simplify `SeriesSet` (#6277)
`RecordBatch` offers zero-copy slicing, so there is no need to store the
row range manually. This makes #6216 simpler.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-30 16:48:31 +00:00
Dom Dwyer f34d4db994
docs: reference WAL cancellation safety ticket
Reference https://github.com/influxdata/influxdb_iox/issues/6281
2022-11-30 16:37:06 +01:00
Dom Dwyer d2a3d0920b
feat(ingester2): commit writes to write-ahead log
Adds WalSink, an implementation of the DmlSink trait that commits DML
operations to the write-ahead log before passing the DML op into the
decorated inner DmlSink chain.

By structuring the WAL logic as a decorator, the chain of inner DmlSink
handlers within it are only ever invoked after the WAL commit has
successfully completed, which keeps all the WAL commit code in a
singly-responsible component. This also lets us layer on the WAL commit
logic to the DML sink chain after replaying any existing WAL files,
avoiding a circular WAL mess.

The application-logic level WAL code abstracts over the underlying WAL
implementation & codec through the WalAppender trait. This decouples the
business logic from the WAL implementation both for testing purposes,
and for trying different WAL implementations in the future.
2022-11-30 16:37:05 +01:00
Dom Dwyer c48a3b49fb
refactor(wal): rename next_ops -> next_op
It only returns one op, so remove the plural.
2022-11-30 16:37:05 +01:00
Dom Dwyer 3029146f5a
refactor(wal): remove SequenceNumberNg
This actually starts getting more confusing than passing the bare u64
around.
2022-11-30 16:37:00 +01:00
Carol (Nichols || Goulding) eafc0ea131
fix: Get the file stem rather than file name for the UUID (#6284)
Oops. Stupid mistake, behavior that should have had a test but didn't.

Fixes #6270.
2022-11-30 15:22:03 +00:00
Andrew Lamb 039a45ddd1
chore: Update Datafusion and arrow/arrow-flight/parquet to `28.0.0` (#6279)
* chore: Update Datafusion and arrow/arrow-flight/parquet to `28.0.0`

* chore: Update thrift to 0.17

* fix: use workspace arrow-flight in ingester2

* chore: Update for API changes

* fix: test

* chore: Update hakari

* chore: Update hakari again

* chore: Update trace_exporters to latest thrift

* fix: update test

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-30 14:12:30 +00:00
kodiakhq[bot] be726a8327
Merge pull request #6274 from influxdata/dom/sequence-rpc-write
feat: sequence rpc writes
2022-11-30 13:38:55 +00:00
Dom 4bddd370e9
Merge branch 'main' into dom/sequence-rpc-write 2022-11-30 13:30:59 +00:00
Marco Neumann fa6f7ee926
refactor: stream-based(TM) `to_series_and_groups`, part 3 (#6275)
* refactor: stream-based(TM) `to_series_and_groups`, part 3

* refactor: remove dead code

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-30 13:21:39 +00:00
Luke Bond d07658282c
feat: add router config parameter for retention (#6278)
* chore: remove unused/moved ns_autocreation dml handler

* feat(router): expose new ns retention as config

* fix: forgot to set default value for router retention arg

* chore: make new namespace retention param an option
2022-11-30 13:14:39 +00:00
Dom 8249396705
Merge branch 'main' into dom/sequence-rpc-write 2022-11-30 12:25:53 +00:00
dependabot[bot] 9356868562
chore(deps): Bump nix from 0.25.0 to 0.26.1 (#6273)
Bumps [nix](https://github.com/nix-rust/nix) from 0.25.0 to 0.26.1.
- [Release notes](https://github.com/nix-rust/nix/releases)
- [Changelog](https://github.com/nix-rust/nix/blob/master/CHANGELOG.md)
- [Commits](https://github.com/nix-rust/nix/compare/v0.25.0...v0.26.1)

---
updated-dependencies:
- dependency-name: nix
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-30 10:31:46 +00:00
dependabot[bot] f9c9e49e10
chore(deps): Bump tonic-reflection from 0.5.0 to 0.6.0 (#6271)
Bumps [tonic-reflection](https://github.com/hyperium/tonic) from 0.5.0 to 0.6.0.
- [Release notes](https://github.com/hyperium/tonic/releases)
- [Changelog](https://github.com/hyperium/tonic/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/tonic/compare/v0.5.0...v0.6.0)

---
updated-dependencies:
- dependency-name: tonic-reflection
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-30 09:57:08 +00:00
Dom Dwyer 5f1635bfc5
feat(ingester2): sequence write ops
Sequence all gRPC write requests to (internally) order the resulting DML
operations.

These sequence numbers are assigned from a timestamp oracle and passed
through to the downstream DmlSink implementers.
2022-11-30 10:40:26 +01:00
Dom Dwyer ace4b7f669
feat: operation timestamp sequencer
Adds a TimestampOracle to provide an ingester-internal ordering to
incoming DmlOperations using a logical clock.
2022-11-30 10:40:22 +01:00
Dom f7a6be4042
Merge pull request #6269 from influxdata/dom/ingester2-init
feat(ingester2): initialise an ingester2 instance
2022-11-30 09:39:32 +00:00
Dom 8625ce4048
Merge branch 'main' into dom/ingester2-init 2022-11-30 09:30:56 +00:00
Dom a3155fb04c
Merge pull request #6272 from influxdata/dependabot/cargo/tonic-build-0.8.4
chore(deps): Bump tonic-build from 0.8.3 to 0.8.4
2022-11-30 09:30:47 +00:00
dependabot[bot] b8e6a89b9b
chore(deps): Bump tonic-build from 0.8.3 to 0.8.4
Bumps [tonic-build](https://github.com/hyperium/tonic) from 0.8.3 to 0.8.4.
- [Release notes](https://github.com/hyperium/tonic/releases)
- [Changelog](https://github.com/hyperium/tonic/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/tonic/compare/v0.8.3...v0.8.4)

---
updated-dependencies:
- dependency-name: tonic-build
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-11-30 01:18:23 +00:00
Andrew Lamb 3d74790191
chore: update dependencies (#6267)
* chore: update dependencies

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 16:07:20 +00:00
Dom Dwyer 9648207f01
feat(ingester2): initialise an ingester2 instance
Adds a public constructor to initialise an ingester2 instance.
2022-11-29 17:05:42 +01:00
Nga Tran 55508ea794
docs: data retention (#6245)
* docs: data retention

* docs: address review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 15:34:23 +00:00
Dom 366e60383f
Merge pull request #6263 from influxdata/dom/buffer-tree-query
perf(ingester2): streaming buffer tree queries
2022-11-29 15:27:21 +00:00
Dom 0b1449e908
Merge branch 'main' into dom/buffer-tree-query 2022-11-29 15:06:41 +00:00
Marco Neumann 6eb13712c4
refactor: stream-based(TM) `to_series_and_groups`, part 2 (#6265)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 15:00:21 +00:00
Marco Neumann 297ea8be55
refactor: make `IOxSessionContext::exec` non-optional (#6266)
`None` was only used for testing and even than we should probably have a
proper executor instead of panicking for some methods.

Found while working on #6216.
2022-11-29 14:52:32 +00:00
Marco Neumann 514aa60f91
refactor: stream-based(TM) `to_series_and_groups`, part 1 (#6261)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 14:16:22 +00:00
Andrew Lamb fc520e0c0f
refactor: Remove unecessary optimize_record_batch (#6262)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 13:35:46 +00:00
Marco Neumann a216c4d0f5
refactor: stream-based series-to-frame conversion (#6260)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 12:42:28 +00:00
Dom 08be4ec162
Merge branch 'main' into dom/buffer-tree-query 2022-11-29 12:36:32 +00:00
Andrew Lamb f22b1e1a09
chore: Update datafusion (to get memory limiting code) (#6246)
* chore: Update datafusion

* chore: Run cargo hakari tasks

* fix: Update to newer api

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 12:05:17 +00:00
dependabot[bot] bce4902e63
chore(deps): Bump serde from 1.0.147 to 1.0.148 (#6257)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.147 to 1.0.148.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.147...v1.0.148)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-29 11:38:07 +00:00
dependabot[bot] d1983fc385
chore(deps): Bump zstd-sys from 2.0.3+zstd.1.5.2 to 2.0.4+zstd.1.5.2 (#6253)
Bumps [zstd-sys](https://github.com/gyscos/zstd-rs) from 2.0.3+zstd.1.5.2 to 2.0.4+zstd.1.5.2.
- [Release notes](https://github.com/gyscos/zstd-rs/releases)
- [Commits](https://github.com/gyscos/zstd-rs/commits)

---
updated-dependencies:
- dependency-name: zstd-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 11:28:27 +00:00
Dom Dwyer 95216055d8
perf(ingester2): stream BufferTree partition data
This commit implements the QueryExec trait for the BufferTree, allow it
to be queried for the partition data it contains. With this change, the
BufferTree now provides "read your writes" functionality.

Notably the implementation streams the contents of individual partitions
to the caller on demand (pull-based execution), deferring acquiring the
partition lock until actually necessary and minimising the duration of
time a strong reference to a specific RecordBatch is held in order to
minimise the memory overhead.

During query execution a client sees a consistent snapshot of
partitions: once a client begins streaming the query response, incoming
writes that create new partitions do not become visible. However
incoming writes to an existing partition that forms part of the snapshot
set become visible iff they are ordered before the acquisition of the
partition lock when streaming that partition data to the client.
2022-11-29 12:01:47 +01:00
Dom Dwyer de6f0468d8
refactor: associated QueryExec return type
Allow the return type of the QueryExec trait's query_exec() method to be
parametrised by the implementer.

This allows the trait to be reused across different data sources that
return differing concrete types.
2022-11-29 12:01:36 +01:00
Dom b28bba51f8
Merge pull request #6259 from influxdata/dom/streaming-queries
perf(ingester2): streaming queries
2022-11-29 11:00:56 +00:00
Dom 12d7d79e86
Merge branch 'main' into dom/streaming-queries 2022-11-29 10:52:55 +00:00
dependabot[bot] be1e5ad8c2
chore(deps): Bump syn from 1.0.103 to 1.0.104 (#6250)
Bumps [syn](https://github.com/dtolnay/syn) from 1.0.103 to 1.0.104.
- [Release notes](https://github.com/dtolnay/syn/releases)
- [Commits](https://github.com/dtolnay/syn/compare/1.0.103...1.0.104)

---
updated-dependencies:
- dependency-name: syn
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-29 10:43:21 +00:00
dependabot[bot] ffd63f564b
chore(deps): Bump tonic-health from 0.7.1 to 0.8.0 (#6254)
Bumps [tonic-health](https://github.com/hyperium/tonic) from 0.7.1 to 0.8.0.
- [Release notes](https://github.com/hyperium/tonic/releases)
- [Changelog](https://github.com/hyperium/tonic/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/tonic/compare/v0.7.1...v0.8.0)

---
updated-dependencies:
- dependency-name: tonic-health
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 10:23:48 +00:00
dependabot[bot] 8224b087cc
chore(deps): Bump zstd-safe from 5.0.2+zstd.1.5.2 to 6.0.2+zstd.1.5.2 (#6248)
* chore(deps): Bump zstd-safe from 5.0.2+zstd.1.5.2 to 6.0.2+zstd.1.5.2

Bumps [zstd-safe](https://github.com/gyscos/zstd-rs) from 5.0.2+zstd.1.5.2 to 6.0.2+zstd.1.5.2.
- [Release notes](https://github.com/gyscos/zstd-rs/releases)
- [Commits](https://github.com/gyscos/zstd-rs/commits)

---
updated-dependencies:
- dependency-name: zstd-safe
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 10:17:00 +00:00
Dom 10d85696c9
Merge pull request #6256 from influxdata/dependabot/cargo/tonic-build-0.8.3
chore(deps): Bump tonic-build from 0.8.2 to 0.8.3
2022-11-29 10:09:24 +00:00
Dom Dwyer 1a379f5f16
perf(ingester2): streaming query data sourcing
Changes the query code (taken from the ingester crate) to stream data
for query execution, tidy up unnecessary Result types and removing
unnecessary indirection/boxing.

Previously the query data sourcing would collect the set of RecordBatch
for a query response during execution, prior to sending the data to the
caller. Any data that was dropped or modified during this time meant the
underlying ref-counted data could not be released from memory until all
outstanding queries referencing it completed. When faced with multiple
concurrent queries and ongoing ingest, this meant multiple copies of
data could be held in memory at any one time.

After this commit, data is streamed to the user, minimising the duration
of time a reference to specific partition data is held, and therefore
eliminating the memory overhead of holding onto all the data necessary
for a query for as long as the client takes to read the data.

When combined with an upcoming PR to stream RecordBatch out of the
BufferTree, this should provide performant query execution with minimal
memory overhead, even for a maliciously slow reading client.
2022-11-29 11:08:06 +01:00
Dom Dwyer 2ed9780f6b
refactor(ingester2): explicit PartitionStream type
Simplify the streaming types by introducing explicitly named wrappers to
improve visibility.
2022-11-29 11:08:02 +01:00
dependabot[bot] 8129887c1f
chore(deps): Bump tonic-build from 0.8.2 to 0.8.3
Bumps [tonic-build](https://github.com/hyperium/tonic) from 0.8.2 to 0.8.3.
- [Release notes](https://github.com/hyperium/tonic/releases)
- [Changelog](https://github.com/hyperium/tonic/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/tonic/compare/v0.8.2...v0.8.3)

---
updated-dependencies:
- dependency-name: tonic-build
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-11-29 10:02:37 +00:00
Marco Neumann 896a03fdbc
chore: update `rskafka` (#6258)
Useful because it updates `zstd` to 0.12. With the upcoming `parquet`
update, we can than drop `zstd` 0.11.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 10:00:13 +00:00
dependabot[bot] b5aa39db4b
chore(deps): Bump tonic from 0.8.2 to 0.8.3 (#6249)
Bumps [tonic](https://github.com/hyperium/tonic) from 0.8.2 to 0.8.3.
- [Release notes](https://github.com/hyperium/tonic/releases)
- [Changelog](https://github.com/hyperium/tonic/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/tonic/compare/v0.8.2...v0.8.3)

---
updated-dependencies:
- dependency-name: tonic
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-29 09:53:30 +00:00
Marco Neumann 7b6ce7da5d
refactor: clean-up and stream-based `QueryCompletedToken` handling (#6244)
* refactor: avoid channels to to create a one-element stream

* refactor: move `StreamWithPermit` into its own module

* refactor: make `QueryCompletedToken` handling stream-based

For #6216.

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2022-11-29 09:46:52 +00:00
Marco Neumann 5e64c2e4b7
refactor: make `ReadResponse` chunking stream-based (#6239)
* refactor: make `ReadResponse` chunking stream-based

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* refactor: error out on oversized frames

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-28 17:50:09 +00:00
Andrew Lamb fc5697b8e7
chore: Update datafusion again (N of N) (#6218)
* chore: Update datafusion again (4 of N)

* fix: Update plans

* fix: Update for renamed API

* fix: Update more plans

* chore: Update to datafusion @ d355f69aae2cc951cfd021e5c0b690861ba0c4ac

* fix: update explain plan tests

* fix: update test after schema error

* chore: Update datafusion again

* fix: Add size() calculation to selectors

* chore: Run cargo hakari tasks

* fix: Update newly added test

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-28 17:09:40 +00:00