Dom Dwyer
f3caf604b5
refactor(wal): last batch length for preallocation
...
There's no need to sub 1 from the batch length to shrink the buffer over
time - the capacity of the new batch will be the length of the last. A
large batch followed by a small batch will cause the pre-allocated next
batch to be small too.
2023-03-02 11:40:38 +01:00
Dom Dwyer
0b40e0d17c
feat(wal): SequenceNumberSet for rotated file
...
Changes Wal::rotate() to return the SequenceNumberSet containing the IDs
of all writes in the segment file that is rotated out.
2023-03-02 10:58:03 +01:00
Dom Dwyer
b22643350f
refactor(wal): track segment sequence numbers
...
Changes the WAL to maintain a SequenceNumberSet containing every ID
wrote to the currently open segment file.
The sets are derived from batched data for efficiency, rather than
recorded per write, to prevent any overhead in the hot path. The batch
set is merged with the file set off the hot path, in a separate I/O
thread (not the async runtime).
2023-03-02 10:58:02 +01:00
Dom Dwyer
6aa33ef380
feat: impl FromIterator for SequenceNumberSet
...
Allow a SequenceNumberSet to be instantiated from an iterator of
SequenceNumber.
2023-03-02 10:58:02 +01:00
Dom Dwyer
6532fb752b
feat: impl Extend for SequenceNumberSet
...
Allow a SequenceNumberSet to be efficiently extended from any iterator
of SequenceNumber instances.
2023-03-02 10:58:01 +01:00
dependabot[bot]
c538cac4ef
chore(deps): Bump tokio from 1.25.0 to 1.26.0 ( #7107 )
...
* chore(deps): Bump tokio from 1.25.0 to 1.26.0
Bumps [tokio](https://github.com/tokio-rs/tokio ) from 1.25.0 to 1.26.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.25.0...tokio-1.26.0 )
---
updated-dependencies:
- dependency-name: tokio
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore: Run cargo hakari tasks
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-02 09:50:39 +00:00
dependabot[bot]
06b2c7a329
chore(deps): Bump sqlparser from 0.30.0 to 0.31.0 ( #7108 )
...
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs ) from 0.30.0 to 0.31.0.
- [Release notes](https://github.com/sqlparser-rs/sqlparser-rs/releases )
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md )
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.30.0...v0.31.0 )
---
updated-dependencies:
- dependency-name: sqlparser
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
2023-03-02 09:43:43 +00:00
Marco Neumann
c95d078e46
feat: add `NestedUnion` opt ( #7092 )
...
* docs: typo
* feat: add `NestedUnion` opt
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-02 09:09:05 +00:00
kodiakhq[bot]
f3267f992a
Merge pull request #7102 from influxdata/cn/share-normalization
...
fix: Use the same normalization code for explain tests as e2e tests do
2023-03-02 03:18:51 +00:00
kodiakhq[bot]
f730991602
Merge branch 'main' into cn/share-normalization
2023-03-02 03:12:16 +00:00
Nga Tran
04ee075a73
chore: reove folder that was aciidently added by vs code ( #7106 )
2023-03-02 02:43:13 +00:00
Christopher M. Wolff
d1a54cf0d4
feat: allow no lower bound gap fill implementation ( #7104 )
...
* feat: allow no lower bound gap fill implementation
* chore: clippy
* refactor: code review feedback
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-01 23:32:57 +00:00
Nga Tran
c8b3827b20
test(compactor2): end-to-end data-tests with large overlap files ( #7103 )
...
* test: end-to-end data-tests with large overlap files
* chore: address review comments
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-01 21:48:33 +00:00
Carol (Nichols || Goulding)
3bf0f2779e
refactor: Move query plan normalizer to arrow_util
2023-03-01 15:44:22 -05:00
Andrew Lamb
525c48de2c
feat(compactor2): add invariant checks to the compactor tests ( #7096 )
...
* feat(compactor2): adding invariant checks to the compactor tests
* fix: Update tests
* fix: remove uneeded change
* fix: filter out deleted files from invariant checks
2023-03-01 19:52:04 +00:00
Carol (Nichols || Goulding)
bbfff8699c
fix: Use the same normalization code for explain tests as e2e tests do
...
The regex for replacing UUIDs needed to be changed like the normalizer's
regex did, so keep them in sync by using the same code.
This might point to the normalizer needing to be moved somewhere else,
or changing these tests to be e2e?
2023-03-01 13:00:04 -05:00
Nga Tran
ce215c2c67
test(compactor2): layout tests for scenarios of large size overlap files ( #7097 )
...
* test: layput tests for scenarios of large size overlap files
* fix: Update compactor2/tests/layouts/large_overlaps.rs
Co-authored-by: Nga Tran <nga-tran@live.com>
* fix: Update compactor2/tests/layouts/large_overlaps.rs
Co-authored-by: Nga Tran <nga-tran@live.com>
---------
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-01 16:43:15 +00:00
kodiakhq[bot]
59c14fc6bb
Merge pull request #7074 from influxdata/cn/more-querier-tests-to-kafkaless
...
test: Change more querier tests to only use Kafkaless
2023-03-01 16:12:43 +00:00
kodiakhq[bot]
b7170e41fb
Merge branch 'main' into cn/more-querier-tests-to-kafkaless
2023-03-01 16:05:41 +00:00
Dom
36d85bf712
Merge pull request #7091 from influxdata/dom/ingester-docs
...
docs: ingester overview + subsystem docs
2023-03-01 15:31:34 +00:00
Dom Dwyer
a55bbebbee
perf(wal): avoid batch buffer reallocations
...
This change causes the WAL to pre-allocate the write batch buffer,
reducing the reallocations & copies that occur in the hot path (this
buffer can grow to be moderately large).
This should automatically size to the correct capacity and (slowly)
reduce buffer overrun.
2023-03-01 15:58:45 +01:00
Dom Dwyer
79f9411e11
fix: wal flusher task / memory leak
...
Although not a problem in conventional usage, leaking this task prevents
the memory used by the wal (which can be substantial) from ever being
deallocated. In turn, this prevents the WAL writer I/O thread from
stopping too.
2023-03-01 15:32:33 +01:00
Dom Dwyer
bbd471718d
docs: hot partition persistence
...
Document the hot partition persistence configuration values.
2023-03-01 14:27:07 +01:00
Dom Dwyer
cfd377d12a
refactor: use as_ for cheap conversion methods
...
This follows with the naming conventions of rust - cheap conversions are
prefixed "as_" and not "to_".
2023-03-01 14:27:06 +01:00
Dom Dwyer
e4089acbae
docs(persist): fix-up handle usage
...
Previously a PersistHandle was cloned for sharing, now it is not
(shareable by wrapping in an Arc).
This fixes the documentation to reflect the change in expected usage.
2023-03-01 14:27:06 +01:00
Dom Dwyer
be661890c5
docs: module-level overviews
...
Adds one-liner documentation of what each module contains - this is
helpful to understand what is where, when looking at the rendered docs.
2023-03-01 14:27:05 +01:00
Dom Dwyer
2a8731dd90
docs: ingester overview documentation
...
Adds "overview" documentation for the ingester, including the high-level
purpose & design. Each subsystem is briefly documented, with links for
jumping-off points into more specific documentation.
2023-03-01 14:27:05 +01:00
Andrew Lamb
e19ce98407
chore: Update datafusion + arrow/arrow-flight/parquet to 34.0.0 ( #7084 )
...
* chore: Update datafusion + arrow/arrow-flight/parquet to 34.0.0
* chore: Run cargo hakari tasks
* chore: Update plans
* chore: Update querier expected output
* chore: Update querier tests to use insta
* fix: sort output too
---------
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-01 11:25:01 +00:00
Dom
f1efeae1c8
Merge pull request #7082 from influxdata/dom/router-docs
...
docs: update router diagrams for kafkaless
2023-03-01 10:42:14 +00:00
Dom
9ed7aa7257
Merge branch 'main' into dom/router-docs
2023-03-01 10:28:46 +00:00
dependabot[bot]
2b2c75e840
chore(deps): Bump clap from 4.1.7 to 4.1.8 ( #7088 )
...
Bumps [clap](https://github.com/clap-rs/clap ) from 4.1.7 to 4.1.8.
- [Release notes](https://github.com/clap-rs/clap/releases )
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md )
- [Commits](https://github.com/clap-rs/clap/compare/v4.1.7...v4.1.8 )
---
updated-dependencies:
- dependency-name: clap
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-01 10:22:52 +00:00
dependabot[bot]
90ceb2c896
chore(deps): Bump crossbeam-utils from 0.8.14 to 0.8.15 ( #7089 )
...
Bumps [crossbeam-utils](https://github.com/crossbeam-rs/crossbeam ) from 0.8.14 to 0.8.15.
- [Release notes](https://github.com/crossbeam-rs/crossbeam/releases )
- [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md )
- [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-utils-0.8.14...crossbeam-utils-0.8.15 )
---
updated-dependencies:
- dependency-name: crossbeam-utils
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
2023-03-01 10:16:09 +00:00
Marco Neumann
8f11372eac
feat: predicate pushdown phys. optimizer rule ( #7083 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-01 09:44:57 +00:00
Marco Neumann
7fb562cb01
feat: "collect chunks" phys. optimizer rule ( #7086 )
...
* feat: "collect chunks" phys. optimizer rule
Required to clean up the plan a bit after all the dedup split and
removal passes.
For #6098 .
* refactor: `collect` -> `combine`
* fix: submodule vis
2023-03-01 09:38:11 +00:00
Marco Neumann
b85869778d
fix: `extract_chunks` schema handling ( #7085 )
...
I forgot that both `RecordBatchExec` and `ParquetExec` can have schemas
with more columns than the chunks they contain, i.e. both provide null
column creation. When extracting the schema for the chunks within a
plan, the full schemas should be preserved, otherwise the physical
optimizer rules will create invalid plan nodes (i.e. with missing
columns).
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-01 09:17:31 +00:00
Christopher M. Wolff
24b9bfacb5
refactor: rewrite gap filling code to be more intuitive and extendable ( #7076 )
...
* refactor: rewrite gap filling code to be more intuitive and extendable
* chore: address clippy issue
2023-02-28 22:18:52 +00:00
Andrew Lamb
f3a16a1221
feat(compactor2): add catalog upgrade information to tests ( #7075 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-28 19:28:42 +00:00
Dom Dwyer
32ce2564dc
refactor: consistent validator/validation
...
Use a consistent name for the retention validation (matching the schema
validation module).
2023-02-28 14:45:45 +01:00
Dom Dwyer
f6c9f9b0e9
docs: update router diagram / overview docs
...
Updates the router documentation to reflect the "kafkaless"
implementation we'll use going forward.
2023-02-28 14:45:32 +01:00
Marco Neumann
6d8fd37e26
feat: add "split dedup by time" optimizer rule ( #7041 )
...
* feat: add "split dedup by time" optimizer rule
For #6098 .
* docs: fix typo
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* feat: add log messages for skipped optimizations
---------
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-02-28 11:29:42 +00:00
Marco Neumann
e8b6e66f75
feat: misc logging/tracing improvements ( #7081 )
...
* feat: print query language/variant in info log
* feat: allow overwriting trace header name in CLI
2023-02-28 11:14:42 +00:00
dependabot[bot]
d474f9a8cc
chore(deps): Bump clap from 4.1.6 to 4.1.7 ( #7080 )
...
Bumps [clap](https://github.com/clap-rs/clap ) from 4.1.6 to 4.1.7.
- [Release notes](https://github.com/clap-rs/clap/releases )
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md )
- [Commits](https://github.com/clap-rs/clap/compare/v4.1.6...v4.1.7 )
---
updated-dependencies:
- dependency-name: clap
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-28 09:23:33 +00:00
Marco Neumann
04f3296d7b
feat: add "remove de-duplication" optimizer pass ( #7042 )
...
For #6098 .
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-28 07:57:19 +00:00
Carol (Nichols || Goulding)
312a9bb56b
test: Change more querier tests to only use Kafkaless
2023-02-27 14:20:46 -05:00
Nga Tran
22fe629f54
refactor: rename files and function to remove tartget level ( #7073 )
...
* refactor: rename files and function to remove tartget level
* chore: update a comment
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-27 19:09:37 +00:00
Andrew Lamb
26b97482df
chore(compactor2): Split up tests into smaller modules ( #7072 )
2023-02-27 17:53:41 +00:00
Andrew Lamb
5194999d62
feat: Use ? for id of uncreated parquet files ( #7066 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-27 16:35:51 +00:00
kodiakhq[bot]
63bf663b94
Merge pull request #7056 from influxdata/cn/test-rpc-write-in-querier
...
test: Switch most querier tests to use the RPC write path
2023-02-27 15:24:39 +00:00
kodiakhq[bot]
731a131a85
Merge branch 'main' into cn/test-rpc-write-in-querier
2023-02-27 15:17:51 +00:00
Andrew Lamb
dd5d4f4435
chore(compactor2): document and test `split_percentage` and `percentage_max_file_size` knobs ( #7026 )
...
* chore: document and test split_percentage and percentage_max_file_size
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* chore: add test with both max file size and split percentage
* docs: whitespace engineering and small typo
---------
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2023-02-27 15:01:06 +00:00