Commit Graph

22 Commits (86dd72ef1f31b62a97d7bd22a69e63dd7b62bc52)

Author SHA1 Message Date
Nga Tran 9e9e689a30
feat: handle large-size overlapped files (#7079)
* feat: split start-level files that overlap wiht many files

* test: split files and theit split times

* test: split test for L1 and L2 files

* feat: full implementation that support large-size overlapped files

* chore: modify comments to reflect the changes

* fix: typo

* chore: update test output

* docs: clearer comments

* chore: remove empty test files. Will add in in a separate PR

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: address review comments

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* refactor: add a knob to turn large-size overlaps on and off

* fix: typo

* chore: update test output after merging main

* fix: split_times should not include the max_time of the file

* fix: fix an overlap bug while limitting number of files to compact

* test: unit tests for different overlap cases of limit files to compact

* chore: increase time range of the tests to let the split files work correctly

* fix: skip compacting files of tiny ranges

* test: add tests for time range 1

* chore: address review comments

* chore: remove  enable_large_size_overlap_files knob

* fix: fix a bug that sort L1 files in thier min_time instead of l0_max_created_at

* refactor: use the same order_files function afer merging main into branch

* chore: typos and clearer comments

* chore: remove obsolete comments

* chore: add asserts per review suggestion

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-03-07 18:51:59 +00:00
Andrew Lamb 8c0e23098f
feat(compactor2): Verify invariants for intermediate parquet files created by compactor2 (#7140)
* feat(compactor2): Verify invariants for compactor2 always

* fix: update tests

* fix: update actual time range and test output

---------

Co-authored-by: NGA-TRAN <nga-tran@live.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-07 15:50:20 +00:00
Joe-Blount 87ae7e72cd
chore: add warnings to compaction simulator for excessively oversized files (#7126)
* chore: add warnings to compaction simulator for excessively oversized file

* chore: Update comment in compactor2_test_utils/src/lib.rs

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-03-06 15:54:38 +00:00
dependabot[bot] 8f3a9396d0
chore(deps): Bump async-trait from 0.1.64 to 0.1.66 (#7129)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.64 to 0.1.66.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.64...0.1.66)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-06 10:13:29 +00:00
Andrew Lamb dfd87f3e20
test(compactor): Add test for large amounts of data with a single timestamp (#7123)
* test(compactor): Add test for large amounts of data with a single timestamp

* fix: Update compactor2/tests/layouts/single_timestamp.rs

Co-authored-by: Joe-Blount <73478756+Joe-Blount@users.noreply.github.com>

---------

Co-authored-by: Joe-Blount <73478756+Joe-Blount@users.noreply.github.com>
2023-03-03 20:12:23 +00:00
dependabot[bot] 3256fcc72e
chore(deps): Bump object_store from 0.5.4 to 0.5.5
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.4 to 0.5.5.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.4...object_store_0.5.5)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-03-03 02:00:51 +00:00
Joe-Blount caa6a84488
chore: verify split time invariants within simulator (#7114) 2023-03-02 20:58:09 +00:00
Andrew Lamb 9f0645a775
fix(compactor2): fix off by one error in time ranges of simulator (#7098)
* fix(compactor2): fix off by one error in time ranges of simulator

* chore: update a test that were added recently and this PR fixes it

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Co-authored-by: NGA-TRAN <nga-tran@live.com>
2023-03-02 14:43:53 +00:00
Nga Tran c8b3827b20
test(compactor2): end-to-end data-tests with large overlap files (#7103)
* test: end-to-end data-tests with large overlap files

* chore: address review comments

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-01 21:48:33 +00:00
Andrew Lamb 525c48de2c
feat(compactor2): add invariant checks to the compactor tests (#7096)
* feat(compactor2): adding invariant checks to the compactor tests

* fix: Update tests

* fix: remove uneeded change

* fix: filter out deleted files from invariant checks
2023-03-01 19:52:04 +00:00
Andrew Lamb f3a16a1221
feat(compactor2): add catalog upgrade information to tests (#7075)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-28 19:28:42 +00:00
Andrew Lamb 5194999d62
feat: Use ? for id of uncreated parquet files (#7066)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-27 16:35:51 +00:00
Carol (Nichols || Goulding) faae5eb438 chore: Rerun cargo hakari manage-deps 2023-02-27 11:56:15 +01:00
Joe-Blount 88d2882350
Merge branch 'main' into alamb/remove_old_algorithm 2023-02-21 09:02:35 -06:00
Andrew Lamb b785f751b3
feat(compactor): add simulator output (#7021) 2023-02-17 15:04:26 +00:00
Andrew Lamb 21a3c8c40d refactor: delete all at once algorithm 2023-02-17 06:24:26 -05:00
Nga Tran f69c8adc7c
feat: Compact partition with many L0 files (#7007)
* feat: initial implementation of the split

* feat: split many L0 files in groups and compact them into new and fewer L0 files

* test: remove iappropriate AllAtOnce test

* refactor: move file classification for initial target to its own function

* fix: pop the branch from start to end

* chore: address review comments

* feat: support splitting to many L1 files

* feat: only add extra round to compact level-n files to same level-n files if their files plus overlapped level-n-plus-1 over limit

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: final cleanup and address comments

* chore: run fmt

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-16 21:17:25 +00:00
dependabot[bot] a06f64b198
chore(deps): Bump insta from 1.26.0 to 1.28.0 (#7016)
Bumps [insta](https://github.com/mitsuhiko/insta) from 1.26.0 to 1.28.0.
- [Release notes](https://github.com/mitsuhiko/insta/releases)
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitsuhiko/insta/compare/1.26.0...1.28.0)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-16 18:14:25 +00:00
Andrew Lamb 04bd47e64a
feat(compactor): Add more tests, improve sizes to simulator run display more (#6981)
* refactor: Split layout tests into their own module

* feat: Add more tests, improve sizes to simulator run display more

* fix: Apply suggestions from code review

Co-authored-by: Nga Tran <nga-tran@live.com>

* fix: fix comment wording

* fix: reporting order of skipped compactions

* chore: Run cargo hakari tasks

* fix: revert changes to Cargo.lock

* fix: revert workspace hack change

---------

Co-authored-by: Nga Tran <nga-tran@live.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-02-14 19:34:19 +00:00
Nga Tran 5c506058da
feat: skip partitions of wide tables (#6978)
* feat: skip partitions of wide tables

* test: one more test

* refactor: address review comments
2023-02-14 16:42:13 +00:00
Andrew Lamb 263d8fe21f
chore: Layout tests with `TargetLevel` algorithm + update display (#6977)
* refactor: move ParquetFileSimulator to compactor2_test_utils

* chore: Test with new algorithm + update display

* chore: Updates

* chore: Update setting to match prod
2023-02-13 22:12:55 +00:00
Andrew Lamb 4d7aa1e48b
refactor: extract compactor2 test utils into `compactor2_test_utils` and integration test (#6960)
* refactor: extract compactor2 test utils into `compactor2_test_utils` and integration test

* fix: Update compactor2/src/components/mod.rs

Co-authored-by: Marco Neumann <marco@crepererum.net>

---------

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-13 12:06:42 +00:00