Commit Graph

47 Commits (73d44ec9a1b06118107e419a3301c3da1202e90d)

Author SHA1 Message Date
Joe-Blount e27a8e815d
chore: add tracking of bytes written in simulator (#7445)
* chore: add tracking of bytes written in simulator; display in final output header

* chore: insta output churn corresponding to tracking bytes written

* chore: address comment
2023-04-04 20:58:50 +00:00
Joe-Blount 80a91142b5
Merge branch 'main' into jrb_24_backlogged_test_case 2023-04-04 11:21:54 -05:00
dependabot[bot] 66982f988b
chore(deps): Bump object_store from 0.5.5 to 0.5.6 (#7433)
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.5 to 0.5.6.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/commits)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-04 08:43:34 +00:00
Joe-Blount 48407e78da chore: add option to simulator to optionally suppress the output of compaction runs 2023-04-03 15:02:04 -05:00
dependabot[bot] 4eedb7ea77
chore(deps): Bump async-trait from 0.1.66 to 0.1.68 (#7374)
* chore(deps): Bump async-trait from 0.1.66 to 0.1.68

Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.66 to 0.1.68.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.66...0.1.68)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-03-30 10:14:36 +00:00
dependabot[bot] 4b888c7255
chore(deps): Bump insta from 1.28.0 to 1.29.0 (#7322)
Bumps [insta](https://github.com/mitsuhiko/insta) from 1.28.0 to 1.29.0.
- [Release notes](https://github.com/mitsuhiko/insta/releases)
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitsuhiko/insta/compare/1.28.0...1.29.0)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-24 18:25:01 +00:00
kodiakhq[bot] 60e2cb0e2b
Merge branch 'main' into cn/simulator-rules 2023-03-24 17:45:43 +00:00
Andrew Lamb 5dd71998a1
chore: Update datafusion (#7318)
* chore: Update datafusion

* chore: Update for API change

* chore: Run cargo hakari tasks

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-24 15:07:23 +00:00
Carol (Nichols || Goulding) 3fe8bda51a
feat: Record why compact, split, or nothing plans were picked
And display the reason in the simulator tests to ensure decisions are
consistent.

Fixes influxdata/idpe#17306.
2023-03-23 16:58:17 -04:00
kodiakhq[bot] acb491aa98
Merge branch 'main' into cn/refactor 2023-03-23 15:27:46 +00:00
Joe-Blount 77948e3341
chore(iox/compactor): add test cases tracking max_l0_created_at (#7181)
* chore(iox/compactor): add test cases tracking max_l0_created_at

* chore(iox/compactor): add invariant check to simulator for misordered max_l0_created_at

* chore(iox/compactor): test updates for max_l0_created_at

This commit updates the newly added tests to comply with the new invariant check.
Several test cases are deleted because the changes to comply with the invariant make the permutations pointless.

* chore(iox/simulator): Add test cases hitting new invariant check for illegal max_l0_created_at

* chore: update previous tests to comply with new invariant

* chore: address comments

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-22 22:14:06 +00:00
Carol (Nichols || Goulding) a58ef38bb0
refactor: Move target_level into PlanIR
Because this info needs to stay with the rest of the plan info
2023-03-22 16:15:16 -04:00
Nga Tran f780aba353
test: set max_l0_created_at to reasonable values for the tests and al… (#7286)
* test: set max_l0_created_at to reasonable values for the tests and also verify it using both test layout and catalog function

* fix: typo

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-21 18:57:10 +00:00
Joe-Blount dd63b5c52d
chore: expose partition_timeout in the compactor simulator (#7291)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-21 18:48:43 +00:00
Carol (Nichols || Goulding) 493f331e4b
fix: Remove the max_compact_size knob and hardcode a multiple (#7197)
* fix: Remove the max_compact_size knob and hardcode a multiple

Rather than panic if the user hasn't set this knob in a particular way,
set the max_compact_size to the minimum value we need by multiplying
max_desired_file_size_bytes by MIN_COMPACT_SIZE_MULTIPLE.

Fixes influxdata/idpe#17259.

* refactor: Move computation of max_compact_size_bytes into compactor config

* test: change test setups to reflect the purposes of the tests

---------

Co-authored-by: NGA-TRAN <nga-tran@live.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-15 11:21:28 +00:00
Nga Tran ad9e3bd1ef
feat: further split the smallest possible set to compact but too large to do so (#7180)
* test: add a panic limit_files_to_compact function to it is used in the right way

* test: provide correct output to the tests

* chore: remove no-longer valid comments

* feat: have the function limit_files_to_compact to also return files_to_further_split if the minimum set to compact is too large to do so

* refactor: rename files_to_split to start_level_files_to_split

* refactor: rename identify_files_to_split to identify_start_level_files_to_split before adding new split function

* feat: split 2 files of minimum set of compacting files if they are over max compact size

* test: since now we may split files in different levels, let us remove the missleading at level from the simulation tests

* chore: clearer comments

* test: add tests for tiny time ranges

* chore: address review comments
2023-03-14 20:08:58 +00:00
Andrew Lamb 3bb5347d4c
chore: Remove unused dependencies found with cargo-machete @alamb (#7183)
* chore: Remove unused dependencies found with cargo-machete

* chore: Remove unused dependencies

* fix: fixup

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-13 11:06:53 +00:00
Nga Tran 5c00819727
test: provide correct input to the limit_files_to_compact function (#7176)
* test: add a panic limit_files_to_compact function to it is used in the right way

* test: provide correct output to the tests

* chore: Apply suggestions from code review

Co-authored-by: Joe-Blount <73478756+Joe-Blount@users.noreply.github.com>

---------

Co-authored-by: Joe-Blount <73478756+Joe-Blount@users.noreply.github.com>
2023-03-10 19:18:00 +00:00
Joe-Blount 5f0059ed00 Merge remote-tracking branch 'origin/main' into jrb_16_test_improvements
# Conflicts:
#	compactor2/tests/layouts/core.rs
#	compactor2/tests/layouts/knobs.rs
#	compactor2/tests/layouts/large_files.rs
#	compactor2/tests/layouts/large_overlaps.rs
#	compactor2/tests/layouts/many_files.rs
2023-03-10 08:55:15 -06:00
Joe-Blount a2928db0a5 chore(iox/compactor): doc updates for max_l0_created_at addition 2023-03-10 08:45:54 -06:00
Joe-Blount c5a8cd2ac8 chore(iox/compactor): update width for simulator file printing 2023-03-10 08:37:45 -06:00
Joe-Blount 6e2f5bae22 chore(iox/compactor): 3 more insta test updates 2023-03-10 08:31:05 -06:00
Andrew Lamb 6a85e8644b
chore(compactor2): Assign all rows and sizes in simlator (#7124)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-10 12:55:10 +00:00
Joe-Blount 250946fff5 feat(iox/compactor/simulator): include max_l0_created_at in parquet file info printed 2023-03-09 16:59:30 -06:00
Joe-Blount c87113ccbf
chore(iox/compactor): rename max_input_parquet_bytes_per_partition (#7160) 2023-03-08 17:08:08 +00:00
Nga Tran 9e9e689a30
feat: handle large-size overlapped files (#7079)
* feat: split start-level files that overlap wiht many files

* test: split files and theit split times

* test: split test for L1 and L2 files

* feat: full implementation that support large-size overlapped files

* chore: modify comments to reflect the changes

* fix: typo

* chore: update test output

* docs: clearer comments

* chore: remove empty test files. Will add in in a separate PR

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: address review comments

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* refactor: add a knob to turn large-size overlaps on and off

* fix: typo

* chore: update test output after merging main

* fix: split_times should not include the max_time of the file

* fix: fix an overlap bug while limitting number of files to compact

* test: unit tests for different overlap cases of limit files to compact

* chore: increase time range of the tests to let the split files work correctly

* fix: skip compacting files of tiny ranges

* test: add tests for time range 1

* chore: address review comments

* chore: remove  enable_large_size_overlap_files knob

* fix: fix a bug that sort L1 files in thier min_time instead of l0_max_created_at

* refactor: use the same order_files function afer merging main into branch

* chore: typos and clearer comments

* chore: remove obsolete comments

* chore: add asserts per review suggestion

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-03-07 18:51:59 +00:00
Andrew Lamb 8c0e23098f
feat(compactor2): Verify invariants for intermediate parquet files created by compactor2 (#7140)
* feat(compactor2): Verify invariants for compactor2 always

* fix: update tests

* fix: update actual time range and test output

---------

Co-authored-by: NGA-TRAN <nga-tran@live.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-07 15:50:20 +00:00
Joe-Blount 87ae7e72cd
chore: add warnings to compaction simulator for excessively oversized files (#7126)
* chore: add warnings to compaction simulator for excessively oversized file

* chore: Update comment in compactor2_test_utils/src/lib.rs

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-03-06 15:54:38 +00:00
dependabot[bot] 8f3a9396d0
chore(deps): Bump async-trait from 0.1.64 to 0.1.66 (#7129)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.64 to 0.1.66.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.64...0.1.66)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-06 10:13:29 +00:00
Andrew Lamb dfd87f3e20
test(compactor): Add test for large amounts of data with a single timestamp (#7123)
* test(compactor): Add test for large amounts of data with a single timestamp

* fix: Update compactor2/tests/layouts/single_timestamp.rs

Co-authored-by: Joe-Blount <73478756+Joe-Blount@users.noreply.github.com>

---------

Co-authored-by: Joe-Blount <73478756+Joe-Blount@users.noreply.github.com>
2023-03-03 20:12:23 +00:00
dependabot[bot] 3256fcc72e
chore(deps): Bump object_store from 0.5.4 to 0.5.5
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.4 to 0.5.5.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.4...object_store_0.5.5)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-03-03 02:00:51 +00:00
Joe-Blount caa6a84488
chore: verify split time invariants within simulator (#7114) 2023-03-02 20:58:09 +00:00
Andrew Lamb 9f0645a775
fix(compactor2): fix off by one error in time ranges of simulator (#7098)
* fix(compactor2): fix off by one error in time ranges of simulator

* chore: update a test that were added recently and this PR fixes it

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Co-authored-by: NGA-TRAN <nga-tran@live.com>
2023-03-02 14:43:53 +00:00
Nga Tran c8b3827b20
test(compactor2): end-to-end data-tests with large overlap files (#7103)
* test: end-to-end data-tests with large overlap files

* chore: address review comments

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-01 21:48:33 +00:00
Andrew Lamb 525c48de2c
feat(compactor2): add invariant checks to the compactor tests (#7096)
* feat(compactor2): adding invariant checks to the compactor tests

* fix: Update tests

* fix: remove uneeded change

* fix: filter out deleted files from invariant checks
2023-03-01 19:52:04 +00:00
Andrew Lamb f3a16a1221
feat(compactor2): add catalog upgrade information to tests (#7075)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-28 19:28:42 +00:00
Andrew Lamb 5194999d62
feat: Use ? for id of uncreated parquet files (#7066)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-27 16:35:51 +00:00
Carol (Nichols || Goulding) faae5eb438 chore: Rerun cargo hakari manage-deps 2023-02-27 11:56:15 +01:00
Joe-Blount 88d2882350
Merge branch 'main' into alamb/remove_old_algorithm 2023-02-21 09:02:35 -06:00
Andrew Lamb b785f751b3
feat(compactor): add simulator output (#7021) 2023-02-17 15:04:26 +00:00
Andrew Lamb 21a3c8c40d refactor: delete all at once algorithm 2023-02-17 06:24:26 -05:00
Nga Tran f69c8adc7c
feat: Compact partition with many L0 files (#7007)
* feat: initial implementation of the split

* feat: split many L0 files in groups and compact them into new and fewer L0 files

* test: remove iappropriate AllAtOnce test

* refactor: move file classification for initial target to its own function

* fix: pop the branch from start to end

* chore: address review comments

* feat: support splitting to many L1 files

* feat: only add extra round to compact level-n files to same level-n files if their files plus overlapped level-n-plus-1 over limit

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: final cleanup and address comments

* chore: run fmt

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-16 21:17:25 +00:00
dependabot[bot] a06f64b198
chore(deps): Bump insta from 1.26.0 to 1.28.0 (#7016)
Bumps [insta](https://github.com/mitsuhiko/insta) from 1.26.0 to 1.28.0.
- [Release notes](https://github.com/mitsuhiko/insta/releases)
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitsuhiko/insta/compare/1.26.0...1.28.0)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-16 18:14:25 +00:00
Andrew Lamb 04bd47e64a
feat(compactor): Add more tests, improve sizes to simulator run display more (#6981)
* refactor: Split layout tests into their own module

* feat: Add more tests, improve sizes to simulator run display more

* fix: Apply suggestions from code review

Co-authored-by: Nga Tran <nga-tran@live.com>

* fix: fix comment wording

* fix: reporting order of skipped compactions

* chore: Run cargo hakari tasks

* fix: revert changes to Cargo.lock

* fix: revert workspace hack change

---------

Co-authored-by: Nga Tran <nga-tran@live.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-02-14 19:34:19 +00:00
Nga Tran 5c506058da
feat: skip partitions of wide tables (#6978)
* feat: skip partitions of wide tables

* test: one more test

* refactor: address review comments
2023-02-14 16:42:13 +00:00
Andrew Lamb 263d8fe21f
chore: Layout tests with `TargetLevel` algorithm + update display (#6977)
* refactor: move ParquetFileSimulator to compactor2_test_utils

* chore: Test with new algorithm + update display

* chore: Updates

* chore: Update setting to match prod
2023-02-13 22:12:55 +00:00
Andrew Lamb 4d7aa1e48b
refactor: extract compactor2 test utils into `compactor2_test_utils` and integration test (#6960)
* refactor: extract compactor2 test utils into `compactor2_test_utils` and integration test

* fix: Update compactor2/src/components/mod.rs

Co-authored-by: Marco Neumann <marco@crepererum.net>

---------

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-13 12:06:42 +00:00