* chore: add tracking of bytes written in simulator; display in final output header
* chore: insta output churn corresponding to tracking bytes written
* chore: address comment
* chore(iox/compactor): add test cases tracking max_l0_created_at
* chore(iox/compactor): add invariant check to simulator for misordered max_l0_created_at
* chore(iox/compactor): test updates for max_l0_created_at
This commit updates the newly added tests to comply with the new invariant check.
Several test cases are deleted because the changes to comply with the invariant make the permutations pointless.
* chore(iox/simulator): Add test cases hitting new invariant check for illegal max_l0_created_at
* chore: update previous tests to comply with new invariant
* chore: address comments
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* test: set max_l0_created_at to reasonable values for the tests and also verify it using both test layout and catalog function
* fix: typo
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* fix: Remove the max_compact_size knob and hardcode a multiple
Rather than panic if the user hasn't set this knob in a particular way,
set the max_compact_size to the minimum value we need by multiplying
max_desired_file_size_bytes by MIN_COMPACT_SIZE_MULTIPLE.
Fixesinfluxdata/idpe#17259.
* refactor: Move computation of max_compact_size_bytes into compactor config
* test: change test setups to reflect the purposes of the tests
---------
Co-authored-by: NGA-TRAN <nga-tran@live.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* test: add a panic limit_files_to_compact function to it is used in the right way
* test: provide correct output to the tests
* chore: remove no-longer valid comments
* feat: have the function limit_files_to_compact to also return files_to_further_split if the minimum set to compact is too large to do so
* refactor: rename files_to_split to start_level_files_to_split
* refactor: rename identify_files_to_split to identify_start_level_files_to_split before adding new split function
* feat: split 2 files of minimum set of compacting files if they are over max compact size
* test: since now we may split files in different levels, let us remove the missleading at level from the simulation tests
* chore: clearer comments
* test: add tests for tiny time ranges
* chore: address review comments
* test: add a panic limit_files_to_compact function to it is used in the right way
* test: provide correct output to the tests
* chore: Apply suggestions from code review
Co-authored-by: Joe-Blount <73478756+Joe-Blount@users.noreply.github.com>
---------
Co-authored-by: Joe-Blount <73478756+Joe-Blount@users.noreply.github.com>
* feat: split start-level files that overlap wiht many files
* test: split files and theit split times
* test: split test for L1 and L2 files
* feat: full implementation that support large-size overlapped files
* chore: modify comments to reflect the changes
* fix: typo
* chore: update test output
* docs: clearer comments
* chore: remove empty test files. Will add in in a separate PR
* chore: Apply suggestions from code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* chore: address review comments
* chore: Apply suggestions from code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* refactor: add a knob to turn large-size overlaps on and off
* fix: typo
* chore: update test output after merging main
* fix: split_times should not include the max_time of the file
* fix: fix an overlap bug while limitting number of files to compact
* test: unit tests for different overlap cases of limit files to compact
* chore: increase time range of the tests to let the split files work correctly
* fix: skip compacting files of tiny ranges
* test: add tests for time range 1
* chore: address review comments
* chore: remove enable_large_size_overlap_files knob
* fix: fix a bug that sort L1 files in thier min_time instead of l0_max_created_at
* refactor: use the same order_files function afer merging main into branch
* chore: typos and clearer comments
* chore: remove obsolete comments
* chore: add asserts per review suggestion
---------
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* feat(compactor2): Verify invariants for compactor2 always
* fix: update tests
* fix: update actual time range and test output
---------
Co-authored-by: NGA-TRAN <nga-tran@live.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* test(compactor): Add test for large amounts of data with a single timestamp
* fix: Update compactor2/tests/layouts/single_timestamp.rs
Co-authored-by: Joe-Blount <73478756+Joe-Blount@users.noreply.github.com>
---------
Co-authored-by: Joe-Blount <73478756+Joe-Blount@users.noreply.github.com>
* fix(compactor2): fix off by one error in time ranges of simulator
* chore: update a test that were added recently and this PR fixes it
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Co-authored-by: NGA-TRAN <nga-tran@live.com>
* feat: initial implementation of the split
* feat: split many L0 files in groups and compact them into new and fewer L0 files
* test: remove iappropriate AllAtOnce test
* refactor: move file classification for initial target to its own function
* fix: pop the branch from start to end
* chore: address review comments
* feat: support splitting to many L1 files
* feat: only add extra round to compact level-n files to same level-n files if their files plus overlapped level-n-plus-1 over limit
* chore: Apply suggestions from code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* chore: final cleanup and address comments
* chore: run fmt
---------
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* refactor: Split layout tests into their own module
* feat: Add more tests, improve sizes to simulator run display more
* fix: Apply suggestions from code review
Co-authored-by: Nga Tran <nga-tran@live.com>
* fix: fix comment wording
* fix: reporting order of skipped compactions
* chore: Run cargo hakari tasks
* fix: revert changes to Cargo.lock
* fix: revert workspace hack change
---------
Co-authored-by: Nga Tran <nga-tran@live.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
* refactor: move ParquetFileSimulator to compactor2_test_utils
* chore: Test with new algorithm + update display
* chore: Updates
* chore: Update setting to match prod
* refactor: extract compactor2 test utils into `compactor2_test_utils` and integration test
* fix: Update compactor2/src/components/mod.rs
Co-authored-by: Marco Neumann <marco@crepererum.net>
---------
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>