Commit Graph

293 Commits (7cb30480350b26af8596eb62192a19c3afbb6e0d)

Author SHA1 Message Date
Carol (Nichols || Goulding) 5e6dbec909
fix: Remove tombstones as they aren't functional currently 2023-04-14 13:36:08 -04:00
Joe-Blount 7dd221aee0
chore: add logging around compaction job semaphore (#7523)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-12 15:47:02 +00:00
Carol (Nichols || Goulding) 3199d65c2f
feat: Add the ability to specify a max threshold duration on CatalogToCompactPartitionsSource 2023-04-12 11:08:51 -04:00
Carol (Nichols || Goulding) a244e5b078
test: Add some tests for CatalogToCompactPartitionsSource's existing behavior 2023-04-12 11:07:43 -04:00
Joe-Blount f05be907cb
chore: increasing concurrency a little more (#7510)
* chore: increasing concurrency a little more

This raises the threshold for single threading compactions to 100 column partitions.  With the non-linear scaling, 70 column partitions would take 49% of the concurrency limit (allowing only 2 of such sized partitions to compact concurrently).  Anything over 70 can only compact with something smaller than itself.

I'm gradually walking these up, partly to avoid causing OOMs in prod, and partly because I want to get a feel for how reactive the average concurrency is to these changes.

* chore: fix comment typo
2023-04-11 22:04:44 +00:00
Joe-Blount 980589504d
chore: make compactor concurrency scale non-linearly (#7509)
* chore: make compactor concurrency scale non-linearly

* chore: rust formatter making the test cases harder to read
2023-04-11 20:10:20 +00:00
Andrew Lamb 8b25a3a64c
chore: Remove unused dependencies, found by cargo-machete (#7491)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-11 08:32:32 +00:00
Joe-Blount ef62f439b8
feat: scale compactor concurrency based on table column count (#7492)
* feat: scale compactor concurrency based on table column count

* chore: address review comments
2023-04-10 21:44:29 +00:00
Phil Bracikowski a99da831e1
Merge branch 'main' into jrb_27_adjust_split_time_selection 2023-04-06 12:07:07 -07:00
Marco Neumann 5f43f2a719
refactor: remove old query planning code (#7449)
Closes #7406.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-06 16:05:08 +00:00
Joe-Blount dcdf253a7a chore: insta updates for split count adjustment 2023-04-06 09:37:05 -05:00
Joe-Blount abde5aa543 chore: adjust how many splits applied in backlogged cases 2023-04-06 09:36:37 -05:00
Joe-Blount e4b4f79c6b
feat: expand 'vertical splitting' to improve compaction efficiency (#7450)
* feat: expand vertical splitting and coordinate compactions

* chore: insta updates for prior commit

* chore: pr review nits
2023-04-05 21:02:16 +00:00
Joe-Blount e27a8e815d
chore: add tracking of bytes written in simulator (#7445)
* chore: add tracking of bytes written in simulator; display in final output header

* chore: insta output churn corresponding to tracking bytes written

* chore: address comment
2023-04-04 20:58:50 +00:00
Joe-Blount 80a91142b5
Merge branch 'main' into jrb_24_backlogged_test_case 2023-04-04 11:21:54 -05:00
dependabot[bot] 66982f988b
chore(deps): Bump object_store from 0.5.5 to 0.5.6 (#7433)
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.5 to 0.5.6.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/commits)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-04 08:43:34 +00:00
Joe-Blount cd708183db chore: update large backfill case to suppress output of runs 2023-04-03 15:02:36 -05:00
Joe-Blount 48407e78da chore: add option to simulator to optionally suppress the output of compaction runs 2023-04-03 15:02:04 -05:00
Joe-Blount 711ccc153e fix: address panic with single L0 overlapping multiple L1s 2023-04-03 14:09:55 -05:00
Carol (Nichols || Goulding) 90d07412ff
refactor: Extract functions for the different purposes of partition filters 2023-03-31 12:53:41 -04:00
Carol (Nichols || Goulding) 9a27736c65
docs: Fix some typos 2023-03-31 12:44:12 -04:00
Carol (Nichols || Goulding) a32d536262
refactor: Extract a function to make the post-classification filters 2023-03-31 12:36:26 -04:00
Carol (Nichols || Goulding) 86dbd5c529
refactor: Extract function for creating the file classifier 2023-03-31 12:36:26 -04:00
Carol (Nichols || Goulding) 63d45532fb
refactor: Extract function for making the parquet files sink 2023-03-31 12:36:26 -04:00
Carol (Nichols || Goulding) eef943ceec
refactor: Extract function for making the scratchpad gen 2023-03-31 12:36:26 -04:00
Carol (Nichols || Goulding) fe0d3c17fd
refactor: Extract function for creating the df plan exec 2023-03-31 12:36:26 -04:00
Carol (Nichols || Goulding) 07c2c768e9
refactor: Extract a function for creating the df planner 2023-03-31 12:36:25 -04:00
Carol (Nichols || Goulding) 7bbf0fcd79
refactor: Import all components from super, not crate 2023-03-31 12:36:25 -04:00
Carol (Nichols || Goulding) 7d2d9dd6b7
refactor: Extract a function for creating the IR planner 2023-03-31 12:36:25 -04:00
Carol (Nichols || Goulding) d7fe50b7ed
refactor: Move logging and metrics of the commit component into where it's created 2023-03-31 12:36:25 -04:00
Carol (Nichols || Goulding) 821ad7f38c
refactor: Move logging and metrics into where the rest of the partition done sink is created 2023-03-31 12:36:25 -04:00
Carol (Nichols || Goulding) 682ed14b9e
refactor: Extract function for creating the round info source 2023-03-31 12:36:25 -04:00
Carol (Nichols || Goulding) 3ce062fd2e
refactor: Extract function for creating partition files source 2023-03-31 12:36:25 -04:00
Carol (Nichols || Goulding) 338ca030ab
refactor: Extract function for creating the partition info source 2023-03-31 12:36:25 -04:00
Carol (Nichols || Goulding) b5f233f037
refactor: Move all partition filter creation into the function for that purpose 2023-03-31 12:36:24 -04:00
Carol (Nichols || Goulding) b9727d2e17
refactor: Extract a function for creating partitions source, commit, and done sink 2023-03-31 12:36:24 -04:00
Carol (Nichols || Goulding) b7b15dff26
refactor: Extract function for making the partition stream
Trying to make the inputs and outputs more clear.
2023-03-31 12:36:24 -04:00
Carol (Nichols || Goulding) c51ec1cc9a
docs: Clean up typos and line wrapping
Found while reading.
2023-03-31 12:36:24 -04:00
Carol (Nichols || Goulding) e4d5c777d9
feat: Make catalog method not specific to compacting and take optional end time 2023-03-31 12:36:24 -04:00
Carol (Nichols || Goulding) 5afb9ccb73
fix: Remove TODO comment that is now done 2023-03-31 12:36:24 -04:00
kodiakhq[bot] a1389e5962
Merge branch 'main' into cn/redo 2023-03-30 20:27:25 +00:00
Carol (Nichols || Goulding) 8718aaa148
fix: Change test file ID to match intent 2023-03-30 16:04:52 -04:00
Carol (Nichols || Goulding) c37f908349
docs: Update comments based on the new criteria for changed files 2023-03-30 16:04:51 -04:00
Carol (Nichols || Goulding) bf026d1f74
fix: Only log that we've detected changed files we're about to compact 2023-03-30 15:10:35 -04:00
Carol (Nichols || Goulding) 48b102f037
fix: Only check that existing files continue to exist at their current compaction level 2023-03-30 14:06:39 -04:00
Joe-Blount 0a51fd55a6
chore: move compaction progress notification up (to be more frequent) (#7389) 2023-03-30 17:49:09 +00:00
Carol (Nichols || Goulding) 956d7bcee4
revert: "revert: Merge pull request #7369 from influxdata/cn/parquet-file-saved-status"
This reverts commit 0d7393f2c1.
2023-03-30 12:39:03 -04:00
Carol (Nichols || Goulding) 0d7393f2c1
revert: Merge pull request #7369 from influxdata/cn/parquet-file-saved-status
This reverts commit c320ed11d4, reversing
changes made to 555f2a67aa.
2023-03-30 12:37:36 -04:00
Carol (Nichols || Goulding) 68abc42cda
test: Add more unit cases for SavedParquetFileState comparisons 2023-03-30 11:05:13 -04:00
Carol (Nichols || Goulding) a0890bf8d3
refactor: Extract a function for repeated code to get and save parquet file state 2023-03-30 11:05:13 -04:00