Commit Graph

28 Commits (1df5948c97c29cc34da4e530364a8e07a614ab9b)

Author SHA1 Message Date
Joe-Blount 1df5948c97
feat: Add Compaction Regions (#8559)
* feat: add CompactRanges RoundInfo type

* chore: insta test updates for adding CompactRange

* feat: simplify/improve ManySmallFiles logic, now that its problem set is simpler

* chore: insta test updates for ManySmallFiles improvement

* chore: upgrade files more aggressively

* chore: insta updates from more aggressive file upgrades

* chore: addressing review comments
2023-08-28 12:59:12 +00:00
Nga Tran 2eb74ddb87
chore: revert teaching compactor to use sort_key_ids (#8574) 2023-08-25 13:21:12 +00:00
Nga Tran 246918feb6
feat: teach compactor to use sort_key_ids instead of sort_key (#8560)
* feat: teach compactor to use sort_key_ids instead of sort_key

* test: update the test output after chatting with Joe and know the reason of the chnanges
2023-08-24 16:16:12 +00:00
Joe-Blount 53915f0653
feat: move vertical splitting & detect non-linear data (#8506)
* chore: test changes and additions in preparation for functional changes

* feat: move vertical splitting to RoundInfo calculation, align splits to L1 files

* chore: insta test churn

* feat: detect non-linear data distribution in vertical splitting

* chore: add tests for non-linear data distribution

* chore: insta churn

* chore: cleanup & comment additions

* chore: some variable renaming
2023-08-21 18:22:25 +00:00
Joe-Blount 1cc0926a7f
feat: track why bytes are written in compactor simulator (#8493)
* feat: add tracking of why bytes are written in simulator

* chore: enable breakdown of why bytes are written in a few larger tests

* chore: enable writes breakdown in another test

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-16 13:39:37 +00:00
Joe-Blount 6d4729db1d chore: Insta test updates 2023-08-14 15:41:37 -05:00
Joe-Blount 964b2f6b97
fix: compactor simulator math error creates 0 byte files (#8478)
* fix: math error in simulator results in 0 byte files during simulations

* chore: insta churn from simulator file size fix
2023-08-14 20:00:19 +00:00
Joe-Blount 3213396caf
feat: force early L1 compaction, avoid invariant violations (#8444)
* feat: force early L1 compaction, avoid invariant violations

* chore: variable renaming

* chore: make SplitBasedFileClassifier honor file selection for SimulatedLeadingEdge
2023-08-08 20:10:36 +00:00
wiedld 81b5d80a91
feat(idpe-17935): move filtering of skipped partitions to the scheduler (#8358)
* catalog.get_in_skipped_compaction() should handle for multiple partitions

* add the ability to perform transformation on sets of partitions (rather than filtering one by one). Start with the transformation to remove skipped partitions, in the scheduler.

* move the env var and cli flag setting, for when to ignore skipped partitions, to the scheduler config.
2023-08-03 11:43:09 -07:00
Joe-Blount 44e266d000
fix: compaction looping fixes (#8363)
* fix: selectively merge L1 to L2 when L0s still exist

* fix: avoid grouping files that undo previous splits

* chore: add test case for new fixes

* chore: insta test churn

* chore: lint cleanup
2023-07-31 13:15:49 +00:00
Joe-Blount f5a41592da Merge remote-tracking branch 'origin/main' into jrb_73_stuck
# Conflicts:
#	compactor/tests/layouts/stuck.rs
2023-07-27 08:54:50 -05:00
Joe-Blount 525f8ec0cb
fix: compactor loop splitting then undoing it (#8338) 2023-07-27 13:17:30 +00:00
Joe-Blount 6246275c4a chore: insta churn updates 2023-07-26 15:59:49 -05:00
Joe-Blount f1e088aa0e fix: no percent split during ManySmallFiles compaction 2023-07-26 15:59:20 -05:00
Joe-Blount 7622358518 fix: avoid compacting 1 L0 to 1 L0 file (stuck looping) 2023-07-21 13:55:04 -05:00
Marco Neumann 0173c50ba1
fix: use correct error code when querier is shutting down (#8282)
When a long running query is in process and the querier is shutting
down, it might happen that the executor (= thread pool and tokio
executor responsible for the CPU-bound DataFusion execution) is shut
down while the query is running. From a "systems interaction" PoV I
think this is totally fine and I would like to avoid some weird
ref-counting. Or in other words: if the system is shutting down, shut it
down.

However the error was treated as "internal" which is not useful. The
client should rather be informed that its server was gone and that it is
OK (and desired) to retry. So as per
<https://grpc.github.io/grpc/core/md_doc_statuscodes.html> I think this
should signal "unavailable".

This change wires the error code in such a way that the gRPC service
layer can properly inspect it and then changes the error mapping.

Ref https://github.com/influxdata/idpe/issues/17917 .

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-20 12:08:22 +00:00
Joe-Blount e1e1d5ab38 fix: compactor loop in highly backlogged case 2023-07-13 16:47:59 -05:00
Joe-Blount ac9cc24315
fix: compactor shouldn't leave small L1s in non-overlap leading edge pattern (#8101)
* fix: compactor shouldn't leave tiny L1s with non-overlapped leading edge pattern

* chore: insta updates for prior commit
2023-06-28 17:02:21 +00:00
Joe-Blount 40865e011c
fix: compactor loop on L1 files (#8082)
* chore: suppress insta run output on some long tests

* fix: prevent L1 compaction looping

* chore: insta updates from prior commit

* chore: addresss comments
2023-06-26 21:21:24 +00:00
Joe-Blount 99d0530a21
fix: compactor stuck looping with unproductive compactions (needs vertical split) (#8056)
* chore: adjust with_max_num_files_per_plan to more common setting

This significantly increases write amplification (see change in `written` at the conclusion of the cases)

* fix: compactor looping with unproductive compactions

* chore: formatting cleanup

* chore: fix typo in comment

* chore: add test case that compacts too many files at once

* fix: enforce max file count for compaction

* chore: insta churn from prior commit

---------

Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-23 09:19:06 +00:00
Joe-Blount e0ccc2e345
Revert "fix: compactor stuck looping with unproductive compactions (needs vertical split) (#8039)" (#8041)
This reverts commit b219b4b003.
2023-06-21 23:14:35 +00:00
Joe-Blount b219b4b003
fix: compactor stuck looping with unproductive compactions (needs vertical split) (#8039)
* chore: adjust with_max_num_files_per_plan to more common setting

This significantly increases write amplification (see change in `written` at the conclusion of the cases)

* fix: compactor looping with unproductive compactions

* chore: formatting cleanup

* chore: fix typo in comment
2023-06-21 20:23:50 +00:00
wiedld 7a1f54ac64
refactor: remove compactor type (#8011)
* refactor: remove cold compactions
* refactor: remove compaction_type
2023-06-16 09:40:13 -07:00
Joe-Blount a21596f604
chore: add L1/L2 accumulated size test (#8012)
This adds 4 small test cases intending to test how compaction decisions made affect the final size of L1/L2 files.
The assumption is that when a steady stream of small L0 files is arriving, the compactor needs to be rewriting L1s so they grow to a reasonable size instead of getting left small.
2023-06-16 13:28:59 +00:00
Carol (Nichols || Goulding) bf699a8b60
fix: Remove partition ID from the metadata serialized into Parquet files (#7947)
Nothing gets the partition ID out of the metadata. The parts of the code
interacting with object storage that need the ID to create the object
store path were using the partition ID from the metadata out of
convenience, but I changed those places to pass in the partition ID in a
separate argument instead.

This will make the transition to deterministic partition IDs a bit
smoother.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-08 14:03:21 +00:00
Joe-Blount c2423d8a5c feat: Order L0s more deterministically 2023-05-30 15:52:03 -05:00
Carol (Nichols || Goulding) 9229ce5668
fix: Rename compactor2_test_utils to compactor_test_utils 2023-05-09 11:02:11 +02:00
Carol (Nichols || Goulding) dd9c5d1b13
fix: Rename compactor2 to compactor 2023-05-09 10:58:55 +02:00