Commit Graph

22 Commits (6347d833e74fec91e5734299aed4b761ee9c4144)

Author SHA1 Message Date
Joe-Blount 85a9e13262
Merge branch 'main' into jrb_63_compactor_spans 2023-07-17 09:52:27 -05:00
dependabot[bot] 4c0e5db3a5
chore(deps): Bump insta from 1.30.0 to 1.31.0 (#8242)
Bumps [insta](https://github.com/mitsuhiko/insta) from 1.30.0 to 1.31.0.
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitsuhiko/insta/compare/1.30.0...1.31.0)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-17 14:01:21 +00:00
wiedld d43300635e
Revert "feat(idpe-17789): scheduler job_status() (#8202)" (#8213)
This reverts commit 3dabccd84b.
2023-07-11 10:33:56 -07:00
wiedld 3dabccd84b
feat(idpe-17789): scheduler job_status() (#8202)
* feat(idpe-17789): scheduler job_status() (#8121)

This block of work moves into the scheduler some of the specific downstream actions affiliated with compaction outcomes. Which responsibilities stay in the compactor, versus moved to the scheduler, roughly followed the heuristic of whether the action (a) had an impact on global catalog state (a.k.a. commits and partition skipping), (b) whether it's logging affiliated with compactor health (e.g. ParitionDoneSink logging outcomes) versus system health (e.g. logging commits), and (c) reporting to the scheduler on any errors encountered during compaction. This boundary is subject to change as we move forward.

Also, a noted caveat (TODO) on this commit. We have a CompactionJob which is used to track work handed off to each compactor. Currently it still uses the partition_id for tracking, but the followup PR will start moving the compactor to have more CompactionJob uuid awareness.
2023-07-11 08:41:12 -07:00
Joe-Blount 16939c849d chore: add tracing to compactor 2023-07-10 16:36:24 -05:00
Joe-Blount 9f522bfd30
Revert "feat(idpe-17789): scheduler job_status() (#8121)" (#8175)
This reverts commit 5d19fa3635.
2023-07-06 18:52:25 +00:00
wiedld 5d19fa3635
feat(idpe-17789): scheduler job_status() (#8121)
This block of work moves into the scheduler some of the specific downstream actions affiliated with compaction outcomes. Which responsibilities stay in the compactor, versus moved to the scheduler, roughly followed the heuristic of whether the action (a) had an impact on global catalog state (a.k.a. commits and partition skipping), (b) whether it's logging affiliated with compactor health (e.g. ParitionDoneSink logging outcomes) versus system health (e.g. logging commits), and (c) reporting to the scheduler on any errors encountered during compaction. This boundary is subject to change as we move forward.

Also, a noted caveat (TODO) on this commit. We have a CompactionJob which is used to track work handed off to each compactor. Currently it still uses the partition_id for tracking, but the followup PR will start moving the compactor to have more CompactionJob uuid awareness.
2023-07-06 09:15:59 -07:00
dependabot[bot] 26a6113a37
chore(deps): Bump async-trait from 0.1.70 to 0.1.71 (#8163)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.70 to 0.1.71.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.70...0.1.71)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-06 09:58:51 +00:00
dependabot[bot] b5c9628f0f
chore(deps): Bump async-trait from 0.1.69 to 0.1.70 (#8148)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.69 to 0.1.70.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.69...0.1.70)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-05 09:05:13 +00:00
wiedld 3a8a8a153e
feat(idpe 17789): provide scheduler interface (#8057)
* feat: provide convenience methods to create Scheduler, and keep the scheduler implementations crate private. External crates can only create a Scheduler based upon configs.

* feat: provide Scheduler as a component to compactor. Specifically, the scheduler configs are present within the compactor run config, and the scheduler in created within the compactor hardcoded components.

* feat: within the compactor ScheduledPartitionsSource, utilize the dyn Scheduler and Scheduler.get_jobs()

* feat: CompactionJob should be per partition, and have a uniqueness characteristic independent of the partition

* feat: keep compactor_scheduler separate from clap_blocks. Only interface is within ioxd_compactor where the CLI configs are transformed into ShardConfig and PartitionsSourceConfig.

* chore: make IdOnlyPartitionFilter into only pub(crate)

* chore: update scheduler display to include any report information (a.k.a. shard_config, if present)
2023-06-28 15:04:00 -07:00
Joe-Blount ac9cc24315
fix: compactor shouldn't leave small L1s in non-overlap leading edge pattern (#8101)
* fix: compactor shouldn't leave tiny L1s with non-overlapped leading edge pattern

* chore: insta updates for prior commit
2023-06-28 17:02:21 +00:00
dependabot[bot] 6e7b838b52
chore(deps): Bump insta from 1.29.0 to 1.30.0 (#8059)
Bumps [insta](https://github.com/mitsuhiko/insta) from 1.29.0 to 1.30.0.
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitsuhiko/insta/compare/1.29.0...1.30.0)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-23 07:45:41 +00:00
Carol (Nichols || Goulding) 62ba18171a
feat: Add a new hash column on the partition and parquet file tables
This will hold the deterministic ID for partitions.

Until all existing partitions have this value, this is optional/nullable.

The row ID still exists and is used as the main foreign key in the
parquet_file and skipped_compaction tables.

The hash_id has a unique index so that we can look up records based on
it (if it's available).

If the parquet file record has a partition_hash_id value, use that to
generate the object storage path instead of the partition_id.
2023-06-22 09:01:22 -04:00
Dom Dwyer d1cbbd27b1
feat(compactor): config partition query rate limit
Allow the partition fetch queries to be (optionally) rate limited via
runtime config.
2023-06-21 15:50:12 +02:00
wiedld e29b453e0d
refactor: move PartitionsSourceConfig into local scheduler (#8026) 2023-06-20 16:05:59 -07:00
wiedld 7a1f54ac64
refactor: remove compactor type (#8011)
* refactor: remove cold compactions
* refactor: remove compaction_type
2023-06-16 09:40:13 -07:00
Joe-Blount 5d0bb68c5b
chore: add compactor option to disable scratchpad (#7995) 2023-06-15 14:35:55 +00:00
Andrew Lamb 1ff76b7bf2 chore: use workspace dependencies for `object_store` 2023-05-26 07:03:42 -04:00
Marco Neumann 103e814f22
refactor: clean up catalog `parquet_files` interface (#7853)
* feat: `ParquetFileRepo::list_all`

* refactor: remove `ParquetFileRepo::list_by_table`

* refactor: simlify `ParquetFileRepo::list_by_table`

* refactor: remove `ParquetFileRepo::count`

* refactor: remove `ParquetFileRepo::update_compaction_level`

* refactor: remove `ParquetFileRepo::exists`

* fix: test

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-24 09:15:03 +00:00
Dom Dwyer 928a4d163e
build: remove unused dependencies from crates
This commit fixes loads of crates (47!) had unused dependencies, or
mis-configured dependencies (test deps as normal deps).

I added the "unused_crate_dependencies" to all crates to help prevent
this mess from growing again!

    https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html

This has the minor downside of false-positives when specifying
dev-dependencies for test/bench binaries - these are files in /test or
/benches (not normal tests). This commit includes a workaround,
importing them in lib.rs (gated by a feature flag). I think the
trade-off of better dependency management is worth it!
2023-05-23 14:55:43 +02:00
Andrew Lamb 6344fe8c3f
chore: Add rationale for `clippy::future_not_send` (#7822)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-18 16:58:56 +00:00
Carol (Nichols || Goulding) 9229ce5668
fix: Rename compactor2_test_utils to compactor_test_utils 2023-05-09 11:02:11 +02:00