influxdb

Commit Graph

Author	SHA1	Message	Date
Carol (Nichols \|\| Goulding)	4a9e76b8b7	feat: Make parquet_file.partition_id optional in the catalog (#8339 ) * feat: Make parquet_file.partition_id optional in the catalog This will acquire a short lock on the table in postgres, per: <https://stackoverflow.com/questions/52760971/will-making-column-nullable-lock-the-table-for-reads> This allows us to persist data for new partitions and associate the Parquet file catalog records with the partition records using only the partition hash ID, rather than both that are used now. * fix: Support transition partition ID in the catalog service * fix: Use transition partition ID in import/export This commit also removes support for the `--partition-id` flag of the `influxdb_iox remote store get-table` command, which Andrew approved. The `--partition-id` filter was getting the results of the catalog gRPC service's query for Parquet files of a table and then keeping only the files whose partition IDs matched. The gRPC query is no longer returning the partition ID from the Parquet file table, and really, this command should instead be using `GetParquetFilesByPartitionId` to only request what's needed rather than filtering. * feat: Support looking up Parquet files by either kind of Partition id Regardless of which is actually stored on the Parquet file record. That is, say there's a Partition in the catalog with: Partition { id: 3, hash_id: abcdefg, } and a Parquet file that has: ParquetFile { partition_hash_id: abcdefg, } calling `list_by_partition_not_to_delete(PartitionId(3))` should still return this Parquet file because it is associated with the partition that has ID 3. This is important for the compactor, which is currently only dealing in PartitionIds, and I'd like to keep it that way for now to avoid having to change Even More in this PR. * fix: Use and set new partition ID fields everywhere they want to be --------- Co-authored-by: Dom <dom@itsallbroken.com>	2023-07-31 12:40:56 +00:00
wiedld	02088995b2	feat(idpe 17789): compactor to scheduler communication. `update_job_status()` and `end_job()` (#8216 ) * feat(idpe-17789): scheduler job_status() (#8121) This block of work moves into the scheduler some of the specific downstream actions affiliated with compaction outcomes. Which responsibilities stay in the compactor, versus moved to the scheduler, roughly followed the heuristic of whether the action (a) had an impact on global catalog state (a.k.a. commits and partition skipping), (b) whether it's logging affiliated with compactor health (e.g. ParitionDoneSink logging outcomes) versus system health (e.g. logging commits), and (c) reporting to the scheduler on any errors encountered during compaction. This boundary is subject to change as we move forward. Also, a noted caveat (TODO) on this commit. We have a CompactionJob which is used to track work handed off to each compactor. Currently it still uses the partition_id for tracking, but the followup PR will start moving the compactor to have more CompactionJob uuid awareness. * fix(idpe-17789): need to remove partition from uniqueness tracking, so it becomes available again * refactor(idpe-17789): split up the single-use end_job() from the multi-use update_job_status() * feat(idpe-17789): Commit is now a scheduler trait, only used externally in the compactor_test_utils * feat(idpe-17789): Propagate errors pertaining to commit, in both the scheduler and the compactor. * feat(idpe-17789): PartitionDoneSink should have different crate-private traits for scheduler versus comactor. * feat(idpe-17789): PartitionDoneSink should propagate errors * test(idpe-17789): integration tests suite * test(idpe-17789): test documenting what skip request does (as outcome) * refactor(idpe-17789): make the validate of the upgrade commit, versus replacement commit, more explicit. * feat(idpe-17789): switch to using parking_lot Mutex within the scheduler	2023-07-24 12:01:28 -07:00
dependabot[bot]	cd31492e5b	chore(deps): Bump async-trait from 0.1.71 to 0.1.72 (#8317 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.71 to 0.1.72. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.71...0.1.72) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-07-24 10:07:18 +00:00
Joe-Blount	85a9e13262	Merge branch 'main' into jrb_63_compactor_spans	2023-07-17 09:52:27 -05:00
dependabot[bot]	4c0e5db3a5	chore(deps): Bump insta from 1.30.0 to 1.31.0 (#8242 ) Bumps [insta](https://github.com/mitsuhiko/insta) from 1.30.0 to 1.31.0. - [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md) - [Commits](https://github.com/mitsuhiko/insta/compare/1.30.0...1.31.0) --- updated-dependencies: - dependency-name: insta dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-07-17 14:01:21 +00:00
wiedld	d43300635e	Revert "feat(idpe-17789): scheduler job_status() (#8202 )" (#8213 ) This reverts commit `3dabccd84b`.	2023-07-11 10:33:56 -07:00
wiedld	3dabccd84b	feat(idpe-17789): scheduler job_status() (#8202 ) * feat(idpe-17789): scheduler job_status() (#8121) This block of work moves into the scheduler some of the specific downstream actions affiliated with compaction outcomes. Which responsibilities stay in the compactor, versus moved to the scheduler, roughly followed the heuristic of whether the action (a) had an impact on global catalog state (a.k.a. commits and partition skipping), (b) whether it's logging affiliated with compactor health (e.g. ParitionDoneSink logging outcomes) versus system health (e.g. logging commits), and (c) reporting to the scheduler on any errors encountered during compaction. This boundary is subject to change as we move forward. Also, a noted caveat (TODO) on this commit. We have a CompactionJob which is used to track work handed off to each compactor. Currently it still uses the partition_id for tracking, but the followup PR will start moving the compactor to have more CompactionJob uuid awareness.	2023-07-11 08:41:12 -07:00
Joe-Blount	16939c849d	chore: add tracing to compactor	2023-07-10 16:36:24 -05:00
Joe-Blount	9f522bfd30	Revert "feat(idpe-17789): scheduler job_status() (#8121 )" (#8175 ) This reverts commit `5d19fa3635`.	2023-07-06 18:52:25 +00:00
wiedld	5d19fa3635	feat(idpe-17789): scheduler job_status() (#8121 ) This block of work moves into the scheduler some of the specific downstream actions affiliated with compaction outcomes. Which responsibilities stay in the compactor, versus moved to the scheduler, roughly followed the heuristic of whether the action (a) had an impact on global catalog state (a.k.a. commits and partition skipping), (b) whether it's logging affiliated with compactor health (e.g. ParitionDoneSink logging outcomes) versus system health (e.g. logging commits), and (c) reporting to the scheduler on any errors encountered during compaction. This boundary is subject to change as we move forward. Also, a noted caveat (TODO) on this commit. We have a CompactionJob which is used to track work handed off to each compactor. Currently it still uses the partition_id for tracking, but the followup PR will start moving the compactor to have more CompactionJob uuid awareness.	2023-07-06 09:15:59 -07:00
dependabot[bot]	26a6113a37	chore(deps): Bump async-trait from 0.1.70 to 0.1.71 (#8163 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.70 to 0.1.71. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.70...0.1.71) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-07-06 09:58:51 +00:00
dependabot[bot]	b5c9628f0f	chore(deps): Bump async-trait from 0.1.69 to 0.1.70 (#8148 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.69 to 0.1.70. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.69...0.1.70) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-07-05 09:05:13 +00:00
wiedld	3a8a8a153e	feat(idpe 17789): provide scheduler interface (#8057 ) * feat: provide convenience methods to create Scheduler, and keep the scheduler implementations crate private. External crates can only create a Scheduler based upon configs. * feat: provide Scheduler as a component to compactor. Specifically, the scheduler configs are present within the compactor run config, and the scheduler in created within the compactor hardcoded components. * feat: within the compactor ScheduledPartitionsSource, utilize the dyn Scheduler and Scheduler.get_jobs() * feat: CompactionJob should be per partition, and have a uniqueness characteristic independent of the partition * feat: keep compactor_scheduler separate from clap_blocks. Only interface is within ioxd_compactor where the CLI configs are transformed into ShardConfig and PartitionsSourceConfig. * chore: make IdOnlyPartitionFilter into only pub(crate) * chore: update scheduler display to include any report information (a.k.a. shard_config, if present)	2023-06-28 15:04:00 -07:00
Joe-Blount	ac9cc24315	fix: compactor shouldn't leave small L1s in non-overlap leading edge pattern (#8101 ) * fix: compactor shouldn't leave tiny L1s with non-overlapped leading edge pattern * chore: insta updates for prior commit	2023-06-28 17:02:21 +00:00
dependabot[bot]	6e7b838b52	chore(deps): Bump insta from 1.29.0 to 1.30.0 (#8059 ) Bumps [insta](https://github.com/mitsuhiko/insta) from 1.29.0 to 1.30.0. - [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md) - [Commits](https://github.com/mitsuhiko/insta/compare/1.29.0...1.30.0) --- updated-dependencies: - dependency-name: insta dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-23 07:45:41 +00:00
Carol (Nichols \|\| Goulding)	62ba18171a	feat: Add a new hash column on the partition and parquet file tables This will hold the deterministic ID for partitions. Until all existing partitions have this value, this is optional/nullable. The row ID still exists and is used as the main foreign key in the parquet_file and skipped_compaction tables. The hash_id has a unique index so that we can look up records based on it (if it's available). If the parquet file record has a partition_hash_id value, use that to generate the object storage path instead of the partition_id.	2023-06-22 09:01:22 -04:00
Dom Dwyer	d1cbbd27b1	feat(compactor): config partition query rate limit Allow the partition fetch queries to be (optionally) rate limited via runtime config.	2023-06-21 15:50:12 +02:00
wiedld	e29b453e0d	refactor: move PartitionsSourceConfig into local scheduler (#8026 )	2023-06-20 16:05:59 -07:00
wiedld	7a1f54ac64	refactor: remove compactor type (#8011 ) * refactor: remove cold compactions * refactor: remove compaction_type	2023-06-16 09:40:13 -07:00
Joe-Blount	5d0bb68c5b	chore: add compactor option to disable scratchpad (#7995 )	2023-06-15 14:35:55 +00:00
Andrew Lamb	1ff76b7bf2	chore: use workspace dependencies for `object_store`	2023-05-26 07:03:42 -04:00
Marco Neumann	103e814f22	refactor: clean up catalog `parquet_files` interface (#7853 ) * feat: `ParquetFileRepo::list_all` * refactor: remove `ParquetFileRepo::list_by_table` * refactor: simlify `ParquetFileRepo::list_by_table` * refactor: remove `ParquetFileRepo::count` * refactor: remove `ParquetFileRepo::update_compaction_level` * refactor: remove `ParquetFileRepo::exists` * fix: test --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-24 09:15:03 +00:00
Dom Dwyer	928a4d163e	build: remove unused dependencies from crates This commit fixes loads of crates (47!) had unused dependencies, or mis-configured dependencies (test deps as normal deps). I added the "unused_crate_dependencies" to all crates to help prevent this mess from growing again! https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html This has the minor downside of false-positives when specifying dev-dependencies for test/bench binaries - these are files in /test or /benches (not normal tests). This commit includes a workaround, importing them in lib.rs (gated by a feature flag). I think the trade-off of better dependency management is worth it!	2023-05-23 14:55:43 +02:00
Andrew Lamb	6344fe8c3f	chore: Add rationale for `clippy::future_not_send` (#7822 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-18 16:58:56 +00:00
Carol (Nichols \|\| Goulding)	9229ce5668	fix: Rename compactor2_test_utils to compactor_test_utils	2023-05-09 11:02:11 +02:00

25 Commits (fd147f871b14c1f1a18d6f66acdb28f646fa0d19)