Commit Graph

140 Commits (2ed5758ddba42f0626e9f8d2b38d3d99e9aa4d78)

Author SHA1 Message Date
Andrew Lamb 7e31b2638d
fix: Understandable compactor2 config report (#7028)
* fix: Understandable compactor2 config report

* fix: do not log postgres dsn

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-22 23:43:31 +00:00
Joe-Blount 88d2882350
Merge branch 'main' into alamb/remove_old_algorithm 2023-02-21 09:02:35 -06:00
Andrew Lamb b785f751b3
feat(compactor): add simulator output (#7021) 2023-02-17 15:04:26 +00:00
Andrew Lamb d90443d9e6 refactor: remove files_filter too 2023-02-17 09:45:00 -05:00
Andrew Lamb 21a3c8c40d refactor: delete all at once algorithm 2023-02-17 06:24:26 -05:00
Nga Tran ae58831467
test: add a test that have over 2 times ax limit files per plan (#7017)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-17 10:42:31 +00:00
Nga Tran f69c8adc7c
feat: Compact partition with many L0 files (#7007)
* feat: initial implementation of the split

* feat: split many L0 files in groups and compact them into new and fewer L0 files

* test: remove iappropriate AllAtOnce test

* refactor: move file classification for initial target to its own function

* fix: pop the branch from start to end

* chore: address review comments

* feat: support splitting to many L1 files

* feat: only add extra round to compact level-n files to same level-n files if their files plus overlapped level-n-plus-1 over limit

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: final cleanup and address comments

* chore: run fmt

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-16 21:17:25 +00:00
dependabot[bot] a06f64b198
chore(deps): Bump insta from 1.26.0 to 1.28.0 (#7016)
Bumps [insta](https://github.com/mitsuhiko/insta) from 1.26.0 to 1.28.0.
- [Release notes](https://github.com/mitsuhiko/insta/releases)
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitsuhiko/insta/compare/1.26.0...1.28.0)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-16 18:14:25 +00:00
Andrew Lamb 63877ab314 chore: Move target level choice into round info 2023-02-15 14:00:35 -05:00
Andrew Lamb 063ac9f2cb
test: Add another layout test to compactor (#7005) 2023-02-15 18:57:36 +00:00
Andrew Lamb 999aba9d56 feat(compactor): Add RoundInfo structure 2023-02-15 13:00:58 -05:00
Joe-Blount 391e64772c
Merge branch 'main' into jrb_5_add_shutdown_log 2023-02-15 08:17:59 -06:00
Marco Neumann f499022511
feat: add compaction level to commit metrics (#6985)
* feat: add compaction level to commit metrics

* test: more realism
2023-02-15 09:28:19 +00:00
Joe-Blount e9f4b8f769 chore: add shutdown logging 2023-02-14 17:37:33 -06:00
Nga Tran 0ffb211c54
test: more compactor layout tests (#6988)
* test: more compactor layout tests

* chore: address review comments
2023-02-14 22:14:06 +00:00
Andrew Lamb 04bd47e64a
feat(compactor): Add more tests, improve sizes to simulator run display more (#6981)
* refactor: Split layout tests into their own module

* feat: Add more tests, improve sizes to simulator run display more

* fix: Apply suggestions from code review

Co-authored-by: Nga Tran <nga-tran@live.com>

* fix: fix comment wording

* fix: reporting order of skipped compactions

* chore: Run cargo hakari tasks

* fix: revert changes to Cargo.lock

* fix: revert workspace hack change

---------

Co-authored-by: Nga Tran <nga-tran@live.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-02-14 19:34:19 +00:00
Nga Tran 5c506058da
feat: skip partitions of wide tables (#6978)
* feat: skip partitions of wide tables

* test: one more test

* refactor: address review comments
2023-02-14 16:42:13 +00:00
Andrew Lamb 263d8fe21f
chore: Layout tests with `TargetLevel` algorithm + update display (#6977)
* refactor: move ParquetFileSimulator to compactor2_test_utils

* chore: Test with new algorithm + update display

* chore: Updates

* chore: Update setting to match prod
2023-02-13 22:12:55 +00:00
Marco Neumann 13ce6da3df
refactor: extract `FileClassifer` component (#6946)
* refactor: extract `FileClassifer` component

Make the driver slightly smaller. Also makes the "all-in-one" mode
easier to understand.

* docs: add some

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-13 16:25:05 +00:00
Dom 9d20f26006
Merge branch 'main' into dom/namespace-soft-delete-catalog 2023-02-13 12:07:03 +00:00
Andrew Lamb 4d7aa1e48b
refactor: extract compactor2 test utils into `compactor2_test_utils` and integration test (#6960)
* refactor: extract compactor2 test utils into `compactor2_test_utils` and integration test

* fix: Update compactor2/src/components/mod.rs

Co-authored-by: Marco Neumann <marco@crepererum.net>

---------

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-13 12:06:42 +00:00
Dom Dwyer 2d46a364dc
feat: namespace soft-delete support
This commit adds initial support for "soft" namespace deletion, where
the actual records & data remain, but are no longer queryable /
writeable.

Soft deletion is eventually consistent - users can expect to continue
writing to and reading from a bucket after issuing a soft delete call,
until the various components either restart, or have their caches
flushed.

The components treat soft-deleted namespaces differently:

    * router: ignore soft deleted namespaces
    * ingester: accept soft deleted namespaces
    * compactor: accept soft deleted namespaces
    * querier: ignore soft deleted namespaces
    * various gRPC services: ignore soft deleted namespaces

This ensures that the ingester & compactor do not see rows "vanishing"
from the database, and continue to make forward progress.

Writes for the deleted namespace that are buffered in the ingester will
be persisted as normal, allowing us to support "un-delete" operations
where the system is restored to a the state at which the delete was
issued (rather than loosing the buffered data).

Follow-on work is required to ensure GC drops the orphaned parquet files
after the configured GC time, and optimisations such as not compacting
parquet from soft-deleted namespaces seems like a trivial win.
2023-02-13 12:01:35 +01:00
dependabot[bot] 0cbd9f6a82
chore(deps): Bump tokio-util from 0.7.5 to 0.7.7 (#6964)
---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-13 10:10:53 +00:00
Andrew Lamb 0db37f564c
chore: Update display, and add initial scenario tests (#6954)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-10 20:20:19 +00:00
Andrew Lamb 779fb93ce7
refactor: move test builders out of compactor2 code (#6953)
* refactor: move test builders out of compactor2 code

* fix: docs
2023-02-10 18:28:09 +00:00
Andrew Lamb d790406085
feat(compactor): Implement ParquetFileSimulator and use it to show layout testing (#6932)
* feat(compactor): implement ParquetFileSimulator, show it working in tests

* fix: Update compactor2/src/components/parquet_files_sink/simulator.rs

Co-authored-by: Nga Tran <nga-tran@live.com>

* feat: Improve display of plan_ir

* refactor: return CompactResult, avoid `mut TestSetup`

---------

Co-authored-by: Nga Tran <nga-tran@live.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-10 14:51:17 +00:00
Dom Dwyer a85dcd745b
refactor(catalog): expose deleted_at on Namespace
Add the new catalog column to the Namespace representation/model.
2023-02-10 14:15:01 +01:00
dependabot[bot] c0c9b51b9e
chore(deps): Bump tokio-util from 0.7.4 to 0.7.5 (#6941)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.4 to 0.7.5.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.4...tokio-util-0.7.5)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-10 09:42:00 +00:00
Marco Neumann 6a470252ac
refactor: `PartitionInfoSource` (#6930)
* refactor: `PartitionInfoSource`

Clean up the driver code a bit. There is certainly a good point in
having all these three sources (partition, table, namespace) separate,
but the driver doesn't really need to know that. In the end, it just
wants to have a `PartitionInfo` instance.

* docs: typo

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-02-09 16:29:28 +00:00
Andrew Lamb fd42f94fb8
test(compactor): show compactor working (#6910)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-09 16:06:37 +00:00
Marco Neumann 4a97620664
refactor: `ParquetFilesSink` (#6928)
* refactor: pass `PlanIR` by ref

* refactor: `ParquetFilesSink`
2023-02-09 15:53:19 +00:00
Marco Neumann 0e5f31c576
feat: add log around job semaphore and per-partition (#6927)
May help w/ debug OOMs.
2023-02-09 15:02:34 +00:00
Marco Neumann 7f4fd1013c
feat: measure scratchpad store (#6925)
* refactor: create `objec_store` submodule

* feat: add gauge for scratchpad size
2023-02-09 13:03:26 +00:00
Marco Neumann e33ac920a5
refactor: introduce IR before creating actual DF plan (#6922)
* refactor: introduce IR before creating actual DF plan

Let's have an IR that presents a machine-readable form of how output
files may look like.

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* feat: also log plan type

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-02-09 11:31:18 +00:00
Andrew Lamb bd2a72a4b6
test(compactor): Improve file size display in compactor tests (#6909)
* test(compactor): Improve file size display in compactor tests

* fix: try different display

* fix: update

* fix: use b to show size
2023-02-08 22:35:12 +00:00
Marco Neumann 18d5924dfd
test: allow testing the compactor w/o any real data (#6908)
* test: allow testing the compactor w/o any real data

Things that are missing:

- output files have nondeterministic IDs which interferes w/ snapshot
  testing. We should probably normalize the IDs somehow.
- time ranges of output files are not captured correctly (because the
  mock sink doesn't know how to calculate them)

* fix: Add output assertion

* fix: fmt

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* fix: fmt

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-02-08 19:10:28 +00:00
Marco Neumann 43b11b8be8
fix: metric labels (#6904)
Metric labels are key-value based.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-08 18:18:56 +00:00
Marco Neumann 5e287ac8e6
refactor: make `TestSetup` immutable (#6907)
* refactor: split off file creation

* refactor: make `TestSetup` immutable
2023-02-08 13:17:15 +00:00
Marco Neumann 6f7f608685
refactor: avoid nested test modules (#6906)
Just call it `tests` and avoid nesting a module in a module.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-08 11:16:36 +00:00
dependabot[bot] 0ecde75af5
chore(deps): Bump object_store from 0.5.3 to 0.5.4 (#6900)
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.3 to 0.5.4.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.3...object_store_0.5.4)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-08 09:40:11 +00:00
Marco Neumann dcba47ab58
feat: allow the compactor to process all known partitions (#6887)
* feat: `PartitionRepo::list_ids`

* refactor: `CatalogPartitionsSource` => `CatalogToCompactPartitionsSource`

* feat: allow the compactor to process all known partitions

Closes #6648.

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-02-08 09:32:21 +00:00
Andrew Lamb 47b47f225b
chore: add more insta snapshots for initial setup (#6894)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-07 23:44:59 +00:00
Andrew Lamb d4e07eb6ac
chore: Update target_level_upgrade_split to use insta snapshots for partition verificiation (#6892) 2023-02-07 20:36:46 +00:00
Andrew Lamb ea0fc79340
chore: Update tests in compactor2 to ue insta (#6891)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-07 19:58:43 +00:00
NGA-TRAN c5025d6271 chore: merge main to branch 2023-02-07 14:04:03 -05:00
NGA-TRAN aab6ac7424 refactor: address review comments 2023-02-07 13:35:11 -05:00
Nga Tran 76172daf39
feat: to prevent OOMs/crrash that will lead to skip compaction, let us compact L1 files when they are large enough (#6877)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-07 17:46:44 +00:00
NGA-TRAN db51d77eb8 docs: even more detailed info about skipped partitions 2023-02-07 10:13:58 -05:00
Andrew Lamb f56023b222
feat: use insta_snapshots (#6884)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-07 15:08:47 +00:00
Nga Tran 6297ad206f
Merge branch 'main' into ntran/c2-skip 2023-02-07 09:38:23 -05:00