kodiakhq[bot]
c0f2ba09ee
Merge branch 'main' into cn/compactor2
2022-12-19 14:22:56 +00:00
dependabot[bot]
299f0e99f9
chore(deps): Bump thiserror from 1.0.37 to 1.0.38
...
Bumps [thiserror](https://github.com/dtolnay/thiserror ) from 1.0.37 to 1.0.38.
- [Release notes](https://github.com/dtolnay/thiserror/releases )
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.37...1.0.38 )
---
updated-dependencies:
- dependency-name: thiserror
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-12-19 10:33:32 +00:00
Carol (Nichols || Goulding)
dfd979477c
fix: Update warm compaction code to optionally take shard ID
2022-12-16 17:41:57 -05:00
Carol (Nichols || Goulding)
d7e75d43ea
fix: Make shard ID optional for compactor queries in RPC write mode
2022-12-16 17:28:53 -05:00
Carol (Nichols || Goulding)
2406cdb24b
feat: Create a compactor2 cli
2022-12-16 17:22:06 -05:00
Luke Bond
f419e2c378
feat: warm compaction ( #6192 )
...
* feat: warm compaction
chore: add missing warm compaction config
chore: tests for warm compaction
chore: modify count usage in warm compaction sql
chore: catalog test for warm compaction; sql fixes
feat: settable target level for compact w/ budget
chore: tests for warm compaction
chore: clarifying comments in warm compaction test
chore: fixed erroneous comment in catalog test
chore: improve warm compactor test by checking file exists
chore: tests for warm compaction
chore: warm compactor test tidy-ups
* chore: improve test for warm compaction
* chore: fix erroneous comment in warm compaction code
2022-12-16 15:59:45 +00:00
dependabot[bot]
1d38d400f0
chore(deps): Bump object_store from 0.5.1 to 0.5.2 ( #6339 )
...
* chore(deps): Bump object_store from 0.5.1 to 0.5.2
Bumps [object_store](https://github.com/apache/arrow-rs ) from 0.5.1 to 0.5.2.
- [Release notes](https://github.com/apache/arrow-rs/releases )
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md )
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.1...object_store_0.5.2 )
---
updated-dependencies:
- dependency-name: object_store
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore: Run cargo hakari tasks
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-06 07:53:54 +00:00
Nga Tran
77cbc880f6
feat: Add cap limit on number of partitions to be compacted in parallel ( #6305 )
...
* feat: Add cap limit on number of partitions to be comapcted in parallel
* chore: cleanup
* chore: clearer comments
2022-12-01 21:23:44 +00:00
Luke Bond
7c813c170a
feat: reintroduce compactor first file in partition exception ( #6176 )
...
* feat: compactor ignores max file count for first file
chore: typo in comment in compactor
* feat: restore special first file in partition compaction logic; add limit
* fix: calculation in compaction max file count
chore: clippy
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-18 15:58:59 +00:00
Luke Bond
f9316decee
chore: expose compactor's hot compaction hours thresholds as cfg ( #6060 )
...
* chore: expose compactor's hot compaction hours thresholds as cfg
* fix: add missing compactor arg envar; fix some comments
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-07 15:29:17 +00:00
Nga Tran
654ed98d1f
feat: config param to set when partition is cold ( #6044 )
...
* feat: config param to set when partition is cold
* chore: Apply suggestions from code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* fix: make default 8 hours and avoid using 8 * 60 becasue it is a string, not expression which makes a test fail
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-03 15:03:56 +00:00
Carol (Nichols || Goulding)
dad1ad1318
feat: Add the catalog service to ingester, querier, and compactor
...
So that `remote get` that uses the catalog service can work no matter
what kind of server you contact.
2022-10-28 10:49:26 -04:00
Carol (Nichols || Goulding)
2e83e04eab
feat: Use workspace package metadata to reduce differences and repetition
2022-10-24 13:04:09 -04:00
Marco Neumann
e0062f2d40
refactor: do NOT use fake DF context for parquet reading ( #5942 )
...
Use the proper top-level DataFusion context and register the object
store there.
Note that we still hide the `ParquetExec` behind an opaque record batch
stream. Fixing that is next on my list.
Helps with #5897 .
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-24 08:20:26 +00:00
Carol (Nichols || Goulding)
0132a33946
fix: Rename SkippedCompactionService to CompactionService
...
To make a good place for other compactor-related gRPC actions in the
future.
2022-10-21 13:40:37 -04:00
Carol (Nichols || Goulding)
ba25300b01
feat: Create compactor service to list skipped compactions
2022-10-21 13:40:31 -04:00
Marco Neumann
eb5a661ab3
refactor: prep work for #5897 ( #5907 )
...
* refactor: add ID to `ParquetStorage`
* refactor: remove duplicate code
* refactor: use dedicated `StorageId`
2022-10-19 11:54:42 +00:00
dependabot[bot]
933493fab3
chore(deps): Bump object_store from 0.5.0 to 0.5.1
...
Bumps [object_store](https://github.com/apache/arrow-rs ) from 0.5.0 to 0.5.1.
- [Release notes](https://github.com/apache/arrow-rs/releases )
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md )
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.0...object_store_0.5.1 )
---
updated-dependencies:
- dependency-name: object_store
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-10-11 01:19:10 +00:00
dependabot[bot]
227dde1dfc
chore(deps): Bump thiserror from 1.0.36 to 1.0.37 ( #5753 )
...
Bumps [thiserror](https://github.com/dtolnay/thiserror ) from 1.0.36 to 1.0.37.
- [Release notes](https://github.com/dtolnay/thiserror/releases )
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.36...1.0.37 )
---
updated-dependencies:
- dependency-name: thiserror
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-29 10:37:14 +00:00
dependabot[bot]
b1740f45d6
chore(deps): Bump thiserror from 1.0.35 to 1.0.36 ( #5737 )
...
Bumps [thiserror](https://github.com/dtolnay/thiserror ) from 1.0.35 to 1.0.36.
- [Release notes](https://github.com/dtolnay/thiserror/releases )
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.35...1.0.36 )
---
updated-dependencies:
- dependency-name: thiserror
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-26 14:44:36 +00:00
Nga Tran
e3deb23bcc
feat: add minimum row_count per file in estimating compacting memory… ( #5715 )
...
* feat: add minimum row_count per file in estiumating compacting memory budget and limit number files per compaction
* chore: cleanup
* chore: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* test: add test per review comments
* chore: Apply suggestions from code review
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* test: add one more test that has limit num files larger than total input files
* fix: make the L1 files in tests not overlapped
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-22 14:37:39 +00:00
dependabot[bot]
b4a25fdb0e
chore(deps): Bump thiserror from 1.0.34 to 1.0.35 ( #5629 )
...
Bumps [thiserror](https://github.com/dtolnay/thiserror ) from 1.0.34 to 1.0.35.
- [Release notes](https://github.com/dtolnay/thiserror/releases )
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.34...1.0.35 )
---
updated-dependencies:
- dependency-name: thiserror
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-14 12:54:12 +00:00
Andrew Lamb
f86d3e31da
chore: Update datafusion + object_store ( #5619 )
...
* chore: Update datafusion pin
* chore: update object_store to 0.5.0
* chore: Run cargo hakari tasks
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-13 12:34:54 +00:00
Carol (Nichols || Goulding)
dfd7255c46
fix: Remove now-unused cold_input_file_count_threshold
2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding)
3a368c02c2
fix: Remove now-unused cold_input_size_threshold_bytes
2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding)
eefc71ac90
fix: Remove now unused max_cold_concurrent_size_bytes
2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding)
6436afc3d9
fix: Remove cold max bytes CLI option; use existing max bytes CLI option
...
As discussed in https://github.com/influxdata/influxdb_iox/issues/5330#issuecomment-1218170063
2022-09-12 13:13:27 -04:00
Carol (Nichols || Goulding)
10ba3fef47
feat: Compact cold partitions completely
...
Fixes #5330 .
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding)
b5ca99a3d5
refactor: Make CompactorConfig fields pub
...
I'm spending way too long with the wrong number of arguments to
CompactorConfig::new and not a lot of help from the compiler. If these
struct fields are pub, they can be set directly and destructured, etc,
which the compiler gives way more help on. This also reduces duplication
and boilerplate that has to be updated when the config fields change.
2022-09-07 13:28:19 -04:00
dependabot[bot]
9f0b0328f7
chore(deps): Bump thiserror from 1.0.33 to 1.0.34 ( #5556 )
...
Bumps [thiserror](https://github.com/dtolnay/thiserror ) from 1.0.33 to 1.0.34.
- [Release notes](https://github.com/dtolnay/thiserror/releases )
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.33...1.0.34 )
---
updated-dependencies:
- dependency-name: thiserror
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-06 09:18:41 +00:00
dependabot[bot]
00ed79ff1b
chore(deps): Bump thiserror from 1.0.32 to 1.0.33 ( #5524 )
...
Bumps [thiserror](https://github.com/dtolnay/thiserror ) from 1.0.32 to 1.0.33.
- [Release notes](https://github.com/dtolnay/thiserror/releases )
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.32...1.0.33 )
---
updated-dependencies:
- dependency-name: thiserror
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-01 09:11:31 +00:00
Nga Tran
cb10a7c6d8
feat: More accurate memory estimate for compaction ( #5471 )
...
* feat: initial implementation of memory estimation for a compaction
* feat: estimate size of files and have the right actions for the needed budget
* feat: run candidates in parallel
* fix: have the right name for the column field of the output struct
* feat: add metrics for estimated budgets
* chore: cleanup
* chore: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: fix syntax after applying review's suggestions
* refactor: Convert a Vec to VecDeque to go well with pop and push
* chore: remove max_concurrent_size_bytes and input_size_threshold_bytes
* chore: remove input_file_count_threshold
* test: tests for estimate_arrow_bytes_for_file
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-30 13:44:44 +00:00
Carol (Nichols || Goulding)
58f0b63cdc
refactor: Rename KafkaTopic to Topic or TopicMetadata or topic name as appropriate
2022-08-29 14:27:02 -04:00
Carol (Nichols || Goulding)
74c9529062
fix: Rename KafkaPartition to ShardIndex
2022-08-29 14:07:18 -04:00
Carol (Nichols || Goulding)
6443858870
fix: Rename compactor option from sequencer to shard
2022-08-29 14:06:45 -04:00
Jake Goulding
4abf21c724
refactor: Rename Sequencer (and its entourage) to Shard
2022-08-29 14:06:43 -04:00
Nga Tran
3220c6f88b
feat: add file_count_threshold for comapcting cold partitions ( #5456 )
...
* feat: file file_count_threshold for comapcting cold partitions to make it consistent with the hot case and help set up to avoid oom easier
* chore: remove unecessary commments
2022-08-23 20:12:21 +00:00
Andrew Lamb
7f0ae53d6f
chore: Update to (almost) released object_store 0.4.0 ( #5419 )
...
* chore: update object_store
* chore: update hakari config
* chore: Run cargo hakari tasks
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-17 13:44:48 +00:00
Jake Goulding
49c5281454
refactor: Supersede old CompactorHandlerImpl constructor
2022-08-10 11:28:51 -04:00
Jake Goulding
e9140df476
refactor: extract method to build `compactor` from CLI configuration
2022-08-10 11:28:51 -04:00
Jake Goulding
ce908c8678
refactor: Use CompactorHandlerImpl::new_with_compactor in service
2022-08-10 11:28:51 -04:00
Carol (Nichols || Goulding)
da0b031c44
feat: Add parameters to limit total memory usage of cold partition compaction
2022-08-04 16:55:48 -04:00
Carol (Nichols || Goulding)
d55f45a5c2
feat: Run compaction of hot partitions a configurable number of times more than cold
2022-08-04 16:55:48 -04:00
dependabot[bot]
55e1e2ec2b
chore(deps): Bump thiserror from 1.0.31 to 1.0.32 ( #5294 )
...
* chore(deps): Bump thiserror from 1.0.31 to 1.0.32
Bumps [thiserror](https://github.com/dtolnay/thiserror ) from 1.0.31 to 1.0.32.
- [Release notes](https://github.com/dtolnay/thiserror/releases )
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.31...1.0.32 )
---
updated-dependencies:
- dependency-name: thiserror
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore: Run cargo hakari tasks
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 16:20:36 +00:00
Carol (Nichols || Goulding)
07e10852a8
feat: Add an input file count threshold to the compactor settings
2022-07-18 15:41:17 -04:00
Carol (Nichols || Goulding)
128833e7d9
fix: Change placeholder new_param to input_size_threshold_bytes
2022-07-18 15:16:43 -04:00
Carol (Nichols || Goulding)
d62b1ed7ee
feat: Select a subset of parquet files for a partition to compact
...
Fixes #5120 .
2022-07-18 15:14:22 -04:00
Carol (Nichols || Goulding)
4416f1ce37
fix: Remove max number of level 0 files configuration option
2022-07-18 15:08:16 -04:00
Carol (Nichols || Goulding)
57c70fcec5
fix: Remove redundant 'compaction' naming from CompactorConfig fields
2022-07-18 15:03:33 -04:00
Nga Tran
c8f4000f04
feat: Select compaction candidates ( #5131 )
...
* feat: initial implementation for selecting compaction candidates
* feat: 2 catalog functions to choose the most thorughput partitions to compact and the selecting candidate function itself
* test: tests for the new 2 queries
* feat: more tests and metrics for chooing compaction candidates
* chore: Apply self suggestions from self review
* chore: cleanup
* chore: fix doc comment
* chore: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* refactor: address review comments
* fix: get the right time provider for the tests
* refactor: remove the left over compaction_
* fix: typos
* fix: make the param name and env name consistent
* refactor: make relevant iSomething to uSomething
* fix: typo
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2022-07-18 18:05:13 +00:00