Carol (Nichols || Goulding)
f5497a3a3d
refactor: Extract a conversion for convenience in tests
2022-09-15 12:48:36 -04:00
Carol (Nichols || Goulding)
dcab9d0ffc
refactor: Combine relevant data with the FilterResult state
...
This encodes the result directly and has the FilterResult hold only the
relevant data to the state. So no longer any need to create or check for
empty vectors or 0 budget_bytes. Also creates a new type after checking
the filter result state and handling the budget, as actual compaction
doesn't need to care about that.
This could still use more refactoring to become a clearer pipeline of
different states, but I think this is a good start.
2022-09-15 11:13:18 -04:00
Carol (Nichols || Goulding)
e57387b8e4
refactor: Extract an inner function so partition isn't needed in tests
2022-09-15 11:10:14 -04:00
Carol (Nichols || Goulding)
a284cebb51
refactor: Store estimated bytes on the CompactorParquetFile
2022-09-15 11:10:14 -04:00
Carol (Nichols || Goulding)
70094aead0
refactor: Make estimating bytes a responsibility of the Partition
...
Table columns for a partition don't change, so rather than carrying
around table columns for the partition and parquet files to look up
repeatedly, have the `PartitionCompactionCandidateWithInfo` keep track
of its column types and be able to estimate bytes given a number of rows
from a parquet file.
2022-09-15 11:10:14 -04:00
Nga Tran
7c4c918636
chore: add parttion id into panic message ( #5641 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-15 02:21:13 +00:00
kodiakhq[bot]
08e2523295
Merge branch 'main' into cn/always-get-extra-info
2022-09-14 17:01:59 +00:00
Nga Tran
44e12aa512
feat: add needed budget and memory budget into the message for us to diagnose and increase our memory budget as needed ( #5640 )
2022-09-14 16:06:19 +00:00
Carol (Nichols || Goulding)
e16306d21c
refactor: Move fetching of extra partition info into the method because it's always needed
2022-09-14 11:14:17 -04:00
kodiakhq[bot]
85641efa6f
Merge branch 'main' into cn/infallible-estimated-bytes
2022-09-14 01:00:10 +00:00
Nga Tran
f21cb43624
feat: add a few more buckets for the histograms ( #5621 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-13 13:52:23 +00:00
Andrew Lamb
f86d3e31da
chore: Update datafusion + object_store ( #5619 )
...
* chore: Update datafusion pin
* chore: update object_store to 0.5.0
* chore: Run cargo hakari tasks
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-13 12:34:54 +00:00
Carol (Nichols || Goulding)
d971980fd3
fix: Box a source error to please clippy
2022-09-12 17:38:40 -04:00
Carol (Nichols || Goulding)
c3937308f4
fix: Make estimate_arrow_bytes_for_file infallible
2022-09-12 16:50:25 -04:00
Andrew Lamb
1fd31ee3bf
chore: Update datafusion / `arrow` / `arrow-flight` / `parquet` to version 22.0.0 ( #5591 )
...
* chore: Update datafusion / `arrow` / `arrow-flight` / `parquet` to version 22.0.0
* fix: enable dynamic comparison flag
* chore: derive Eq for clippy
* chore: update explain plans
* chore: Update sizes for ReadBuffer encoding
* chore: update more tests
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-12 17:45:03 +00:00
Carol (Nichols || Goulding)
e7a3f15ecf
test: Remove outdated description
2022-09-12 13:13:30 -04:00
Carol (Nichols || Goulding)
8981cbbd84
test: Reduce time from 18 to 9 hours
2022-09-12 13:13:29 -04:00
Carol (Nichols || Goulding)
2ceb779c28
test: Correct a comment that I missed in the 24 hr -> 8 hr switch
2022-09-12 13:13:29 -04:00
Carol (Nichols || Goulding)
baec40a313
test: Correct and expand assertions and descriptions
2022-09-12 13:13:29 -04:00
Carol (Nichols || Goulding)
2aef7c7936
feat: Temporarily disable cold full compaction
2022-09-12 13:13:29 -04:00
Carol (Nichols || Goulding)
743b67f0e9
fix: Re-enable full cold compaction, in serial for now
2022-09-12 13:13:29 -04:00
Carol (Nichols || Goulding)
6e1b06c435
fix: Work with Arc of PartitionCompactionCandidateWithInfo
2022-09-12 13:13:29 -04:00
Carol (Nichols || Goulding)
dfd7255c46
fix: Remove now-unused cold_input_file_count_threshold
2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding)
3a368c02c2
fix: Remove now-unused cold_input_size_threshold_bytes
2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding)
eefc71ac90
fix: Remove now unused max_cold_concurrent_size_bytes
2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding)
2a22d79c94
feat: Make cold compaction like hot compaction except for candidate selection
...
Temporarily disable full compaction from level 1 to 2.
Re-use the memory budget estimation and parallelization for cold
compaction. Rather than choosing cold compaction candidates and then in
parallel compacting each partition from level 0 to 1 and then 1 to 2,
this commit switches to compacting in parallel (by memory budget) all
candidates form level 0 to 1. The next commit will re-enable full
compaction of all partitions in parallel (by memory budget).
2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding)
76228c9fd6
refactor: Move compact_in_parallel and compact_one_partition to lib and make more general
...
Cold compaction is going to use these too.
2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding)
7a3dffb750
refactor: Create wrapper fns that don't take size overrides
...
So that we don't have to pass an empty hashmap in as many places in real
code, because the size overrides are only for tests
2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding)
608290b83d
fix: Make some hot compaction code more general/parameterized
2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding)
2a5ef3058c
refactor: Move compact_candidates_with_memory_budget to share with cold
2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding)
955e7ea824
fix: Remove unused Error struct
2022-09-12 13:13:27 -04:00
Carol (Nichols || Goulding)
ee3e1b851d
fix: Clean up some long lines, comments
2022-09-12 13:13:27 -04:00
Carol (Nichols || Goulding)
77f3490246
refactor: Extract cold compaction code into a module like hot
2022-09-12 13:13:27 -04:00
Carol (Nichols || Goulding)
c12b3fbb03
refactor: Move to a module named hot to reduce naming duplication
...
My fingers are tired of typing 🤣
2022-09-12 13:13:27 -04:00
Carol (Nichols || Goulding)
e3f9984878
docs: Clean up some comments while reading through
2022-09-12 13:13:27 -04:00
Carol (Nichols || Goulding)
f2f99727ba
feat: Add metrics for files going into cold compaction
2022-09-12 13:13:27 -04:00
Carol (Nichols || Goulding)
ad2db51ac2
refactor: Extract a function to share logic for compacting to L1 or L2
2022-09-12 13:13:27 -04:00
Carol (Nichols || Goulding)
6436afc3d9
fix: Remove cold max bytes CLI option; use existing max bytes CLI option
...
As discussed in https://github.com/influxdata/influxdb_iox/issues/5330#issuecomment-1218170063
2022-09-12 13:13:27 -04:00
Carol (Nichols || Goulding)
723aedfbca
test: Add more cases for cold compaction
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding)
7cd78a3020
fix: Extract and test logic that groups files for cold compaction
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding)
da201ba87f
fix: Select by num of both l0 and l1 files for cold compaction
...
Now that we're going to compact level 1 files in to level 2 files as
well.
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding)
6bba3fafaa
fix: If full compaction group has only 1 file, upgrade level
...
As opposed to running full compaction.
Makes the catalog function general and take the level as a parameter
rather than only upgrade to level 1.
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding)
10ba3fef47
feat: Compact cold partitions completely
...
Fixes #5330 .
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding)
327446f0cd
fix: Change default cold hours threshold from 24 hours to 8
...
As requested in https://github.com/influxdata/influxdb_iox/issues/5330#issuecomment-1212468682
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding)
a64a705b60
refactor: Extract a fn for the first step of cold compaction
...
Which is currently the only step, compacting any remaining level 0 files
into level 1. Make a TODO function for performing full compaction of all
level 1 files next.
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding)
7249ef4793
fix: Don't record cold compaction metrics if compaction fails
2022-09-12 13:13:25 -04:00
Marco Neumann
8933f47ec1
refactor: make `QueryChunk::partition_id` non-optional ( #5614 )
...
In our data model, a chunk always belongs to a partition[^1], so let's
not make this attribute optional. The optional value only leads to
-- mostly surprising -- conditional behavior, ranging from "do not equalize
the partition sort key" (querier) to "always consider the chunk overlapping"
(iox_query when dealing with ingester chunks).
[^1]: This is even true when the chunk belongs to a parquet file that is not
yet added to the catalog, contrary to what a comment in the ingester
stated. The catalog and data model used by the querier are two totally
different things.
2022-09-12 13:52:51 +00:00
Carol (Nichols || Goulding)
13de7ac954
feat: Record reasons for skipping compaction of a partition in the database
...
Closes #5458 .
2022-09-09 16:40:48 -04:00
Nga Tran
f03e370ecc
refactor: allocate more accurate length for a hashmap ( #5592 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-09 15:37:29 +00:00
dependabot[bot]
786ce75e26
chore(deps): Bump tokio-util from 0.7.3 to 0.7.4 ( #5596 )
...
Bumps [tokio-util](https://github.com/tokio-rs/tokio ) from 0.7.3 to 0.7.4.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.3...tokio-util-0.7.4 )
---
updated-dependencies:
- dependency-name: tokio-util
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-09 07:40:16 +00:00