influxdb

Commit Graph

Author	SHA1	Message	Date
Nga Tran	f21cb43624	feat: add a few more buckets for the histograms (#5621 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-13 13:52:23 +00:00
Carol (Nichols \|\| Goulding)	e7a3f15ecf	test: Remove outdated description	2022-09-12 13:13:30 -04:00
Carol (Nichols \|\| Goulding)	8981cbbd84	test: Reduce time from 18 to 9 hours	2022-09-12 13:13:29 -04:00
Carol (Nichols \|\| Goulding)	2ceb779c28	test: Correct a comment that I missed in the 24 hr -> 8 hr switch	2022-09-12 13:13:29 -04:00
Carol (Nichols \|\| Goulding)	baec40a313	test: Correct and expand assertions and descriptions	2022-09-12 13:13:29 -04:00
Carol (Nichols \|\| Goulding)	2aef7c7936	feat: Temporarily disable cold full compaction	2022-09-12 13:13:29 -04:00
Carol (Nichols \|\| Goulding)	743b67f0e9	fix: Re-enable full cold compaction, in serial for now	2022-09-12 13:13:29 -04:00
Carol (Nichols \|\| Goulding)	6e1b06c435	fix: Work with Arc of PartitionCompactionCandidateWithInfo	2022-09-12 13:13:29 -04:00
Carol (Nichols \|\| Goulding)	dfd7255c46	fix: Remove now-unused cold_input_file_count_threshold	2022-09-12 13:13:28 -04:00
Carol (Nichols \|\| Goulding)	3a368c02c2	fix: Remove now-unused cold_input_size_threshold_bytes	2022-09-12 13:13:28 -04:00
Carol (Nichols \|\| Goulding)	eefc71ac90	fix: Remove now unused max_cold_concurrent_size_bytes	2022-09-12 13:13:28 -04:00
Carol (Nichols \|\| Goulding)	2a22d79c94	feat: Make cold compaction like hot compaction except for candidate selection Temporarily disable full compaction from level 1 to 2. Re-use the memory budget estimation and parallelization for cold compaction. Rather than choosing cold compaction candidates and then in parallel compacting each partition from level 0 to 1 and then 1 to 2, this commit switches to compacting in parallel (by memory budget) all candidates form level 0 to 1. The next commit will re-enable full compaction of all partitions in parallel (by memory budget).	2022-09-12 13:13:28 -04:00
Carol (Nichols \|\| Goulding)	76228c9fd6	refactor: Move compact_in_parallel and compact_one_partition to lib and make more general Cold compaction is going to use these too.	2022-09-12 13:13:28 -04:00
Carol (Nichols \|\| Goulding)	7a3dffb750	refactor: Create wrapper fns that don't take size overrides So that we don't have to pass an empty hashmap in as many places in real code, because the size overrides are only for tests	2022-09-12 13:13:28 -04:00
Carol (Nichols \|\| Goulding)	608290b83d	fix: Make some hot compaction code more general/parameterized	2022-09-12 13:13:28 -04:00
Carol (Nichols \|\| Goulding)	2a5ef3058c	refactor: Move compact_candidates_with_memory_budget to share with cold	2022-09-12 13:13:28 -04:00
Carol (Nichols \|\| Goulding)	955e7ea824	fix: Remove unused Error struct	2022-09-12 13:13:27 -04:00
Carol (Nichols \|\| Goulding)	ee3e1b851d	fix: Clean up some long lines, comments	2022-09-12 13:13:27 -04:00
Carol (Nichols \|\| Goulding)	77f3490246	refactor: Extract cold compaction code into a module like hot	2022-09-12 13:13:27 -04:00
Carol (Nichols \|\| Goulding)	c12b3fbb03	refactor: Move to a module named hot to reduce naming duplication My fingers are tired of typing 🤣	2022-09-12 13:13:27 -04:00
Carol (Nichols \|\| Goulding)	e3f9984878	docs: Clean up some comments while reading through	2022-09-12 13:13:27 -04:00
Carol (Nichols \|\| Goulding)	f2f99727ba	feat: Add metrics for files going into cold compaction	2022-09-12 13:13:27 -04:00
Carol (Nichols \|\| Goulding)	ad2db51ac2	refactor: Extract a function to share logic for compacting to L1 or L2	2022-09-12 13:13:27 -04:00
Carol (Nichols \|\| Goulding)	6436afc3d9	fix: Remove cold max bytes CLI option; use existing max bytes CLI option As discussed in https://github.com/influxdata/influxdb_iox/issues/5330#issuecomment-1218170063	2022-09-12 13:13:27 -04:00
Carol (Nichols \|\| Goulding)	723aedfbca	test: Add more cases for cold compaction	2022-09-12 13:13:26 -04:00
Carol (Nichols \|\| Goulding)	7cd78a3020	fix: Extract and test logic that groups files for cold compaction	2022-09-12 13:13:26 -04:00
Carol (Nichols \|\| Goulding)	da201ba87f	fix: Select by num of both l0 and l1 files for cold compaction Now that we're going to compact level 1 files in to level 2 files as well.	2022-09-12 13:13:26 -04:00
Carol (Nichols \|\| Goulding)	6bba3fafaa	fix: If full compaction group has only 1 file, upgrade level As opposed to running full compaction. Makes the catalog function general and take the level as a parameter rather than only upgrade to level 1.	2022-09-12 13:13:26 -04:00
Carol (Nichols \|\| Goulding)	10ba3fef47	feat: Compact cold partitions completely Fixes #5330.	2022-09-12 13:13:26 -04:00
Carol (Nichols \|\| Goulding)	327446f0cd	fix: Change default cold hours threshold from 24 hours to 8 As requested in https://github.com/influxdata/influxdb_iox/issues/5330#issuecomment-1212468682	2022-09-12 13:13:26 -04:00
Carol (Nichols \|\| Goulding)	a64a705b60	refactor: Extract a fn for the first step of cold compaction Which is currently the only step, compacting any remaining level 0 files into level 1. Make a TODO function for performing full compaction of all level 1 files next.	2022-09-12 13:13:26 -04:00
Carol (Nichols \|\| Goulding)	7249ef4793	fix: Don't record cold compaction metrics if compaction fails	2022-09-12 13:13:25 -04:00
Marco Neumann	8933f47ec1	refactor: make `QueryChunk::partition_id` non-optional (#5614 ) In our data model, a chunk always belongs to a partition[^1], so let's not make this attribute optional. The optional value only leads to -- mostly surprising -- conditional behavior, ranging from "do not equalize the partition sort key" (querier) to "always consider the chunk overlapping" (iox_query when dealing with ingester chunks). [^1]: This is even true when the chunk belongs to a parquet file that is not yet added to the catalog, contrary to what a comment in the ingester stated. The catalog and data model used by the querier are two totally different things.	2022-09-12 13:52:51 +00:00
Carol (Nichols \|\| Goulding)	13de7ac954	feat: Record reasons for skipping compaction of a partition in the database Closes #5458.	2022-09-09 16:40:48 -04:00
Nga Tran	f03e370ecc	refactor: allocate more accurate length for a hashmap (#5592 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-09 15:37:29 +00:00
Joe-Blount	333cfa4f3c	chore: address comments - use TimestampMinMax passed by reference	2022-09-07 16:36:39 -05:00
Joe-Blount	97ebad5adb	chore: rustfmt changes	2022-09-07 13:22:36 -05:00
Joe-Blount	4188230694	fix: avoid splitting compaction output for time ranges with no chunks	2022-09-07 13:01:14 -05:00
Carol (Nichols \|\| Goulding)	b5ca99a3d5	refactor: Make CompactorConfig fields pub I'm spending way too long with the wrong number of arguments to CompactorConfig::new and not a lot of help from the compiler. If these struct fields are pub, they can be set directly and destructured, etc, which the compiler gives way more help on. This also reduces duplication and boilerplate that has to be updated when the config fields change.	2022-09-07 13:28:19 -04:00
Carol (Nichols \|\| Goulding)	54eea79773	refactor: Make filtering the parquet files into a closure argument too So that the cold compaction can use different filtering but still use the memory budget function. Not sure I'm happy with this yet, but it's a start.	2022-09-07 13:26:42 -04:00
Carol (Nichols \|\| Goulding)	3e76a155f7	refactor: Make memory budget compaction group function more general In preparation for using it for cold compaction too.	2022-09-07 13:26:42 -04:00
Carol (Nichols \|\| Goulding)	1f69d11d46	refactor: Move hot compaction function into hot compaction module	2022-09-07 13:26:40 -04:00
Carol (Nichols \|\| Goulding)	85fb0acea6	refactor: Extract read_parquet_file test helper function to iox_tests::utils	2022-09-07 13:21:28 -04:00
Marco Neumann	064f0e9b29	refactor: use DataFusion to read parquet files (#5531 ) Remove our own hand-rolled logic and let DataFusion read the parquet files. As a bonus, this now supports predicate pushdown to the deserialization step, so we can use parquets as in in-mem buffer. Note that this currently uses some "nested" DataFusion hack due to the way the `QueryChunk` interface works. Midterm I'll change the interface so that the `ParquetExec` nodes are directly visible to DataFusion instead of some opaque `SendableRecordBatchStream`.	2022-09-05 09:25:04 +00:00
Marco Neumann	f45cbfb88d	refactor: fine-grained file size mocking (#5541 ) * refactor: do not override parquet file size in querier This is going to be an issue when we actually rely on the size for reading, see #5531. * refactor: use selected file size mocking in compactor Do not blindly override parquet file sizes for all subsystems. This is going to be an issue when we actually rely on the size for reading, see #5531. * refactor: remove ability to override file sizes in catalog Blindly overriding data for all subsystems is dangerous, because some parts of our stack actually rely on the actual file size. See #5531. * docs: explain `size_overrides`	2022-09-05 08:50:04 +00:00
Nga Tran	dde65fa7ef	fix: remove timestamp functions from SQLs to be able to use index for improving performance (#5547 )	2022-09-02 19:43:52 +00:00
kodiakhq[bot]	b9959fa2d8	Merge branch 'main' into cn/even-more-compactor-tests	2022-09-01 21:02:04 +00:00
Nga Tran	c8cbc5299b	feat: make compactors to select candidates based on the last n minutes (#5535 ) * feat: make compactors to select candidates based on the last n minutes to reduce workload for postgres catalog query * refactor: remove 1-minute case per review comment	2022-09-01 20:07:26 +00:00
Carol (Nichols \|\| Goulding)	16d631a247	test: Add test for current behavior of skipping a table without columns	2022-08-31 16:26:02 -04:00
Carol (Nichols \|\| Goulding)	1120b49821	refactor: Extract the mock compactor function into a type	2022-08-31 16:17:43 -04:00

1 2 3 4 5 ...

274 Commits (f21cb4362479b25fd00e04c9effef96c57dade87)