influxdb

Commit Graph

Author	SHA1	Message	Date
Marco Neumann	adeacf416c	ci: fix (#5569 ) * ci: use same feature set in `build_dev` and `build_release` * ci: also enable unstable tokio for `build_dev` * chore: update tokio to 1.21 (to fix console-subscriber 0.1.8 * fix: "must use"	2022-09-06 14:13:28 +00:00
Marco Neumann	064f0e9b29	refactor: use DataFusion to read parquet files (#5531 ) Remove our own hand-rolled logic and let DataFusion read the parquet files. As a bonus, this now supports predicate pushdown to the deserialization step, so we can use parquets as in in-mem buffer. Note that this currently uses some "nested" DataFusion hack due to the way the `QueryChunk` interface works. Midterm I'll change the interface so that the `ParquetExec` nodes are directly visible to DataFusion instead of some opaque `SendableRecordBatchStream`.	2022-09-05 09:25:04 +00:00
Marco Neumann	f45cbfb88d	refactor: fine-grained file size mocking (#5541 ) * refactor: do not override parquet file size in querier This is going to be an issue when we actually rely on the size for reading, see #5531. * refactor: use selected file size mocking in compactor Do not blindly override parquet file sizes for all subsystems. This is going to be an issue when we actually rely on the size for reading, see #5531. * refactor: remove ability to override file sizes in catalog Blindly overriding data for all subsystems is dangerous, because some parts of our stack actually rely on the actual file size. See #5531. * docs: explain `size_overrides`	2022-09-05 08:50:04 +00:00
Nga Tran	dde65fa7ef	fix: remove timestamp functions from SQLs to be able to use index for improving performance (#5547 )	2022-09-02 19:43:52 +00:00
kodiakhq[bot]	b9959fa2d8	Merge branch 'main' into cn/even-more-compactor-tests	2022-09-01 21:02:04 +00:00
Nga Tran	c8cbc5299b	feat: make compactors to select candidates based on the last n minutes (#5535 ) * feat: make compactors to select candidates based on the last n minutes to reduce workload for postgres catalog query * refactor: remove 1-minute case per review comment	2022-09-01 20:07:26 +00:00
Carol (Nichols \|\| Goulding)	16d631a247	test: Add test for current behavior of skipping a table without columns	2022-08-31 16:26:02 -04:00
Carol (Nichols \|\| Goulding)	1120b49821	refactor: Extract the mock compactor function into a type	2022-08-31 16:17:43 -04:00
Carol (Nichols \|\| Goulding)	b893251efc	test: Add a test that compacting no candidates compacts nothing	2022-08-31 15:30:25 -04:00
Carol (Nichols \|\| Goulding)	b0e871196c	test: Use more iox test utils in this compactor test	2022-08-31 14:37:59 -04:00
Nga Tran	a32d5180b3	fix: loop forever in compact_hot_partition_candidates (#5518 ) * fix: loop forever in compact_hot_partition_candidates * chore: cleanup * fix: avoid using continues that will cause bugs in corner cases * fix: Pass compaction fn as a closure instead to allow collection of groups in test * fix: Add Send bound as suggested by clippy * fix: fix the test to return data of round 3 instead of round 2 Co-authored-by: Carol (Nichols \|\| Goulding) <carol.nichols@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-31 17:46:59 +00:00
Andrew Lamb	6669d85fb4	chore: Update datafusion + arrow/parquet to `21.0.0` (#5519 ) * chore: Update arrow/arrow-flight/parquet to 21.0.0 * chore: Update datafusion pin * chore: Fix arrow update script * chore: Update Cargo.lock * chore: Update for new API	2022-08-31 13:30:47 +00:00
Nga Tran	cb10a7c6d8	feat: More accurate memory estimate for compaction (#5471 ) * feat: initial implementation of memory estimation for a compaction * feat: estimate size of files and have the right actions for the needed budget * feat: run candidates in parallel * fix: have the right name for the column field of the output struct * feat: add metrics for estimated budgets * chore: cleanup * chore: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * fix: fix syntax after applying review's suggestions * refactor: Convert a Vec to VecDeque to go well with pop and push * chore: remove max_concurrent_size_bytes and input_size_threshold_bytes * chore: remove input_file_count_threshold * test: tests for estimate_arrow_bytes_for_file Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-30 13:44:44 +00:00
Dom Dwyer	2fc0ddbea1	fix: compactor tolerates empty output Changes the compactor code to tolerate a SplitExec yielding an empty partition (with no rows). This raises a WARN as the situation in which this is acceptable is very rare, and is more likely indicative of an opportunity to improve the SplitExec usage (i.e. pruning out unnecessary split points).	2022-08-30 14:52:31 +02:00
Carol (Nichols \|\| Goulding)	58f0b63cdc	refactor: Rename KafkaTopic to Topic or TopicMetadata or topic name as appropriate	2022-08-29 14:27:02 -04:00
Carol (Nichols \|\| Goulding)	74c9529062	fix: Rename KafkaPartition to ShardIndex	2022-08-29 14:07:18 -04:00
Carol (Nichols \|\| Goulding)	c9567cad7d	fix: Rename some more sequencer to shard	2022-08-29 14:06:45 -04:00
Carol (Nichols \|\| Goulding)	6443858870	fix: Rename compactor option from sequencer to shard	2022-08-29 14:06:45 -04:00
Carol (Nichols \|\| Goulding)	fe9c474620	fix: rustfmt	2022-08-29 14:06:45 -04:00
Carol (Nichols \|\| Goulding)	f6c93f7e67	fix: Remove moot comment	2022-08-29 14:06:44 -04:00
Carol (Nichols \|\| Goulding)	698f1a47ff	refactor: Rename test structures from sequencer to shard where appropriate	2022-08-29 14:06:44 -04:00
Jake Goulding	4abf21c724	refactor: Rename Sequencer (and its entourage) to Shard	2022-08-29 14:06:43 -04:00
Nga Tran	3220c6f88b	feat: add file_count_threshold for comapcting cold partitions (#5456 ) * feat: file file_count_threshold for comapcting cold partitions to make it consistent with the hot case and help set up to avoid oom easier * chore: remove unecessary commments	2022-08-23 20:12:21 +00:00
kodiakhq[bot]	2b3ca54168	Merge branch 'main' into cn/upgrade-l0-metrics	2022-08-17 16:01:42 +00:00
Andrew Lamb	7f0ae53d6f	chore: Update to (almost) released object_store 0.4.0 (#5419 ) * chore: update object_store * chore: update hakari config * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-08-17 13:44:48 +00:00
Carol (Nichols \|\| Goulding)	ef716a5b90	fix: Remove compaction level attribute from the compaction_input_file_bytes metric	2022-08-15 10:50:04 -04:00
Carol (Nichols \|\| Goulding)	a9ed32df89	fix: Remove compaction_counter as it's now redundant with the compaction_input_file_bytes histogram	2022-08-15 10:23:29 -04:00
Carol (Nichols \|\| Goulding)	af95ce7ca6	feat: Add a histogram tracking sizes of files used as inputs to compaction Fixes #5348.	2022-08-15 10:13:54 -04:00
Carol (Nichols \|\| Goulding)	cd6c809fe0	fix: Change metric tracking sizes of files selected for compaction to a histogram Connects to #5348.	2022-08-15 10:13:54 -04:00
Carol (Nichols \|\| Goulding)	b982bdaf2f	fix: Derive Eq when we derive PartialEq and members can derive Eq Allow this in generated code that we don't control, though. Recommended by clippy now. https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq	2022-08-11 15:04:06 -04:00
Marco Neumann	90fec1365f	feat: intern schemas during query planning (#5215 ) * feat: intern schemas during query planning Helps with #5202. * refactor: `SchemaMerger::build` shall return an `Arc` * feat: `SchemaMerger::with_interner` * refactor: hash-based schema interning	2022-08-11 12:28:51 +00:00
Jake Goulding	68e64af4d1	refactor: extract compactor loop body to call it separately	2022-08-10 11:28:51 -04:00
Jake Goulding	49c5281454	refactor: Supersede old CompactorHandlerImpl constructor	2022-08-10 11:28:51 -04:00
Jake Goulding	cc061b6ce9	refactor: add CompactorHandlerImpl::new_with_compactor This will allow us to refactor the code a level up to create a `Compactor` directly.	2022-08-10 11:28:51 -04:00
Andrew Lamb	c0fc91c627	chore: Warn if a parquet file has no sort key (#5368 )	2022-08-10 11:56:50 +00:00
Andrew Lamb	16ddc5efc6	chore: Update datafusion / arrow/parquet/arrow-flight and prost/tonic ecosystem (#5360 ) * chore: Update datafusion and arrow * chore: Update Cargo.lock * chore: update to Decimal128 * chore: Update tonic/prost/pbjson/etc * chore: Run cargo hakari tasks * fix: doctest in generated types Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-08-09 17:30:44 +00:00
Nga Tran	b71c1a09ea	feat: only sleep when there are neither hot nor cold partitions to compact (#5329 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-05 16:36:36 +00:00
Carol (Nichols \|\| Goulding)	facc967320	fix: Specify hot or cold in more log messages	2022-08-04 16:55:48 -04:00
Carol (Nichols \|\| Goulding)	c9d66c30b1	fix: Make this field name consistent With the other fields on this struct and with the corresponding field on the clap block struct.	2022-08-04 16:55:48 -04:00
Carol (Nichols \|\| Goulding)	da0b031c44	feat: Add parameters to limit total memory usage of cold partition compaction	2022-08-04 16:55:48 -04:00
Carol (Nichols \|\| Goulding)	9d8f94d0d7	fix: Remove an unneeded sleep The cold case won't make a hot busy loop (hah), we'll just go back to working on the hot partitions if there's no cold partitions to do.	2022-08-04 16:55:48 -04:00
Carol (Nichols \|\| Goulding)	e1c45e836a	test: Remove copypastaed assertions that duplicate a different test	2022-08-04 16:55:48 -04:00
Carol (Nichols \|\| Goulding)	cb6442018e	test: Add more test cases varying number of partitions per sequencer	2022-08-04 16:55:48 -04:00
Carol (Nichols \|\| Goulding)	d55f45a5c2	feat: Run compaction of hot partitions a configurable number of times more than cold	2022-08-04 16:55:48 -04:00
Carol (Nichols \|\| Goulding)	827e82cfb8	feat: Upgrade one level 0, non-overlapping file without compacting Fixes #1078.	2022-08-04 16:55:47 -04:00
Carol (Nichols \|\| Goulding)	c1d016a00a	feat: Upgrade cold level 0 files when they have no overlaps	2022-08-04 16:55:47 -04:00
Carol (Nichols \|\| Goulding)	9052eabe50	feat: Separate out hot/cold partition compaction and filtering Cold partition compaction will (in the next commit) upgrade a level 0 file without any overlaps rather than running compaction. Cold partition filtering gathers all level 0 files in the (already deemed cold) partition with all overlapping level 1 files, and does not limit the set of files being compacted by their number or size.	2022-08-04 16:55:47 -04:00
Carol (Nichols \|\| Goulding)	fc62c82722	feat: Select cold partitions	2022-08-04 16:55:47 -04:00
Carol (Nichols \|\| Goulding)	6e9c752230	refactor: Extract current compaction into a fn for 'hot' partitions	2022-08-04 16:55:47 -04:00
Marco Neumann	eea8270e83	fix: `compute_split_time` with small step sizes (#5309 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-04 13:40:30 +00:00

1 2 3 4 5 ...

265 Commits (d24fb0eae7159a986a92c4e8be04e99321f4fc3f)