Commit Graph

100 Commits (fc0634792b9971a2267e1e048b9dc5ff6db5ba1c)

Author SHA1 Message Date
Dom Dwyer cd4087e00d style: add no todo!() or dbg!() lints
Some crates had theme, some not - lets be consistent and have the
compiler spot dbg!() and todo!() macro calls - they should never be in
prod code!
2022-09-29 13:10:07 +02:00
Nga Tran e3deb23bcc
feat: add minimum row_count per file in estimating compacting memory… (#5715)
* feat: add minimum row_count per file in estiumating compacting memory budget and limit number files per compaction

* chore: cleanup

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* test: add test per review comments

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* test: add one more test that has limit num files larger than total input files

* fix: make the L1 files in tests not overlapped

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-22 14:37:39 +00:00
Andrew Lamb f86d3e31da
chore: Update datafusion + object_store (#5619)
* chore: Update datafusion pin

* chore: update object_store to 0.5.0

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-13 12:34:54 +00:00
Carol (Nichols || Goulding) dfd7255c46
fix: Remove now-unused cold_input_file_count_threshold 2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding) 3a368c02c2
fix: Remove now-unused cold_input_size_threshold_bytes 2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding) eefc71ac90
fix: Remove now unused max_cold_concurrent_size_bytes 2022-09-12 13:13:28 -04:00
Carol (Nichols || Goulding) e3f9984878
docs: Clean up some comments while reading through 2022-09-12 13:13:27 -04:00
Carol (Nichols || Goulding) 6436afc3d9
fix: Remove cold max bytes CLI option; use existing max bytes CLI option
As discussed in https://github.com/influxdata/influxdb_iox/issues/5330#issuecomment-1218170063
2022-09-12 13:13:27 -04:00
Carol (Nichols || Goulding) 10ba3fef47
feat: Compact cold partitions completely
Fixes #5330.
2022-09-12 13:13:26 -04:00
Dom Dwyer d1ca29c029 fix(ingester): connect to assigned Kafka partition
During initialisation, the ingester connects to the Kafka brokers - this
involves per-partition leadership discovery & connection establishment.
These connections are then retained for the lifetime of the process.

Prior to this commit, the ingester would establish a connection to all
partition leaders for a given topic. After this commit, the ingester
connects to only the partition leaders it is going to consume from
(for those shards that it is assigned.)
2022-09-07 13:21:06 +02:00
Dom Dwyer 2a19606456 feat(ingester): restrict partition row count
This limit restricts a single partition to containing at most N rows
before it is marked for persistence (note: being marked for persistence
does not currently prevent further ingest for that partition.)
2022-08-31 15:48:18 +02:00
Nga Tran cb10a7c6d8
feat: More accurate memory estimate for compaction (#5471)
* feat: initial implementation of memory estimation for a compaction

* feat: estimate size of files and have the right actions for the needed budget

* feat: run candidates in parallel

* fix: have the right name for the column field of the output struct

* feat: add metrics for estimated budgets

* chore: cleanup

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: fix syntax after applying review's suggestions

* refactor: Convert a Vec to VecDeque to go well with pop and push

* chore: remove max_concurrent_size_bytes and input_size_threshold_bytes

* chore: remove input_file_count_threshold

* test: tests for estimate_arrow_bytes_for_file

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-30 13:44:44 +00:00
Raphael Taylor-Davies 711ba77341
chore: update object_store to test IMDSv1 fallback (#5509)
* chore: update object_store to test IMDSv1 fallback

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-30 12:31:49 +00:00
Carol (Nichols || Goulding) 74c9529062
fix: Rename KafkaPartition to ShardIndex 2022-08-29 14:07:18 -04:00
Carol (Nichols || Goulding) 6443858870
fix: Rename compactor option from sequencer to shard 2022-08-29 14:06:45 -04:00
Nga Tran 3220c6f88b
feat: add file_count_threshold for comapcting cold partitions (#5456)
* feat: file file_count_threshold for comapcting cold partitions to make it consistent with the hot case and help set up to avoid oom easier

* chore: remove unecessary commments
2022-08-23 20:12:21 +00:00
Andrew Lamb 7f0ae53d6f
chore: Update to (almost) released object_store 0.4.0 (#5419)
* chore: update object_store

* chore: update hakari config

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-17 13:44:48 +00:00
Carol (Nichols || Goulding) b982bdaf2f
fix: Derive Eq when we derive PartialEq and members can derive Eq
Allow this in generated code that we don't control, though.

Recommended by clippy now. https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq
2022-08-11 15:04:06 -04:00
Carol (Nichols || Goulding) 1b77abdda7
fix: Explain the purpose of this macro, make arg name better 2022-08-10 11:33:43 -04:00
Carol (Nichols || Goulding) 96acf3c54b
fix: Don't rustfmt this module because of a rustfmt bug 2022-08-10 11:30:39 -04:00
Carol (Nichols || Goulding) 75bdd470a2
docs: Rewrap doc comments to 100 cols now that they're indented more 2022-08-10 11:30:22 -04:00
Jake Goulding 3915841a53
feat: Introduce a separate config for the compactor command 2022-08-10 11:30:21 -04:00
Jake Goulding 21864f35e1
refactor: Generate the CompactorConfig in a macro
This will allow us to have related but different configurations for
service and command mode.
2022-08-10 11:30:20 -04:00
Marco Neumann 3446127b65
chore: enable and fix warnings for `clap_blocks` (#5365)
Esp. this fixes "unused import" warnings when not all features are
enabled, so developer IDEs don't shout.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 10:14:34 +00:00
Raphael Taylor-Davies dadcc369b1
chore: update object_store to fix credentials client (#5359)
* chore: update object_store to fix credentials client

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-09 13:17:43 +00:00
Raphael Taylor-Davies dfa862fd53
chore: temporary allow http always (#5357) 2022-08-09 10:54:42 +00:00
Raphael Taylor-Davies ccb45d7bac
chore: update to rusoto-less object_store (#5342)
* chore: update to rusoto-less object_store

* chore: Run cargo hakari tasks

* chore: further fixes

* chore: document workaround

* chore: review feedback

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-09 09:06:03 +00:00
Carol (Nichols || Goulding) da0b031c44
feat: Add parameters to limit total memory usage of cold partition compaction 2022-08-04 16:55:48 -04:00
Carol (Nichols || Goulding) d55f45a5c2
feat: Run compaction of hot partitions a configurable number of times more than cold 2022-08-04 16:55:48 -04:00
dependabot[bot] e8231b2986
chore(deps): Bump serde_json from 1.0.82 to 1.0.83 (#5297)
* chore(deps): Bump serde_json from 1.0.82 to 1.0.83

Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.82 to 1.0.83.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.82...v1.0.83)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-04 14:28:29 +00:00
Marco Neumann 840e4801b8
feat: make querier RAM pool split a proper feature (#5283)
* feat: make querier RAM pool split a proper feature

- use propre pool names
- expose sizing via CLI/env

Closes https://github.com/influxdata/conductor/issues/1102.

* refactor: improve naming and docs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 15:27:23 +00:00
Marco Neumann 663a20d743
refactor: remove `--ingster-address` (#5255)
Closes #5002.
2022-08-03 15:05:01 +00:00
Nga Tran 8f1b6f2465
chore: reduce log info (#5254)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-01 16:00:34 +00:00
Marco Neumann 9a9a1a4777
feat: limit per-table chunk data for every query (#5223)
* feat: `QueryChunk::as_any`

* feat: allo `ChunkPruner::prune_chunks` to fail

* feat: limit per-table chunk data for every query

Closes #5211.

* fix: address review comments

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-07-27 13:20:05 +00:00
Nga Tran d05f383a98
refactor: reduce compacting size and compacted file size to prevent compactor from waiting for reading a large file forever (#5206)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-25 20:08:11 +00:00
Andrew Lamb 17231b4001
fix: warn if `--data-dir is specified but `--object_store` type is not file (#5147)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-19 01:00:54 +00:00
Carol (Nichols || Goulding) 07e10852a8
feat: Add an input file count threshold to the compactor settings 2022-07-18 15:41:17 -04:00
Carol (Nichols || Goulding) 128833e7d9
fix: Change placeholder new_param to input_size_threshold_bytes 2022-07-18 15:16:43 -04:00
Carol (Nichols || Goulding) d62b1ed7ee
feat: Select a subset of parquet files for a partition to compact
Fixes #5120.
2022-07-18 15:14:22 -04:00
Carol (Nichols || Goulding) 4416f1ce37
fix: Remove max number of level 0 files configuration option 2022-07-18 15:08:16 -04:00
Carol (Nichols || Goulding) 57c70fcec5
fix: Remove redundant 'compaction' naming from CompactorConfig fields 2022-07-18 15:03:33 -04:00
Carol (Nichols || Goulding) 0828fb5376
fix: Use more accurate number of bytes for MB and GB 2022-07-18 15:01:41 -04:00
Nga Tran c8f4000f04
feat: Select compaction candidates (#5131)
* feat: initial implementation for selecting compaction candidates

* feat: 2 catalog functions to choose the most thorughput partitions to compact and the selecting candidate function itself

* test: tests for the new 2 queries

* feat: more tests and metrics for chooing compaction candidates

* chore: Apply self suggestions from self review

* chore: cleanup

* chore: fix doc comment

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: address review comments

* fix: get the right time provider for the tests

* refactor: remove the left over compaction_

* fix: typos

* fix: make the param name and env name consistent

* refactor: make relevant iSomething to uSomething

* fix: typo

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2022-07-18 18:05:13 +00:00
Nga Tran 5c5c964dfe
feat: config params for Compactor (#5108)
* feat: config params for Compactor

* refactor: address review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-13 13:50:07 +00:00
Carol (Nichols || Goulding) 311d4c1f9a
fix: All-in-one mode only supports one partition/sequencer 2022-07-06 11:00:55 -04:00
Nga Tran 1de022136c
feat: add max desired file size config param (#5025)
* feat: add max desired file size config param

* fix: comment typos

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-07-05 15:32:45 +00:00
Sam Arnold 9438570ba1
test: document how to run tests (#4982)
* test: document how to run tests

Also fix a few issues for local runs.

* docs: add back one-liner for running end to end tests

* docs: add comment for clap_blocks test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* docs: add comment in influxdb_iox/tests/end_to_end_cases/cli.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-06-30 14:01:35 +00:00
Carol (Nichols || Goulding) 3049479b78
feat: Implement new querier to ingester config design 2022-06-30 08:26:50 -04:00
Carol (Nichols || Goulding) 59da2dccb8
feat: Assert if no ingester addresses are found
Temporarily support `--ingester-addresses` (and always return all
ingesters) so that this PR can be deployed during the switchover.
2022-06-30 08:22:47 -04:00
Carol (Nichols || Goulding) 0e450deca8
feat: Support a sequencer being mapped to multiple ingesters 2022-06-30 08:22:47 -04:00