influxdb

Commit Graph

Author	SHA1	Message	Date
Andrew Lamb	6caf31acf3	chore: Move garbage collection configuration into clap_blocks (#6678 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-25 11:31:48 +00:00
Marco Neumann	40e6a1a437	feat: job semaphore (#6696 ) * refactor: avoid too-many-arguments * refactor: extract `fetch_partition_info` * feat: job semaphore	2023-01-25 10:35:07 +00:00
Marco Neumann	4521516147	feat: add per-partition timeout (#6686 ) It seems that prod was hanging last night. This is pretty hard to debug and in general we should protect the compactor against hanging / malformed partitions that take forever. This is similar to the fact that the querier also has a timeout for every query. Let's see if this shows anything in prod (and if not it's still a desired safety net). Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-24 16:53:47 +00:00
Nga Tran	840923abab	refactor: execute compaction plan (#6654 ) * chore: address review comment of previous PR * refactor: execute compact plan * refactor: we will now compact all L0 and L1 files of a partition and split them as needed * chore: comnents Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-20 22:34:50 +00:00
Marco Neumann	5e297b4667	refactor: lift up compactor2 CLI args, set mem limit to 8GB (#6631 ) - use a single data structure for CLI args (not two) - set mem limit default to 8GB (same as querier). We can always tune this later, but we should not run with "unlimited" to begin with.	2023-01-19 12:21:51 +00:00
Nga Tran	9ae03b16d6	feat: invokes catalog functions for compactor2 (#6619 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-19 10:33:57 +00:00
Marco Neumann	380a855aab	feat: basic compactor2 algo layout (#6616 )	2023-01-18 18:51:59 +00:00
Marco Neumann	e72173d58d	feat: very basic compactor2 skeleton (#6614 ) Sets up crate and wires up the main binary. No tests yet, no algorithm framework, just the bare minimum. Also I decided to not offer a gRPC server in `compactor2` at the moment and hence did not implement any handle/delegate infrastructure. We add this later if we need it. This also means compactor2 does NOT provide a catalog service for now. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-18 16:36:40 +00:00
Nga Tran	fa0893819c	fix: have warm compaction work with compactor2 (#6571 ) * refactor: same function to select partition candidates * fix: have warm compaction work with compactor2 * fix: format * chore: cleanup	2023-01-12 02:32:39 +00:00
Nga Tran	62c0f3dbdd	feat: have cold compaction work with Compactor2 (#6542 ) * feat: cold * chore: debug info * feat: only compact qualified cold partition candidates * fix: catalog test * chore: cleanup * chore: add new config flag for cold partition candidates * chore: implement display for CompactionType and add tests for max num partitions Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-10 16:42:57 +00:00
Paul Dix	828992c9c5	feat: Ingest replica skeleton (#6529 ) * feat: Update replication.proto * Remove the PartitionId in the replicate request as a single replicate request can have the data for many partitions. * Add namespace_id and table_id to persist complete request to make data easier to lookup in buffer. * feat: Initial ingest_replica skeleton A bunch of copy pasta here from ingester2, but this takes out a ton of stuff that isn't used in replicas. Also lays the groundwork for the simpler buffer structure to keep the data and a basic cache for catalog information that will be required. * feat: update replication.proto GetPartitionBufferResponse * chore: PR cleanup * chore: PR cleanup	2023-01-09 16:53:49 +00:00
kodiakhq[bot]	c0f2ba09ee	Merge branch 'main' into cn/compactor2	2022-12-19 14:22:56 +00:00
dependabot[bot]	7f2aa8b10c	chore(deps): Bump serde_json from 1.0.89 to 1.0.91 Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.89 to 1.0.91. - [Release notes](https://github.com/serde-rs/json/releases) - [Commits](https://github.com/serde-rs/json/compare/v1.0.89...v1.0.91) --- updated-dependencies: - dependency-name: serde_json dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-12-19 01:44:18 +00:00
Carol (Nichols \|\| Goulding)	d7e75d43ea	fix: Make shard ID optional for compactor queries in RPC write mode	2022-12-16 17:28:53 -05:00
Carol (Nichols \|\| Goulding)	2406cdb24b	feat: Create a compactor2 cli	2022-12-16 17:22:06 -05:00
Dom Dwyer	c830a83105	feat(ingester2): hot partition persistence This PR uses the MutableBatch persist cost estimation added in #6425 to selectively mark "hot" partitions for persistence. This uses a (composable!) "post-write" observer that is invoked after each buffer call - this allows the HotPartitionPersister in this commit to inspect the cost of the partition after applying the write, and if it exceeds the configurable cost threshold, enqueue it for persistence (rotating the buffer within the partition in the process). Unlike ingester(1), this implementation prevents overrun - the application of the write that exceeds the cost limit, and enqueueing the partition for persistence is atomic.	2022-12-16 19:33:34 +01:00
Luke Bond	f419e2c378	feat: warm compaction (#6192 ) * feat: warm compaction chore: add missing warm compaction config chore: tests for warm compaction chore: modify count usage in warm compaction sql chore: catalog test for warm compaction; sql fixes feat: settable target level for compact w/ budget chore: tests for warm compaction chore: clarifying comments in warm compaction test chore: fixed erroneous comment in catalog test chore: improve warm compactor test by checking file exists chore: tests for warm compaction chore: warm compactor test tidy-ups * chore: improve test for warm compaction * chore: fix erroneous comment in warm compaction code	2022-12-16 15:59:45 +00:00
kodiakhq[bot]	cfb7c16bb1	Merge branch 'main' into dom/optimal-persist-parallelism	2022-12-16 09:12:22 +00:00
Carol (Nichols \|\| Goulding)	22d6b78899	docs: Fix outdated comment on querier mode switching behavior	2022-12-15 14:16:14 -05:00
Carol (Nichols \|\| Goulding)	2a1e540ee3	fix: Rename INFLUXDB_IOX_MODE to INFLUXDB_IOX_RPC_MODE	2022-12-15 14:13:01 -05:00
Carol (Nichols \|\| Goulding)	7d216ba1fd	feat: Error if you run the wrong command with the wrong env var set Connects to #6402.	2022-12-15 14:06:59 -05:00
Carol (Nichols \|\| Goulding)	aec98015d7	fix: Remove the rpc_write feature flag and use INFLUXDB_IOX_MODE env var instead And standardize on ingester2 and router2 for consistency. Connects to #6402.	2022-12-15 14:06:59 -05:00
Dom Dwyer	933ab1f8c7	feat(ingester2): optimal persist parallelism This commit changes the behaviour of the persist system to enable optimal parallelism of persist operations, and improve the accuracy of the outstanding job bound / back-pressure. Previously all persist operations for a given partition were consistently hashed to a single worker task. This serialised persistence per partition, ensuring all updates to the partition sort key were serialised. However, this also unnecessarily serialises persist operations that do not need to update the sort key, reducing the potential throughput of the system; in the worst case of a single partition receiving all the writes, only one worker would be persisting, and the other N-1 workers would be idle. After this change, the sort key is inspected when enqueuing the persist operation and if it can be determined that no sort key update is necessary (the typical case), then the persist task is placed into a global work queue from which all workers consume. This allows for maximal parallelisation of these jobs, and the removes the per-worker head-of-line blocking. In the case that the sort key does need updating, these jobs continue to be consistently hashed to a single worker, ensuring serialised sort key updates only where necessary. To support these changes, the back-pressure system has been changed to account for all outstanding persist jobs in the system, regardless of type or assigned worker - a logical, bounded queue is composed together of a semaphore limiting the number of persist tasks overall, and a series of physical, unbounded queues - one to each worker & the global queue. The overall system remains bounded by the INFLUXDB_IOX_PERSIST_QUEUE_DEPTH value, and is now simpler to reason about (it is independent of the number of workers, etc).	2022-12-15 18:30:51 +01:00
Dom Dwyer	c7e4bf3dd1	refactor(config): default persist queue depth=250 Allow up to 250 persist jobs to be enqueued for any one worker before pausing. With 5 workers, this gives a maximum outstanding persist jobs of 2,500.	2022-12-14 17:19:19 +01:00
Dom Dwyer	1da9b63cce	fix(ingester2): persist deadlock Removes the submission queue from the persist fan-out, instead the PersistHandle now carries the shared state internally (cheaply cloned via ref counts). This also resolves the persist deadlock when under load.	2022-12-13 16:47:45 +01:00
Carol (Nichols \|\| Goulding)	5141cba1db	fix: Only switch into querier RPC write path if ingester addresses specified This enables testing of the querier using the old path with the rpc_write feature turned on.	2022-12-08 17:40:04 -05:00
Carol (Nichols \|\| Goulding)	b85130cb7c	fix: Make --ingester-addresses optional for the querier in RPC write mode	2022-12-08 17:22:52 -05:00
Carol (Nichols \|\| Goulding)	619a2d0856	fix: Remove conflicting arguments from the RouterRpcWriteConfig (#6355 ) These were added in https://github.com/influxdata/influxdb_iox/pull/6346. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-08 20:21:37 +00:00
kodiakhq[bot]	6f7cb5ccf0	Merge branch 'main' into cn/ingester2-querier	2022-12-08 14:00:49 +00:00
Carol (Nichols \|\| Goulding)	e13e668d26	refactor: Share more code in the querier in the RPC write path mode	2022-12-07 13:54:08 -05:00
Luke Bond	551bb0ef6a	feat: allow enabling/disabling ns autocreation in router (#6346 ) * feat: allow enabling/disabling ns autocreation in router * fix: missed an import for something behind router2 compile flag	2022-12-07 16:12:00 +00:00
Carol (Nichols \|\| Goulding)	9166ace796	feat: Make a mode for the querier to use ingester2 instead, behind the rpc_write feature flag	2022-12-07 09:56:50 -05:00
dependabot[bot]	1d38d400f0	chore(deps): Bump object_store from 0.5.1 to 0.5.2 (#6339 ) * chore(deps): Bump object_store from 0.5.1 to 0.5.2 Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.1 to 0.5.2. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md) - [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.1...object_store_0.5.2) --- updated-dependencies: - dependency-name: object_store dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Run cargo hakari tasks Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-06 07:53:54 +00:00
Marco Neumann	cd6a8a1a82	refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics (#6313 ) * refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics Closes #6310. * refactor: rename and tune default exec mem limits * fix: ingester2 bits after rebase	2022-12-05 12:38:28 +00:00
Dom Dwyer	66aab55534	feat(ingester2): run persistence task Configures the initialisation of an ingester2 instance to spawn a persistence task (currently unused) and plumbs in various configuration parameters.	2022-12-02 17:18:39 +01:00
kodiakhq[bot]	9e3d0fcefb	Merge branch 'main' into cn/ingester2	2022-12-02 13:39:55 +00:00
Dom Dwyer	b819a48d59	refactor(ingester2): rename concurrent limit var The ingester will handle many types of requests - this limit applies to queries only.	2022-12-02 14:36:42 +01:00
Nga Tran	77cbc880f6	feat: Add cap limit on number of partitions to be compacted in parallel (#6305 ) * feat: Add cap limit on number of partitions to be comapcted in parallel * chore: cleanup * chore: clearer comments	2022-12-01 21:23:44 +00:00
Carol (Nichols \|\| Goulding)	7c21db360f	fix: Make wal rotation period a CLI arg with a default of 300 s	2022-12-01 13:51:45 -05:00
Carol (Nichols \|\| Goulding)	b9e424582f	refactor: Extract a clap block for the ingester2 RPC write path To be able to share it with the coming all-in-one2 command	2022-12-01 11:39:30 -05:00
Carol (Nichols \|\| Goulding)	fef3bc02cd	refactor: Extract a clap block for the router RPC write path To be able to share it with the coming all-in-one2 command	2022-12-01 11:39:30 -05:00
Luke Bond	d07658282c	feat: add router config parameter for retention (#6278 ) * chore: remove unused/moved ns_autocreation dml handler * feat(router): expose new ns retention as config * fix: forgot to set default value for router retention arg * chore: make new namespace retention param an option	2022-11-30 13:14:39 +00:00
dependabot[bot]	caa595a6fc	chore(deps): Bump serde_json from 1.0.88 to 1.0.89 (#6203 ) Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.88 to 1.0.89. - [Release notes](https://github.com/serde-rs/json/releases) - [Commits](https://github.com/serde-rs/json/compare/v1.0.88...v1.0.89) --- updated-dependencies: - dependency-name: serde_json dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-22 09:28:31 +00:00
dependabot[bot]	52c50c16e1	chore(deps): Bump serde_json from 1.0.87 to 1.0.88 Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.87 to 1.0.88. - [Release notes](https://github.com/serde-rs/json/releases) - [Commits](https://github.com/serde-rs/json/compare/v1.0.87...v1.0.88) --- updated-dependencies: - dependency-name: serde_json dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-11-21 01:52:18 +00:00
Luke Bond	7c813c170a	feat: reintroduce compactor first file in partition exception (#6176 ) * feat: compactor ignores max file count for first file chore: typo in comment in compactor * feat: restore special first file in partition compaction logic; add limit * fix: calculation in compaction max file count chore: clippy Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-18 15:58:59 +00:00
Marco Neumann	62851afc27	feat: add querier->ingester circuit breaker (#6147 ) * feat: add log ingester memory pressure persist * feat: add querier->ingester circuit breaker Closes #4608. * docs: explain high-level circuit breaker * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * test: add additional test assertion * refactor: upgrade info to warning log Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-11-16 10:50:33 +00:00
Carol (Nichols \|\| Goulding)	43687a86d2	fix: Remove lots of needless borrows that Clippy can now identify Except for in generated code that we don't control.	2022-11-09 10:54:18 -05:00
Luke Bond	f9316decee	chore: expose compactor's hot compaction hours thresholds as cfg (#6060 ) * chore: expose compactor's hot compaction hours thresholds as cfg * fix: add missing compactor arg envar; fix some comments Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-07 15:29:17 +00:00
Nga Tran	654ed98d1f	feat: config param to set when partition is cold (#6044 ) * feat: config param to set when partition is cold * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <alamb@influxdata.com> * fix: make default 8 hours and avoid using 8 * 60 becasue it is a string, not expression which makes a test fail Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-03 15:03:56 +00:00
Carol (Nichols \|\| Goulding)	2e83e04eab	feat: Use workspace package metadata to reduce differences and repetition	2022-10-24 13:04:09 -04:00
dependabot[bot]	bebb15d30f	chore(deps): Bump serde_json from 1.0.86 to 1.0.87 Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.86 to 1.0.87. - [Release notes](https://github.com/serde-rs/json/releases) - [Commits](https://github.com/serde-rs/json/compare/v1.0.86...v1.0.87) --- updated-dependencies: - dependency-name: serde_json dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-10-20 07:52:33 +00:00
Dom Dwyer	5d835d5047	revert: rdkafka/rskafka swapping (#5844 ) This reverts commit `442a7ff2a4`. This commit restores rskafka as the producer Kafka client, effectively undoing the change made (and follow-up PRs) in: https://github.com/influxdata/influxdb_iox/pull/5800	2022-10-17 12:34:28 +02:00
Andrew Lamb	ff7b571fae	feat: Log object store configuration on startup (#5865 )	2022-10-14 18:10:20 +00:00
Carol (Nichols \|\| Goulding)	442a7ff2a4	revert: "revert: rdkafka/rskafka swapping (#5800 )" (#5844 ) * revert: "revert: rdkafka/rskafka swapping (#5800)" This reverts commit `b77c3540e1`. * test: Verify write buffer connection_config is parsed as expected * test: Failing test reproducing the error seen when deploying rdkafka * fix: Translate k8s-idpe configs to rdkafka configs	2022-10-13 09:33:06 +00:00
kodiakhq[bot]	266b8f2a58	Merge branch 'main' into dependabot/cargo/clap-4.0.2	2022-10-12 14:01:28 +00:00
dependabot[bot]	933493fab3	chore(deps): Bump object_store from 0.5.0 to 0.5.1 Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.0 to 0.5.1. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md) - [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.0...object_store_0.5.1) --- updated-dependencies: - dependency-name: object_store dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-10-11 01:19:10 +00:00
dependabot[bot]	2277fcf08a	chore(deps): Bump serde_json from 1.0.85 to 1.0.86 Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.85 to 1.0.86. - [Release notes](https://github.com/serde-rs/json/releases) - [Commits](https://github.com/serde-rs/json/compare/v1.0.85...v1.0.86) --- updated-dependencies: - dependency-name: serde_json dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-10-10 01:42:37 +00:00
Jake Goulding	d0f8f0aa60	fix: Remove variadic space-separated support for --write-buffer-connection-config We don't actually appear to use the space support anywhere, preferring the comma version.	2022-09-30 16:59:28 -04:00
Carol (Nichols \|\| Goulding)	576d629ce4	fix: Remove leading `--` from long option names	2022-09-30 16:59:28 -04:00
Carol (Nichols \|\| Goulding)	50f84906e2	fix: Remove multiple_values = true; it's now implied because of Vec See <https://docs.rs/clap/4.0.2/clap/_derive/index.html#arg-types> > clap assumes some intent based on the type used: > > ... > > Vec<T> 0.. occurrences of argument .action(ArgAction::Append).required(false).num_args(1..)	2022-09-30 16:59:03 -04:00
Carol (Nichols \|\| Goulding)	73d7105f20	fix: Update from clap ArgEnum to ValueEnum See <https://github.com/clap-rs/clap/pull/4127>	2022-09-30 16:59:03 -04:00
Carol (Nichols \|\| Goulding)	3c11d3640f	fix: Update use of clap::StructOpt to clap::Parser StructOpt is now fully part of Clap. https://docs.rs/clap/latest/clap/_faq/index.html#how-does-clap-compare-to-structopt	2022-09-30 16:59:03 -04:00
dependabot[bot]	199e47721a	chore(deps): Bump clap from 3.2.22 to 4.0.7 Bumps [clap](https://github.com/clap-rs/clap) from 3.2.22 to 4.0.7. - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.22...v4.0.7)	2022-09-30 16:46:56 -04:00
Dom Dwyer	cd4087e00d	style: add no todo!() or dbg!() lints Some crates had theme, some not - lets be consistent and have the compiler spot dbg!() and todo!() macro calls - they should never be in prod code!	2022-09-29 13:10:07 +02:00
Nga Tran	e3deb23bcc	feat: add minimum row_count per file in estimating compacting memory… (#5715 ) * feat: add minimum row_count per file in estiumating compacting memory budget and limit number files per compaction * chore: cleanup * chore: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * test: add test per review comments * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * test: add one more test that has limit num files larger than total input files * fix: make the L1 files in tests not overlapped Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-22 14:37:39 +00:00
Andrew Lamb	f86d3e31da	chore: Update datafusion + object_store (#5619 ) * chore: Update datafusion pin * chore: update object_store to 0.5.0 * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-13 12:34:54 +00:00
Carol (Nichols \|\| Goulding)	dfd7255c46	fix: Remove now-unused cold_input_file_count_threshold	2022-09-12 13:13:28 -04:00
Carol (Nichols \|\| Goulding)	3a368c02c2	fix: Remove now-unused cold_input_size_threshold_bytes	2022-09-12 13:13:28 -04:00
Carol (Nichols \|\| Goulding)	eefc71ac90	fix: Remove now unused max_cold_concurrent_size_bytes	2022-09-12 13:13:28 -04:00
Carol (Nichols \|\| Goulding)	e3f9984878	docs: Clean up some comments while reading through	2022-09-12 13:13:27 -04:00
Carol (Nichols \|\| Goulding)	6436afc3d9	fix: Remove cold max bytes CLI option; use existing max bytes CLI option As discussed in https://github.com/influxdata/influxdb_iox/issues/5330#issuecomment-1218170063	2022-09-12 13:13:27 -04:00
Carol (Nichols \|\| Goulding)	10ba3fef47	feat: Compact cold partitions completely Fixes #5330.	2022-09-12 13:13:26 -04:00
Dom Dwyer	d1ca29c029	fix(ingester): connect to assigned Kafka partition During initialisation, the ingester connects to the Kafka brokers - this involves per-partition leadership discovery & connection establishment. These connections are then retained for the lifetime of the process. Prior to this commit, the ingester would establish a connection to all partition leaders for a given topic. After this commit, the ingester connects to only the partition leaders it is going to consume from (for those shards that it is assigned.)	2022-09-07 13:21:06 +02:00
Dom Dwyer	2a19606456	feat(ingester): restrict partition row count This limit restricts a single partition to containing at most N rows before it is marked for persistence (note: being marked for persistence does not currently prevent further ingest for that partition.)	2022-08-31 15:48:18 +02:00
Nga Tran	cb10a7c6d8	feat: More accurate memory estimate for compaction (#5471 ) * feat: initial implementation of memory estimation for a compaction * feat: estimate size of files and have the right actions for the needed budget * feat: run candidates in parallel * fix: have the right name for the column field of the output struct * feat: add metrics for estimated budgets * chore: cleanup * chore: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * fix: fix syntax after applying review's suggestions * refactor: Convert a Vec to VecDeque to go well with pop and push * chore: remove max_concurrent_size_bytes and input_size_threshold_bytes * chore: remove input_file_count_threshold * test: tests for estimate_arrow_bytes_for_file Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-30 13:44:44 +00:00
Raphael Taylor-Davies	711ba77341	chore: update object_store to test IMDSv1 fallback (#5509 ) * chore: update object_store to test IMDSv1 fallback * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-30 12:31:49 +00:00
Carol (Nichols \|\| Goulding)	74c9529062	fix: Rename KafkaPartition to ShardIndex	2022-08-29 14:07:18 -04:00
Carol (Nichols \|\| Goulding)	6443858870	fix: Rename compactor option from sequencer to shard	2022-08-29 14:06:45 -04:00
Nga Tran	3220c6f88b	feat: add file_count_threshold for comapcting cold partitions (#5456 ) * feat: file file_count_threshold for comapcting cold partitions to make it consistent with the hot case and help set up to avoid oom easier * chore: remove unecessary commments	2022-08-23 20:12:21 +00:00
Andrew Lamb	7f0ae53d6f	chore: Update to (almost) released object_store 0.4.0 (#5419 ) * chore: update object_store * chore: update hakari config * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-08-17 13:44:48 +00:00
Carol (Nichols \|\| Goulding)	b982bdaf2f	fix: Derive Eq when we derive PartialEq and members can derive Eq Allow this in generated code that we don't control, though. Recommended by clippy now. https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq	2022-08-11 15:04:06 -04:00
Carol (Nichols \|\| Goulding)	1b77abdda7	fix: Explain the purpose of this macro, make arg name better	2022-08-10 11:33:43 -04:00
Carol (Nichols \|\| Goulding)	96acf3c54b	fix: Don't rustfmt this module because of a rustfmt bug	2022-08-10 11:30:39 -04:00
Carol (Nichols \|\| Goulding)	75bdd470a2	docs: Rewrap doc comments to 100 cols now that they're indented more	2022-08-10 11:30:22 -04:00
Jake Goulding	3915841a53	feat: Introduce a separate config for the compactor command	2022-08-10 11:30:21 -04:00
Jake Goulding	21864f35e1	refactor: Generate the CompactorConfig in a macro This will allow us to have related but different configurations for service and command mode.	2022-08-10 11:30:20 -04:00
Marco Neumann	3446127b65	chore: enable and fix warnings for `clap_blocks` (#5365 ) Esp. this fixes "unused import" warnings when not all features are enabled, so developer IDEs don't shout. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-10 10:14:34 +00:00
Raphael Taylor-Davies	dadcc369b1	chore: update object_store to fix credentials client (#5359 ) * chore: update object_store to fix credentials client * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-08-09 13:17:43 +00:00
Raphael Taylor-Davies	dfa862fd53	chore: temporary allow http always (#5357 )	2022-08-09 10:54:42 +00:00
Raphael Taylor-Davies	ccb45d7bac	chore: update to rusoto-less object_store (#5342 ) * chore: update to rusoto-less object_store * chore: Run cargo hakari tasks * chore: further fixes * chore: document workaround * chore: review feedback Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-08-09 09:06:03 +00:00
Carol (Nichols \|\| Goulding)	da0b031c44	feat: Add parameters to limit total memory usage of cold partition compaction	2022-08-04 16:55:48 -04:00
Carol (Nichols \|\| Goulding)	d55f45a5c2	feat: Run compaction of hot partitions a configurable number of times more than cold	2022-08-04 16:55:48 -04:00
dependabot[bot]	e8231b2986	chore(deps): Bump serde_json from 1.0.82 to 1.0.83 (#5297 ) * chore(deps): Bump serde_json from 1.0.82 to 1.0.83 Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.82 to 1.0.83. - [Release notes](https://github.com/serde-rs/json/releases) - [Commits](https://github.com/serde-rs/json/compare/v1.0.82...v1.0.83) --- updated-dependencies: - dependency-name: serde_json dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Run cargo hakari tasks Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-04 14:28:29 +00:00
Marco Neumann	840e4801b8	feat: make querier RAM pool split a proper feature (#5283 ) * feat: make querier RAM pool split a proper feature - use propre pool names - expose sizing via CLI/env Closes https://github.com/influxdata/conductor/issues/1102. * refactor: improve naming and docs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-03 15:27:23 +00:00
Marco Neumann	663a20d743	refactor: remove `--ingster-address` (#5255 ) Closes #5002.	2022-08-03 15:05:01 +00:00
Nga Tran	8f1b6f2465	chore: reduce log info (#5254 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-01 16:00:34 +00:00
Marco Neumann	9a9a1a4777	feat: limit per-table chunk data for every query (#5223 ) * feat: `QueryChunk::as_any` * feat: allo `ChunkPruner::prune_chunks` to fail * feat: limit per-table chunk data for every query Closes #5211. * fix: address review comments Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-07-27 13:20:05 +00:00
Nga Tran	d05f383a98	refactor: reduce compacting size and compacted file size to prevent compactor from waiting for reading a large file forever (#5206 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-25 20:08:11 +00:00
Andrew Lamb	17231b4001	fix: warn if `--data-dir is specified but `--object_store` type is not file (#5147 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-19 01:00:54 +00:00
Carol (Nichols \|\| Goulding)	07e10852a8	feat: Add an input file count threshold to the compactor settings	2022-07-18 15:41:17 -04:00
Carol (Nichols \|\| Goulding)	128833e7d9	fix: Change placeholder new_param to input_size_threshold_bytes	2022-07-18 15:16:43 -04:00
Carol (Nichols \|\| Goulding)	d62b1ed7ee	feat: Select a subset of parquet files for a partition to compact Fixes #5120.	2022-07-18 15:14:22 -04:00
Carol (Nichols \|\| Goulding)	4416f1ce37	fix: Remove max number of level 0 files configuration option	2022-07-18 15:08:16 -04:00
Carol (Nichols \|\| Goulding)	57c70fcec5	fix: Remove redundant 'compaction' naming from CompactorConfig fields	2022-07-18 15:03:33 -04:00
Carol (Nichols \|\| Goulding)	0828fb5376	fix: Use more accurate number of bytes for MB and GB	2022-07-18 15:01:41 -04:00
Nga Tran	c8f4000f04	feat: Select compaction candidates (#5131 ) * feat: initial implementation for selecting compaction candidates * feat: 2 catalog functions to choose the most thorughput partitions to compact and the selecting candidate function itself * test: tests for the new 2 queries * feat: more tests and metrics for chooing compaction candidates * chore: Apply self suggestions from self review * chore: cleanup * chore: fix doc comment * chore: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * refactor: address review comments * fix: get the right time provider for the tests * refactor: remove the left over compaction_ * fix: typos * fix: make the param name and env name consistent * refactor: make relevant iSomething to uSomething * fix: typo Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2022-07-18 18:05:13 +00:00
Nga Tran	5c5c964dfe	feat: config params for Compactor (#5108 ) * feat: config params for Compactor * refactor: address review comments Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-13 13:50:07 +00:00
Carol (Nichols \|\| Goulding)	311d4c1f9a	fix: All-in-one mode only supports one partition/sequencer	2022-07-06 11:00:55 -04:00
Nga Tran	1de022136c	feat: add max desired file size config param (#5025 ) * feat: add max desired file size config param * fix: comment typos * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-07-05 15:32:45 +00:00
Sam Arnold	9438570ba1	test: document how to run tests (#4982 ) * test: document how to run tests Also fix a few issues for local runs. * docs: add back one-liner for running end to end tests * docs: add comment for clap_blocks test Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * docs: add comment in influxdb_iox/tests/end_to_end_cases/cli.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-06-30 14:01:35 +00:00
Carol (Nichols \|\| Goulding)	3049479b78	feat: Implement new querier to ingester config design	2022-06-30 08:26:50 -04:00
Carol (Nichols \|\| Goulding)	59da2dccb8	feat: Assert if no ingester addresses are found Temporarily support `--ingester-addresses` (and always return all ingesters) so that this PR can be deployed during the switchover.	2022-06-30 08:22:47 -04:00
Carol (Nichols \|\| Goulding)	0e450deca8	feat: Support a sequencer being mapped to multiple ingesters	2022-06-30 08:22:47 -04:00
Carol (Nichols \|\| Goulding)	7965bda42f	fix: Accept JSON ingester/shard config as CLI param value or env var value	2022-06-30 08:22:47 -04:00
Carol (Nichols \|\| Goulding)	4e91121e29	feat: Allow specification of sequencer to ingester mappings in a JSON file	2022-06-30 08:22:46 -04:00
Marko Mikulicic	16a8d29b9f	fix: Fix typo in const name (#4993 )	2022-06-30 07:51:39 +00:00
Raphael Taylor-Davies	835e1c91c7	chore: update object_store to 0.3.0 (#4707 ) * chore: update object_store to 0.3.0 * chore: review feedback Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-29 21:44:03 +00:00
Andrew Lamb	bfddb032ce	docs: improve docs for `persist_partition_size_threshold_bytes` / `INFLUXDB_IOX_PERSIST_PARTITION_SIZE_THRESHOLD_BYTES` (#4877 ) * docs: improve docs for `persist_partition_size_threshold_bytes` / `INFLUXDB_IOX_PERSIST_PARTITION_SIZE_THRESHOLD_BYTES` * docs: improve comments about LifecycleConfig::partition_size_threshold Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-27 21:52:40 +00:00
Marco Neumann	0fbff981ec	chore(deps): Bump sqlx to 0.6.0 and uuid to 1 (#4894 ) Closes #4889. Closes #4890. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 10:28:28 +00:00
Carol (Nichols \|\| Goulding)	e9cdaffe74	fix: Create querier sharder from catalog sequencer info Panic if there are no sharders in the catalog.	2022-06-15 10:18:54 -04:00
Carol (Nichols \|\| Goulding)	874ef89daa	feat: Make specifying the write buffer, and thus getting a sharder, optional in querier	2022-06-15 10:01:45 -04:00
Carol (Nichols \|\| Goulding)	148bc57e7b	refactor: Make the querier server constructor more like other server constructors	2022-06-15 10:01:45 -04:00
dependabot[bot]	23c9e38ea7	chore(deps): Bump clap from 3.1.18 to 3.2.1 (#4848 ) * chore(deps): Bump clap from 3.1.18 to 3.2.1 Bumps [clap](https://github.com/clap-rs/clap) from 3.1.18 to 3.2.1. - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.1.18...clap_complete-v3.2.1) --- updated-dependencies: - dependency-name: clap dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * chore: fix clap deprecations Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Marco Neumann <marco@crepererum.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-14 15:42:18 +00:00
Nga Tran	f0e477fcee	chore: let aggressively increase compactor job size and concurrency level (#4747 ) * chore: let aggressively increase compactor job size and concurrency level * chore: Apply suggestions from code review Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-01 14:32:36 +00:00
Andrew Lamb	2886149afc	chore: naming / comment cleanups from namespace semaphore (#4753 )	2022-06-01 12:46:38 +00:00
Marco Neumann	ebeccf037c	feat: limit querier concurrency by limiting number of active namespaces (#4752 ) This is a rather quick fix for prod. On the mid-term we probably wanna rethink our deployment strategy, e.g. by using "one query per pod" and by deploying queryd w/ IOx into the same pod.	2022-06-01 11:59:35 +00:00
Nga Tran	79220720be	chore: increase size of a compactor job and level of concurrency (#4746 ) * fix: let us not compact no-data * fix: split time must be greater min_time, too * fix: resolve merge conflict * chore: increase size of a compactor job and level of concurrency Co-authored-by: Dom <dom@itsallbroken.com>	2022-05-31 19:57:06 +00:00
Paul Dix	6af32b7750	feat: add concurrency limit for ingester queries (#4703 ) I've defaulted it to 20, we can adjust as needed. Closes #4657 Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-05-30 10:22:17 +00:00
Nga Tran	6cc767efcc	feat: teach compactor to compact smaller number of files (#4671 ) * refactor: split compact_partition into two functions to handle concurrency better * feat: limit number of files to compact * test: add test for limit num files * chore: fix cipply * feat: split group if over max size * fix: split the overlapped group to limit size or file num * chore: reduce config values * test: add tests and clearer comments for the split_overlapped_groups and test_limit_size_and_num_files * chore: more comments * chore: cleanup	2022-05-25 19:54:34 +00:00
Marco Neumann	2029bd16ba	feat: enable debugging of failed querier->ingester requests (#4659 ) * feat: enable debugging of failed querier->ingester requests - extend `query-ingester` CLI to allow usage of predicates - on failed requests: log all information that required for the CLI - test the "ingester fails" scenario * test: explain Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * docs: improve Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: move b64 pred. serde into a single crate Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-05-23 15:37:31 +00:00
Carol (Nichols \|\| Goulding)	ab72c93a5e	docs: Updating wrapping, content, and grammar of comments	2022-05-20 10:51:07 -04:00
Carol (Nichols \|\| Goulding)	c811bebdb7	feat: Add ingester CLI option to skip to oldest available WB seq num The default behavior of the ingester is to panic if the min unpersisted sequence number in the catalog is unknown to the write buffer due to the retention policies having evicted that sequence number. Specifying `--skip-to-oldest-available` changes this behavior to skip to the oldest sequence number the write buffer does have available and go from there. Fixes #4624.	2022-05-20 10:51:07 -04:00
Marco Neumann	779f0e9cdf	feat: querier RAM pool (#4593 ) * feat: `SortKey::size` * feat: `FunctionEstimator` * feat: querier RAM pool Let's put all the caches into a single RAM pool, so we can at least somewhat control RAM usage. Note that this does NOT limit the peak memory during query execution though, but should at least stop unlimited cache growth. A follow-up PR will add metrics. * refactor: improve some size calculations Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-05-17 13:11:20 +00:00
Nga Tran	9530e73925	chore: move noisy debug to trace and fix some comments (#4598 ) * chore: move noisy debug to trace and fix some comments * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * chore: fix format Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-05-13 19:18:15 +00:00
Raphael Taylor-Davies	f2bb0fdf77	feat: update to crates.io object_store version (#4595 ) * feat: update to crates.io object_store version * chore: Run cargo hakari tasks * fix: tests * chore: remove object store integration test plumbing Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-05-13 16:26:07 +00:00
Raphael Taylor-Davies	84d60ce56e	fix: feature flags (#4550 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-05-10 13:42:51 +00:00
Raphael Taylor-Davies	8b379c83cc	refactor: simplify object_store path handling (#4534 ) * refactor: simplify object_store path handling * fix: aws integration tests * chore: lint * fix: update gcs tests * refactor: move errors into submodules * chore: lint * chore: review feedback * refactor: replace provider with Display * fix: failing tests Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-05-09 18:43:22 +00:00
Carol (Nichols \|\| Goulding)	0541c6e40f	fix: Remove data_types crate where it's no longer used	2022-05-06 14:45:39 -04:00
Carol (Nichols \|\| Goulding)	44209faa8e	fix: Move write buffer data types to write_buffer crate	2022-05-06 14:45:38 -04:00
Carol (Nichols \|\| Goulding)	fb8f8d22c0	fix: Remove now-unused ServerId. Fixes #4451	2022-05-06 14:45:38 -04:00
Marco Neumann	0a20086a58	feat: expose catalog timeouts via CLI/env (#4472 ) This is useful for local instances that run against a prod system, because port forwarding can lead to long connection delays. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-29 11:14:15 +00:00
Carol (Nichols \|\| Goulding)	5c89e7b952	fix: Remove max catalog connection config from all-in-one mode	2022-04-28 09:29:03 -04:00
Carol (Nichols \|\| Goulding)	ff268ace6d	fix: Add more docs on write buffer temp dir behavior	2022-04-28 09:29:02 -04:00
Carol (Nichols \|\| Goulding)	c93c26c503	fix: Add reference to TODO ticket	2022-04-28 09:29:02 -04:00
Carol (Nichols \|\| Goulding)	84683056db	feat: Add more logging for all-in-one mode to make it clearer what components are used	2022-04-28 09:29:02 -04:00
Carol (Nichols \|\| Goulding)	d6d50f83c2	feat: Set different catalog config defaults for all-in-one mode Connects to #4399. If `--catalog-dsn` is specified, use that Postgres catalog. If `--catalog-dsn` is not specified, use an in-memory catalog.	2022-04-28 09:29:02 -04:00
Carol (Nichols \|\| Goulding)	941dd12dd1	feat: Set different write buffer config defaults for all-in-one mode Connects to #4399. Only file-based write buffer is supported. If `--data-dir` is specified, store it there, otherwise store it in a temp directory to be ephemeral	2022-04-28 09:29:02 -04:00
Carol (Nichols \|\| Goulding)	0cfd16263c	refactor: Change run_config to logging_config The only spot this method is used actually wants the logging config	2022-04-28 09:29:01 -04:00
二手掉包工程师	4b47d723b1	refactor: Rename time to iox_time (#4416 ) Signed-off-by: hi-rustin <rustin.liu@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-26 00:19:59 +00:00
Marco Neumann	ef1f252c6c	test: use random catalog schema name for NG end2end tests Otherwise tests leak state to other tests or previous runs.	2022-04-13 17:24:42 +02:00

1 2 3 4 5 ...

263 Commits (196c589ef64f73677eb3e89e60b219f862bde19a)