influxdb

Commit Graph

Author	SHA1	Message	Date
Martin Hilton	13657d5bcc	feat(authz): authorization service client and write integration (#7216 ) * feat(authz): add authorization client. Add a new authz crate to provide the interface for making authorization checks from within IOx. This includes the default client that uses the influxdata.iox.authz.v1 gRPC protocol. This feature is not used by any IOx component yet. * feat: optional authorization on write path Support optionally enabling authorization checks on the /api/v2/write handler. If an authrorizer is configured then the handler will attempt to retrieve a token from the request's Authorization header. If no such token exists then a response with a 401 error code is returned. If the token is not valid, or does not have write permission for the requested namespace then a response with a 403 error is returned. * chore: add unit test for authz in write handler Add unit tests that test the correct functioning of the /api/v2/write handler when an Authorizer is configured. * chore(authz): use lazy connection Change the initialization of the authz client to use a lazy connection. This allows the client to be initialised synchronously. * chore: Run cargo hakari tasks * fix(authz): protolint complaints * fix: authz tests * fix: benches and lint * chore: Update clap_blocks/src/authz.rs Co-authored-by: Marko Mikulicic <mkm@influxdata.com> * chore: Update authz/src/lib.rs Co-authored-by: Marko Mikulicic <mkm@influxdata.com> * chore: Update clap_blocks/src/authz.rs Co-authored-by: Marko Mikulicic <mkm@influxdata.com> * chore: review suggestions * chore: review suggestions Apply a number of suggestions from review comments. The main behavioural change is that if the authz service is configured applictions will perform a probe request to ensure it can communicate before continuing startup. * chore: Update router/src/server/http.rs Co-authored-by: Dom <dom@itsallbroken.com> --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Marko Mikulicic <mkm@influxdata.com> Co-authored-by: Dom <dom@itsallbroken.com>	2023-03-17 15:20:14 +00:00
Carol (Nichols \|\| Goulding)	493f331e4b	fix: Remove the max_compact_size knob and hardcode a multiple (#7197 ) * fix: Remove the max_compact_size knob and hardcode a multiple Rather than panic if the user hasn't set this knob in a particular way, set the max_compact_size to the minimum value we need by multiplying max_desired_file_size_bytes by MIN_COMPACT_SIZE_MULTIPLE. Fixes influxdata/idpe#17259. * refactor: Move computation of max_compact_size_bytes into compactor config * test: change test setups to reflect the purposes of the tests --------- Co-authored-by: NGA-TRAN <nga-tran@live.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-03-15 11:21:28 +00:00
Joe-Blount	c87113ccbf	chore(iox/compactor): rename max_input_parquet_bytes_per_partition (#7160 )	2023-03-08 17:08:08 +00:00
dependabot[bot]	535f3d92a2	chore(deps): Bump serde_json from 1.0.93 to 1.0.94 (#7132 ) Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.93 to 1.0.94. - [Release notes](https://github.com/serde-rs/json/releases) - [Commits](https://github.com/serde-rs/json/compare/v1.0.93...v1.0.94) --- updated-dependencies: - dependency-name: serde_json dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-03-06 12:12:23 +00:00
dependabot[bot]	3256fcc72e	chore(deps): Bump object_store from 0.5.4 to 0.5.5 Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.4 to 0.5.5. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md) - [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.4...object_store_0.5.5) --- updated-dependencies: - dependency-name: object_store dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2023-03-03 02:00:51 +00:00
Andrew Lamb	dd5d4f4435	chore(compactor2): document and test `split_percentage` and `percentage_max_file_size` knobs (#7026 ) * chore: document and test split_percentage and percentage_max_file_size * fix: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * chore: add test with both max file size and split percentage * docs: whitespace engineering and small typo --------- Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2023-02-27 15:01:06 +00:00
Carol (Nichols \|\| Goulding)	faae5eb438	chore: Rerun cargo hakari manage-deps	2023-02-27 11:56:15 +01:00
dependabot[bot]	4cf066680e	chore(deps): Bump tempfile from 3.3.0 to 3.4.0 (#7069 ) * chore(deps): Bump tempfile from 3.3.0 to 3.4.0 Bumps [tempfile](https://github.com/Stebalien/tempfile) from 3.3.0 to 3.4.0. - [Release notes](https://github.com/Stebalien/tempfile/releases) - [Changelog](https://github.com/Stebalien/tempfile/blob/master/NEWS) - [Commits](https://github.com/Stebalien/tempfile/commits) --- updated-dependencies: - dependency-name: tempfile dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Run cargo hakari tasks --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Dom <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-27 10:02:21 +00:00
Marco Neumann	08578cded5	refactor: n_threads and n_target_partitions are non-zero (#7047 ) * refactor: n_threads and n_target_partitions are non-zero Zero values will just panic. Prevent that earlier. * fix: typo Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> --------- Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2023-02-23 16:57:00 +00:00
Joe-Blount	88d2882350	Merge branch 'main' into alamb/remove_old_algorithm	2023-02-21 09:02:35 -06:00
Joe-Blount	19be6df6cd	Merge branch 'main' into jrb_6_compactor_query_thread_default	2023-02-21 08:17:42 -06:00
Andrew Lamb	716c469324	feat: Implement all-in-one local persistence testing mode (#7027 ) * feat: Implement all-in-one local persistence testing mode * fix: Apply suggestions from code review Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com> --------- Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-21 12:45:25 +00:00
dependabot[bot]	77ee4d512a	chore(deps): Bump http from 0.2.8 to 0.2.9 Bumps [http](https://github.com/hyperium/http) from 0.2.8 to 0.2.9. - [Release notes](https://github.com/hyperium/http/releases) - [Changelog](https://github.com/hyperium/http/blob/master/CHANGELOG.md) - [Commits](https://github.com/hyperium/http/compare/v0.2.8...v0.2.9) --- updated-dependencies: - dependency-name: http dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2023-02-20 02:54:56 +00:00
Joe-Blount	7b97cdd69c	chore: change default for compactor_config.query_exec_thread_count to be 1 less than the CPU count	2023-02-17 15:29:44 -06:00
Andrew Lamb	d82d00b847	docs(compactor2): Update compactor2 config parameter documentation (#7022 ) * chore: Update compactor2 config parameter documentaton * fix: clarify ording	2023-02-17 21:09:17 +00:00
Andrew Lamb	21a3c8c40d	refactor: delete all at once algorithm	2023-02-17 06:24:26 -05:00
Nga Tran	f69c8adc7c	feat: Compact partition with many L0 files (#7007 ) * feat: initial implementation of the split * feat: split many L0 files in groups and compact them into new and fewer L0 files * test: remove iappropriate AllAtOnce test * refactor: move file classification for initial target to its own function * fix: pop the branch from start to end * chore: address review comments * feat: support splitting to many L1 files * feat: only add extra round to compact level-n files to same level-n files if their files plus overlapped level-n-plus-1 over limit * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <alamb@influxdata.com> * chore: final cleanup and address comments * chore: run fmt --------- Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-16 21:17:25 +00:00
Nga Tran	0ffb211c54	test: more compactor layout tests (#6988 ) * test: more compactor layout tests * chore: address review comments	2023-02-14 22:14:06 +00:00
Nga Tran	5c506058da	feat: skip partitions of wide tables (#6978 ) * feat: skip partitions of wide tables * test: one more test * refactor: address review comments	2023-02-14 16:42:13 +00:00
Marco Neumann	2283b3d401	fix: `clap_blocks` clippy warning (#6945 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-10 12:29:45 +00:00
Carol (Nichols \|\| Goulding)	d7150b83bf	fix: Consolidate ingester address specification Fixes #6418. Makes sure the querier, the router, and the ingest replica CLI all accept and validate ingester addresses the same, except whether or not at least one value is required.	2023-02-09 15:23:23 -05:00
dependabot[bot]	6327e3d9c0	chore(deps): Bump serde_json from 1.0.92 to 1.0.93 (#6918 ) Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.92 to 1.0.93. - [Release notes](https://github.com/serde-rs/json/releases) - [Commits](https://github.com/serde-rs/json/compare/v1.0.92...v1.0.93) --- updated-dependencies: - dependency-name: serde_json dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-09 10:39:33 +00:00
dependabot[bot]	0ecde75af5	chore(deps): Bump object_store from 0.5.3 to 0.5.4 (#6900 ) Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.3 to 0.5.4. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md) - [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.3...object_store_0.5.4) --- updated-dependencies: - dependency-name: object_store dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-08 09:40:11 +00:00
Marco Neumann	dcba47ab58	feat: allow the compactor to process all known partitions (#6887 ) * feat: `PartitionRepo::list_ids` * refactor: `CatalogPartitionsSource` => `CatalogToCompactPartitionsSource` * feat: allow the compactor to process all known partitions Closes #6648. * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> --------- Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2023-02-08 09:32:21 +00:00
Stuart Carnie	eb245d6774	feat: Initial SQLite catalog schema (#6851 ) * feat: Initial SQLite catalog schema * chore: Run cargo hakari tasks * feat: impls, many TODOs * feat: completed `todo!()`'s * chore: add remaining tests from postgres module * feat: add SQLite to get_catalog API * chore: Add docs * chore: Placate clippy * chore: Placate clippy * chore: PR feedback from @domodwyer --------- Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2023-02-06 22:55:14 +00:00
Marco Neumann	52b43c40bc	refactor: use "endless" stream for compactor work (#6803 ) Instead of looping and polling a fresh set of partitions and constructing a stream from that, use an endless stream instead. This helps w/ efficiency during roll-overs since we can already start to process the next set of partitions while the last ones from the previous round are still in-progress. Closes #6750. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-06 11:11:39 +00:00
dependabot[bot]	6f4e287a3a	chore(deps): Bump serde_json from 1.0.91 to 1.0.92 (#6860 ) Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.91 to 1.0.92. - [Release notes](https://github.com/serde-rs/json/releases) - [Commits](https://github.com/serde-rs/json/compare/v1.0.91...v1.0.92) --- updated-dependencies: - dependency-name: serde_json dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-02-06 08:27:41 +00:00
Nga Tran	e85de74a5d	feat: partition filters for TargetLevel version and a complete test (#6858 ) * feat: partition filters for TargetLevel version and a complete test * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <alamb@influxdata.com> * chore: run fmt after applying review suggestions in git --------- Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-04 19:54:41 +00:00
Carol (Nichols \|\| Goulding)	30fea67701	fix: Move variables within format strings. Thanks clippy! Changes made automatically using `cargo clippy --fix`.	2023-02-03 13:06:17 -05:00
Nga Tran	1535366666	refactor: rename compact algo versions to reflect their actual work (#6841 ) * refactor: rename compact algo versions to reflect thier actual work * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <alamb@influxdata.com> --------- Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-02-03 15:06:38 +00:00
Marco Neumann	0a20bd404e	feat: allow selection of different compactor algo versions (#6800 ) Required to lift&shift to hot-cold compaction w/ keeping the codebase maintainable.	2023-02-01 15:33:41 +00:00
Marco Neumann	62697265c1	feat: compactor sharding (#6729 ) I'm not saying we have to use this, but this is a demonstration how easy it would be to add sharding to the compaction tier and also acts as a "backup / insurance" if we ever need it. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-31 14:37:06 +00:00
Marco Neumann	515f0eef64	feat: simple compactor2 self-protection (#6728 ) Add some rough "partition is too big" filter for now until we can deal with them (the framework allows that but we need to set up the proper divide-and-conquer components). This will hopefully prevent our prod compactor from dying that often. Note that this is also duct-tape around two issues: - DataFusion not accounting in-flight data all the time - Our wide fan-out query plans (see https://github.com/influxdata/idpe/issues/16768#issuecomment-1387056833 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-30 10:57:47 +00:00
Dom	a7770f0f7a	Merge branch 'main' into dom/reduce-write-timeout	2023-01-30 09:59:37 +00:00
Christopher M. Wolff	55257b46c9	chore: validate ingester URIs on querier CLI (#6740 ) * chore: add validate for ingesters on querier CLI * chore: fix typo and tests * chore: clippy * chore: review feedback --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-27 21:13:52 +00:00
Dom Dwyer	353b1ad575	feat: configurable RPC write request timeout Allows the user to configure the timeout used for a single RPC write request, and changes the default to a more sensible value (30 -> 3 seconds).	2023-01-27 14:53:48 +01:00
Dom Dwyer	6797eab5fc	feat(router): configurable partition key Allows the partition key to be set at runtime, though it's probably best no one does so for now.	2023-01-27 14:26:18 +01:00
Christopher M. Wolff	c78088b043	fix: update clap parser for --ingester-addresses (#6723 ) * fix: update clap parser for --ingester-addresses * fix: make querier2 specify ingester addrs same as router2 * fix: update clap parser args but do not prepend http:// * chore: cargo fmt	2023-01-27 02:54:57 +00:00
Marco Neumann	30d411dc95	feat: shadow mode (#6712 ) * refactor: remove untyped durations from `compactor2` * feat: shadow mode Closes #6645. * refactor: split input and output store	2023-01-26 14:20:55 +00:00
Marco Neumann	ed694d3be4	feat: introduce scratchpad store for compactor (#6706 ) * feat: introduce scratchpad store for compactor Use an intermediate in-memory store (can be a disk later if we want) to stage all inputs and outputs of the compaction. The reasons are: - fewer IO ops: DataFusion's streaming IO requires slightly more IO requests (at least 2 per file) due to the way it is optimized to read as little as possible. It first reads the metadata and then decides which content to fetch. In the compaction case this is (esp. w/o delete predicates) EVERYTHING. So in contrast to the querier, there is no advantage of this approach. In contrary this easily adds 100ms latency to every single input file. - less traffic: For divide&conquer partitions (i.e. when we need to run multiple compaction steps to deal with them) it is kinda pointless to upload an intermediate result just to download it again. The scratchpad avoids that. - higher throughput: We want to limit the number of concurrent DataFusion jobs because we don't wanna blow up the whole process by having too much in-flight arrow data at the same time. However while we perform the actual computation, we were waiting for object store IO. This was limiting our throughput substantially. - shadow mode: De-coupling the stores in this way makes it easier to implement #6645. Note that we assume here that the input parquet files are WAY SMALLER than the uncompressed Arrow data during compaction itself. Closes #6650. * fix: panic on shutdown * refactor: remove shadow scratchpad (for now) * refactor: make scratchpad safe to use	2023-01-26 10:03:08 +00:00
Andrew Lamb	6caf31acf3	chore: Move garbage collection configuration into clap_blocks (#6678 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-25 11:31:48 +00:00
Marco Neumann	40e6a1a437	feat: job semaphore (#6696 ) * refactor: avoid too-many-arguments * refactor: extract `fetch_partition_info` * feat: job semaphore	2023-01-25 10:35:07 +00:00
Marco Neumann	4521516147	feat: add per-partition timeout (#6686 ) It seems that prod was hanging last night. This is pretty hard to debug and in general we should protect the compactor against hanging / malformed partitions that take forever. This is similar to the fact that the querier also has a timeout for every query. Let's see if this shows anything in prod (and if not it's still a desired safety net). Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-24 16:53:47 +00:00
Nga Tran	840923abab	refactor: execute compaction plan (#6654 ) * chore: address review comment of previous PR * refactor: execute compact plan * refactor: we will now compact all L0 and L1 files of a partition and split them as needed * chore: comnents Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-20 22:34:50 +00:00
Marco Neumann	5e297b4667	refactor: lift up compactor2 CLI args, set mem limit to 8GB (#6631 ) - use a single data structure for CLI args (not two) - set mem limit default to 8GB (same as querier). We can always tune this later, but we should not run with "unlimited" to begin with.	2023-01-19 12:21:51 +00:00
Nga Tran	9ae03b16d6	feat: invokes catalog functions for compactor2 (#6619 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-19 10:33:57 +00:00
Marco Neumann	380a855aab	feat: basic compactor2 algo layout (#6616 )	2023-01-18 18:51:59 +00:00
Marco Neumann	e72173d58d	feat: very basic compactor2 skeleton (#6614 ) Sets up crate and wires up the main binary. No tests yet, no algorithm framework, just the bare minimum. Also I decided to not offer a gRPC server in `compactor2` at the moment and hence did not implement any handle/delegate infrastructure. We add this later if we need it. This also means compactor2 does NOT provide a catalog service for now. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-18 16:36:40 +00:00
Nga Tran	fa0893819c	fix: have warm compaction work with compactor2 (#6571 ) * refactor: same function to select partition candidates * fix: have warm compaction work with compactor2 * fix: format * chore: cleanup	2023-01-12 02:32:39 +00:00
Nga Tran	62c0f3dbdd	feat: have cold compaction work with Compactor2 (#6542 ) * feat: cold * chore: debug info * feat: only compact qualified cold partition candidates * fix: catalog test * chore: cleanup * chore: add new config flag for cold partition candidates * chore: implement display for CompactionType and add tests for max num partitions Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-10 16:42:57 +00:00

1 2 3 4 5

203 Commits (93ebb42fddde40d2686afbc42bac1701802b9c41)