* feat: provide convenience methods to create Scheduler, and keep the scheduler implementations crate private. External crates can only create a Scheduler based upon configs.
* feat: provide Scheduler as a component to compactor. Specifically, the scheduler configs are present within the compactor run config, and the scheduler in created within the compactor hardcoded components.
* feat: within the compactor ScheduledPartitionsSource, utilize the dyn Scheduler and Scheduler.get_jobs()
* feat: CompactionJob should be per partition, and have a uniqueness characteristic independent of the partition
* feat: keep compactor_scheduler separate from clap_blocks. Only interface is within ioxd_compactor where the CLI configs are transformed into ShardConfig and PartitionsSourceConfig.
* chore: make IdOnlyPartitionFilter into only pub(crate)
* chore: update scheduler display to include any report information (a.k.a. shard_config, if present)
This commit fixes loads of crates (47!) had unused dependencies, or
mis-configured dependencies (test deps as normal deps).
I added the "unused_crate_dependencies" to all crates to help prevent
this mess from growing again!
https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html
This has the minor downside of false-positives when specifying
dev-dependencies for test/bench binaries - these are files in /test or
/benches (not normal tests). This commit includes a workaround,
importing them in lib.rs (gated by a feature flag). I think the
trade-off of better dependency management is worth it!
The replication flag defines the total number of copies of each write -
slightly less confusing than the additional copies it was previously,
and matches with the actual code.
* refactor: Change catalog configuration so it is entirely dsn based / support end to end testing without postgres
Restores code from https://github.com/influxdata/influxdb_iox/pull/7708
Revert "revert: PR #7708"
This reverts commit c9cfe05f8d.
* fix: merge
* fix: Update new test
This PR increases the batch/page size of list operations in the gc 10x
to 10,000; it introduces a new cli config for the sleep interval between
batches. Previously a single sleep config was used between batches and
between entirely new list operations. This PR also moves to debug some
noisy logging.
* tag to #7689
Still insert them into the database and associate them with namespaces,
but don't ever query them back out.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This PR renames the CLI option and variables throughout the compactor to be `df_concurrency` (and similar) instead of `job_concurrency` (and similar).
From the perspective of the compactor, the most noteworthy consequence of this limit is on concurrency within Data Fusion. This is the limit on the number of DataFusion jobs the compactor may start concurrently.
* refactor: move authz-addr flag into router-specific config
* refactor: move authz-addr flag into querier-specific config
* refactor: remove global AuthzConfig which is now redundant with the pushdown to individual configs. Keep constant the env vars used universally.
* chore: make errors lowercase, and use the required bool for the authz-addr flag
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Provide a configuration item for the ingester2 that controls the maximum
incoming RPC message size.
Raises the maximum from the default 4MiB to a more reasonable 100MiB.
Provide a configuration item for the router (in RPC mode) that controls
the maximum outgoing RPC message size when communicating with an
Ingester.
Raises the maximum from the default 4MiB to 100MiB. This does not
increase exposure to memory-based DOS, as writes are size-limited by the
HTTP layer to 10MiB, preventing a user from submitting a write this
large (or larger!) across the RPC boundary.
This is helpful to test changes in our defaults but also for testing.
Required for https://github.com/influxdata/idpe/issues/17474 .
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This creates a separate option for the number of minutes *without* a
write that a partition must have before being considered for cold
compaction.
This is a new CLI flag so that it can have a different default from hot
compaction's compaction_partition_minute_threshold.
I didn't add "hot" to compaction_partition_minute_threshold's name so
that k8s-idpe doesn't have to change to continue running hot compaction
as it is today.
Then use the relevant threshold earlier, when creating the
PartitionsSourceConfig, to make it clearer which threshold is used
where.
Right now, this will silently ignore any CLI flag specified that isn't
relevant to the current compaction mode. We might want to change that to
warn or error to save debugging time in the future.
Adds a single-tenant mode (CST) to the IOx routers.
Single-tenancy mode differs in two main ways:
* V1 write endpoint is partially supported
* V2 write endpoint ignores "org" parameter
The "normal" mode is "multi tenant" which is the default operational
mode, and all existing behaviour remains unchanged. Single tenant mode
can be enabled by specifying INFLUXDB_IOX_SINGLE_TENANCY=true.
Request parsing is delegated to two implementations of the
WriteParamExtractor trait, one each for CST and MT - the logic of each
"mode" is defined within these files and all other functionality is
common between the two.
This commit also renames some of the error types for clarity
(NoSpecified -> NoOrgBucketSpecified, other NotSpecified ->
NoQueryParams, etc).
Note: single tenant code requires testing
* feat(authz): add authorization client.
Add a new authz crate to provide the interface for making authorization
checks from within IOx. This includes the default client that uses
the influxdata.iox.authz.v1 gRPC protocol. This feature is not used
by any IOx component yet.
* feat: optional authorization on write path
Support optionally enabling authorization checks on the /api/v2/write
handler. If an authrorizer is configured then the handler will
attempt to retrieve a token from the request's Authorization header.
If no such token exists then a response with a 401 error code is
returned. If the token is not valid, or does not have write permission
for the requested namespace then a response with a 403 error is
returned.
* chore: add unit test for authz in write handler
Add unit tests that test the correct functioning of the /api/v2/write
handler when an Authorizer is configured.
* chore(authz): use lazy connection
Change the initialization of the authz client to use a lazy connection.
This allows the client to be initialised synchronously.
* chore: Run cargo hakari tasks
* fix(authz): protolint complaints
* fix: authz tests
* fix: benches and lint
* chore: Update clap_blocks/src/authz.rs
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
* chore: Update authz/src/lib.rs
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
* chore: Update clap_blocks/src/authz.rs
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
* chore: review suggestions
* chore: review suggestions
Apply a number of suggestions from review comments. The main
behavioural change is that if the authz service is configured
applictions will perform a probe request to ensure it can communicate
before continuing startup.
* chore: Update router/src/server/http.rs
Co-authored-by: Dom <dom@itsallbroken.com>
---------
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
Co-authored-by: Dom <dom@itsallbroken.com>
* fix: Remove the max_compact_size knob and hardcode a multiple
Rather than panic if the user hasn't set this knob in a particular way,
set the max_compact_size to the minimum value we need by multiplying
max_desired_file_size_bytes by MIN_COMPACT_SIZE_MULTIPLE.
Fixesinfluxdata/idpe#17259.
* refactor: Move computation of max_compact_size_bytes into compactor config
* test: change test setups to reflect the purposes of the tests
---------
Co-authored-by: NGA-TRAN <nga-tran@live.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>