* feat(authz): add authorization client.
Add a new authz crate to provide the interface for making authorization
checks from within IOx. This includes the default client that uses
the influxdata.iox.authz.v1 gRPC protocol. This feature is not used
by any IOx component yet.
* feat: optional authorization on write path
Support optionally enabling authorization checks on the /api/v2/write
handler. If an authrorizer is configured then the handler will
attempt to retrieve a token from the request's Authorization header.
If no such token exists then a response with a 401 error code is
returned. If the token is not valid, or does not have write permission
for the requested namespace then a response with a 403 error is
returned.
* chore: add unit test for authz in write handler
Add unit tests that test the correct functioning of the /api/v2/write
handler when an Authorizer is configured.
* chore(authz): use lazy connection
Change the initialization of the authz client to use a lazy connection.
This allows the client to be initialised synchronously.
* chore: Run cargo hakari tasks
* fix(authz): protolint complaints
* fix: authz tests
* fix: benches and lint
* chore: Update clap_blocks/src/authz.rs
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
* chore: Update authz/src/lib.rs
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
* chore: Update clap_blocks/src/authz.rs
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
* chore: review suggestions
* chore: review suggestions
Apply a number of suggestions from review comments. The main
behavioural change is that if the authz service is configured
applictions will perform a probe request to ensure it can communicate
before continuing startup.
* chore: Update router/src/server/http.rs
Co-authored-by: Dom <dom@itsallbroken.com>
---------
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
Co-authored-by: Dom <dom@itsallbroken.com>
This implements the RPC "delete_namespace" handler, allowing a namespace
to be soft-deleted.
Adds unit coverage for all handlers & e2e test coverage for the new
handler (the rest were already covered).
The tests also highlight the caching issue documented here:
https://github.com/influxdata/influxdb_iox/issues/6175
This commit adds initial support for "soft" namespace deletion, where
the actual records & data remain, but are no longer queryable /
writeable.
Soft deletion is eventually consistent - users can expect to continue
writing to and reading from a bucket after issuing a soft delete call,
until the various components either restart, or have their caches
flushed.
The components treat soft-deleted namespaces differently:
* router: ignore soft deleted namespaces
* ingester: accept soft deleted namespaces
* compactor: accept soft deleted namespaces
* querier: ignore soft deleted namespaces
* various gRPC services: ignore soft deleted namespaces
This ensures that the ingester & compactor do not see rows "vanishing"
from the database, and continue to make forward progress.
Writes for the deleted namespace that are buffered in the ingester will
be persisted as normal, allowing us to support "un-delete" operations
where the system is restored to a the state at which the delete was
issued (rather than loosing the buffered data).
Follow-on work is required to ensure GC drops the orphaned parquet files
after the configured GC time, and optimisations such as not compacting
parquet from soft-deleted namespaces seems like a trivial win.
Use the namespace schema cache in the router to enforce the
per-namespace table limit (service protection limit), adding O(1)
overhead to the existing column limit evaluation logic.
Prior to this commit, each request that would breach the table limit
would be (potentially partially) applied to the catalog and return an
error. Every subsequent request creating a new table continued to cause
a catalog query, unnecessarily adding load proportional to request
counts.
After this commit, catalog requests are sent when the router instance
can determine (to the best of it's ability, see below) that the request
will not cause the namespace to exceed the table limit.
Because this uses cached schemas, the actual state set of tables may
have changed - this will cause inconsistent enforcement and spurious
errors in the same way it currently does for the column limit. For more
details (and to track a resolution) see:
https://github.com/influxdata/influxdb_iox/issues/5957
The maximum number of tables is part of the Namespace, which is already
loaded in its entirety. This commit copies the value into the
NamespaceSchema, making it available for the router to utilise.
Envoy will connect to an endpoint on demand, and return an
application-level error if it fails with a gRPC status code of
"Unavailable".
It also embeds a metadata entry of {"server": "envoy"} - this commit
uses the two signals (error status code + metadata entry) to drive an
immediate reconnection when observed, assuming the connection is bad.
Prior to this commit, namespaces that had been created on one router
could not be used on another router until the latter was restarted.
Effectively, newly created namespaces couldn't be used.
After this commit, the catalog is also checked when a cache miss occurs,
ensuring the router discovers new, not-yet-cached namespaces.
Service limits are enforced on two values:
* Number of tables in a namespace
* Number of columns in a table
This commit labels the existing service limit hit metric with the type
of limit reached, and adds this information to the log lines emitted.
Ensure a "probe" node is always returned as the first candidate, driving
it to recovery faster.
This also includes a fix for the balancer metrics that would report
probe candidate nodes as healthy nodes.