Commit Graph

319 Commits (fcd80060befb1080ecb46656d222268dd8fc489e)

Author SHA1 Message Date
Dom Dwyer c76129a7e8
refactor: fix lint failures 2023-04-27 13:19:06 +02:00
Dom be3256d1a7
Merge branch 'main' into dom/proptest-cache 2023-04-26 14:55:58 +01:00
Martin Hilton 4b24c988ad
feat(service_grpc_flight): JDBC compatible Handshake (#7660)
* refactor(authz): move extract_header_token into authz

Move the extract_header_token method into the authz package so that
it can be shared by the query path. The method is renamed to reflect
the fact that it can now also extract a token from gRPC metadata.

The extract_token function is now a little more generic to allow
it to be used with HTTP header values and gRPC metadata values.

* feat(service_grpc_flight): JDBC compatible Handshake

While testing some JDBC based clients we found that some, Tableau
in this case,  cannot be configured with authoriztion tokens. In
these cases we need to be able to support username/password. The
approach taken is to ignore the username and make the token the
password. This is the same approach being taken throughout the
product.

To facilitate this the Flight RPC Handshake command has been extended
to look for Basic authorization credentials and respond with the
appropriate Bearer authorization header.

While adding end-to-end tests the subprocess commands were causing
a deadlock. These have been changed to using the tonic::process
module.

There are also some small changes to the JDBC test application where
the hardcoded values were clashing with the authorization parameters.

* fix: lint

* chore: apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: review suggestion

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-26 13:52:49 +00:00
Dom Dwyer 9cc58eb4e4
test: property test namespace schema merging
This commit adds a randomised property test, that compares the results
of the new namespace cache schema merging (#7555) with a known-good
stdlib HashSet union (the cache implementation is effectively a more
specialised set union operation).

This property test also validates the "last writer wins" semantics for
other, non-schema data within the namespace.

Additionally the ChangeSet values returned over a pair of updates are
asserted to reflect the actual values added to the cache (but not each
call individually) to ensure accurate metrics are reported.
2023-04-26 15:45:54 +02:00
Fraser Savage ffe4747cf2
fix(router): Fix new_columns calculation for namespace cache table merges
This commit adds logic to ensure that all pre-existing columns are
counted when no merge takes place and a test covering that.
2023-04-26 14:00:10 +01:00
Fraser Savage d9111e2a1a
Merge branch 'main' into savage/additive-namespace-schema-caching 2023-04-26 12:30:52 +01:00
Fraser Savage 2921a79ac3
refactor(router): Use sum to count new_columns instead of fold
Co-authored-by: Dom <dom@itsallbroken.com>
2023-04-26 12:21:45 +01:00
Fraser Savage 41ee990d68
fix(router): Re-introduce cache put metric insert/update attribute 2023-04-26 12:04:43 +01:00
Fraser Savage c837a6e8dc
docs(router): Explicitly document use of get() and insert() for schema merge
Co-authored-by: Dom <dom@itsallbroken.com>
2023-04-26 11:55:55 +01:00
dependabot[bot] 09d6b4ae50
chore(deps): Bump tokio-stream from 0.1.12 to 0.1.13 (#7666)
Bumps [tokio-stream](https://github.com/tokio-rs/tokio) from 0.1.12 to 0.1.13.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-stream-0.1.12...tokio-stream-0.1.13)

---
updated-dependencies:
- dependency-name: tokio-stream
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-26 09:31:04 +00:00
Fraser Savage 5a9c68e428
perf(router): Perform NamespaceCache schema merge out of lock
This re-introduces the potential racy conflicting schema updates, to
optimise for the expected read-heavy workload. This limits the point at
which write requests may race with schema updates to overlapping calls
to put, rather than the write call-path as a whole.
2023-04-24 16:13:01 +01:00
Fraser Savage 065018be11
Merge branch 'main' into savage/additive-namespace-schema-caching 2023-04-24 15:26:42 +01:00
wiedld daabe9663c chore(idpe-17434): make restrictive whitelist of chars accepted, for any NamespaceName 2023-04-21 16:36:00 -07:00
wiedld b870242ec7 chore(idpe-17434): remove utf8-percent encoding on v2 write path, such that it matches v1 writes and onCreate 2023-04-21 16:31:55 -07:00
wiedld 781d6c040d
fix: process query param for token, even when header is not present. (#7619)
* Move the or_else conditional out of the Some() chain
2023-04-21 17:44:59 +00:00
wiedld 1d2003d385
feat(idpe-17265): cst write authorization (#7527)
* feat(idpe-17265): authorization should occur as part of the single_tenant specific mod
* authz service is accessed only through the single_tenant mod handler
* authz service is wrapped in auth mod
* move auth integration test into auth mod
* push down the authorize() call into the query params parser call, in order to access query params in the extract_token
* provide configuration error when authz or single_tenant mode are not co-presented
* update authz e2e fixtures

* feat(idpe-17265): extract tokens based upon preferred ordering in spec, and write tests to verify behavior.

* chore(idpe-17265): update naming conventions for a unifying parser

* test: make MockAuthorizer have default, and add a test_delegate_to_authz for CST

* chore: record authz duration metric, and include in delegation test.

* chore: use authz terminology instead of auth_service

* chore: more explicit naming

* Revert "chore: record authz duration metric, and include in delegation test."

This reverts commit 05c36888ca7247b6953343d759a5185098fae679.

* refactor: extract_header_token versus the else condition

* refactor: make single_tenant mod and move auth within

* chore: make unreachable explicitly panic in the build

* test: make token values be const, to be consumed when MockAuthorizer is used

* test: use locking for calls_counter in test

* fix: add base64 encoding as expected for Basic header

* fix: merge conflict resolution. The AuthorizationHeaderExtension is now under the authz::http mod, which is a required feature for router package.

* chore: run rustfmt nightly with preferred import handling, on files with modified imports

* chore: code cleanup, to have minimal code needed
2023-04-19 15:28:10 +00:00
Dom Dwyer 03c5ea5488
feat(router): configurable RPC write message size
Provide a configuration item for the router (in RPC mode) that controls
the maximum outgoing RPC message size when communicating with an
Ingester.

Raises the maximum from the default 4MiB to 100MiB. This does not
increase exposure to memory-based DOS, as writes are size-limited by the
HTTP layer to 10MiB, preventing a user from submitting a write this
large (or larger!) across the RPC boundary.
2023-04-19 14:57:53 +02:00
Dom Dwyer cf38e3bae5
chore: use http in router authz deps
The router should be using the "http" feature - this prevents
crate-specific tests from compiling otherwise.
2023-04-19 14:57:53 +02:00
Fraser Savage 851eda92a3
Merge branch 'main' into savage/additive-namespace-schema-caching 2023-04-18 14:25:52 +01:00
Fraser Savage ae99a8725f
feat(router): Additively merge tables in NamespaceCache
This commit adds additive merge behaviour for tables missing
from the new NamespaceCache entry, as well as moving calculation
of change stat metrics down to the in-memory implementation.

The metrics no longer distinguish between insert and update
caches OPs as a result of the change to the `put_schema()` interface.
2023-04-18 14:18:48 +01:00
kodiakhq[bot] f8f57ceeec
Merge branch 'main' into dom/router-deps 2023-04-18 10:53:43 +00:00
Dom Dwyer f46a29aa42
refactor(router): remove unused deps
The Removes more unused dependencies in the router specifically.
2023-04-18 12:34:14 +02:00
Dom Dwyer 2b9a809de4
refactor: move HTTP authz helpers into authz
The "server_util" crate exists only to support HTTP authz operations, so
this commit moves it under the authz crate. This helper is gated by a
feature flag allowing callers to opt into this extra HTTP dependency
(disabled by default).
2023-04-18 12:30:56 +02:00
Dom Dwyer c5bb88e173
chore: remove unused dependencies
Some crates import dependencies they never use.
2023-04-18 12:07:13 +02:00
Carol (Nichols || Goulding) d60e4d5823
feat: Delete delete parsing code from router (#7573)
And return the "deletes unsupported" error sooner.

Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-18 09:57:02 +00:00
Fraser Savage dae06b4587
refactor(router): Use by-ref lookup API for NamespaceCache
Assert namespace ID does not change for the same named cache entry,
as it is an invariant.
2023-04-18 10:14:56 +01:00
Fraser Savage b99141f880
docs(router): Clarify namespace cache doc comment
This documents clearer the behaviour on merge.

Co-authored-by: Dom <dom@itsallbroken.com>
2023-04-17 15:46:46 +01:00
Fraser Savage 32c863ccb5
refactor(router): Unexport NamespaceStats, clean-up column merge loop 2023-04-17 10:56:07 +01:00
Fraser Savage f0615f59d8
docs(router): Update documentation for change to NamespaceCache semantics
The changes to `put_schema()`'s change the signature and introduces a
side-effect transform. This needed to be documented at the trait level.
2023-04-14 10:15:31 +01:00
Fraser Savage 3edc05884c
refactor(router): Move NamespaceSchema into cache on put
By moving the namespace schema into the Put cache method and returning
the new value wrapped in an Arc, it allows for the cache to merge the
new schema and the existing schema without calling clone() on either.

This has a side effect of allowing the metrics and stats capture
behaviour to be achieved without leaking into the traits definition.
2023-04-14 10:01:36 +01:00
Fraser Savage 3425bc176e
refactor(router): Surface stats for new namespace schema in cache
The previous behaviour of the router's NamespaceCache was to provide
put semantics where the entire schema in the cache is replaced. With
the addition of the additive merging side-effect, the metrics decorator
could not compute the correct statistics. This calculates them during
the merge and surfaces the result to the caller.
2023-04-14 10:01:35 +01:00
Fraser Savage 96365fc1c6
feat(router): Merge schema in namespace cache on write
Rather than unconditionally overwriting the whole namespace schema
in the namespace cache if an entry already exists the in-memory
cache will now merge any column schema missing from the new entry.
In order to calculate correct metrics for column count, the cache needs
to return extra data for an insert.
2023-04-14 10:01:34 +01:00
wiedld ca492b09d2 fix(idpe-17449): accept content-encoding identity for the parseBody 2023-04-13 17:09:21 -07:00
dependabot[bot] e811a69a1e
chore(deps): Bump serde_json from 1.0.95 to 1.0.96 (#7535)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.95 to 1.0.96.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.95...v1.0.96)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-13 10:07:32 +00:00
kodiakhq[bot] 53ddca45d8
Merge branch 'main' into cn/remove-write-summary 2023-04-12 16:07:35 +00:00
Andrew Lamb 20e9c91866
refactor: Use workspace dependencies for `tonic`, `tonic-build`, etc (#7515)
* refactor: Use workspace dependencies for `tonic`, `tonic-build`, etc

* chore: Run cargo hakari tasks

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-12 16:07:19 +00:00
Carol (Nichols || Goulding) 6387a9576a
fix: Remove the write_summary crate and write info service 2023-04-12 11:31:23 -04:00
Carol (Nichols || Goulding) d025362ce0
fix: Remove old router 2023-04-12 10:15:48 -04:00
Fraser Savage dc6053bfba
refactor(router): Apply further code review changes, clean up docs 2023-04-12 14:40:02 +01:00
Fraser Savage 8a2b88398f
refactor(router): Apply suggestions from code review
Assert an invariant, document existing edge cases and a little cleanup.

Co-authored-by: Dom <dom@itsallbroken.com>
2023-04-12 14:12:12 +01:00
Fraser Savage a6ccb05caf
refactor(router): DML handler tests use helper fn to set-up NS cache 2023-04-11 16:47:48 +01:00
Fraser Savage 728b7293b9
feat(router): Use read-through namespace cache for NamespaceResolver
The NamespaceResolver was using its own very similar look-aside caching
to the DML handlers, this commit leverages the read-through cache
implementation to deduplicate more code and makes the read through
behavioural expectation explicit for namespace autocreation.
2023-04-11 15:38:18 +01:00
Fraser Savage d590d19e3b
feat(router): Use read-through NamespaceCache with DML handlers
This removes the look-aside cache from the retention_validation
and schema_validation DML handlers, instead setting up the new
NamespaceCache decorator and using that to handle cache misses.
2023-04-11 15:38:17 +01:00
Fraser Savage 0bb88dcd4f
refactor(router): Return Result from NamespaceCache, use GAT for Error
This commit refactors the NamespaceCache trait to return a result
instead of an option for calls to `get_schema()`, allowing callers and
decorators to differentiate between cache misses, namespaces not
existing and transient I/O errors. This allows implementations to
interact with backend catalog storage.
2023-04-11 15:38:17 +01:00
Fraser Savage 082e8db9ef
refactor(router): Make NamespaceCache an async_trait
In order to implement a read-through NamespaceCache
decorator the `get_cache()` call will need to interact
with async catalog methods, so this allows implementations
to call await within the `get_cache()` body.
2023-04-11 15:38:16 +01:00
Dom Dwyer 73d44ec9a1
Merge remote-tracking branch 'origin/main' into dom/req-mode-parsing 2023-04-11 13:34:52 +02:00
Martin Hilton d2585002fe
chore(authz): Change "namespace" to "database" (#7502)
Part of the wider effort to consistently use tht term "database"
for the user-facing terminology, update the authorization system.
Whilst this system is technically user-facing, it is unlikely many
users will see it. It is however new enough that the change is
relatively little effort.
2023-04-11 11:04:51 +00:00
wiedld 9a56d08ddc test: add namespace char validation tests, to highlight the current contracts for v1/v2 and MT/CST. Contracts will be iterated with followup issue 7489 2023-04-10 12:59:20 -07:00
wiedld 9288155ac4 test: add multi-tenant missing params test, invalid namespaceError test for v1 single tenant, and v2 single-tenant missing bucket should have consistent message. 2023-04-10 12:59:20 -07:00
Dom Dwyer 306bffb4b7 docs: fix comment
Copy/paste.
2023-04-10 12:59:20 -07:00