Commit Graph

452 Commits (ac426fe5e1f7dab9d21b3f67fd0eae49b5356cdf)

Author SHA1 Message Date
Fraser Savage a930be45f7
refactor(router): Use map & sum over values instead of fold over iter
Also add a nice comment explaining what the string keys are for
[`ChangeStats`].

Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-28 15:11:13 +01:00
Fraser Savage e00a5cab13
perf(router): Pre-compute `ChangeStats` new column total during schema merge
During the schema merge the new tables are iterated over already (to find
which tables and columns are new), so the number needed for the metrics
can be pre-computed to spare two extra loops over the new tables and new
columns returned in `ChangeStats`.
2023-07-27 14:01:50 +01:00
Fraser Savage 5453ad8ba4
feat(router): Include table/column diff for namespace schema cache update
This adds some computational overhead during the merging of new
namespace schema with what's in the router's local cache, but will allow
gossiping of changes.
2023-07-27 13:37:47 +01:00
Fraser Savage c818f90aef
docs(router): Remove code doc ref from router CLI flag text 2023-07-26 11:01:13 +01:00
Fraser Savage 61e79374e0
feat(router): Expose circuit breaker healthcheck config
Exposes the `ERROR_WINDOW` parameter that controls the router's
downstream error-gate health check behaviour as an environment
variable/command line flag. This allows tuning, per-environment, the
period over which the error rate of 80% must be exceeded to cause an
ingester to appear unhealthy.
2023-07-26 09:48:55 +01:00
Fraser Savage c834ec171f
test(router): Custom partition template API create using `time` tag value is rejected
This removes the double negative from the error message and adds
coverage at the router's gRPC API level for the rejection of the bad
TagValue value.
2023-07-24 13:07:04 +01:00
wiedld efae0f108a
feat(idpe-17887): enable `/` in db name for v1 write. (#8235)
* test case for proposed new behavior in v1 write endpoint.
* autogen and default are equivalent reserved words for rp
* have write endpoint match query endpoint, in that db and rp are always concated
2023-07-18 09:36:25 -07:00
dependabot[bot] e33a078128
chore(deps): Bump paste from 1.0.13 to 1.0.14 (#8244)
Bumps [paste](https://github.com/dtolnay/paste) from 1.0.13 to 1.0.14.
- [Release notes](https://github.com/dtolnay/paste/releases)
- [Commits](https://github.com/dtolnay/paste/compare/1.0.13...1.0.14)

---
updated-dependencies:
- dependency-name: paste
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-17 16:10:02 +00:00
Carol (Nichols || Goulding) 10a0f8e3bf
fix: Remove ::default() when constructing unit structs
As recommended by https://rust-lang.github.io/rust-clippy/master/index.html#default_constructed_unit_structs
2023-07-14 10:50:55 -04:00
Carol (Nichols || Goulding) d40bc54b71
fix: Remove unneeded double derefs found with new lint suspicious_double_ref_op 2023-07-14 10:25:21 -04:00
Fraser Savage 7e595eca88
test(router): Assert RPC write span contexts can be parsed as encoded
This test aims to add some assertion that the span context is correctly
encoded into an RPC write request as long as the [`TraceHeaderParser`]
is responsible for decorating the requests extensions with the added
information.
2023-07-12 16:41:40 +01:00
Fraser Savage 5a37c92c2c
feat(router): Send tracing SpanContext header to ingester during RPC write 2023-07-12 11:30:50 +01:00
dependabot[bot] 8b000862e1
chore(deps): Bump pretty_assertions from 1.3.0 to 1.4.0 (#8182)
Bumps [pretty_assertions](https://github.com/rust-pretty-assertions/rust-pretty-assertions) from 1.3.0 to 1.4.0.
- [Release notes](https://github.com/rust-pretty-assertions/rust-pretty-assertions/releases)
- [Changelog](https://github.com/rust-pretty-assertions/rust-pretty-assertions/blob/main/CHANGELOG.md)
- [Commits](https://github.com/rust-pretty-assertions/rust-pretty-assertions/compare/v1.3.0...v1.4.0)

---
updated-dependencies:
- dependency-name: pretty_assertions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-07 09:35:18 +00:00
dependabot[bot] bc6bf2d8e5
chore(deps): Bump smallvec from 1.10.0 to 1.11.0 (#8164)
Bumps [smallvec](https://github.com/servo/rust-smallvec) from 1.10.0 to 1.11.0.
- [Release notes](https://github.com/servo/rust-smallvec/releases)
- [Commits](https://github.com/servo/rust-smallvec/compare/v1.10.0...v1.11.0)

---
updated-dependencies:
- dependency-name: smallvec
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-06 09:43:27 +00:00
dependabot[bot] 9a03d9c9fe
chore(deps): Bump paste from 1.0.12 to 1.0.13 (#8139)
Bumps [paste](https://github.com/dtolnay/paste) from 1.0.12 to 1.0.13.
- [Release notes](https://github.com/dtolnay/paste/releases)
- [Commits](https://github.com/dtolnay/paste/compare/1.0.12...1.0.13)

---
updated-dependencies:
- dependency-name: paste
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-04 07:57:41 +00:00
Dom Dwyer 8dd159456a
test: assert partitioner row counts
Assert the number of rows yielded by the partitioner matches the number
of input rows.
2023-06-16 14:14:03 +02:00
Dom Dwyer f92b866979
test: better proptest timestamp for DML partition
Changes the proptest for the router's partitioner handler to use a
timestamp generation strategy that more accurately models the
distribution of timestamps in real-world requests.
2023-06-16 12:40:33 +02:00
Dom Dwyer 0a2a315b91
chore: limit chrono features
See https://rustsec.org/advisories/RUSTSEC-2020-0071
2023-06-15 16:41:20 +02:00
Dom Dwyer 5388e49734
test: router partition handler
Asserts the partitioning code within the router (that drives the
low-level partitioning logic) generates partitions with rows with
timestamps that belong in those partitions.
2023-06-15 14:54:46 +02:00
Marco Neumann 335d9f7357
chore: minimize proptest features (#7993) 2023-06-14 12:28:18 +00:00
dependabot[bot] 2ffa9f3cda
chore(deps): Bump crossbeam-utils from 0.8.15 to 0.8.16
Bumps [crossbeam-utils](https://github.com/crossbeam-rs/crossbeam) from 0.8.15 to 0.8.16.
- [Release notes](https://github.com/crossbeam-rs/crossbeam/releases)
- [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md)
- [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-utils-0.8.15...crossbeam-utils-0.8.16)

---
updated-dependencies:
- dependency-name: crossbeam-utils
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-06-13 02:00:14 +00:00
Marko Mikulicic d26ad8e079
feat: Allow passing service protection limits in create db gRPC call (#7941)
* feat: Allow passing service protection limits in create db gRPC call

* fix: Move the impl into the catalog namespace trait

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-08 14:28:32 +00:00
Carol (Nichols || Goulding) d0db1194e2
feat: Validate custom partition templates on their creation
Make sure custom partition templates have:

- At least one part
- No more than 8 parts
- Only nonempty, valid strftime formats
2023-06-07 11:38:12 -04:00
Carol (Nichols || Goulding) ac26ceef91
feat: Make a place to do partition template validation
- Create data_types::partition_template::ValidationError
- Make creation of NamespacePartitionTemplateOverride and
  TablePartitionTemplateOverride fallible
- Move SerializationWrapper into a module to make its inner field
  private to force creation through one fallible constructor; this is
  where the validation logic will go to be shared among all uses of
  partition templates
2023-06-07 11:38:12 -04:00
Dom Dwyer 8e61dc5aef
refactor: remove InvalidStrftime value
It's big, it's annoying, it's already available to the user.
2023-06-05 11:31:02 +02:00
Dom Dwyer a873e119c4
test(bench): router partitioner
Adds a benchmark that exercises the router's partitioning DmlHandler
implementation against a set of three files (very small, small, medium)
with 4 different partitioning schemes:

    * Single tag, which occurs in all rows
    * Single tag, which does not occur in any row
    * Default strftime formatter (YYYY-MM-DD)
    * Long and complicated strftime formatter

This covers the entire partitioning overhead - building the formatters,
evaluating each row, grouping the values into per-partition buckets, and
returning to the caller, where it normally would be passed to the next
handler in the pipeline.

Note that only one template part is evaluated in each case - this
measures the overhead of each type of formatter. In reality, we'd expect
partitioning with custom schemes to utilise more than one part,
increasing the cost of partitioning proportionally. This is a
lower-bound measurement!
2023-06-02 16:04:09 +02:00
Dom Dwyer f0832818ee
test(router): invalid strftime partition template
An integration test asserting that a router returns an error when
attempting to partition a write with an invalid strftime partition
formatter, rather than panicking.
2023-06-01 17:44:44 +02:00
Dom Dwyer 47214ec9a0
fix: prevent panics in partitioning logic
Changes the partitioning logic to be fallible. This prevents an invalid
partition template from causing a panic, previously possible through two
known code paths:

    * TagValue formatter referencing a non-tag column
    * Time formatter using an invalid strftime format string

If either occurs, the write attempt is now aborted and an error returned
to the user with a HTTP 500 status code.

Additionally unexpected partitioner errors now map to a catch-all error
instead of panicking.
2023-06-01 17:44:44 +02:00
Dom Dwyer 27bef292a3
feat: unambiguously reversible partition keys
This commit changes the format of partition keys when generated with
non-default partition key templates ONLY. A prior fixture test is
unchanged by this commit, ensuring the default partition keys remain
the same.

When a custom partition key template is provided, it may specify one or
more parts, with the TagValue template causing values extracted from tag
columns to appear in the derived partition key.

This commit changes the generated partition key in the following ways:

    * The delimiter of multi-part partition keys; the character used to
      delimit partition key parts is changed from "/" to "|" (the pipe
      character) as it is less likely to occur in user-provided input,
      reducing the encoding overhead.

    * The format of the extracted TagValue values (see below).

Building on the work of custom partition key overrides, where an
immutable partition template is resolved and set at table creation time,
the changes in this PR enable the derived partition key to be
unambiguously reversed into the set of tag (column_name, column_value)
tuples it was generated from for use in query pruning logic. This is
implemented by the build_column_values() method in this commit, which
requires both the template, and the derived partition key.

Prior to this commit, a partition key value extracted from a tag column
was in the form "tagname_x" where "x" is the value and "tagname" is the
name of the tag column it was extracted from. After this commit, the
partition key value is in the form "x"; the column name is removed from
the derived string to reduce the catalog storage overhead (a key driver
of COGS). In the case of a NULL tag value, the sentinel value "!" is
inserted instead of the prior "tagname_" marker. In the case of an empty
string tag value (""), the sentinel "^" value is inserted instead of the
"tagname_-" marker, ensuring the distinction between an empty value and
a not-present tag is preserved.

Additionally tag values utilise percent encoding to encode reserved
characters (part delimiter, empty sentinel character, % itself) to
eliminate deserialisation ambiguity.

Examples of how this has changed derived partition keys, for a template
of [Time(YYYY-MM-DD), TagValue(region), TagValue(bananas)]:

    Write: time=1970-01-01,region=west,other=ignored
        Old: "1970-01-01-region_west-bananas"
        New: "1970-01-01|west|!"

    Write: time=1970-01-01,other=ignored
        Old: "1970-01-01-region-bananas"
        New: "1970-01-01|!|!"
2023-05-30 15:58:25 +02:00
Dom Dwyer 9e0570f2bf
refactor: explicit submod for partition_template
Move the import into the submodule itself, rather than re-exporting it
at the crate level.

This will make it possible to link to the specific module/logic.
2023-05-30 15:13:20 +02:00
Andrew Lamb 1ff76b7bf2 chore: use workspace dependencies for `object_store` 2023-05-26 07:03:42 -04:00
dependabot[bot] ececd0ada7
chore(deps): Bump base64 from 0.21.1 to 0.21.2 (#7874)
Bumps [base64](https://github.com/marshallpierce/rust-base64) from 0.21.1 to 0.21.2.
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md)
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.21.1...v0.21.2)

---
updated-dependencies:
- dependency-name: base64
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
2023-05-26 09:28:43 +00:00
Carol (Nichols || Goulding) de243ad823
test: Verify default template usage 2023-05-25 10:55:51 -04:00
Carol (Nichols || Goulding) fe07e34714
test: Add router tests that set templates and verify writes 2023-05-25 10:44:57 -04:00
Carol (Nichols || Goulding) 17219d71fe
feat: Use the table service in the router 2023-05-25 10:44:57 -04:00
Carol (Nichols || Goulding) fb53faaa2f
refactor: Only use Partitioner::default and derive it 2023-05-24 10:34:31 -04:00
Carol (Nichols || Goulding) 9c0faa66f0
feat: Set a table partition template explicitly or from the namespace
And use the table partition template when partitioning writes to that
table.
2023-05-24 10:34:30 -04:00
Carol (Nichols || Goulding) 604bab9508
fix: Make Table create_or_get be only create 2023-05-24 10:34:30 -04:00
Carol (Nichols || Goulding) afb3838437
feat: Optionally supply the namespace partition template when creating a namespace 2023-05-24 10:10:34 -04:00
Carol (Nichols || Goulding) 6f92bccc99
feat: Use protobuf for PartitionTemplate in CreateNamespace gRPC API
The service implementation doesn't use this field yet.
2023-05-24 10:10:34 -04:00
dependabot[bot] 24a4f36d24
chore(deps): Bump proptest from 1.1.0 to 1.2.0 (#7857)
Bumps [proptest](https://github.com/proptest-rs/proptest) from 1.1.0 to 1.2.0.
- [Release notes](https://github.com/proptest-rs/proptest/releases)
- [Changelog](https://github.com/proptest-rs/proptest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/proptest-rs/proptest/compare/v1.1.0...v1.2.0)

---
updated-dependencies:
- dependency-name: proptest
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-24 09:21:32 +00:00
dependabot[bot] b7fbfa6fb2
chore(deps): Bump criterion from 0.4.0 to 0.5.0 (#7856)
Bumps [criterion](https://github.com/bheisler/criterion.rs) from 0.4.0 to 0.5.0.
- [Changelog](https://github.com/bheisler/criterion.rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/bheisler/criterion.rs/compare/0.4.0...0.5.0)

---
updated-dependencies:
- dependency-name: criterion
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-24 09:08:37 +00:00
Dom Dwyer 928a4d163e
build: remove unused dependencies from crates
This commit fixes loads of crates (47!) had unused dependencies, or
mis-configured dependencies (test deps as normal deps).

I added the "unused_crate_dependencies" to all crates to help prevent
this mess from growing again!

    https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html

This has the minor downside of false-positives when specifying
dev-dependencies for test/bench binaries - these are files in /test or
/benches (not normal tests). This commit includes a workaround,
importing them in lib.rs (gated by a feature flag). I think the
trade-off of better dependency management is worth it!
2023-05-23 14:55:43 +02:00
Dom Dwyer 6cf180738b
test: more exacting retention validation tests
The old tests used partial error string matching, with the whole error
message! So when I added more to the error message, the fixture tests
didn't fail.

This changes the tests to match the full string, and validate the
timestamps are included.
2023-05-23 12:04:52 +02:00
Dom Dwyer ec0d1375d4
feat(router): put timestamps in retention error
Include the minimum acceptable timestamp (the retention bound) and the
observed timestamp that exceeds this bound in the retention enforcement
write error response.
2023-05-23 11:50:32 +02:00
dependabot[bot] 6cb7619d83
chore(deps): Bump base64 from 0.21.0 to 0.21.1 (#7832)
Bumps [base64](https://github.com/marshallpierce/rust-base64) from 0.21.0 to 0.21.1.
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md)
- [Commits](https://github.com/marshallpierce/rust-base64/commits)

---
updated-dependencies:
- dependency-name: base64
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-22 09:50:06 +00:00
Andrew Lamb 6344fe8c3f
chore: Add rationale for `clippy::future_not_send` (#7822)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-18 16:58:56 +00:00
Dom Dwyer 82500720e4
refactor(cli): update replication help text
The replication flag defines the total number of copies of each write -
slightly less confusing than the additional copies it was previously,
and matches with the actual code.
2023-05-18 16:01:12 +02:00
wiedld 506bd80f6f
Merge branch 'main' into chore/router-metrics-for-auth-v2 2023-05-16 10:24:42 -07:00
Dom 3c47226244
Merge branch 'main' into dom/parallel-replicate 2023-05-16 11:05:10 +01:00
wiedld 2e2aac9ac8 refactor: with updated Authorizer interface, update the metric to delineate the different scenarios 2023-05-15 11:25:01 -07:00
Carol (Nichols || Goulding) 57bedb1c2d
refactor: Extract a test helper function to create a basic namespace 2023-05-15 14:20:38 -04:00
wiedld d087160112 chore: update naming conventions, and use assert_histogram in tests 2023-05-15 09:26:15 -07:00
wiedld 199daee0f6 chore: make AuthorizerInstrumentation use a constant topic (metric name) within the registry 2023-05-15 08:52:09 -07:00
wiedld d8661d043b chore: use new authorizer metric decorator, in the router 2023-05-15 08:52:06 -07:00
wiedld 867fd39dbf
Merge branch 'main' into authz/refactor-interface 2023-05-15 08:03:10 -07:00
Kaya Gökalp 5fe8affb18
refactor: accept NamespaceName with Namespace create (#7774)
Co-authored-by: Dom <dom@itsallbroken.com>
2023-05-15 10:03:55 +00:00
wiedld 4c30e7e04d refactor: Authorizer trait should have a single interface for requested permissions()
* returns an intersection of requested_perms and actual perms_on_token
* returns ok if any of the requested_perms is within the actual perms_on_token
2023-05-12 15:28:58 -05:00
Dom Dwyer dfe1a7dec8
perf(router): parallel write replication
This commit changes the write replication loop to concurrently write to
N distinct upstream ingesters, instead of the previous sequential logic.
2023-05-12 17:04:32 +02:00
Dom Dwyer bf93014bb7
feat: concurrent lending iterator
Changes the UpstreamSnapshot to be suitable for concurrent use. This
type contains the core logic to enable a caller to uphold the
responsibility of ensuring replicated writes land on distinct ingesters
in the presence of concurrent replication.

The clients within the snapshot are returned to at most one concurrent
caller at a time, by tracking the state of each client as a FSM:

                        ┌────────────────┐
                     ┌─▶│   Available    │
                     │  └────────────────┘
                     │           │
                   drop       next()
                     │           │
                     │           ▼
                     │  ┌────────────────┐
                     └──│    Yielded     │
                        └────────────────┘
                                 │
                              remove
                                 │
                                 ▼
                        ┌────────────────┐
                        │      Used      │
                        └────────────────┘

Once a client has been yielded it will not be yielded again until it is
dropped (transitioning the FSM from "yielded" to "available" again,
returning it to the candidate pool of clients) or removed (transitioning
to "used", permanently preventing it from being yielded to another
caller).
2023-05-12 17:04:32 +02:00
Dom Dwyer cdaf99268c
refactor: owned client in UpstreamSnapshot
Changes then UpstreamSnapshot to return owned clients, instead of
references to those clients.

This will allow the snapshot to have a 'static lifetime, suitable for
use across tasks.
2023-05-12 16:59:49 +02:00
Dom Dwyer dc27ae5fbf
refactor: eliminate impossible error
Because the number of candidate upstreams is checked to exceed the
number of desired data copies before starting the write loop, and
because the parallelism of the write loop matches the number of desired
data copies, it's not possible for any thread to observe an empty
snapshot.

This commit removes the unreachable error condition for clarity.
2023-05-12 16:59:49 +02:00
Dom Dwyer 465158e08e
test(router): replication prop/invariant fuzzing
Adds a property-based test of the RPC write handler's replication logic,
ensuring:

    1. If the number of healthy upstreams is 0, NoHealthyUpstreams is
       returned and no requests are attempted.

    2. Given N healthy upstreams (> 0) and a replication factor of R:
       if N < R, "not enough replicas" is returned and no requests are
       attempted.

    3. Upstreams that return an error are retried until the entire
       write succeeds or times out.

    4. Writes are replicated to R distinct upstreams successfully, or
       an error is returned.

    5. One an upstream write is ack'd as successful, it is never
       requested again.

    6. An upstream reporting as unhealthy at the start of the write is
       never requested (excluding probe requests).

These properties describe a mixture of invariants (don't replicate your
two copies of a write to the same ingester) and expected behaviour of
the replication logic (optimisations like "don't try writes when you
already know they'll fail").

This passes for the single-threaded replication logic used at the time
of this commit, and will be used to validate correctness of a concurrent
replication implementation - a concurrent approach should uphold these
properties the same way a single-threaded implementation does.
2023-05-12 16:59:48 +02:00
Dom Dwyer cf622c1b91
test: MockWriteClient tracks ACK response count
Changes the MockWriteClient to track how many success responses it has
returned in response to a write request.
2023-05-12 16:59:48 +02:00
Dom Dwyer ac656ab1f9
refactor: clearer NoHealthyUpstreams error name
Renames NoUpstreams -> NoHealthyUpstreams as it's confusing because we
also have "not enough replicas" which could be no upstreams? This has a
slightly clearer meaning.
2023-05-12 16:59:47 +02:00
Dom Dwyer 8e74f0a568
test: proptest upstream snapshot cycles
Adds a proptest that ensures the set of upstream ingesters is cycled
over indefinitely, with each element yielded an equal number of times.
2023-05-12 16:59:44 +02:00
Dom 3f0e7745c2
Merge branch 'main' into dom/rpc-client-error-split 2023-05-09 14:41:12 +01:00
Dom Dwyer ab666ea5fa
refactor: owned ColumnsByName constructor only
Refactors the From<BtreeMap> impl that accepted a &str name for
ColumnsByName construction, instead allowing only the owned String, and
updating the test that makes use of it appropriately.
2023-05-09 14:55:03 +02:00
Carol (Nichols || Goulding) 23c0110b32
feat: Create newtypes for different partition templates
So that the different kinds aren't mixed up. Also extracts the logic
having to do with which template takes precedence onto the
PartitionTemplate type itself.
2023-05-09 14:55:02 +02:00
Carol (Nichols || Goulding) ebceabb608
feat: Add an assertion for the new invariant that namespace partition templates never change 2023-05-09 14:55:02 +02:00
Carol (Nichols || Goulding) 70dca8f60b
fix: Pass the NamespaceSchema through the dml write traits 2023-05-09 14:55:02 +02:00
Carol (Nichols || Goulding) c062d2d890
fix: Change NamespaceResolver to return the whole cached NamespaceSchema
Rather than picking out the ID and partition template to be passed
around separately
2023-05-09 14:55:01 +02:00
Carol (Nichols || Goulding) e8a480f5f6
fix: Give up ownership of Column when adding to a table
To enable reuse of existing allocations rather than borrowing, creating
new allocations, then dropping them.
2023-05-09 14:55:00 +02:00
Carol (Nichols || Goulding) e8655af52d
fix: Change ColumnsByName::new to enable taking ownership if caller wants to give it 2023-05-09 14:55:00 +02:00
Carol (Nichols || Goulding) cc41216382
fix: Undo the addition of a TableInfo type; store partition_template on TableSchema 2023-05-09 14:54:59 +02:00
Carol (Nichols || Goulding) 596673d515
refactor: Create a new ColumnsByName type to abstract over TableSchema columns
And allow usage of just the columns when that's all that's needed
without leaking the BTreeMap implementation detail everywhere
2023-05-09 14:54:58 +02:00
Carol (Nichols || Goulding) 1f1dcc947d
fix: Don't change how the compactor gets the table schema 2023-05-09 14:54:58 +02:00
Carol (Nichols || Goulding) 58d9c40ffd
feat: If namespace or table partition templates are specified, use those 2023-05-09 14:54:57 +02:00
Carol (Nichols || Goulding) c1a8408572
fix: Consolidate the default partition template; remove --partition-key-pattern CLI option 2023-05-09 14:54:57 +02:00
Carol (Nichols || Goulding) 2aa8713d1d
fix: Remove partition TemplatePart::Table; partitioning is already per-table 2023-05-09 14:54:57 +02:00
Dom Dwyer 6aa657a649
refactor(router): separate RPC client error types
This commit splits out the RPC-request-centric errors in RpcWriteError
into their own RpcWriteClientError type.

This improves the separation of concerns - an RpcWriteError comes from
the RPC write handler, whereas an RpcWriteClientError comes from an
underlying client. It's definitely less confusing!
2023-05-09 14:33:56 +02:00
Carol (Nichols || Goulding) 3d5df5574a
fix: Remove vestiges of shards 2023-05-08 20:24:36 -04:00
Carol (Nichols || Goulding) b0959667d5
fix: Move topic and query pool within iox catalog (#7734)
Still insert them into the database and associate them with namespaces,
but don't ever query them back out.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-04 13:45:56 +00:00
Dom Dwyer c76129a7e8
refactor: fix lint failures 2023-04-27 13:19:06 +02:00
Dom be3256d1a7
Merge branch 'main' into dom/proptest-cache 2023-04-26 14:55:58 +01:00
Martin Hilton 4b24c988ad
feat(service_grpc_flight): JDBC compatible Handshake (#7660)
* refactor(authz): move extract_header_token into authz

Move the extract_header_token method into the authz package so that
it can be shared by the query path. The method is renamed to reflect
the fact that it can now also extract a token from gRPC metadata.

The extract_token function is now a little more generic to allow
it to be used with HTTP header values and gRPC metadata values.

* feat(service_grpc_flight): JDBC compatible Handshake

While testing some JDBC based clients we found that some, Tableau
in this case,  cannot be configured with authoriztion tokens. In
these cases we need to be able to support username/password. The
approach taken is to ignore the username and make the token the
password. This is the same approach being taken throughout the
product.

To facilitate this the Flight RPC Handshake command has been extended
to look for Basic authorization credentials and respond with the
appropriate Bearer authorization header.

While adding end-to-end tests the subprocess commands were causing
a deadlock. These have been changed to using the tonic::process
module.

There are also some small changes to the JDBC test application where
the hardcoded values were clashing with the authorization parameters.

* fix: lint

* chore: apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: review suggestion

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-26 13:52:49 +00:00
Dom Dwyer 9cc58eb4e4
test: property test namespace schema merging
This commit adds a randomised property test, that compares the results
of the new namespace cache schema merging (#7555) with a known-good
stdlib HashSet union (the cache implementation is effectively a more
specialised set union operation).

This property test also validates the "last writer wins" semantics for
other, non-schema data within the namespace.

Additionally the ChangeSet values returned over a pair of updates are
asserted to reflect the actual values added to the cache (but not each
call individually) to ensure accurate metrics are reported.
2023-04-26 15:45:54 +02:00
Fraser Savage ffe4747cf2
fix(router): Fix new_columns calculation for namespace cache table merges
This commit adds logic to ensure that all pre-existing columns are
counted when no merge takes place and a test covering that.
2023-04-26 14:00:10 +01:00
Fraser Savage d9111e2a1a
Merge branch 'main' into savage/additive-namespace-schema-caching 2023-04-26 12:30:52 +01:00
Fraser Savage 2921a79ac3
refactor(router): Use sum to count new_columns instead of fold
Co-authored-by: Dom <dom@itsallbroken.com>
2023-04-26 12:21:45 +01:00
Fraser Savage 41ee990d68
fix(router): Re-introduce cache put metric insert/update attribute 2023-04-26 12:04:43 +01:00
Fraser Savage c837a6e8dc
docs(router): Explicitly document use of get() and insert() for schema merge
Co-authored-by: Dom <dom@itsallbroken.com>
2023-04-26 11:55:55 +01:00
dependabot[bot] 09d6b4ae50
chore(deps): Bump tokio-stream from 0.1.12 to 0.1.13 (#7666)
Bumps [tokio-stream](https://github.com/tokio-rs/tokio) from 0.1.12 to 0.1.13.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-stream-0.1.12...tokio-stream-0.1.13)

---
updated-dependencies:
- dependency-name: tokio-stream
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-26 09:31:04 +00:00
Fraser Savage 5a9c68e428
perf(router): Perform NamespaceCache schema merge out of lock
This re-introduces the potential racy conflicting schema updates, to
optimise for the expected read-heavy workload. This limits the point at
which write requests may race with schema updates to overlapping calls
to put, rather than the write call-path as a whole.
2023-04-24 16:13:01 +01:00
Fraser Savage 065018be11
Merge branch 'main' into savage/additive-namespace-schema-caching 2023-04-24 15:26:42 +01:00
wiedld daabe9663c chore(idpe-17434): make restrictive whitelist of chars accepted, for any NamespaceName 2023-04-21 16:36:00 -07:00
wiedld b870242ec7 chore(idpe-17434): remove utf8-percent encoding on v2 write path, such that it matches v1 writes and onCreate 2023-04-21 16:31:55 -07:00
wiedld 781d6c040d
fix: process query param for token, even when header is not present. (#7619)
* Move the or_else conditional out of the Some() chain
2023-04-21 17:44:59 +00:00
wiedld 1d2003d385
feat(idpe-17265): cst write authorization (#7527)
* feat(idpe-17265): authorization should occur as part of the single_tenant specific mod
* authz service is accessed only through the single_tenant mod handler
* authz service is wrapped in auth mod
* move auth integration test into auth mod
* push down the authorize() call into the query params parser call, in order to access query params in the extract_token
* provide configuration error when authz or single_tenant mode are not co-presented
* update authz e2e fixtures

* feat(idpe-17265): extract tokens based upon preferred ordering in spec, and write tests to verify behavior.

* chore(idpe-17265): update naming conventions for a unifying parser

* test: make MockAuthorizer have default, and add a test_delegate_to_authz for CST

* chore: record authz duration metric, and include in delegation test.

* chore: use authz terminology instead of auth_service

* chore: more explicit naming

* Revert "chore: record authz duration metric, and include in delegation test."

This reverts commit 05c36888ca7247b6953343d759a5185098fae679.

* refactor: extract_header_token versus the else condition

* refactor: make single_tenant mod and move auth within

* chore: make unreachable explicitly panic in the build

* test: make token values be const, to be consumed when MockAuthorizer is used

* test: use locking for calls_counter in test

* fix: add base64 encoding as expected for Basic header

* fix: merge conflict resolution. The AuthorizationHeaderExtension is now under the authz::http mod, which is a required feature for router package.

* chore: run rustfmt nightly with preferred import handling, on files with modified imports

* chore: code cleanup, to have minimal code needed
2023-04-19 15:28:10 +00:00
Dom Dwyer 03c5ea5488
feat(router): configurable RPC write message size
Provide a configuration item for the router (in RPC mode) that controls
the maximum outgoing RPC message size when communicating with an
Ingester.

Raises the maximum from the default 4MiB to 100MiB. This does not
increase exposure to memory-based DOS, as writes are size-limited by the
HTTP layer to 10MiB, preventing a user from submitting a write this
large (or larger!) across the RPC boundary.
2023-04-19 14:57:53 +02:00