This commit changes the format of partition keys when generated with
non-default partition key templates ONLY. A prior fixture test is
unchanged by this commit, ensuring the default partition keys remain
the same.
When a custom partition key template is provided, it may specify one or
more parts, with the TagValue template causing values extracted from tag
columns to appear in the derived partition key.
This commit changes the generated partition key in the following ways:
* The delimiter of multi-part partition keys; the character used to
delimit partition key parts is changed from "/" to "|" (the pipe
character) as it is less likely to occur in user-provided input,
reducing the encoding overhead.
* The format of the extracted TagValue values (see below).
Building on the work of custom partition key overrides, where an
immutable partition template is resolved and set at table creation time,
the changes in this PR enable the derived partition key to be
unambiguously reversed into the set of tag (column_name, column_value)
tuples it was generated from for use in query pruning logic. This is
implemented by the build_column_values() method in this commit, which
requires both the template, and the derived partition key.
Prior to this commit, a partition key value extracted from a tag column
was in the form "tagname_x" where "x" is the value and "tagname" is the
name of the tag column it was extracted from. After this commit, the
partition key value is in the form "x"; the column name is removed from
the derived string to reduce the catalog storage overhead (a key driver
of COGS). In the case of a NULL tag value, the sentinel value "!" is
inserted instead of the prior "tagname_" marker. In the case of an empty
string tag value (""), the sentinel "^" value is inserted instead of the
"tagname_-" marker, ensuring the distinction between an empty value and
a not-present tag is preserved.
Additionally tag values utilise percent encoding to encode reserved
characters (part delimiter, empty sentinel character, % itself) to
eliminate deserialisation ambiguity.
Examples of how this has changed derived partition keys, for a template
of [Time(YYYY-MM-DD), TagValue(region), TagValue(bananas)]:
Write: time=1970-01-01,region=west,other=ignored
Old: "1970-01-01-region_west-bananas"
New: "1970-01-01|west|!"
Write: time=1970-01-01,other=ignored
Old: "1970-01-01-region-bananas"
New: "1970-01-01|!|!"
This commit fixes loads of crates (47!) had unused dependencies, or
mis-configured dependencies (test deps as normal deps).
I added the "unused_crate_dependencies" to all crates to help prevent
this mess from growing again!
https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html
This has the minor downside of false-positives when specifying
dev-dependencies for test/bench binaries - these are files in /test or
/benches (not normal tests). This commit includes a workaround,
importing them in lib.rs (gated by a feature flag). I think the
trade-off of better dependency management is worth it!
So that the different kinds aren't mixed up. Also extracts the logic
having to do with which template takes precedence onto the
PartitionTemplate type itself.
There was a mix of different ways of returning errors - this commit
unifies them, adds some documentation to the returned errors, and
removes the capitalisation.
Errors should be lower-case so they compose nicely like this:
"something failed: super important error: inner error"
rather than:
"something failed: Super important error: Inner error"
Adds proptests to assert correctness of set operations against a
SequenceNumberSet, by ensuring they match those of a known-good
implementation from the stdlib (HashSet).
Ensure (de-)serialisation correctness by asserting a round trip results
in equal sets.
Adds a space-efficient encoding of a set of SequenceNumber, backed by
roaring bitmaps.
Memory utilisation will change as the number of elements changes,
according to the underlying roaring bitmap design, but is intended to be
"relatively" cheap.
The std `DefaultHasher` is NOT guaranteed to stay the same, so let's
directly use the `SipHasher13` which at the moment (2021-11-15) is used
by the standard lib.
Fixes#3063.
* feat: include more information in system.operations table
* chore: review feedback
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This is a strawman for what routing rules might look like in DatabaseRules. Once there's a chance for discussion, I'd move next to looking at how the Server would split up an incoming write into separate FB blobs to be sent to remote IOx servers. That might change what the API/configuration looks like as that's how it would be used (at least for writes).
After that it would make sense to move to adding the proto definitions with conversions and gRPC and CLI CRUD to configure routing rules.