Currently we see some prod panics:
```
'assertion failed: len <= std::u32::MAX as usize', tonic/src/codec/encode.rs:127:5
```
This is due to an upstream bug in tonic:
https://github.com/hyperium/tonic/issues/1141
However the fix will only turn this into an error instead of panicking.
We should instead NOT return such overlarge results, esp. because
InfluxRPC supports streaming.
While we currently don't perform streaming conversion (like streaming
the data out of the query stack into the gRPC layer), the 4GB size limit
can easily be triggered (in prod) w/ enough RAM. So let's re-chunk our
in-memory responses so that they stream nicely to the client.
We may later implement proper streaming conversion, see #4445 and #503.
I tracked down the source of the size difference to the difference in
`mem::size_of::<mutable_batch::column::ColumnData>`. I believe this enum
is now able to take advantage of this niche-filling optimization:
<https://github.com/rust-lang/rust/pull/94075/>
* feat: flag partition for delete
* fix: compare the right date and time
* chore: Run cargo hakari tasks
* chore: cleanup
* fix: typos
* chore: rust style tidy ups in catalog
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Luke Bond <luke.n.bond@gmail.com>
* refactor: NS+table ID (instead of name) in querier<>ingester
* feat(ingester): use IDs for query API
Changes the ingester to utilise the ID fields (instead of names) sent
over the query wire message wrapped within the Flight API.
BREAKING: this changes the "query-ingester" CLI command arguments which
now expects the namespace & table IDs, rather than their names.
* refactor(ingester): add more query logging context
Updates the log messages during query execution to include more context
fields.
* style: remove unused import
Co-authored-by: Marco Neumann <marco@crepererum.net>
* test: Ensure router's HTTP error messages are stable
If you change the text of an error, the tests will fail.
If you add a new error variant to the `Error` enum but don't add it to
the test, test compilation will fail with a "non-exhaustive patterns"
message.
If you remove an error variant, test compilation will fail with a "no
variant named `RemovedError`" message.
You can get the list of error variants and their current text via
`cargo test -p router -- print_out_error_text --nocapture`.
A step towards accomplishing #5863
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
* fix: Remove optional commas and document macro arguments
* docs: Clarify the purpose of the tests the check_errors macro generates
* fix: Add tests for inner mutable batch LP error variants
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* test: use `EXPLAIN ANALYZE` for SQL metric tests
Needs a bit more infra (due to normalization), but this seems to be
worth it so we can easily hook up more metrics in the future.
* docs: explain regexes
* feat: Added single and inline comment combinators
* chore: Add tests for ws0 function
* feat: Add ws1 combinator
* feat: Use ws0 and ws1 combinators to properly handle comments
* feat: deletion flagging in GC based on retention policy
* chore: typo in comment
* fix: only soft delete parquet files that aren't yet soft deleted
* fix: guard against flakiness in catalog test
* chore: some better tests for parquet file delete flagging
Co-authored-by: Nga Tran <nga-tran@live.com>
Now DML operations contain the table ID, the ingester has all necessary
data to initialise the TableData buffer node without having to query the
catalog.
This also removes the catalog from the buffer_operation() call path,
simplifying testing.
Now DML operations contain the namespace ID, the ingester has all
necessary data to initialise the NamespaceData buffer node without
having to query the catalog.
Expose the Table and Namespace IDs encoded within the serialised DML
write (added in #6036).
This makes the IDs available for use in the consumers, ending the
transition period. This commit DOES NOT remove the strings sent over the
wire.
This commit pushes the existing table-level mutex down to the partition.
This allows the ingester to gather data from multiple partitions within
a single table in parallel, and reduces contention between ingest/query
workloads.
This moves the logic that skips operations that do not need to be
applied to a partition during shard replay from the table level, to the
partition level.
Changes the bounds on the ArcMap to accept an owned key, avoiding an
extra allocation.
Cleans up the bounds on other fn to ensure the borrowed key impl Eq and
is the ref type of K.
This commit changes the ArcMap HashBuilder to use the same instance as
the underlying HashMap hasher.
This prevents divergent hashing across threads that MAY initialise a
hasher with a different seed.