* fix: account for memory allocations in InfluxRPC group outputs
This should prevent the querier from OOMing.
See https://github.com/influxdata/idpe/issues/16614 .
* docs: improve
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* refactor: pull out constant
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* fix: support InfluxRPC OR-chains w/ arbitrary child nodes
Also convert another assertion regarding child nodes of Eq-nodes into a
proper error.
See https://github.com/influxdata/idpe/issues/16582 .
* test: more tests
`RecordBatch` offers zero-copy slicing, so there is no need to store the
row range manually. This makes #6216 simpler.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* refactor: avoid channels to to create a one-element stream
* refactor: move `StreamWithPermit` into its own module
* refactor: make `QueryCompletedToken` handling stream-based
For #6216.
* docs: improve
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* fix: ignore fields when considering tag predicates
* chore: update test to not use time column in predicate
* chore: update with review feedback
* chore: update tests to avoid fields refs in RPC preds
This is more like what would be coming off the wire from
Influx RPC.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: marshal InfluxDbError into status details
* chore: address feedback and CI issues
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Currently we see some prod panics:
```
'assertion failed: len <= std::u32::MAX as usize', tonic/src/codec/encode.rs:127:5
```
This is due to an upstream bug in tonic:
https://github.com/hyperium/tonic/issues/1141
However the fix will only turn this into an error instead of panicking.
We should instead NOT return such overlarge results, esp. because
InfluxRPC supports streaming.
While we currently don't perform streaming conversion (like streaming
the data out of the query stack into the gRPC layer), the 4GB size limit
can easily be triggered (in prod) w/ enough RAM. So let's re-chunk our
in-memory responses so that they stream nicely to the client.
We may later implement proper streaming conversion, see #4445 and #503.
It should be always clear from the context to which table a chunk
belongs.
I think having a table name bound to a chunk goes back to a time where
chunks had multiple tables.
Helps with #6049.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* refactor: Move response creation into a single location
* fix: add storage-type=iox header to influxrpc responses
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
- treat OOM protection as "resource exhausted"
- use `DataFusionError` in more places instead of opaque `Box<dyn Error>`
- improve conversion from/into `DataFusionError` to preserve more
semantics
Overall, this improves our error handling. DF can now return errors like
"resource exhausted" and gRPC should now automatically generate a
sensible status code for it.
Fixes#5799.
In our data model, a chunk always belongs to a partition[^1], so let's
not make this attribute optional. The optional value only leads to
-- mostly surprising -- conditional behavior, ranging from "do not equalize
the partition sort key" (querier) to "always consider the chunk overlapping"
(iox_query when dealing with ingester chunks).
[^1]: This is even true when the chunk belongs to a parquet file that is not
yet added to the catalog, contrary to what a comment in the ingester
stated. The catalog and data model used by the querier are two totally
different things.
* fix: apply selection in `TestChunk::read_filter`
TBH I have no idea how this worked so well before, but the chunks are
expected to apply the given selection. This is because
`IOxReadFilterNode::execute` will wrap the `QueryChunk::read_filter`
output into a `SchemaAdapterStream` and this one expects that there are
no input columns that are absent in the output schema (i.e. it will only
add null columns, it won't remove any). Funnily the `SchemaAdapterStream`
error will blame DataFusion for the mess.
* test: make `test_storage_rpc_tag_values_grouped_by_measurement_and_tag_key` a bit harder
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* ci: use same feature set in `build_dev` and `build_release`
* ci: also enable unstable tokio for `build_dev`
* chore: update tokio to 1.21 (to fix console-subscriber 0.1.8
* fix: "must use"
* test: add tests for regex_match_on_field
* feat: more general `_field` predicate handling
* fix: remove old comment
* fix: update tests
* fix: improve test a little more
* fix: fmt
* fix: Update predicate/src/rpc_predicate/field_rewrite.rs
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
* fix: Handle predicates that can not be evaluated
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>