- treat OOM protection as "resource exhausted"
- use `DataFusionError` in more places instead of opaque `Box<dyn Error>`
- improve conversion from/into `DataFusionError` to preserve more
semantics
Overall, this improves our error handling. DF can now return errors like
"resource exhausted" and gRPC should now automatically generate a
sensible status code for it.
Fixes#5799.
In our data model, a chunk always belongs to a partition[^1], so let's
not make this attribute optional. The optional value only leads to
-- mostly surprising -- conditional behavior, ranging from "do not equalize
the partition sort key" (querier) to "always consider the chunk overlapping"
(iox_query when dealing with ingester chunks).
[^1]: This is even true when the chunk belongs to a parquet file that is not
yet added to the catalog, contrary to what a comment in the ingester
stated. The catalog and data model used by the querier are two totally
different things.
* fix: apply selection in `TestChunk::read_filter`
TBH I have no idea how this worked so well before, but the chunks are
expected to apply the given selection. This is because
`IOxReadFilterNode::execute` will wrap the `QueryChunk::read_filter`
output into a `SchemaAdapterStream` and this one expects that there are
no input columns that are absent in the output schema (i.e. it will only
add null columns, it won't remove any). Funnily the `SchemaAdapterStream`
error will blame DataFusion for the mess.
* test: make `test_storage_rpc_tag_values_grouped_by_measurement_and_tag_key` a bit harder
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* ci: use same feature set in `build_dev` and `build_release`
* ci: also enable unstable tokio for `build_dev`
* chore: update tokio to 1.21 (to fix console-subscriber 0.1.8
* fix: "must use"
* test: add tests for regex_match_on_field
* feat: more general `_field` predicate handling
* fix: remove old comment
* fix: update tests
* fix: improve test a little more
* fix: fmt
* fix: Update predicate/src/rpc_predicate/field_rewrite.rs
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
* fix: Handle predicates that can not be evaluated
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
InfluxQL queries can send (technically incorrect) ranges like this, meaning all time
but excluding the max nanosecond time.
Since this is an important case, we should handle it specially and use the optimized
'all time' handling for meta queries even though this is technically wrong in that
it does not filter out column names / measurement names at MAX_NANO_TIME exactly.
Closes: https://github.com/influxdata/conductor/issues/1072
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
- ensure that logging is done BEFORE the DB/namespace is requested (i.e.
any actual work is done) but AFTER the query semaphore is acquired
- simplify tag key decoding (so that the logging statements are simpler
to write)
* fix: Fix SeriesKey sort order for special _measurement and _field
* fix: Update expected test output
* fix: Update more tests
* fix: Re-sort tag key when using binary encoding
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* fix: optimize field columns for all-time predicates
Also fix timestamp range to allow selecting points at MAX_NANO_TIME
* fix: clamp end to MIN_NANO_TIME for safety
* refactor: add contains_all method to TimestampRange
* docs: fix comment
* test: add test for delete behaviour
* fix: tag_keys optimization for empty predicate
Also need to eliminate 'true' predicates from simplified predicate so
is_empty works correctly.
* refactor: use lit instead of spelling out literal true
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: TEMP Update DataFusion to pre-release
* chore: update arrow et al to 16.0.0
* chore: Run cargo hakari tasks
* fix: update reader read_dictionary API
* chore: Update to real Datafusion release
* fix: Update parquet API
* fix: update test
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
* test: query limits
This was left out of #4760.
* test: additional debugging
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* refactor: rework querier concurrency limiting
With #4752 we introduced a concurrency limit into the querier. It works
by drawing permits from a central semaphore whenever we create a
`QuerierNamespace`. This however only limits concurrency during query
planning and not query execution, because the objects contained within
the plan (chunks and some metadata) neither reference the permit nor the
`QuerierNamespace`.
Now one approach to fix that would be to wire up the permit all the down
into all the query-related data structures. This however is very fiddly
and potentially will get lost at some point, because as soon as we
transform these data structures -- e.g. into streams -- the permit might
get lost again. This will be potentially query-dependent and very hard
to debug.
So instead we reverse the approach and track the permits at the upper
layer of the stack: the gRPC service entry points. There we also need to
be careful -- e.g. when we return streams to tonic -- but it's way
easier to review that then the deeply nested object hierarchy that is
involved with queries. Also the separation of concerns is a bit clearer,
because why would a "chunk" care about the "query concurrency" as a
whole.
* refactor: improve gRPC permit keeping and prepare tests
* ci: fix cargo deny
* chore: downgrade `socket2`, version 0.4.5 was yanked
* chore: rename `query` to `iox_query`
`query` is already taken on crates.io and yanked and I am getting tired
of working around that.