This is used to gather shared state for the query vs `Select`, which
represents state for each `SELECT` statement.
First example is tracking whether a statement projects multiple
unique measurements. This changed improved one of the plans, as it
no longer needs to sort the `iox::measurement` column.
This is an improvement over the previous version, and prepares the
planner for implementing subqueries and passing schema to the
`project_select` function.
The end goal is that each `Select` node will contain a schema to be
referenced directly by the InfluxQL planner. Additionally, further
refinement of the field data types used by the `Select` node
are expected, to remove ambiguity from the planner.
* refactor: only use struct-style `select` in InfluxQL planner
For #7533 we need to track more columns apart from `time` and `value`
and having a simple variant and multiple complex ones gets overly
complicated soon. The aggregator is internally identical anyways, so
let's only use one and then pull out the struct fields that we need.
I'll also change the InfluxRPC planner to use the struct variant next,
so we have a single `select` system both in the planners and in `query_functions`.
* docs: improve
* docs: explain test
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
---------
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Rewriter drops `MeasurementSelection` entries from the `FROM` clause
that do not project any fields directly from a measurement. Subqueries
are an exception, such that an outer query may project only tags
from the inner subquery.
* feat: Evaluate data types of binary expressions
This is necessary to ensure column data types of projections of a
subquery are accurately determined.
* chore: rustfmt 🧹
- No `ON` clause
- No `WHERE` clause
- No time restriction yet
- No `FROM <db>.<retention>`
Ref https://github.com/influxdata/idpe/issues/17360 .
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Returns a `NotImplemented` error when attempting to execute a
selector query, which projects a single selector function and additional
tags or fields until #7533 is implemented.
Introduced `error` module to simplify error handling and ensure
consistency of error messages.
This will be used to complete queries that have selector semantics,
meaning they project a single selector function and therefore
use the timestamp for the time column.
This will be used to complete queries that have selector semantics,
meaning they project a single selector function and therefore
use the timestamp for the time column.
* feat: Specialises test output formatting for each language
* Also fixes an error uncovered in the `write_columnar` when tag
columns are `NULL`
Closes#7145
* chore: Run cargo hakari tasks
* chore: Add sorted output until #7513 is addressed
* chore: clippy 📋
* feat: Add `options` to `write_columnar`
* Added ability to configure border rendering, including removing
borders. This helps avoid variable width issues with EXPLAIN output,
which tends to vary and cause flaky test failures.
* chore: rustfmt 🧹
* chore: update expected output
* chore: clarify what "this" is
---------
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
This was "internal". The mapping works like this: we take the
`DataFusionError` and call `find_root` which should traverse the
`External(...)` chain (even through Arrow) to find the last error that
is not within the Arrow/DataFusion land. This is then mapped by us.
`DataFusionError::External(...)` is no further inspected and mapped
straight to "internal". I think this if fine because in the end we're
mostly dealing w/ DataFusion stuff anyways.
I've slightly changed the error mapping in the planner to emit
`DataFusionError::Plan(...)` instead which we map to "invalid argument".
I think this is way better for the user.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* fix: Add sort operator after window aggregate operator
Closes#7460
* fix: Refactor `LIMIT` and `OFFSET` implementation
These changes should allow the `limit` function to be used
generically with any plan following the same conventions.
* chore: No need to reorder this
* chore: Add documentation to the `limit` function
* feat: Support LIMIT and OFFSET with GROUP BY
* fix: Compile error
* chore: Improve function name and comment
* chore: rustfmt
* chore: fix clippy warnings
Allowing the too-many-arguments warning for project_select,
as it will require some refactoring after this PR has already
been reviewed. It may be refactored in the future when subqueries are
implemented
* chore: Simplify insta snapshots
* chore: Extract struct-like enums to structs
This is in line with DataFusion, which also represents many of its
expression types as struct. The change permits explicit visit
methods for these new types.
These changes will be used by rewriting and visitors to treat the
types, such as Call as an atomic unit that can be replaced.
* chore: Update DataFusion
* refactor: Update predicate crate for new transform API
* refactor: Update iox_query crate for new APIs
* refactor: Update influxql for new API
* chore: Run cargo hakari tasks
---------
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: Display failed query
Allows a user to immediately identify the failed query.
* feat: API improvements to InfluxQL parser
* feat: Extend `SchemaProvider` trait to query for UDFs
* fix: We don't want the parser to panic on overflows
* fix: ensure `map_type` maps the timestamp data type
* feat: API to map a InfluxQL duration expression to a DataFusion interval
* chore: Copied APIs from DataFusion SQL planner
These APIs are private but useful for InfluxQL planning.
* feat: Initial aggregate query support
* feat: Add an API to fetch a field by name
* chore: Fixes to handling NULLs in aggregates
* chore: Add ability to test expected failures for InfluxQL
* chore: appease rustfmt and clippy 😬
* chore: produce same error as InfluxQL
* chore: appease clippy
* chore: Improve docs
* chore: Simplify aggregate and raw planning
* feat: Add support for GROUP BY TIME(stride, offset)
* chore: Update docs
* chore: remove redundant `is_empty` check
Co-authored-by: Christopher M. Wolff <chris.wolff@influxdata.com>
* chore: PR feedback to clarify purpose of function
* chore: The series_sort can't be empty, as `time` is always added
This was originally intended as an optimisation when executing an
aggregate query that did not group by time or tags, as it will produce
N rows, where N is the number of measurements queried.
* chore: update comment for clarity
---------
Co-authored-by: Christopher M. Wolff <chris.wolff@influxdata.com>
* refactor: Break unnecessary dependencies from `iox_query` crate
In the process, the test code has been simplified.
* refactor: Move InfluxQL plan module to iox_query_influxql crate
* refactor: Move remaining behaviour from iox_query to iox_query_influxql
* chore: rustfmt 🙄
I was under the impression `clippy` would catch formatting