* feat: WIP support CommandGetXdbcTypeInfo metadata endpoint with tests
* chore: update test case and add jdbc test
* chore: uncomment jdbc getCoumns test
* chore: lint
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* test: add tests on month and year date_bin
* fix: add IOX_COMPARE: uuid to get deterministics name for output parquet_file in the explain
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore(flightsql): rename Namespace to Database in error message
* chore(flightsql): rename Namespace to Database in test error msg
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
- No `ON` clause
- No `WHERE` clause
- No time restriction yet
- No `FROM <db>.<retention>`
Ref https://github.com/influxdata/idpe/issues/17360 .
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Returns a `NotImplemented` error when attempting to execute a
selector query, which projects a single selector function and additional
tags or fields until #7533 is implemented.
Introduced `error` module to simplify error handling and ensure
consistency of error messages.
* chore: Update datafusion and arrow/parquet to 37, tonic to 0.9.1
* refactor: Update for FieldRef and other API changes
* fix: Update field size calculation
* fix: Use `NullBuffer` directly
* fix: remove outdated comment
* chore: Update test for tonic
* chore: Run cargo hakari tasks
* chore: cargo update
---------
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: support CommandGetCrossReference metadata endpoint with tests
* chore: create two tables in the test for GetCrossReference endpoint
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: support returning the database name if all the keys refer to the same database
* test: add test cases to check for same, different, and no database in request header
* chore: lint
* chore: more lint
* refactor: replace empty string with None for database_name
* refactor: simplify logic for NoFlightSQLDatabase error
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: support CommandGetImportedKeys metadata endpoint with tests
* chore: remove comments that is no longer valid
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: Specialises test output formatting for each language
* Also fixes an error uncovered in the `write_columnar` when tag
columns are `NULL`
Closes#7145
* chore: Run cargo hakari tasks
* chore: Add sorted output until #7513 is addressed
* chore: clippy 📋
* feat: Add `options` to `write_columnar`
* Added ability to configure border rendering, including removing
borders. This helps avoid variable width issues with EXPLAIN output,
which tends to vary and cause flaky test failures.
* chore: rustfmt 🧹
* chore: update expected output
* chore: clarify what "this" is
---------
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
* feat: support `database`, `bucket`, and `bucket-name` as grpc header names
* chore: lint
* chore: update doc to accept `database`, `bucket`, and `bucket-name` as parameter names
* chore: update doc to only show `database` as the parameter name
* refactor: consolidate header names into a const vec and update comments on database
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This was "internal". The mapping works like this: we take the
`DataFusionError` and call `find_root` which should traverse the
`External(...)` chain (even through Arrow) to find the last error that
is not within the Arrow/DataFusion land. This is then mapped by us.
`DataFusionError::External(...)` is no further inspected and mapped
straight to "internal". I think this if fine because in the end we're
mostly dealing w/ DataFusion stuff anyways.
I've slightly changed the error mapping in the planner to emit
`DataFusionError::Plan(...)` instead which we map to "invalid argument".
I think this is way better for the user.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Within our query tests and our CLI, we've used to print out empty
query responses as:
```text
++
++
```
This is pretty misleading. Why are there no columns?! The reason is that
while Flight provides us with schema information, we often have zero
record batches (because why would the querier send an empty batch). Now
lets fix this by creating an empty batch on the client side based on the
schema data we've received. This way, people know that there are columns
but no rows:
```text
+-------+--------+------+------+
| count | system | time | town |
+-------+--------+------+------+
+-------+--------+------+------+
```
An alternative fix would be to pass the schema in addition to
`Vec<RecordBatch>` to the formatting code, but that seemed to be more
effort.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: add CommandGetPrimaryKeys metadata endpoint and tests
* chore: update schema for the returned record batch
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: end-to-end tests for authorization
Add tests to validate the behaviour of the authorization machinery
in the write and query paths.
In order to facilitate this an authorizer implentation has been
added to the the test helpers that runs an authorizer gRPC service
for the use of tests. The gRPC service is started in the process
that is running the test and listens on a OS-assigned port number.
The authorization service cannot be shared between tests so a
non-shared cluster must be used when the authorizer is configured.
The influxdb_iox_client has been enhanced so that the user can
configure additional headers in the flight client, which is used
for SQL and InfluxQL queries. This uses the same interface as the
Flight SQL client has for the same job.
* chore: fix lint errors
* chore: review suggestion
Consolate the authorization tests into fewer tests in order to avoid
repeating set-up and tear-down unnecessarily.
* fix: Add sort operator after window aggregate operator
Closes#7460
* fix: Refactor `LIMIT` and `OFFSET` implementation
These changes should allow the `limit` function to be used
generically with any plan following the same conventions.
* chore: No need to reorder this
* chore: Add documentation to the `limit` function
* feat: Support LIMIT and OFFSET with GROUP BY
* fix: Compile error
* chore: Improve function name and comment
* chore: rustfmt
* chore: fix clippy warnings
Allowing the too-many-arguments warning for project_select,
as it will require some refactoring after this PR has already
been reviewed. It may be refactored in the future when subqueries are
implemented
This commit adds a client method to invoke the
UpdateNamespaceServiceProtectionLimits RPC API, providing a user
friendly way to do this through the IOx command line.
- Only redact the actual time value in `FilteExec`, not the entire
expression. This preserves important information about filter
pushdowns.
- Apply similar time filter to `ParquetExec` because with #6098 we will
push down more filters into `ParquetExec`, including retention
policies.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: update gap fill planner rule to use LOCF
* chore: cargo fmt
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* test: Add an e2e test for write replication
* fix: Pass through rpc_write_replicas configuration to RpcWrite handler
---------
Co-authored-by: Dom <dom@itsallbroken.com>
* feat(flightsql): Add support for table_schema in GetTables and complies
feat(fightsql): Add support for table_schema in GetTables
support actual schema
it compiles / does not fail
* chore: resolve merge conflict
* chore: make table_schema optional
* test: update e2e test for `include_schema` = true
* chore: remove info!() and update test `flightsql_schema_matches`
* chore(deps): Bump rustix from 0.36.11 to 0.37.3 (#7308)
* chore(deps): Bump rustix from 0.36.11 to 0.37.3
Bumps [rustix](https://github.com/bytecodealliance/rustix) from 0.36.11 to 0.37.3.
- [Release notes](https://github.com/bytecodealliance/rustix/releases)
- [Commits](https://github.com/bytecodealliance/rustix/compare/v0.36.11...v0.37.3)
---
updated-dependencies:
- dependency-name: rustix
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore: Run cargo hakari tasks
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: make real error for existing columns
* chore: user match instead of unwrap() on column names
* chore: use datafusion::physical_plan::collect() to get record batches
* chore: use `concat_batches` to combine multiple batches into single one and fix db schema test
* chore: add doc comment for GetTables
* chore: remove pretty print
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: Display failed query
Allows a user to immediately identify the failed query.
* feat: API improvements to InfluxQL parser
* feat: Extend `SchemaProvider` trait to query for UDFs
* fix: We don't want the parser to panic on overflows
* fix: ensure `map_type` maps the timestamp data type
* feat: API to map a InfluxQL duration expression to a DataFusion interval
* chore: Copied APIs from DataFusion SQL planner
These APIs are private but useful for InfluxQL planning.
* feat: Initial aggregate query support
* feat: Add an API to fetch a field by name
* chore: Fixes to handling NULLs in aggregates
* chore: Add ability to test expected failures for InfluxQL
* chore: appease rustfmt and clippy 😬
* chore: produce same error as InfluxQL
* chore: appease clippy
* chore: Improve docs
* chore: Simplify aggregate and raw planning
* feat: Add support for GROUP BY TIME(stride, offset)
* chore: Update docs
* chore: remove redundant `is_empty` check
Co-authored-by: Christopher M. Wolff <chris.wolff@influxdata.com>
* chore: PR feedback to clarify purpose of function
* chore: The series_sort can't be empty, as `time` is always added
This was originally intended as an optimisation when executing an
aggregate query that did not group by time or tags, as it will produce
N rows, where N is the number of measurements queried.
* chore: update comment for clarity
---------
Co-authored-by: Christopher M. Wolff <chris.wolff@influxdata.com>
* feat: add optional param to GetTables
* chore: add the third param to query plan
* feat: add table_types param
* chore: clippy
* test: add test cases with filters
* chore: update query to avoid SQL injection
* refactor: update where clause and cleanup
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* test: Add getTables jdbc_client example
* feat: add `CommandGetTables` in FlightSqlClient
* feat: add `CommandGetTables` in flightsql cmd and planner
* test: add e2e test for `CommandGetTables`
* chore: clippy
* chore: comment out the test with filters
* test: update jdbc test expected value for tables
---------
Co-authored-by: Chunchun <14298407+appletreeisyellow@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: Normalise name of Call expression to lowercase
Simplifies matching functions in planner, as they are guaranteed to be
lowercase.
This also ensures compatibility with InfluxQL when generating column
alias names, which are reflected in updated tests.
* chore: Ensure aggregate functions fail gracefully.
* feat: GROUP BY tag support
* feat: Ensure schema-level metadata is propagated
Requires: https://github.com/apache/arrow-rs/issues/3779
* chore: Add some tests to validate GROUP BY output
* chore: Add clarifying comment
* chore: Declare message in flight.proto
The metadata is public API, so best practice is to encode this in a way
that is most compatible for clients in other languages, and will also
document the history of schema changes.
Added tests to validate the metadata is encoded correctly.
* chore: Placate linters
* chore: Use correct column in test cases
* chore: Add `is_projected` to the TagKeyColumn message
`is_projected` is necessary to inform a client whether it should include
the tag key is used exclusively for the group key (false) or also
projected in the `SELECT` column list.
* refactor: Move constants to `schema` crate per PR feedback
* chore: rustfmt 🙄
* chore: Update docs for InfluxQlMetadata
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
---------
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* chore: Move to inline snapshots
* chore: Container for the DataFusion and IOx schema
* chore: Simplify using logical expression helper functions
* feat: Rewrite conditional expressions using InfluxQL rules
* feat: Add tests to validation conditional expression rewriting
* feat: Rewrite column expressions
* chore: Rewrite expression to use false when possible
This allows the planner to optimise away the entire logical plan to an
empty plan in many cases.
* feat: Complete cast postfix operator support
Added `unsigned` postfix operator, as the feature was mostly complete.
Closes#6895
* chore: Remove redundant attribute
* chore: Update datafusion
* chore: update the plans
* fix: update some plans
* chore: Update plans and port some explain plans to use insta snapshots
* fix: another plan
* chore: Run cargo hakari tasks
---------
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: Add more tests
* chore: Fix default ordering; implement ORDER BY
* feat: Add EXPLAIN support
* chore: Add additional tests to validate GROUP BY expansion
* chore: More test cases for TZ, and failing log scalar function
This debugging tool was more useful in previous situations where it was
harder to get real data as input for the compactor.
It's currently causing a flaky test that isn't worth investigating.
Fixes#6190 by making it moot.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Ensure a HTTP error response contains a well-formed JSON structure
containing "code" and "message" fields (for backwards compatibility with
existing InfluxDB versions) and a correct "content-type" header.
* feat: Add support for data-driven InfluxQL tests
* chore: Added more tests
* chore: Check if GIT_HASH has changed, to avoid rebuilds
This speeds up the edit-test cycle, when changing data files, as
cargo won't rebuild the test binary 🥳
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* feat: Updating to new services for all-in-one
* fix: Use correct shard id for ingester2
* fix: clippy
* fix: use wal directory
* fix: end to end tests
* fix: Update tracing cases for new ingest reality
* fix: update metrics test
* fix: Use rpc mode
---------
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* fix: Reading file error reported the wrong path
When the `.expected` SQL file couldn't be found, this error reported
the input file path instead.
* test: Port SQL query_tests to end-to-end tests
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
These two tests weren't actually ensuring that the combination of these
predicates worked, because the tests would still pass if some of the
predicate parts were removed.
* refactor: Start a new file for tag keys tests; move the one existing test
* test: Port tag_keys query_tests to end-to-end tests
---------
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* refactor: Start a new file for measurement names tests; move the one existing test
* fix: Pass on predicate when sending a measurement names request with GrpcRequestBuilder
* feat: Support literal integer queries too
* test: Port measurement_names/table_names query_tests to end-to-end tests
* fix: merge conflict error
---------
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* fix: Delete the read filter *file*; last PR only deleted the *contents*
* test: Port read_group query_tests to end-to-end tests
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* refactor: Move sql script files from query_tests and into end to end query tests
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* feat: introduce a new way of max_sequence_number for ingester, compactor and querier
* chore: cleanup
* feat: new column max_l0_created_at to order files for deduplication
* chore: cleanup
* chore: debug info for chnaging cpu.parquet
* fix: update test parquet file
Co-authored-by: Marco Neumann <marco@crepererum.net>
* test: Port a test that's not actually supported through the full gRPC API
* test: Port remaining field column/measurement fields tests
* test: Remove unsupported measurement predicate and clarify purposes of tests
Andrew confirmed that the only way to invoke a Measurement Fields
request is with a measurement/table name specified: <0249b5018e/generated_types/protos/influxdata/platform/storage/service.proto (L43)>
so testing with a `_measurement` predicate is not valid.
I thought this test would become redundant with some other tests, but
they're actually still different enough; I took this opportunity to
better highlight the differences in the test names.
* refactor: Move all measurement fields tests to their own file
* test: Remove field columns tests that are now covered in end-to-end measurement fields tests
Make a new trait, `InfluxRpcTest`, that types can implement to define
how to run a test on a specific Storage gRPC API. `InfluxRpcTest` takes
care of iterating through the two architectures, running the setups, and
creating the custom test step.
Implementers of the trait can define aspects of the tests that differ
per run, to make the parameters of the test clearer and highlight what
different tests are testing.
To support a case where someone calls WriteLineProtocol twice in
a row to simulate two write requests. The test should be able to
record this state before the two write requests and not twice.
Fixes#6506.
Also has the pleasant side effect of making this code simpler and less
hacky-- it now checks the number of Parquet files for the whole
namespace, which is useful in cases where the line protocol writes to
several tables.
This test was relying on a graceful shutdown of the ingester to drive a
WAL replay, restoring the buffered state at startup.
Now the shutdown causes the data to be persisted and not replayed, this
didn't work.
* feat: InfluxQL learns how to plan some queries
Also added a means to test the planner and execution
* chore: Update module docs
* chore: Document the planner functions
* chore: Update end_to_end_cases crate
* chore: Clarify why `SLIMIT` and `SOFFSET` return `NotImplemented`
* chore: Address lint issues
* chore: Fix rustdoc link issue
* chore: Remove InfluxQL tests from query_tests crate
Will follow conventions established by @carols10cents when
new query_tests crate is merged.
* chore: `now` field
`now` is a DataFusion built-in scalar function
* chore: remove unused code
* chore: Add additional arithmetic expression tests
* chore: Establish pattern for identifying and tracking InfluxQL issues
* chore: Add tests for case sensitivity issues
* chore: group tests into modules and functions
This avoids mass rewriting of insta snapshots as new
tests are added to each function. When tests are added in the middle,
existing snapshots are renamed (-N+1, -N+2, etc) resulting in
having to review numerous additional snapshots.
* feat: Add basic Flight and FlightSQL client into IOx codebase
Basic flight end to end test
* fix: Apply suggestions from code review
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
I couldn't find any end2end tests for these cases and I was kinda
worried that our error codes were wrong. Turns out they are correct, but
let's have some nice tests for this behavior.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* fix: gRPC errors regarding group cols
- missing group col prev. produced an "internal error" but should be
"invalid argument"
- duplicate group cols produced a panic but should also be "invalid
argument"
* docs: clarify
* refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics
Closes#6310.
* refactor: rename and tune default exec mem limits
* fix: ingester2 bits after rebase