Commit Graph

662 Commits (196c589ef64f73677eb3e89e60b219f862bde19a)

Author SHA1 Message Date
Marco Neumann 18783f9462
test: harden `end_to_end_cases::debug::build_catalog` (#8537)
This seems to fail a lot in CI, try to work around it. A proper fix is
tracked under #8287.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-22 08:56:19 +00:00
Andrew Lamb 967aef0e9d
chore: Update datafusion (#8515)
* chore: Update datafusion

* fix: update for API

* fix: Verify unsupported statements, with tests

* fix: update tests

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-21 17:49:21 +00:00
Carol (Nichols || Goulding) 0fd5af64bc
refactor: Extract a builder of namespace commands 2023-08-18 15:59:30 -04:00
Carol (Nichols || Goulding) 64eb3be4c2
refactor: Split long strings for readability and rustfmt
When there are long strings in a file, rustfmt just gives up and
ignores the whole file.
2023-08-18 15:59:06 -04:00
Carol (Nichols || Goulding) defd20c7ed
refactor: Reduce duplication and increase consistency in namespace names in tests
Create consts outside the test steps when possible to share values that
need to be the same. Call all of these namespace_name to distinguish
from a namespace object or command.
2023-08-18 15:52:38 -04:00
Carol (Nichols || Goulding) d575873ceb
refactor: Extract CLI test helper functions into their own module 2023-08-18 15:52:37 -04:00
Carol (Nichols || Goulding) 412df8dc4e
fix: Remove debugging printlns from tests 2023-08-18 15:52:37 -04:00
Carol (Nichols || Goulding) 65207b2f9c
fix: Remove unneeded use of lazy_static
This can just be a static slice rather than a vec.
2023-08-18 15:52:37 -04:00
Carol (Nichols || Goulding) 3d1e49e57a
refactor: Extract table CLI tests to their own module 2023-08-18 15:52:37 -04:00
Carol (Nichols || Goulding) 22f1d6f469
refactor: Extract namespace CLI tests to their own module 2023-08-18 15:52:37 -04:00
Nga Tran 5d17a99dbb
feat: read null sort_key_ids (#8489)
* feat: read null sort_key_ids

* chore: clearer explanation about test strategy

* chore: Apply suggestions from code review

Co-authored-by: Marco Neumann <marco@crepererum.net>

* test: tests that add partition with NULL sort_key_ids

* chore: address review comments

* chore: remove unecessary comments and tests

* fix: typos

* chore: remove unecessary tests

* fix: chec duplicates for SortedColumnSet

---------

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-18 14:15:27 +00:00
Andrew Lamb af8967f9e1
chore: Update DataFusion to get fix for string functions on tags (#8479)
* chore: Update DataFusion pin

* test: add test

* fix: Update test with correct query
2023-08-17 17:00:04 +00:00
Andrew Lamb 25ab230898
refactor(flightsql): Use upstream implementation of XdbcTypeInfo builder, support filter by datatype (#8455)
* refactor(flightsql): Use upstream implementation of XdbyTypeInfo builder

* fix: Update test

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-08 20:16:39 +00:00
Martin Hilton 8e2662befb
feat(influxql): minimal SHOW RETENTION POLICIES implementation (#8433)
For backwards compatibility with version 1 clients support a minimal
implementation of SHOW RETENTION POLICIES. This advertises a single,
default, retention policy for any database. This is suffiecient for
compatibility with the grafana plugin and avoids the need for
cross-database catalog queries in the querier.

The values used for the "duration", "shardGroupDuration", and
"resourceN" are the same canned values used in the cloud 2
implementation.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-07 13:39:53 +00:00
Andrew Lamb 6e13ff8cb8
chore: Update DataFusion pin (#8390)
* chore: Update DataFusion pin

* chore: Update for API

* fix: update plans

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-02 14:58:16 +00:00
Chunchun Ye c8242c7469
chore(cli): add `--partition-template` to `namespace create` (#8365)
* chore(cli): add `--partition-template` to namespace create

* chore: fix typo in doc for `PartitionTemplateConfig`

chore: add max limit 8 for partition template in doc

* chore: add e2e tests

* chore: fmt

* chore: add more e2e tests for namespace create with partition template

* chore: show doc comments in cli help interface

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-01 14:37:00 +00:00
Andrew Lamb de79619e71
chore: Update datafusion (#8355)
* chore: Update datafusion pin

* fix: Update for change in API

* chore: Update plan

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-31 15:41:00 +00:00
Carol (Nichols || Goulding) 4a9e76b8b7
feat: Make parquet_file.partition_id optional in the catalog (#8339)
* feat: Make parquet_file.partition_id optional in the catalog

This will acquire a short lock on the table in postgres, per:
<https://stackoverflow.com/questions/52760971/will-making-column-nullable-lock-the-table-for-reads>

This allows us to persist data for new partitions and associate the
Parquet file catalog records with the partition records using only the
partition hash ID, rather than both that are used now.

* fix: Support transition partition ID in the catalog service

* fix: Use transition partition ID in import/export

This commit also removes support for the `--partition-id` flag of the
`influxdb_iox remote store get-table` command, which Andrew approved.

The `--partition-id` filter was getting the results of the catalog gRPC
service's query for Parquet files of a table and then keeping only the
files whose partition IDs matched. The gRPC query is no longer returning
the partition ID from the Parquet file table, and really, this command
should instead be using `GetParquetFilesByPartitionId` to only request
what's needed rather than filtering.

* feat: Support looking up Parquet files by either kind of Partition id

Regardless of which is actually stored on the Parquet file record.

That is, say there's a Partition in the catalog with:

Partition {
    id: 3,
    hash_id: abcdefg,
}

and a Parquet file that has:

ParquetFile {
    partition_hash_id: abcdefg,
}

calling `list_by_partition_not_to_delete(PartitionId(3))` should still
return this Parquet file because it is associated with the partition
that has ID 3.

This is important for the compactor, which is currently only dealing in
PartitionIds, and I'd like to keep it that way for now to avoid having
to change Even More in this PR.

* fix: Use and set new partition ID fields everywhere they want to be

---------

Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-31 12:40:56 +00:00
NGA-TRAN afd5b12324 chore: address review comments and fix tests due to previous commit 2023-07-26 16:40:54 -04:00
NGA-TRAN 1ddc64d68d test: modify and add tests check valid and invalid strftime 2023-07-26 11:44:13 -04:00
NGA-TRAN e6cf9c9d61 fix: rename namespaces of differnet tests to avoid test failures 2023-07-25 17:07:28 -04:00
NGA-TRAN 44e1c1abdb feat: implement partition templpate as json and more tests as well as verify the partition key after inserting data 2023-07-25 16:51:57 -04:00
NGA-TRAN 144778430e Merge branch 'main' into ntran/table_cli 2023-07-21 14:49:02 -04:00
NGA-TRAN 2aff6a7495 chore: remove unused comments 2023-07-21 14:34:19 -04:00
NGA-TRAN a340fd5a6a feat: create table CLI 2023-07-21 14:17:42 -04:00
Martin Hilton b1c695d5a2
fix(influxql): fill count aggregates with 0 by default (#8284)
* chore: update expected output for `COUNT` aggregates with `FILL(null)`

See #8232

* fix(influxql): fill count aggregates with 0 by default

When gap-filling a COUNT aggregate any missing rows should be filled
with 0, unless otherwise directed by a FILL clause. To do this the
projection on the aggregate plan is modiefied to coalesce any COUNT
fields with 0 unless a FILL value has been specified in the query.

* chore: add more tests

* chore: add explanation of COUNT gap filling with multiple measurements

* fix: update test introduced with merge

---------

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-21 16:31:10 +00:00
Martin Hilton 5731e012bf
fix(influxql): advanced syntax window functions with selector aggregates (#8303)
Ensure that advanced syntax window functions that contain a selector,
rather than an aggregate, function are considered valid and generate
a correct plan.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-21 14:48:15 +00:00
Andrew Lamb 3eb48ef210
chore: Update datafusion again (#8247)
* chore: Update datafusion to get new grouping

* chore: Update for new API

* chore: update tests

* fix: new API

* fix: state type

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-21 11:20:36 +00:00
Andrew Lamb ac9d1946e9
fix: add retry loop to avoid CI flake in build-catalog test (#8271)
* fix: add retry loop to avoid CI flake in build-catalog test

* fix: Update influxdb_iox/tests/end_to_end_cases/debug.rs

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-21 10:14:06 +00:00
Joe-Blount 1bed99567c
chore: add DF metrics to compaction spans (#8270)
* chore: add DF metrics to compaction spans

* chore: update string for test verification

* chore: update comment

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-20 15:00:22 +00:00
Christopher M. Wolff 668a1c3d8e
fix: aggregate fns called on tags should return null (#8274)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-20 14:55:16 +00:00
Marco Neumann 6ae9143742
fix: `end_to_end_cases::cli::query_ingester` flakyness (#8281)
While I cannot reproduce the CI flakyness locally (probably because the
local system is fast enough), looking at the test convinced me that the
ingester should not persist.

Closes #8245.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-20 11:48:59 +00:00
Martin Hilton d1640bb926
feat(influxql): CUMULATIVE_SUM window function (#8248)
* feat(influxql): CUMULATIVE_SUM window function

Implement the InfluxQL CUMULATIVE_SUM window function. This is
implemented as described in
https://docs.influxdata.com/influxdb/v1.8/query_language/functions/#cumulative_sum.

* chore: Add a test demonstrating NULL handling of CUMULATIVE_SUM

---------

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-18 06:13:58 +00:00
Christopher M. Wolff 33e41fc5cb
fix: improve error for malformed gap fill query (#8252)
* fix: improve error for malformed gap fill query

* fix: code review feedback
2023-07-17 21:20:34 +00:00
Christopher M. Wolff b916a89159
fix: recurse through SubqueryAlias when finding gap fill time range (#8249) 2023-07-17 19:39:30 +00:00
Joe-Blount 85a9e13262
Merge branch 'main' into jrb_63_compactor_spans 2023-07-17 09:52:27 -05:00
Christopher M. Wolff 85f03acbdf
fix: correctly catch field/tag discrepancy (#8234)
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-14 18:21:56 +00:00
Joe-Blount 803122e3b4 Merge remote-tracking branch 'origin/main' into jrb_63_compactor_spans
# Conflicts:
#	compactor/src/driver.rs
2023-07-13 08:54:22 -05:00
Andrew Lamb 9bfec2f77c
fix: ignore flaky test while it is debugged (#8227)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-13 10:27:13 +00:00
Andrew Lamb 48a5c3e966
chore: Add longer sleep in `end_to_end_cases::debug::build_catalog` and extra logging (#8224)
* fix: Add longer sleep in end_to_end_cases::debug::build_catalog

* chore: add debug logging when test fails

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-13 10:14:59 +00:00
Joe-Blount c5a4912399 chore: add compactor tracing test case 2023-07-11 10:43:09 -05:00
Martin Hilton 9111cd517f
feat(influxql): PERCENTILE function (#8187)
* feat(influxql): support TOP and BOTTOM  functions

Add support for the TOP and BOTTOM functions which return the first
n rows in some ordered data set.

* fix: clippy

* refactor(influxql): use window aggregates for selectors

Change the implentation of ProjectionType::Selector to use a window
aggregate, rather than an aggregate with a custom selector function.
This is in preparation for implementing PERCENTILE.

* feat(influxql): PERCENTILE selector

Add a selector for the row containing the nth percentile of a
partition. This is the behaviour used when a single selector function
is used in an influxql query.

* feat(influxql): PERCENTILE aggregator

Add the PERCENTILE aggregation function for when the PERCENTILE
function is used in an aggregating projection. This implementation
buffers all non-null field values in memory in order to perform the
operation and therefore could be an expensive operation. This is
necessary for compatibility with earlier influxdb versions.

* refactor(influxql): move PERCENTILE implementation out of plan

The plan module is getting rather full of user-defined function
implementations. This breaks the new functions used to implement
percentile into some new top-level modules for aggregate and window
UDFs.

* fix: doc-lint

* chore: refactor `find_enumerated`

* chore: use `s` in format string

* chore: include the unexpected selector function in the error

* chore(influxql): review suggestions

Added some addition comments to help understanding.

Changed the handling os slector functions such that FIRST, LAST,
MAX & MIN behave the same as they did before PERCENTILE was added.

* chore(influxql): make percent_row_number a window UDF

Now that user-defined window functions are available make the
percent_row_number function be one of those. this allows the values
to be calculated for the entire window partition in one go.

For some reason the user-defined window function cannot return NULL
values. This function uses 0 where it would otherwise use NULL, as
row numbering starts at 1.

---------

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-11 05:33:16 +00:00
Fraser Savage dec0244bff
refactor(e2e): Wait 100ms between queries in debug::build_catalog test 2023-07-10 15:27:30 +01:00
Fraser Savage 0978aa0551
fix(e2e): Add small busy-loop to debug::build_catalog test to assert only on non-empty results 2023-07-10 15:13:37 +01:00
Andrew Lamb 3ce11d8d66
chore: Update DataFusion (#8190)
* chore: Update DataFusion

* chore: Run cargo hakari tasks

* fix: Update for API changes

* fix: use display format

* chore: Update explain plan output

* fix: update plans

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-10 09:54:50 +00:00
Andrew Lamb 048fc32bd5
feat: add `influxdb_iox debug build-catalog` command (#8067)
* feat: add `influxdb_iox debug build-catalog` command

* fix: tests

* fix: Use info! logs instead of println for status

* fix: Set partition_hash_id as well

* fix: remove leftover code

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-07 18:32:27 +00:00
Stuart Carnie 1ca547b313
fix: Teach planner to rewrite binary expressions for div operator
Specifically when the operands are integers, to match InfluxQL OG
2023-07-07 11:22:03 +10:00
Martin Hilton dfffdc1d90
feat(influxql): support TOP and BOTTOM functions (#8143)
* refactor(iox_query_influxql): expand select projection

Change the SELECT projection in the planner to make it clearer how
each projection type works.

* feat(influxql): support TOP and BOTTOM  functions

Add support for the TOP and BOTTOM functions which return the first
n rows in some ordered data set.

* fix: clippy

* chore: Use array / slice destructuring

* chore: review suggestion in iox_query_influxql/src/plan/planner.rs

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>

---------

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-06 07:08:45 +00:00
Marco Neumann 70b44f78ee
test: correctly decode ingester reponses in end2end tests 2023-07-03 17:25:01 +02:00
Marco Neumann b1a4e3955e
test: `ingester_partition_pruning` must perform type coercion 2023-07-03 17:25:00 +02:00