Commit Graph

476 Commits (17890a9906e61de26e121e04ac165df650912e64)

Author SHA1 Message Date
Carol (Nichols || Goulding) a2454b542d
fix: Small cleanups in Cargo.tomls (#3160)
* fix: Add tokio rt-multi-thread feature so cargo test -p client_util compiles

* fix: Alphabetize dependencies

* fix: Add the data_types_conversions feature to get tests passing

* fix: Remove dev dependencies already listed under normal dependencies

* fix: Make sure the workspace is using the new resolver
2021-11-18 22:26:33 +00:00
Raphael Taylor-Davies 8155747735
feat: add write buffer delete encoding (#2731) (#3127)
* feat: add write buffer delete encoding (#2731)

* chore: fix doc

* chore: review feedback

* chore: review feedback

* chore: fmt

* chore: review feedback
2021-11-17 16:12:19 +00:00
Raphael Taylor-Davies 3d091208af
refactor: move delete predicate into data_types (#2731) (#3094)
* refactor: move delete predicate into dml (#2731)

* refactor: move DeletePredicate to data_types

* chore: fix doc

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-15 10:28:58 +00:00
Nga Tran 71731524c4 test: add group by tests 2021-11-08 10:46:40 -05:00
Nga Tran abbfafcabd chore: merge main to branch 2021-11-08 09:28:29 -05:00
Nga Tran 97206b13cb fix: statistics for max/min(time) should have data type timstamp 2021-11-05 18:11:54 -04:00
Raphael Taylor-Davies 60f0deaf1e
feat: remove flatbuffer entry (#3045) 2021-11-05 20:19:24 +00:00
Andrew Lamb a252b81baa
test: add negative test for CREATE EXTERNAL TABLE (#3051)
* test: add negative test for CREATE EXTERNAL TABLE

* fix: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-05 17:43:08 +00:00
Nga Tran 89699cf0de chore: use latest DataFusion to fix the min/max(dictionary string) bug 2021-11-04 11:04:03 -04:00
Andrew Lamb 5e86989990
fix: Only use timestamps in first/last when there is a corresponding … (#2988)
* fix: Only use timestamps in first/last when there is a corresponding value

* docs: Fix broken English in comment

* docs: clarify expectations in test
2021-11-01 18:54:43 +00:00
dependabot[bot] c540b40f05
chore(deps): bump tokio from 1.12.0 to 1.13.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.12.0 to 1.13.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.12.0...tokio-1.13.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-01 11:21:59 +00:00
Andrew Lamb d2c2143277
fix: wrong results in read_filter with "complex" OR predicate and IS_NULL (#2977)
* fix: Allow evaluating general purpose predicates in gRPC by rewriting missing columns to null

* fix: update commands and add some additional tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-29 14:30:28 +00:00
kodiakhq[bot] 6760212e86
Merge branch 'main' into ntran/group 2021-10-26 21:25:57 +00:00
Nga Tran 7b3b92b161 chore: make comments clearer 2021-10-26 16:52:45 -04:00
Nga Tran 682499d4a7 chore: make comments clearer 2021-10-26 16:50:27 -04:00
Nga Tran 8c43b24745 chore: cleanup 2021-10-26 16:46:50 -04:00
Nga Tran 46cb4df14d fix: now no rows will be return for read_group if the table has no data or all its data is soft deleted or the table does not exist 2021-10-26 16:40:36 -04:00
Marco Neumann 5451a49620
fix: flaky `test_query_cancellation_slow_store` (#2966) 2021-10-26 15:58:28 +00:00
Andrew Lamb 7cd56cbc56 Merge remote-tracking branch 'origin/main' into er/fix/flux/2691 2021-10-25 13:41:08 -04:00
Marco Neumann bc7244c48e chore: use Rust edition 2021 2021-10-25 10:58:20 +02:00
Andrew Lamb 52cf1a85b9
fix(metadata): Do not report table_names for tables that have no non-null values that match predicate (#2947)
* fix(metadata): Do not report table_names for tables that have no non-null values that match predicate

* fix: make apply_predicate_to_metadata precise

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-24 11:03:30 +00:00
Nga Tran 3c87cc3747 fix: make all test scenarios have data deleted after chunks are moved 2021-10-23 07:56:15 -04:00
Andrew Lamb 28962038c1
fix: properly handle null values in table_name plan (#2946)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-23 10:35:02 +00:00
Nga Tran a6fb7bf2b5 test: fix read_group test case to make it work with all scenarios 2021-10-22 17:22:30 -04:00
Edd Robinson eb71102d82
Merge branch 'main' into er/fix/flux/2691 2021-10-22 20:21:22 +01:00
Nga Tran e733562da8 chore: fix comment 2021-10-22 12:39:56 -04:00
Nga Tran f9dedd78da test: Make test outputs consistent with the ones we send back to client 2021-10-22 12:19:30 -04:00
Edd Robinson 44e0ef757f
Merge branch 'main' into er/fix/flux/2691 2021-10-22 13:56:27 +01:00
Edd Robinson b3242277eb test: test_field_name_plan_with_delete is fixed 2021-10-22 13:07:36 +01:00
Edd Robinson fc7ce3535d refactor: fix _value filtering for non-selector aggregates 2021-10-21 22:28:41 +01:00
Edd Robinson f8489fc774 refactor: ensure _value filtering works read_filter 2021-10-21 22:28:41 +01:00
Edd Robinson 3e687377e0 test: add test for defect #2691 2021-10-21 22:24:03 +01:00
Marco Neumann f7ca80e29f
test: ensure query cancellation (somewhat) works (#2931)
* feat: enable reconfiguration of in-use throttled store

This is handy for tests for which a part should run "normal" and another
one should be throttled/blocked.

* feat: keep track of the number of tasks within a `DedicatedExecutor`

* test: ensure query cancellation (somewhat) works

We cannot really test that query cancellation finishes all subtasks
because _tokio_ doesn't provide sufficient stats / inspection, at least
as long we don't want to rely heavily on _tokio_ tracing. So let's at
least check that tasks from the dedicated executors are pruned properly.

For all other regressions we need to add unit tests to the affected
components. See for example:

- https://github.com/apache/arrow-datafusion/issues/1103
- https://github.com/apache/arrow-datafusion/pull/1105
- https://github.com/apache/arrow-datafusion/pull/1112
- https://github.com/apache/arrow-datafusion/pull/1121

Closes #2027.
2021-10-21 19:10:58 +00:00
Nga Tran aecbdc0468 chore: merge main to branch 2021-10-20 17:09:02 -04:00
Nga Tran 97d5760347 feat: support delelete for table_names 2021-10-20 15:58:25 -04:00
Andrew Lamb ee2ca8fc32
fix(read_group): Support grouping on `_start` and `_stop` (#2918)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-20 15:16:47 +00:00
Edd Robinson 5059d9cf8a test: add _field filter coverage to window_aggregate 2021-10-20 14:59:27 +01:00
Edd Robinson 3db48d1460 test: add _field filter coverage to read_group 2021-10-20 14:59:27 +01:00
Edd Robinson f397f0b4ac test: add test for defect 2021-10-20 14:59:27 +01:00
Andrew Lamb 9974a5364c
chore(security): Replace prettytable with comfy-table (#2905)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-20 10:44:36 +00:00
Raphael Taylor-Davies ce0127a1f7
feat: MutableBatch write API (#2090) (#2724) (#2882)
* feat: MutableBatch write API (#2090) (#2724)

* chore: fix lint

* fix: handle dictionaries with unused mappings

* chore: review feedback

* chore: further review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-20 08:44:14 +00:00
Andrew Lamb b55ca06fe3
fix: read_window_aggregate overflow (#2908) 2021-10-19 19:20:39 +00:00
Nga Tran e664799593 feat: support delete for tag_values 2021-10-19 14:18:01 -04:00
kodiakhq[bot] 254a842eed
Merge branch 'main' into ntran/table_names 2021-10-19 17:53:25 +00:00
Andrew Lamb d5cffb5f54
fix(read_window_aggregates): return aggregates as integer rather than unsigned (#2906)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-19 17:47:15 +00:00
Nga Tran cabb007956 chore: Merge branch 'main' into ntran/table_names 2021-10-19 13:22:28 -04:00
Andrew Lamb a82dc6f5f0
chore: Update datafusion + arrow (#2903)
* chore: Update datafusion to latest, arrow to 6.0.0

* fix: Update tests

* fix: bubble internal error

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-19 17:14:08 +00:00
Nga Tran 716515fc5a fix: turn ignore back on for table_name tests as it is not ready yet 2021-10-19 10:38:25 -04:00
Nga Tran ea85d6478e chore: remove code changes for table_names abecause normal plan needs to be implemented first 2021-10-19 10:36:10 -04:00
Marco Neumann a364bdfb5f chore: remove unused `query_tests` => `chrono` dep 2021-10-19 14:45:56 +02:00
Nga Tran afa6e50c9c feat: make tag_keys work with delete 2021-10-18 15:36:19 -04:00
Andrew Lamb 30c3d1e001
test: Add read_group test for missing fields (#2881)
* test: Add read_group test

* fix: Update to use o2 table
2021-10-18 18:43:52 +00:00
Andrew Lamb f5a84122e3
feat: Support grouping by _field and _measurement (#2874)
* feat: Support grouping by _field and _measurement

* fix: clippy

* fix: doclink

* refactor: rename SeriesOrGroup --> Either

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-18 15:32:24 +00:00
Nga Tran 9244e9fc4e test: Delete tests for Influxrpc queries 2021-10-15 17:26:36 -04:00
Andrew Lamb beaf77cecf
refactor: move Series translation logic into query crate, update gRPC tests (#2852)
* refactor: move Series translation logic into query crate

* refactor: update grpc_tests to use new display

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-15 11:06:40 +00:00
Nga Tran b2d265dc51 chore: run format after accepting reviewer's suggestions 2021-10-14 17:31:42 -04:00
Nga Tran 39a556c5eb
chore: Apply suggestions from code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-10-14 17:21:11 -04:00
Nga Tran 6a5aa00e2c chore: make the comments and names clearer 2021-10-14 17:11:37 -04:00
kodiakhq[bot] 993c6173d1
Merge branch 'main' into ntran/grpc_storage 2021-10-14 15:28:05 +00:00
Nga Tran 69d1253240 chore: Merge branch 'ntran/grpc_storage' of https://github.com/influxdata/influxdb_iox into ntran/grpc_storage 2021-10-14 11:23:29 -04:00
Nga Tran faf65f38cc refactor: address review comments 2021-10-14 11:23:20 -04:00
Nga Tran 08f1831aef
refactor: Apply suggestions from code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-10-14 09:25:44 -04:00
Edd Robinson 693c878781 refactor: update test 2021-10-14 09:23:29 +01:00
Edd Robinson 75caa4aad8 test: add test duplicating defect input 2021-10-14 09:23:29 +01:00
Nga Tran 8dd9dcce01 test: verify if all scenarios are created correctly and add a few delete tests for read_filter 2021-10-13 17:21:03 -04:00
Nga Tran f85d6d5da8 chore: cleanup 2021-10-12 17:23:01 -04:00
Nga Tran 401124548c chore: ignore failed newly added test due to bug 2021-10-12 17:20:17 -04:00
Nga Tran 144ce77e39 chore: merge main to branch 2021-10-12 15:59:57 -04:00
Nga Tran 459dd46ae9 refactor: move delete tests to .sql 2021-10-12 15:49:23 -04:00
Andrew Lamb 035654b4f9
refactor: do not rebuild query_test when .sql or .expected files change (#2816)
* feat: Do not rebuild query_tests if .sql or .expected change

* feat: Add CI check

* refactor: move some sql tests to .sql files

* tests: port tests / expected results to data files

* fix: restore old name check-flatbuffers
2021-10-12 19:34:54 +00:00
Nga Tran 9b6726b99c refactor: rename to a more general name function 2021-10-12 10:34:55 -04:00
Nga Tran 0b4ae95ca4 refactor: exhaust scenarios for one-chunk test 2021-10-11 17:47:41 -04:00
Raphael Taylor-Davies b39e01f7ba
feat: migrate PersistenceWindows to TimeProvider (#2722) (#2798) 2021-10-11 20:40:00 +00:00
Nga Tran fbf5539336 chore: merge main to branch 2021-10-11 10:47:10 -04:00
Raphael Taylor-Davies afe34751e7
refactor: split out schema crate (#2781)
* refactor: split out schema crate

* chore: fix doc
2021-10-11 09:45:08 +00:00
Nga Tran f7475322a6 chore: merge main to branch, resolve conflicts, and discover an inconsitent bug 2021-10-08 15:50:46 -04:00
Nga Tran f2cdb9531f chore: cleanup 2021-10-08 14:52:15 -04:00
Nga Tran adbcd85c26 fix: fully fix 2745 2021-10-08 14:37:34 -04:00
Andrew Lamb 2072b4066e
feat: Implement support for `_measurement` predicate in gRPC plans (#2772)
* feat: Implement filtering for _measurement in general purpose gRPC plans

* docs: fixup docstrings

* fix: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-08 17:25:33 +00:00
Nga Tran 22d6f11bea fix: add cols of delete predicates into the schema of scanning columns 2021-10-07 17:37:34 -04:00
Marco Neumann 63d74be490 refactor: make `ChunkId` a UUID 2021-10-07 10:23:27 +02:00
Nga Tran de148337e8 fix: half way fix the bug to inlcude schema of column in delete predicate into the schema of IOx scan to avoid missing reading columns 2021-10-06 17:43:48 -04:00
Andrew Lamb efa2316626
fix: do not sort the output of read_group with no group keys (#2755) 2021-10-06 18:59:58 +00:00
Nga Tran bd93b411c7 chore: cleanup 2021-10-06 10:57:51 -04:00
kodiakhq[bot] 1aee2a49eb
Merge branch 'main' into ntran/no_use_stats 2021-10-06 14:06:01 +00:00
Nga Tran 65a02f7085
refactor: Apply suggestions from code review
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-10-06 10:04:28 -04:00
Raphael Taylor-Davies ce5b24e65d
refactor: use DateTime<Utc> in PersistenceWindows (#2722) (#2743)
* refactor: use DateTime<Utc> in PersistenceWindows (#2722)

* chore: fix benchmark

* chore: fmt

* chore: review feedback
2021-10-06 09:39:32 +00:00
Nga Tran 055e69439d test: fix auto created tests 2021-10-05 18:11:27 -04:00
Nga Tran aa64daca86 feat: dDisable using statistics to query data if there are soft deleted rows 2021-10-05 17:52:32 -04:00
Raphael Taylor-Davies d0929e3a34
feat: persist no chunks (#2712) (#2718)
* feat: persist no chunks (#2712)

* fix: persist partition

* fix: chunk ordering test

* chore: fix logical conflict

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-05 15:18:35 +00:00
Marco Neumann 10c1a72402 refactor: remove unused fields from `DeletePredicate` 2021-10-05 09:29:24 +02:00
Nga Tran 1856d7184c fix: make stop time inclusive 2021-10-04 14:14:16 -04:00
Nga Tran 64cd0cb3ce chore: cleanup 2021-10-04 12:43:35 -04:00
Nga Tran ba96b20b1f test: turn on tests of delete all 2021-10-04 12:28:49 -04:00
Marco Neumann 75ac6e8646 refactor: make `DeletePredicate::range` non-optional 2021-10-04 16:36:20 +02:00
Marco Neumann 5a5a929b9e refactor: introduce `DeletePredicate`
`DeletePredicate` is a simpler version of `Predicate` that is based on
IOx `DeleteExpr` instead of the full-blown DataFusion `Expr`. This will
allow us to do a couple of things (in follow-up changes):

- Order and de-duplicate delete predicates
- Normalize predicates
- Infallible serialization
- Smaller memory footprint

Note that this change only affects delete expressions. Query expressions
that are supported via the API are not changed. The query subsystem also
still uses the full-featured expressions/predicates (delete
expressions/predicates are converted to the more powerful DataFusion
version on-the-fly).
2021-10-04 16:36:20 +02:00
Edd Robinson 7ab10daa19
Merge branch 'main' into dependabot/cargo/arrow-5.5.0 2021-10-04 12:58:29 +01:00
Raphael Taylor-Davies e8eab2cc97
feat: allow compaction and persistence to retun no chunk (#2664) (#2700)
* feat: allow compaction and persistence to retun no chunk (#2664)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-04 10:54:47 +00:00
dependabot[bot] d1f5209869
chore(deps): bump arrow from 5.4.0 to 5.5.0
Bumps [arrow](https://github.com/apache/arrow-rs) from 5.4.0 to 5.5.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/5.5.0/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/5.4.0...5.5.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-04 08:55:38 +00:00
Raphael Taylor-Davies b402423e9e
feat: remove move lifecycle action (#2674)
* feat: remove move_chunk lifecycle action

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-30 16:58:05 +00:00
Nga Tran 105b63b2af
test: use integer in predicates (#2673)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-30 10:46:03 +00:00
Edd Robinson 003f72ba00
Merge branch 'main' into er/fix/read_buffer/pred_validate 2021-09-29 14:50:12 +01:00
Edd Robinson a52b86e070 fix: fallback to no predicate if it can't be validated
Closes: #1603

If a predicate cannot be executed against a read buffer chunk because of schema conflicts then fall back to applying no predicate and let the query engine apply predicates in the Filter step of the plan.
2021-09-29 14:42:56 +01:00
Nga Tran 3cfea6e83f refactor: add comment 2021-09-28 17:44:49 -04:00
Nga Tran 2837aae479 test: more tests that inlcude the repro of 2546 2021-09-28 17:42:23 -04:00
Nga Tran d7af1b8290 refactor: exhaust delete test scenarios 2021-09-28 12:31:46 -04:00
Nga Tran 4237d6dcc6 refactor: address review comments and refactor some more obvios ones 2021-09-27 21:38:00 -04:00
Nga Tran cbfa3e85af chore: Merge branch 'main' into ntran/refactor_delete_tests 2021-09-27 14:52:38 -04:00
Nga Tran ff77c53e11 chore: cleanup 2021-09-27 14:43:08 -04:00
Nga Tran ec3e1fda06 refactor: refactor all delete helper functions 2021-09-27 14:39:01 -04:00
Andrew Lamb 8e89dde85c
test: enable stack overflow test (#2618)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-27 14:27:08 +00:00
Andrew Lamb a55a21c644
chore: Update datafusion (#2635)
* chore: Update datafusion and sqlparser

* fix: remove STACK_SIZE workaround

* chore: update datafusion_util

* chore: update predicate

* chore: update query_tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-27 14:13:19 +00:00
Nga Tran c5e40b50cc refactor: delete tests to make it easier to add more tests 2021-09-24 16:35:55 -04:00
Nga Tran 0799b5ab91 test: query with selection predicate on deleted data 2021-09-23 16:54:40 -04:00
Nga Tran c35ecf257d refactor: address review comments 2021-09-23 11:14:20 -04:00
Nga Tran 713c7f4820 chore: remove empty line 2021-09-22 17:11:09 -04:00
Nga Tran 332649a8ee refactor: another cleanup due to marco -> function and add a test that exposes bug 2021-09-22 17:09:20 -04:00
Nga Tran 10097766d3 refactor: changes to use function instead of marco after merging main 2021-09-22 16:53:51 -04:00
Nga Tran 2399a932fb chore: Merge branch 'main' into ntran/more_delete_tests 2021-09-22 16:47:15 -04:00
Nga Tran 3a00d9c70a refactor: cleanup 2021-09-22 16:42:46 -04:00
Nga Tran 400ec93498 test: more delete tests 2021-09-22 16:38:27 -04:00
Marco Neumann 9a1a3f4de5 refactor: convert `run_tag_keys_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 8cc08d6bd9 refactor: convert `run_table_names_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 2408dea394 refactor: convert `run_read_window_aggregate_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann cdfe49f041 refactor: convert `run_read_group_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 1db13ec437 refactor: convert `run_tag_values_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 20655d10b6 refactor: convert `run_read_window_aggregate_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann b24be8d17d refactor: convert `run_field_columns_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 6ef8c11036 refactor: convert `run_table_schema_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 70a754a022 refactor: convert `run_sql_test_case` macro to a function
This generated less code and speeds up compilation a bit.
2021-09-22 13:46:53 +02:00
Andrew Lamb d38648952c
chore: Update datafusion (#2602)
* chore: Update datafusion + other deps

* refactor: update query crate for new async interfaces

* refactor: update server crate for new async interface

* refactor: update query_tests crate for new async interfaces

* refactor: update ioxd and server to use new async interface

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-22 10:33:25 +00:00
Nga Tran ae40d93af4 test: more delete tests 2021-09-21 18:10:12 -04:00
Nga Tran b4b33c378e test: turn all delete tests on 2021-09-21 15:23:41 -04:00
Nga Tran 2f371b6f79 refactor: address review comments 2021-09-21 14:46:24 -04:00
Nga Tran 93551bdd1e fix: all chunks now are applied delete predicates during scan 2021-09-21 12:17:59 -04:00
Nga Tran 85989cc8a3 test: add more delete tests and test scenarios 2021-09-20 18:18:08 -04:00
kodiakhq[bot] 77d84ca5ab
Merge branch 'main' into crepererum/chunk_id 2021-09-20 13:39:05 +00:00
Marco Neumann cef5aeee52 refactor: introduce `ChunkId` type 2021-09-20 13:10:41 +02:00
Nga Tran 364d245eae feat: apply negated delete predicates during scan 2021-09-17 16:20:42 -04:00
Nga Tran 243cc1f88c fix: compile error after merge from main 2021-09-16 17:56:33 -04:00
Nga Tran 6cfeeb352b refactor: address review comments 2021-09-16 17:21:06 -04:00
Nga Tran 472e8a9e49 fix: fix compile error 2021-09-16 15:02:18 -04:00
Nga Tran 2bae14df60 test: delete tests 2021-09-16 14:51:26 -04:00
Raphael Taylor-Davies c66095cad1
feat: remove metrics crate (#2552) 2021-09-15 19:43:33 +00:00
Marco Neumann bfaba78dc3 refactor: move `predicate` into its own crate
Two reasons:

1. I wanna decouple `parquet_file` from `query` (nearly done, needs a
   small follow-up PR).
2. `predicate` will have more and more features (like serialization)
   which justifies a new home
2021-09-14 17:13:02 +02:00
Marco Neumann 2591bcac13 test: ensure chunk IDs are as documented 2021-09-14 13:00:55 +02:00
Marco Neumann 9a60af7fa3 docs: explain `ChunkOrder` query test scenario 2021-09-14 13:00:55 +02:00
Marco Neumann 1b788732da fix: order chunks correctly during query processing
The query processing was implicitly relying on the order provided by the
catalog. This had two issues:

- this ordering was not defined in the API contract (neither via docs
  nor via typing)
- the order was based on chunk IDs which is not adequate in some cases
  (e.g. when chunks are created while a persistence operations is in
  progress)

Now we explicitly sort chunks by `(order, ID)`.

Fixes #1963.
2021-09-14 13:00:55 +02:00
Andrew Lamb 5eef76c868
chore: Update dependencies (including datafusion) (#2521)
* chore: Update datafusion deps to pre-release

* refactor: Update IOx to use new datafusion Statistics

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-13 21:30:44 +00:00
Raphael Taylor-Davies 20143e4f4e
feat: migrate chunk pruning metrics (#2516) 2021-09-13 13:13:47 +00:00
Andrew Lamb eb72799f0d
chore: Update datafusion (and arrow et al) dependencies (#2509)
* chore: update datafusion and other deps

* fix: Update InfluxRPC frontend with new op types

* fix: Update test output for new column names

* fix: typos and unintended changes

* fix: Update query_tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-10 17:57:25 +00:00
dependabot[bot] b67610d9b9
chore(deps): bump tokio from 1.10.1 to 1.11.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.10.1 to 1.11.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.10.1...tokio-1.11.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 09:11:38 +00:00
Marco Neumann 3c968ac092 feat: correctly account MUB sizes
Fixes #1565.
2021-09-03 09:15:49 +02:00
Marco Neumann 79ad48ac3a chore: rename "labels" to "attributes" 2021-08-31 11:31:15 +02:00
Raphael Taylor-Davies e3e801d29a
feat: propagate span context into storage RPC queries (#2407)
* feat: propagate span context into storage RPC queries

* refactor: create ExecutionContextProvider trait

* chore: cleanup imports

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 17:11:49 +00:00
Andrew Lamb f975baba6b
chore: Update datafusion + other deps again (get baseline metrics) (#2422)
* chore: Update datafusion reference

* chore: cargo update

* fix: update explain tests to show Union

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 13:13:00 +00:00
kodiakhq[bot] b1ecf1bfed
Merge branch 'main' into crepererum/job_start_time_in_system_table 2021-08-26 08:04:10 +00:00
Andrew Lamb ddf6c6362e
chore: update DataFusion again (#2411)
* chore: update datafusion ref

* chore: run cargo update

* refactor: Rename concurrency to target_partitions, avoid deprecation warning
2021-08-26 08:03:13 +00:00
Marco Neumann 558aa54aa3 feat: add start time to `operations` system table 2021-08-26 10:00:29 +02:00
Edd Robinson 11e88877f4 fix: correct size estimation of RLE encoding 2021-08-25 12:03:04 +01:00
Marco Neumann 2ad9843e5f feat: make `RLE` a bit smaller by capacity-based allocation
For some demo data this reduced the overall chunk size from

195049367 bytes
to
191088095 bytes
2021-08-25 10:22:43 +02:00
Raphael Taylor-Davies f7792aafe6
feat: query tracing (#2273) (#2391)
* feat: query tracing (#2273)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 17:35:59 +00:00
Raphael Taylor-Davies a6c9cc2bf2
refactor: rework exec module (#2384)
* refactor: rework exec module

* chore: update docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 08:39:54 +00:00
Raphael Taylor-Davies 0946ffe916
refactor: reuse IOxExecutionContext (#2373)
* refactor: reuse IOxExecutionContext

* fix: orphaned comment

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-23 15:47:15 +00:00
Edd Robinson b9f09fce49 feat: improve bitset size estimation 2021-08-17 22:54:22 +01:00
Edd Robinson 1daa30cc7d fix: include enum in sizing 2021-08-17 22:54:22 +01:00
Edd Robinson 311d36d776 refactor: include capacity in Read Buffer chunk size 2021-08-13 11:57:46 +01:00
Edd Robinson fa8da19c45 refactor: expose enc size API into column 2021-08-13 11:57:46 +01:00
Edd Robinson c68bbb6309 test: update test 2021-08-12 15:05:47 +01:00
Andrew Lamb bb8021d9fd
fix: "Can not convert index to usize in dictionary of type creating group by value Int32" (#2151)
* test: add reproducer for index error

* chore: update datafusion
2021-08-02 12:20:41 +00:00
Carol (Nichols || Goulding) 9d15798288 fix: Address or allow Clippy warnings new with Rust 1.54 2021-07-30 09:59:59 -04:00
Andrew Lamb e6cbd4d217
feat: Use statistics for count(*) queries (#2038)
* feat: Use statistics for count(*) queries

* docs: fix mangled comment

* refactor: rewrite to use fold

* refactor: use sort_by_cached_key

* fix: set null count properly

* fix: fmt + clippy
2021-07-28 19:39:41 +00:00
Andrew Lamb 3ea84c6be4
feat: expose null_counts in system.chunk_columns (#2105)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-27 11:05:23 +00:00
Andrew Lamb 5fb3e00f2a
fix: Properly record total_count and null_count in statistics (#2103)
* fix: Properly record total_count and null_count in statistics

* fix: fix statistics calculation in mutable_buffer

* refactor: expose null counts in read_buffer

* refactor: expose null_count in parquet_file

* fix: update server crate tests

* fix: update query_tests tests

* docs: tweak comments

* refactor: Use storage_stats rather than adding `null_count`

* refactor: rename test data field for clarity

* fix: fixup merge conflicts

* refactor: rename initial_non_null_count to initial_total_count

* refactor: caculate null_count as row_count - to_add
2021-07-26 18:13:36 +00:00
Marco Neumann 6ef3680554 feat: collect replay plan during catalog loading 2021-07-23 09:23:06 +02:00
Andrew Lamb 38261cc7ac
test: add tests using `to_timestamp()` as predicates in SQL (#2099)
* test: add tests using `to_timestamp()` as predicates in SQL

* fix: cleanup redundancy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 21:06:52 +00:00
Andrew Lamb 01c79f1a1a
fix: Print all timestamps using RFC3339 format (#2098)
* fix: Use IOx pretty printer rather than arrow pretty printer

* chore: update tests in the query crate

* chore: update influxdb_iox tests

* chore: Update end to end tests

* chore: update query_tests

* chore: update mutable_buffer tests

* refactor: update parquet_file tests

* refactor: update db tests

* chore: update kafka integration test output

* fix: merge conflict
2021-07-22 19:04:52 +00:00
Raphael Taylor-Davies 20d06e3225
feat: include more information in system.operations table (#2097)
* feat: include more information in system.operations table

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 17:16:09 +00:00
Andrew Lamb 387667330a
chore: Update datafusion deps (#2073)
* chore: Update datafusion deps

* fix: update tests
2021-07-21 08:27:03 +00:00
Raphael Taylor-Davies 091837420f
feat: add PersistenceWindows sytem table (#2030) (#2062)
* feat: add PersistenceWindows sytem table (#2030)

* chore: update log

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 13:10:57 +00:00
Andrew Lamb 1c16988a51
chore: Update datafusion references (#2056) 2021-07-19 18:09:06 +00:00
Andrew Lamb 4da8a16c18
chore: update to arrow 5.0 and master datafusion (#2049)
* chore: update to arrow 5.0 and master datafusion

* fix: Update test for change in object size
2021-07-19 12:49:51 +00:00
Marco Neumann 2263189e09 test: make TestDb lifecycle better for testing
This is a leftover from #1972.
2021-07-19 09:50:44 +02:00
Marco Neumann 1ef2bc1887 refactor: `Db::{write_chunk_to_object_store => Db::persist_partition}`
The previous method allowed to persist any chunk -- even ones that
should not be persisted yet and w/o any order of peristence. That will
break our persistence windows. So instead offer a sane higher-level
interface that can trigger persistence of a partition within the
boundaries of the lifecycle rules. This needs some adjustments for our
test suite.
2021-07-16 12:07:58 +02:00
Andrew Lamb 3fd6430fb6
fix: rename `estimated_bytes` to `memory_bytes` and expose `object_store_bytes` in ChunkSummary and system.chunks (#2017)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 16:00:24 +00:00
Andrew Lamb 3bb32594ba
refactor: rename end-to-end.rs to end_to_end.rs (#2015)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 13:50:32 +00:00
Marco Neumann d89fca00be feat: persist "drop chunk" 2021-07-15 12:07:56 +02:00
Nga Tran 0b1f2b1fd0 chore: merge main to branch 2021-07-14 16:17:14 -04:00
Nga Tran 552e3fb691 fix: Padd stats compute deterministic order of sort key and update tests that got changed by the use of sort key 2021-07-14 14:06:41 -04:00
Andrew Lamb 4800b36949 chore: Update IOx to a pre-release version of arrow and datafusion to test out performance improvement 2021-07-13 15:44:57 -04:00
Andrew Lamb 0164cabbf3
refactor: do not use DataFrame DataFusion API / stop optimizing twice (#1982)
* refactor: do not use DataFrame DataFusion API

* fix: update output to reflect not running optimizer twice

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-13 16:29:43 +00:00
kodiakhq[bot] f26f844ed2
Merge branch 'main' into ntran/use_sortkey 2021-07-12 18:12:47 +00:00
Nga Tran 7b7a60993d feat: consider time as a special key 2021-07-09 18:54:22 -04:00
Marco Neumann 676034b4ae docs: explain why the path placeholder is there 2021-07-09 09:45:13 +02:00
Marco Neumann 09e611deb7 refactor: lift query schema generation up to caller
Do no longer scan chunks during query planning to determine the schema
(except for the lifetime jobs where we have a good reason to do so).
Instead pass the schema down to from whoever is triggering the query.
For real SQL queries, we then just use the the table-wide schemas
introduced in #1913.

Apart from avoiding schema merges we now also don't crash any longer
when no chunks are left in the table (aka columns are present but all
rows are gone).

Fixes #1768.
Fixes #1884.
2021-07-09 09:24:21 +02:00
Marco Neumann 6ac1420335 test: fix out dir for query tests 2021-07-09 09:16:28 +02:00
kodiakhq[bot] c8126784a8
Merge branch 'main' into ntran/avoid_sort_in_scan 2021-07-08 20:22:18 +00:00
Nga Tran da6249a4df fix: address reviewers' comments and also fixe a bug they discovered 2021-07-08 15:54:54 -04:00
Andrew Lamb dd3eff7748
refactor: Always use `row_count` for count of rows in system.* tables (#1937) 2021-07-08 19:28:11 +00:00
Andrew Lamb f670224ea1
chore: Reduce output spew during query tests (#1926)
* chore: Reduce output spew during query tests

* docs: Update query_tests/src/runner.rs

Co-authored-by: Edd Robinson <me@edd.io>

Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 11:06:24 +00:00
Andrew Lamb 7602bde850
chore: Update datafusion deps (#1799)
* chore: Update datafusion deps + rework code

* refactor: remove workaround as it has been contributed upstream

* fix: Update query/src/exec/split.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 10:58:32 +00:00
Nga Tran 5c722af0fa fix: remove comments 2021-07-07 16:50:53 -04:00
Nga Tran d3c4f8c249 fix: store sort key correctly inthe schema. Update tests to reflect it 2021-07-07 15:55:23 -04:00
Edd Robinson 2ec9151b32
Merge branch 'main' into er/fix/read_buffer/predicate 2021-07-06 13:35:04 +01:00
Marco Neumann 8387eaed27 test: do not recompile `query_tests` when test content changes
There is no need to recompile the entire `query_tests` crate when the
CONTENT (not the SET) of the test cases changes, e.g. due to new
optimizations, datafusion upgrades, query additions, etc. We now check
if `cases.rs` really changed before touching it, so that Cargo can rely
on the files mtime.
2021-07-05 15:30:10 +02:00
Marco Neumann d6cff911b6 test: ensure that query tests don't rebuild all the time
Beforehand:

```text
❯ env CARGO_LOG=cargo::core::compiler::fingerprint=info cargo test -p query_tests
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint] stale: changed "/home/mneumann/src/influxdb_iox/query_tests/cases"
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint]           (vs) "/home/mneumann/src/influxdb_iox/target/debug/build/query_tests-0e8f741dfb84437f/output"
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint]                FileTime { seconds: 1625474716, nanos: 436081357 } != FileTime { seconds: 1625474752, nanos: 52625167 }
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/Test/TargetInner { ..: lib_target("query_tests", ["lib"], "/home/mneumann/src/influxdb_iox/query_tests/src/lib.rs", Edition2018) }
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint]     err: current filesystem status shows we're outdated
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/RunCustomBuild/TargetInner { ..: custom_build_target("build-script-build", "/home/mneumann/src/influxdb_iox/query_tests/build.rs", Edition2018) }
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint]     err: current filesystem status shows we're outdated
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/Build/TargetInner { ..: lib_target("query_tests", ["lib"], "/home/mneumann/src/influxdb_iox/query_tests/src/lib.rs", Edition2018) }
[2021-07-05T08:52:13Z INFO  cargo::core::compiler::fingerprint]     err: current filesystem status shows we're outdated
   Compiling query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)
```

The issue is that both the input and the test output files are located
under `cases/`. `build.rs` used `cargo:rerun-if-changed=cases` which per
Cargo doc will scan ALL files in that directory. Note that the normal
`exclude` directive in `Cargo.toml` does NOT work, see
https://github.com/rust-lang/cargo/issues/4587 .

So we need to split input and output files into separate directories
(`cases/{in,out}`).
2021-07-05 15:30:10 +02:00
Raphael Taylor-Davies 5b00bc69e6
refactor: use Arc<Db> in lifecycle actions (#1873)
* refactor: use Arc<Db> in lifecycle actions

* chore: review feedback
2021-07-01 19:56:33 +00:00
Andrew Lamb 56c8c8d428
feat: Use separate executor for queries and compactions/moves (#1870)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:47:50 +00:00
Andrew Lamb 07826306ed
fix: Always deduplicate data prior to insertion into the ReadBuffer (#1863)
* fix: mark ReadBuffer as always deduplicated

* fix: Use compact plans during merge

* docs: Update server/src/db/chunk.rs

Co-authored-by: Nga Tran <ntran@influxdata.com>

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Nga Tran <ntran@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:23:37 +00:00
Edd Robinson 8fc07cf4f0 fix: correctly evaluate exprs matching disjoint rows 2021-07-01 16:05:09 +01:00
Andrew Lamb cfa06e1497
chore: Add query tests for compacted chunks (#1861)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-30 20:59:29 +00:00
Andrew Lamb 9e1723620c
refactor: rename load_chunk_to_read_buffer to move_chunk_to_read_buffer (#1857)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-30 16:53:18 +00:00
Andrew Lamb fef160e24f
feat: Implement data driven query_tests and port explain tests (#1814)
* feat: Implment data driven query testing and port explain tests

* fix: do not fmt the auto generated cases

* refactor: split setup and parser into separate modules

* refactor: Add log to runner, add end to end tests

* docs: fixu cpmments
2021-06-29 16:09:51 +00:00
Carol (Nichols || Goulding) 93881da016 feat: Make Write Buffer store_entry async
In preparation for the Kafka write buffer implementation needing to call
async functions.
2021-06-23 10:48:18 -04:00
Andrew Lamb 5362c7c924
feat: enable query deduplication (#1762) 2021-06-21 18:49:04 +00:00
Raphael Taylor-Davies ea04ce40dc
feat: transactional lifecycle API (#1753)
* feat: transactional lifecycle API

* chore: remove redundant upgrade

* feat: lifecycle error propagation

* chore: add usage doctest

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-21 13:09:53 +00:00
Andrew Lamb ab052c0501
fix: fix flaky test by updating datafusion dep (#1758)
* chore: update DataFusion dependencies

* fix: Re-enable previously flaking tests

This reverts commit c63ad0ea31.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-18 20:17:18 +00:00
Andrew Lamb de67bd3efe
refactor: Remove PartitionChunk::table_schema (#1756)
* refactor: Remove PartitionChunk::table_schema

* docs: update comments
2021-06-18 16:13:16 +00:00
Andrew Lamb 1c13d676b4
refactor: Rename query::PartitionChunk --> query::QueryChunk (#1754) 2021-06-18 13:24:09 +00:00
Marko Mikulicic c63ad0ea31
chore: Ignore flaky test (#1749) 2021-06-17 14:51:19 +00:00
Andrew Lamb 856751deec
feat: Lifecycle manager unloads, rather than drop, chunks when soft limit is hit (#1701)
* feat: unload chunks from memory rather than dropping them

* docs: Update server/src/db/lifecycle.rs

Co-authored-by: Marco Neumann <marco@crepererum.net>

* docs: Update comment wording

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-14 13:14:39 +00:00
Nga Tran 11729b9aa7
test: select non-key from 2 chunks with different key/tag sets (#1703)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-11 18:52:36 +00:00
Raphael Taylor-Davies 11b25b3aaf
refactor: swap order of partition and table in in-memory catalog (#1678)
* refactor: swap order of partition and table in in-memory catalog

* chore: review feedback

* chore: validate panic message

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-10 16:40:30 +00:00
Edd Robinson 5336acc36f test: add test case for != predicate 2021-06-08 15:40:51 +01:00
Andrew Lamb e9834a907c
feat: Prune on boolean column predicates too (#1629)
* chore: update deps to get latest DataFusion

* fix: enable boolean pruning tests

* fix: update explain plan tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-07 16:51:30 +00:00
Andrew Lamb 42f26b609b
refactor: Move `query_tests` and `server_benchmarks` into their own crate --> smaller `server` (#1628)
* refactor: Separate query_tests into its own crate

* fix: references

* refactor: break out server benchmarks

* fix: Update query_tests/src/lib.rs

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-06-04 17:31:19 +00:00