Commit Graph

215 Commits (f6e2472168e7f5f00a046184ce84f49f0db3381f)

Author SHA1 Message Date
Nga Tran 08f1831aef
refactor: Apply suggestions from code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-10-14 09:25:44 -04:00
Edd Robinson 693c878781 refactor: update test 2021-10-14 09:23:29 +01:00
Edd Robinson 75caa4aad8 test: add test duplicating defect input 2021-10-14 09:23:29 +01:00
Nga Tran 8dd9dcce01 test: verify if all scenarios are created correctly and add a few delete tests for read_filter 2021-10-13 17:21:03 -04:00
Nga Tran f85d6d5da8 chore: cleanup 2021-10-12 17:23:01 -04:00
Nga Tran 401124548c chore: ignore failed newly added test due to bug 2021-10-12 17:20:17 -04:00
Nga Tran 144ce77e39 chore: merge main to branch 2021-10-12 15:59:57 -04:00
Nga Tran 459dd46ae9 refactor: move delete tests to .sql 2021-10-12 15:49:23 -04:00
Andrew Lamb 035654b4f9
refactor: do not rebuild query_test when .sql or .expected files change (#2816)
* feat: Do not rebuild query_tests if .sql or .expected change

* feat: Add CI check

* refactor: move some sql tests to .sql files

* tests: port tests / expected results to data files

* fix: restore old name check-flatbuffers
2021-10-12 19:34:54 +00:00
Nga Tran 9b6726b99c refactor: rename to a more general name function 2021-10-12 10:34:55 -04:00
Nga Tran 0b4ae95ca4 refactor: exhaust scenarios for one-chunk test 2021-10-11 17:47:41 -04:00
Raphael Taylor-Davies b39e01f7ba
feat: migrate PersistenceWindows to TimeProvider (#2722) (#2798) 2021-10-11 20:40:00 +00:00
Nga Tran fbf5539336 chore: merge main to branch 2021-10-11 10:47:10 -04:00
Raphael Taylor-Davies afe34751e7
refactor: split out schema crate (#2781)
* refactor: split out schema crate

* chore: fix doc
2021-10-11 09:45:08 +00:00
Nga Tran f7475322a6 chore: merge main to branch, resolve conflicts, and discover an inconsitent bug 2021-10-08 15:50:46 -04:00
Nga Tran f2cdb9531f chore: cleanup 2021-10-08 14:52:15 -04:00
Nga Tran adbcd85c26 fix: fully fix 2745 2021-10-08 14:37:34 -04:00
Andrew Lamb 2072b4066e
feat: Implement support for `_measurement` predicate in gRPC plans (#2772)
* feat: Implement filtering for _measurement in general purpose gRPC plans

* docs: fixup docstrings

* fix: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-08 17:25:33 +00:00
Nga Tran 22d6f11bea fix: add cols of delete predicates into the schema of scanning columns 2021-10-07 17:37:34 -04:00
Marco Neumann 63d74be490 refactor: make `ChunkId` a UUID 2021-10-07 10:23:27 +02:00
Nga Tran de148337e8 fix: half way fix the bug to inlcude schema of column in delete predicate into the schema of IOx scan to avoid missing reading columns 2021-10-06 17:43:48 -04:00
Andrew Lamb efa2316626
fix: do not sort the output of read_group with no group keys (#2755) 2021-10-06 18:59:58 +00:00
Nga Tran bd93b411c7 chore: cleanup 2021-10-06 10:57:51 -04:00
kodiakhq[bot] 1aee2a49eb
Merge branch 'main' into ntran/no_use_stats 2021-10-06 14:06:01 +00:00
Nga Tran 65a02f7085
refactor: Apply suggestions from code review
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-10-06 10:04:28 -04:00
Raphael Taylor-Davies ce5b24e65d
refactor: use DateTime<Utc> in PersistenceWindows (#2722) (#2743)
* refactor: use DateTime<Utc> in PersistenceWindows (#2722)

* chore: fix benchmark

* chore: fmt

* chore: review feedback
2021-10-06 09:39:32 +00:00
Nga Tran 055e69439d test: fix auto created tests 2021-10-05 18:11:27 -04:00
Nga Tran aa64daca86 feat: dDisable using statistics to query data if there are soft deleted rows 2021-10-05 17:52:32 -04:00
Raphael Taylor-Davies d0929e3a34
feat: persist no chunks (#2712) (#2718)
* feat: persist no chunks (#2712)

* fix: persist partition

* fix: chunk ordering test

* chore: fix logical conflict

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-05 15:18:35 +00:00
Marco Neumann 10c1a72402 refactor: remove unused fields from `DeletePredicate` 2021-10-05 09:29:24 +02:00
Nga Tran 1856d7184c fix: make stop time inclusive 2021-10-04 14:14:16 -04:00
Nga Tran 64cd0cb3ce chore: cleanup 2021-10-04 12:43:35 -04:00
Nga Tran ba96b20b1f test: turn on tests of delete all 2021-10-04 12:28:49 -04:00
Marco Neumann 75ac6e8646 refactor: make `DeletePredicate::range` non-optional 2021-10-04 16:36:20 +02:00
Marco Neumann 5a5a929b9e refactor: introduce `DeletePredicate`
`DeletePredicate` is a simpler version of `Predicate` that is based on
IOx `DeleteExpr` instead of the full-blown DataFusion `Expr`. This will
allow us to do a couple of things (in follow-up changes):

- Order and de-duplicate delete predicates
- Normalize predicates
- Infallible serialization
- Smaller memory footprint

Note that this change only affects delete expressions. Query expressions
that are supported via the API are not changed. The query subsystem also
still uses the full-featured expressions/predicates (delete
expressions/predicates are converted to the more powerful DataFusion
version on-the-fly).
2021-10-04 16:36:20 +02:00
Edd Robinson 7ab10daa19
Merge branch 'main' into dependabot/cargo/arrow-5.5.0 2021-10-04 12:58:29 +01:00
Raphael Taylor-Davies e8eab2cc97
feat: allow compaction and persistence to retun no chunk (#2664) (#2700)
* feat: allow compaction and persistence to retun no chunk (#2664)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-04 10:54:47 +00:00
dependabot[bot] d1f5209869
chore(deps): bump arrow from 5.4.0 to 5.5.0
Bumps [arrow](https://github.com/apache/arrow-rs) from 5.4.0 to 5.5.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/5.5.0/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/5.4.0...5.5.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-04 08:55:38 +00:00
Raphael Taylor-Davies b402423e9e
feat: remove move lifecycle action (#2674)
* feat: remove move_chunk lifecycle action

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-30 16:58:05 +00:00
Nga Tran 105b63b2af
test: use integer in predicates (#2673)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-30 10:46:03 +00:00
Edd Robinson 003f72ba00
Merge branch 'main' into er/fix/read_buffer/pred_validate 2021-09-29 14:50:12 +01:00
Edd Robinson a52b86e070 fix: fallback to no predicate if it can't be validated
Closes: #1603

If a predicate cannot be executed against a read buffer chunk because of schema conflicts then fall back to applying no predicate and let the query engine apply predicates in the Filter step of the plan.
2021-09-29 14:42:56 +01:00
Nga Tran 3cfea6e83f refactor: add comment 2021-09-28 17:44:49 -04:00
Nga Tran 2837aae479 test: more tests that inlcude the repro of 2546 2021-09-28 17:42:23 -04:00
Nga Tran d7af1b8290 refactor: exhaust delete test scenarios 2021-09-28 12:31:46 -04:00
Nga Tran 4237d6dcc6 refactor: address review comments and refactor some more obvios ones 2021-09-27 21:38:00 -04:00
Nga Tran cbfa3e85af chore: Merge branch 'main' into ntran/refactor_delete_tests 2021-09-27 14:52:38 -04:00
Nga Tran ff77c53e11 chore: cleanup 2021-09-27 14:43:08 -04:00
Nga Tran ec3e1fda06 refactor: refactor all delete helper functions 2021-09-27 14:39:01 -04:00
Andrew Lamb 8e89dde85c
test: enable stack overflow test (#2618)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-27 14:27:08 +00:00
Andrew Lamb a55a21c644
chore: Update datafusion (#2635)
* chore: Update datafusion and sqlparser

* fix: remove STACK_SIZE workaround

* chore: update datafusion_util

* chore: update predicate

* chore: update query_tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-27 14:13:19 +00:00
Nga Tran c5e40b50cc refactor: delete tests to make it easier to add more tests 2021-09-24 16:35:55 -04:00
Nga Tran 0799b5ab91 test: query with selection predicate on deleted data 2021-09-23 16:54:40 -04:00
Nga Tran c35ecf257d refactor: address review comments 2021-09-23 11:14:20 -04:00
Nga Tran 713c7f4820 chore: remove empty line 2021-09-22 17:11:09 -04:00
Nga Tran 332649a8ee refactor: another cleanup due to marco -> function and add a test that exposes bug 2021-09-22 17:09:20 -04:00
Nga Tran 10097766d3 refactor: changes to use function instead of marco after merging main 2021-09-22 16:53:51 -04:00
Nga Tran 2399a932fb chore: Merge branch 'main' into ntran/more_delete_tests 2021-09-22 16:47:15 -04:00
Nga Tran 3a00d9c70a refactor: cleanup 2021-09-22 16:42:46 -04:00
Nga Tran 400ec93498 test: more delete tests 2021-09-22 16:38:27 -04:00
Marco Neumann 9a1a3f4de5 refactor: convert `run_tag_keys_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 8cc08d6bd9 refactor: convert `run_table_names_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 2408dea394 refactor: convert `run_read_window_aggregate_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann cdfe49f041 refactor: convert `run_read_group_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 1db13ec437 refactor: convert `run_tag_values_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 20655d10b6 refactor: convert `run_read_window_aggregate_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann b24be8d17d refactor: convert `run_field_columns_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 6ef8c11036 refactor: convert `run_table_schema_test_case` to a function 2021-09-22 13:46:54 +02:00
Marco Neumann 70a754a022 refactor: convert `run_sql_test_case` macro to a function
This generated less code and speeds up compilation a bit.
2021-09-22 13:46:53 +02:00
Andrew Lamb d38648952c
chore: Update datafusion (#2602)
* chore: Update datafusion + other deps

* refactor: update query crate for new async interfaces

* refactor: update server crate for new async interface

* refactor: update query_tests crate for new async interfaces

* refactor: update ioxd and server to use new async interface

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-22 10:33:25 +00:00
Nga Tran ae40d93af4 test: more delete tests 2021-09-21 18:10:12 -04:00
Nga Tran b4b33c378e test: turn all delete tests on 2021-09-21 15:23:41 -04:00
Nga Tran 2f371b6f79 refactor: address review comments 2021-09-21 14:46:24 -04:00
Nga Tran 93551bdd1e fix: all chunks now are applied delete predicates during scan 2021-09-21 12:17:59 -04:00
Nga Tran 85989cc8a3 test: add more delete tests and test scenarios 2021-09-20 18:18:08 -04:00
kodiakhq[bot] 77d84ca5ab
Merge branch 'main' into crepererum/chunk_id 2021-09-20 13:39:05 +00:00
Marco Neumann cef5aeee52 refactor: introduce `ChunkId` type 2021-09-20 13:10:41 +02:00
Nga Tran 364d245eae feat: apply negated delete predicates during scan 2021-09-17 16:20:42 -04:00
Nga Tran 243cc1f88c fix: compile error after merge from main 2021-09-16 17:56:33 -04:00
Nga Tran 6cfeeb352b refactor: address review comments 2021-09-16 17:21:06 -04:00
Nga Tran 472e8a9e49 fix: fix compile error 2021-09-16 15:02:18 -04:00
Nga Tran 2bae14df60 test: delete tests 2021-09-16 14:51:26 -04:00
Raphael Taylor-Davies c66095cad1
feat: remove metrics crate (#2552) 2021-09-15 19:43:33 +00:00
Marco Neumann bfaba78dc3 refactor: move `predicate` into its own crate
Two reasons:

1. I wanna decouple `parquet_file` from `query` (nearly done, needs a
   small follow-up PR).
2. `predicate` will have more and more features (like serialization)
   which justifies a new home
2021-09-14 17:13:02 +02:00
Marco Neumann 2591bcac13 test: ensure chunk IDs are as documented 2021-09-14 13:00:55 +02:00
Marco Neumann 9a60af7fa3 docs: explain `ChunkOrder` query test scenario 2021-09-14 13:00:55 +02:00
Marco Neumann 1b788732da fix: order chunks correctly during query processing
The query processing was implicitly relying on the order provided by the
catalog. This had two issues:

- this ordering was not defined in the API contract (neither via docs
  nor via typing)
- the order was based on chunk IDs which is not adequate in some cases
  (e.g. when chunks are created while a persistence operations is in
  progress)

Now we explicitly sort chunks by `(order, ID)`.

Fixes #1963.
2021-09-14 13:00:55 +02:00
Andrew Lamb 5eef76c868
chore: Update dependencies (including datafusion) (#2521)
* chore: Update datafusion deps to pre-release

* refactor: Update IOx to use new datafusion Statistics

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-13 21:30:44 +00:00
Raphael Taylor-Davies 20143e4f4e
feat: migrate chunk pruning metrics (#2516) 2021-09-13 13:13:47 +00:00
Andrew Lamb eb72799f0d
chore: Update datafusion (and arrow et al) dependencies (#2509)
* chore: update datafusion and other deps

* fix: Update InfluxRPC frontend with new op types

* fix: Update test output for new column names

* fix: typos and unintended changes

* fix: Update query_tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-10 17:57:25 +00:00
dependabot[bot] b67610d9b9
chore(deps): bump tokio from 1.10.1 to 1.11.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.10.1 to 1.11.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.10.1...tokio-1.11.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 09:11:38 +00:00
Marco Neumann 3c968ac092 feat: correctly account MUB sizes
Fixes #1565.
2021-09-03 09:15:49 +02:00
Marco Neumann 79ad48ac3a chore: rename "labels" to "attributes" 2021-08-31 11:31:15 +02:00
Raphael Taylor-Davies e3e801d29a
feat: propagate span context into storage RPC queries (#2407)
* feat: propagate span context into storage RPC queries

* refactor: create ExecutionContextProvider trait

* chore: cleanup imports

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 17:11:49 +00:00
Andrew Lamb f975baba6b
chore: Update datafusion + other deps again (get baseline metrics) (#2422)
* chore: Update datafusion reference

* chore: cargo update

* fix: update explain tests to show Union

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 13:13:00 +00:00
kodiakhq[bot] b1ecf1bfed
Merge branch 'main' into crepererum/job_start_time_in_system_table 2021-08-26 08:04:10 +00:00
Andrew Lamb ddf6c6362e
chore: update DataFusion again (#2411)
* chore: update datafusion ref

* chore: run cargo update

* refactor: Rename concurrency to target_partitions, avoid deprecation warning
2021-08-26 08:03:13 +00:00
Marco Neumann 558aa54aa3 feat: add start time to `operations` system table 2021-08-26 10:00:29 +02:00
Edd Robinson 11e88877f4 fix: correct size estimation of RLE encoding 2021-08-25 12:03:04 +01:00
Marco Neumann 2ad9843e5f feat: make `RLE` a bit smaller by capacity-based allocation
For some demo data this reduced the overall chunk size from

195049367 bytes
to
191088095 bytes
2021-08-25 10:22:43 +02:00