Nga Tran
fb639ee54f
feat: add UnionExec on top of the scan activities
2021-06-11 07:06:08 -04:00
Andrew Lamb
13dd4b23fd
fix: make pruning debug log less confusing ( #1684 )
2021-06-10 18:35:04 +00:00
kodiakhq[bot]
16b268402e
Merge branch 'main' into ntran/dedup_merge_exec
2021-06-10 17:13:49 +00:00
Nga Tran
46d4ab1f2a
refactor: address review comments
2021-06-10 13:13:02 -04:00
Marco Neumann
7b1106ff64
chore: enforce `clippy::future_not_send` for `query`
2021-06-10 09:48:35 +02:00
Nga Tran
4cf05df35b
feat: hook SortPreservingMergeExec into deduplication framework
2021-06-09 23:29:44 -04:00
Nga Tran
4478d900ee
refactor: capture test output
2021-06-09 15:09:13 -04:00
Nga Tran
8cc99e3420
Merge branch 'ntran/dedup_within_chunk' of https://github.com/influxdata/influxdb_iox into ntran/dedup_within_chunk
2021-06-09 14:40:29 -04:00
Nga Tran
b3c94b9d65
refactor: change order of fields to pass circle CI tests
2021-06-09 14:40:10 -04:00
kodiakhq[bot]
eed73a30c5
Merge branch 'main' into ntran/dedup_within_chunk
2021-06-09 18:19:17 +00:00
Nga Tran
c1c58018fc
refactor: address review comments
2021-06-09 14:17:47 -04:00
Andrew Lamb
89fcc457f4
fix: Fix bug in chunk overlap calculation due to nulls ( #1669 )
...
* fix: Fix bug in chunk overlap calculation due to nulls
* docs: add note about algorithmic complexity
* fix: avoid recursion in normal case
2021-06-09 17:46:39 +00:00
Raphael Taylor-Davies
07c4277ca7
refactor: schema merge to give more control over field merging ( #1653 )
...
* refactor: schema merge to give more control over field merging
* chore: review feedback
2021-06-09 06:30:45 +00:00
Nga Tran
3d50ff7a60
refactor: remove comments
2021-06-08 21:48:57 -04:00
Nga Tran
ab7d3384b7
refactor: remove unused comments
2021-06-08 21:43:02 -04:00
Nga Tran
3e10351538
test: add tests for the sort plan
2021-06-08 21:40:46 -04:00
Andrew Lamb
cba7f270b4
docs: Improve comments + whitespace ( #1663 )
2021-06-08 21:13:35 +00:00
Nga Tran
68e3a2121f
feat: add SortExec
2021-06-08 15:04:31 -04:00
Andrew Lamb
666204d4a8
fix: remove whitespace changes
2021-06-08 14:46:55 -04:00
Andrew Lamb
b23c4e5210
fix: clippy
2021-06-08 14:44:48 -04:00
Andrew Lamb
fd8a87484e
feat: Hook up chunk grouping into provider
2021-06-08 14:42:37 -04:00
Nga Tran
edbf1b7d5e
Merge branch 'main' into ntran/dedup_within_chunk
2021-06-08 13:18:40 -04:00
Nga Tran
40cb4f741f
feat: initial implementaton
2021-06-08 13:17:36 -04:00
Andrew Lamb
62e8675737
refactor: move primary_key calculaton to TableSummary ( #1659 )
2021-06-08 17:06:37 +00:00
Andrew Lamb
34ba268cf1
feat: Group chunks by potential overlap ( #1654 )
...
* feat: Group chunks by potential overlap
* docs: clarify in what way the calculation is conservative
* fix: Add test for mixed nulls
2021-06-08 16:55:29 +00:00
Edd Robinson
b88f277477
feat: enable not eq operator
2021-06-08 15:57:07 +01:00
Andrew Lamb
e9834a907c
feat: Prune on boolean column predicates too ( #1629 )
...
* chore: update deps to get latest DataFusion
* fix: enable boolean pruning tests
* fix: update explain plan tests
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-07 16:51:30 +00:00
Nga Tran
ff641e5638
refactor: address Andrew's comments
2021-06-06 22:36:44 -04:00
Nga Tran
2f82a9d670
feat: full foundation for deduplicate with todo functions to finish
2021-06-06 22:09:01 -04:00
Andrew Lamb
ff3215e6a9
feat: Implement Chunk Pruning ( #1567 )
2021-06-04 13:05:22 +00:00
Andrew Lamb
c986ce2c19
feat: Add pruning module to query crate ( #1611 )
...
* feat: Add pruning module
* fix: clippy
* fix: Apply suggestions from code review
* fix: remove erronious claims of DF bugs
* fix: update comments with DF bug reference
2021-06-03 11:07:26 +00:00
Nga Tran
e7a97f3ac1
test: merge main and add more tests for deduplicate work
2021-06-02 12:00:40 -04:00
Nga Tran
60ad929721
refactor: add macro tto compare output of explains
2021-06-01 16:39:14 -04:00
Nga Tran
aa867601e5
chore: merge main with DF plan display fix
2021-06-01 16:17:41 -04:00
Andrew Lamb
d8fbb7b410
refactor: Remove last vestiges of multi-table chunks from PartitionChunk API ( #1588 )
...
* refactor: Remove last vestiges of multi-table chunks from PartitionChunk API
* fix: remove test that can no longer fail
* fix: update tests + code review comments
* fix: clippy
* fix: clippy
* fix: restore test_measurement_fields_error test
2021-06-01 16:12:33 +00:00
Andrew Lamb
d3711a5591
refactor: Use ParquetExec from DataFusion to read parquet files ( #1580 )
...
* refactor: use ParquetExec to read parquet files
* fix: test
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-01 14:44:07 +00:00
Andrew Lamb
162a808a8d
refactor: Remove `table_name` from PartitionChunk API ( #1584 )
...
* refactor: Remove `table_name` from PartitionChunk API
* fix: clippy
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-31 12:05:09 +00:00
Andrew Lamb
d50c7c8919
chore: remove unused dependency ( #1581 )
2021-05-31 09:58:10 +00:00
Nga Tran
62147ff0d4
feat: add more explain tests
2021-05-27 12:19:41 -04:00
Raphael Taylor-Davies
5d342d7779
feat: associate tracker with lifecycle action ( #1099 ) ( #1556 )
...
* feat: associate tracker with lifecycle action (#1099 )
* chore: docs
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-27 10:47:35 +00:00
Raphael Taylor-Davies
4fcc04e6c9
chore: enable arrow prettyprint feature ( #1566 )
2021-05-27 10:28:14 +00:00
Raphael Taylor-Davies
c2fd85209c
feat: wait for task shutdown on DedicatedExecutor ( #1537 )
2021-05-25 11:33:55 +00:00
Andrew Lamb
14ba25f86d
chore: Update datafusion and use released version of arrow crates ( #1546 )
...
* chore: Update datafusion and use released version of arrow crate
* fix: Update for change in API
2021-05-24 15:37:22 +00:00
Nga Tran
0563005aac
chore: remove leftover comments
2021-05-21 17:01:49 -04:00
Nga Tran
f113abacb5
feat: more unit & e2e tests plus cleanup and addressing review comments of Andrew and Edd
2021-05-21 16:48:43 -04:00
Nga Tran
e44a3a87db
feat: fnow predicate is actuallu pushed down to RUB but there are bugs and not working yet
2021-05-20 16:56:15 -04:00
Nga Tran
51de37e752
chore: run fmt
2021-05-19 15:28:44 -04:00
Nga Tran
11561111d5
chore: merge main to branch
2021-05-19 15:11:15 -04:00
Nga Tran
1f13842550
chore: modify comments
2021-05-19 14:49:48 -04:00
Nga Tran
087d61f229
feat: Part 1 of predicate push down - Send predicates to MUB, RUB, and Parquet File. Note that MUB has not handled predicates yet
2021-05-19 13:59:51 -04:00
Andrew Lamb
7e223780f3
feat: Implement Display for query::predicate to improve debug printing of plans ( #1519 )
...
* feat: Implement Display for query::predicate to improve debug printing of plans
* fix: clippy
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-19 12:38:34 +00:00
Andrew Lamb
0680a5167f
chore: Improve DataFusion plan logging ( #1508 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-18 11:08:06 +00:00
Andrew Lamb
07db4932ee
refactor: rename data_types/src/chunk.rs -> data_types/src/chunk_metadata.rs ( #1500 )
2021-05-15 10:18:01 +00:00
Edd Robinson
8ccc359cab
refactor: address PR feedback
2021-05-07 13:48:44 +01:00
Edd Robinson
4c4bd2f164
refactor: update query/src/func/regex.rs
2021-05-07 13:44:51 +01:00
Edd Robinson
4cc7a99854
refactor: include not match in support check
2021-05-07 13:44:51 +01:00
Edd Robinson
beee3115f4
feat: expose regex =\~ and to gRPC API
2021-05-07 13:44:51 +01:00
Edd Robinson
eae3fec571
feat: wire up regex UDF as predicate filter expr
2021-05-07 13:44:51 +01:00
Edd Robinson
3fc2c9fc04
feat: add DataFusion regex match operator
...
This commit adds a new custom UDF to IOx that provide a regex operator to Datafusion plans.
Effectively it allows predicates to contain regex operators that are applied as filters, only allowing rows that satisfy the regex to be returned.
I did not use the Arrow regex kernel for this work because that does not return a boolean array indicating which rows matched a regex, but instead returns a new string array of results. This doesn't work well with DF's approach to filtering.
2021-05-07 13:44:51 +01:00
Carol (Nichols || Goulding)
febc1538ff
chore: Update Rust version ( #1445 )
...
* chore: Update Rust version
* refactor: Make struct constructor field orderings consistent
Sometimes I changed the struct definition, sometimes changed the struct
construction instance, depending on consistency with code around each
(other similar structs, function argument orders, etc)
More info: https://rust-lang.github.io/rust-clippy/master/index.html#inconsistent_struct_constructor
* refactor: Use flatten where appropriate
One instance is a false positive with a clippy bug.
More info:
- https://rust-lang.github.io/rust-clippy/master/index.html#filter_map_identity
- https://rust-lang.github.io/rust-clippy/master/index.html#manual_flatten
* refactor: Use Option map instead of match
More info: https://rust-lang.github.io/rust-clippy/master/index.html#manual_map
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-06 22:07:10 +00:00
Raphael Taylor-Davies
44de42906f
refactor: use Arc<str> instead of Arc<String> ( #1442 )
2021-05-06 17:05:08 +00:00
Raphael Taylor-Davies
411cf134e9
refactor: explode arrow_deps ( #1425 )
...
* refactor: explode arrow_deps
* chore: workaround doctest bug
2021-05-05 16:59:12 +00:00
Edd Robinson
2f789485e6
refactor: fix spelling
2021-05-05 11:06:04 +01:00
Andrew Lamb
3b7c5ac350
fix(storage rpc): do not send back tags with empty values ( #1403 )
2021-05-04 10:35:24 +00:00
Andrew Lamb
40b9b09cdc
refactor: rename assert_table_eq to assert_batches_eq ( #1368 )
2021-04-30 10:51:08 +00:00
Andrew Lamb
eb8d91cf1c
refactor: remove additional uses of RecordBatch::try_new ( #1378 )
...
* refactor: remove additional uses of RecordBatch::try_new
* fix: fix accidental change
* fix: clippy
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-30 10:24:47 +00:00
Edd Robinson
13fbf2e68d
refactor: plumb registry to gRPC server
2021-04-29 14:00:05 +01:00
Edd Robinson
4acbdcf1c9
refactor: address PR feedback
2021-04-28 16:11:57 +00:00
Edd Robinson
a9ef604ef6
perf: avoid using channels for query execution
...
Pre-sized channels get full when the results to send over them are larger than the capacities. This causes significant runtime overhead and slows down query performance.
This commit removes the intermediate channels. The potential downside to this approach is there may be more buffering which could increase memory usage during query and also block a thread for longer periods of time.
2021-04-28 16:11:57 +00:00
Raphael Taylor-Davies
7ca1da3fcd
feat: pushdown table and partition key predicates to catalog ( #736 ) ( #1327 )
...
* feat: catalog predicate pushdown (#736 )
* chore: fix lints
* chore: review comments
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-27 15:31:47 +00:00
Marco Neumann
91bccdfca3
ci: pass `--document-private-items` to `cargo doc`
2021-04-27 15:42:07 +02:00
Marco Neumann
eddc9319ff
docs: deny broken intradoc links
2021-04-27 13:22:28 +02:00
Raphael Taylor-Davies
20117de078
feat: string dictionary encoding ( #1220 ) ( #1262 )
...
* feat: string dictionary encoding (#1220 )
* chore: review comments
* chore: fix lint
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-27 09:36:58 +00:00
Edd Robinson
a322d05838
refactor: rust fmt
2021-04-20 17:30:50 +00:00
Edd Robinson
554b3b4662
refactor: satisfy new clippy lints
2021-04-20 17:30:50 +00:00
Carol (Nichols || Goulding)
51041ba2d9
fix: Prefer implementing From over Into
2021-04-19 08:48:11 -04:00
Carol (Nichols || Goulding)
757933afc4
fix: use Self when possible
2021-04-19 08:48:11 -04:00
Carol (Nichols || Goulding)
f136931225
fix: Inconsistent ordering lints
2021-04-19 08:48:11 -04:00
Carol (Nichols || Goulding)
3e87ce5232
fix: Make this trait and methods more idiomatically named
...
"into" usually takes ownership and does a conversion; "as" takes
references and provides a different view.
2021-04-19 08:45:34 -04:00
Andrew Lamb
529c99c93f
fix: don't clone arrays to make TimestampNanosecondArrays ( #1241 )
...
* fix: avoid clone
* fix: remove another clone
2021-04-16 18:40:22 +00:00
Andrew Lamb
e226b5a820
feat: Use TimestampNanosecondArray for timestamps in IOx ( #1230 )
...
* refactor: Create Arrow arrays using iterators
* feat: use Timestamp64(TimeUnit::Nanosecond) for timestamps
* feat: add support for timestamp array
* fix: update more tests
* fix: remove unecessary code
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-16 15:55:33 +00:00
Andrew Lamb
f092294da3
fix: Use MAX (window end) for timestamps in read group ( #1228 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-16 10:51:38 +00:00
Andrew Lamb
5aeeccb97c
feat: Run query plans on the database wide executor as well ( #1210 )
...
* feat: route all query planning through executor
* fix: Rename JoinError -> TaskJoinError and make message clearer
* fix: remove dangling comment
* fix: remove confusing comments
2021-04-15 11:57:20 +00:00
Andrew Lamb
59ca090aef
feat: Use single db-wide executor for running queries ( #1198 )
...
* refactor: plumb executor into all Db instances
* refactor: Route all query executions through worker pool
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-14 16:46:02 +00:00
Andrew Lamb
8f1bf8a960
fix: Remove mutex acquisition in impl `std::fmt::Debug` for DedicatedExecutor ( #1205 )
2021-04-14 12:09:40 +00:00
Andrew Lamb
f5f768d750
feat: Add a dedicated threadpool for running queries ( #1191 )
...
* feat: use a dedicated tokio threadpool for running queries
* feat: plumb number of executor threads through to command line
thread through command line
* fix: Logical merge conflict
* fix: another logical conflict
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-14 10:48:09 +00:00
Andrew Lamb
150ed4e1d9
refactor: Remove async from `InfluxRPCPlanner` ( #1200 )
...
* refactor: Remove async from InfluxRPCPlanner
* fix: make it compile
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-13 22:17:19 +00:00
Paul Dix
7e28f8ef66
feat: Implement Entry writing to Db
...
This removes the old ReplicatedWrite structure and implements the writing of an Entry to the Db. I also call out in `server/lib.rs` and in the `Db` where sharding and replication might happen.
I've also added helpers in various places to write line protocol to chunks, tables, and databases. That enabled removing a good amount of code from the test helpers crate.
2021-04-13 12:52:14 +00:00
Raphael Taylor-Davies
1997324344
feat: mutable buffer snapshotting ( #1179 )
...
* feat: mutable buffer snapshotting
* chore: review feedback
2021-04-13 12:14:54 +00:00
Raphael Taylor-Davies
078c0f3fda
refactor: lift chunk and table summaries out of DBChunk ( #1162 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-09 12:00:47 +00:00
Nga Tran
be6e1e48e4
feat: add writer_id and object_store in Db
2021-04-07 18:36:07 -04:00
Carol (Nichols || Goulding)
82588d5c72
fix: Don't return Result from test functions
2021-04-07 12:40:00 -04:00
Raphael Taylor-Davies
5cd1d6691d
refactor: use DatabaseName in DatabaseRules ( #1127 )
2021-04-06 13:26:30 +00:00
Jacob Marble
80d55d0829
chore: rename tracing_deps to observability_deps
...
OpenTelemetry makes this necessary.
2021-04-02 13:14:30 -07:00
Carol (Nichols || Goulding)
0b880d3534
chore: Group all tracing-related crates under one crate for easier upgrade management
2021-04-02 09:54:39 -04:00
Andrew Lamb
569f90d937
feat: Add ability to get PartitionSummary statistics from a Db ( #1090 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-31 14:18:53 +00:00
Andrew Lamb
f0b411cd43
feat: enable information_schema
2021-03-30 09:01:43 -04:00
Andrew Lamb
6a48001d13
refactor: Manage storage directly in the Catalog ( #1057 )
...
* refactor: Manage mutable buffer chunks directly
* fix: do not use mutable_buffer for listing table names
2021-03-29 17:55:07 +00:00
Andrew Lamb
eb0122655d
refactor: Remove async from PartitionChunk ( #1062 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-29 13:00:36 +00:00
Andrew Lamb
02ae743e8e
refactor: Remove async from Database ( #1063 )
2021-03-29 12:48:12 +00:00