Nga Tran
23895e6673
feat: Using sort_key to avoid resorts
2021-07-12 18:08:45 -04:00
kodiakhq[bot]
f26f844ed2
Merge branch 'main' into ntran/use_sortkey
2021-07-12 18:12:47 +00:00
Carol (Nichols || Goulding)
c681da1031
refactor: Define the TestChunk methods with macros
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
4e53a32928
refactor: Completely replace query::provider::overlap::TestChunk with query::test::TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
1698edcc39
refactor: Implement query::provider::overlap::TestChunk in terms of query::test::TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
dc0b97e121
refactor: Completely replace TestChunkMeta with TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
96f9485792
refactor: Move a with_no_stats method to be entirely defined on TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
b4c5a87088
refactor: Rename int field to i64 field to be more consistent
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
54f7ee8b8d
refactor: Implement TestChunkMeta in terms of TestChunk
...
This is a temporary step to make sure TestChunk does everything
TestChunkMeta needs
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
ee545ce90e
test: Make _with_stats methods able to optionally take max/min
...
Not used yet, but will be when this is unified with query/src/pruning.rs
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
b26aae1cb4
test: Add an arg to control whether to add a column summary at all
...
Always true for now, but there are some cases in query/src/pruning.rs
that don't add any column summaries that will use this with `false`.
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
6cd75bc688
test: Optionally take stats in add_schema_to_table
...
This gets rid of a lookup and construction of default stats that aren't
necessary
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
e05ca7f98b
fix: Change a method name that says null to not say null
...
The comment and implementation seem to indicate this is creating
non-null data.
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
4406d8a219
test: Always initialize a TableSummary on TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
22d4040c81
test: Always initialize a Schema for TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
92cb5986f1
test: Initialize a schema on TestChunk to always exist
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
78f1c4fc80
test: Chunks can only have one table; no need to specify repeatedly
...
This lets us make the name required and always present on TestChunks,
and make the ID optional.
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
15aac65c2c
fix: Arrange use statements so rustfmt can manage their order
2021-07-12 09:59:11 -04:00
Nga Tran
7b7a60993d
feat: consider time as a special key
2021-07-09 18:54:22 -04:00
Nga Tran
8f4463664c
feat: add super_key function
2021-07-09 15:37:04 -04:00
Marco Neumann
bc958e2ff0
refactor: use Arcs to pass schemas around
2021-07-09 09:45:12 +02:00
Marco Neumann
09e611deb7
refactor: lift query schema generation up to caller
...
Do no longer scan chunks during query planning to determine the schema
(except for the lifetime jobs where we have a good reason to do so).
Instead pass the schema down to from whoever is triggering the query.
For real SQL queries, we then just use the the table-wide schemas
introduced in #1913 .
Apart from avoiding schema merges we now also don't crash any longer
when no chunks are left in the table (aka columns are present but all
rows are gone).
Fixes #1768 .
Fixes #1884 .
2021-07-09 09:24:21 +02:00
kodiakhq[bot]
c8126784a8
Merge branch 'main' into ntran/avoid_sort_in_scan
2021-07-08 20:22:18 +00:00
Nga Tran
680394b50b
refactor: run fmt
2021-07-08 16:21:42 -04:00
Nga Tran
c5733ab4a7
refactor: remove redudant code
2021-07-08 16:11:42 -04:00
Nga Tran
6738cb272f
refactor: remove duplicate test
2021-07-08 15:59:25 -04:00
Nga Tran
da6249a4df
fix: address reviewers' comments and also fixe a bug they discovered
2021-07-08 15:54:54 -04:00
Andrew Lamb
33bc85ad18
feat: Infrastructure for persistence ( #1925 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 11:14:38 +00:00
Andrew Lamb
7602bde850
chore: Update datafusion deps ( #1799 )
...
* chore: Update datafusion deps + rework code
* refactor: remove workaround as it has been contributed upstream
* fix: Update query/src/exec/split.rs
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 10:58:32 +00:00
Nga Tran
d3c4f8c249
fix: store sort key correctly inthe schema. Update tests to reflect it
2021-07-07 15:55:23 -04:00
Andrew Lamb
e6d995cbd8
chore: Update to Rust 1.53.0 ( #1922 )
...
* chore: Update to Rust 1.53.0
* fix: Update to latest clippy standards
* fix: bad refactor
* fix: Update escaping
* test: update test output
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-07 18:02:03 +00:00
Nga Tran
76789e5902
feat: store sotkey into the chunk schema of RUB
2021-07-06 17:00:35 -04:00
Marco Neumann
b6185982f7
refactor: make `ProviderBuilder` a build-time-checked builder
...
It's safer and also avoids cloning / copying state around.
2021-07-06 18:20:05 +02:00
Marco Neumann
4172d7946c
refactor: make `SchemaMerger` self-consuming
...
The error handling in `merge` was incomplete, aka it could leave the
merger in a half-modified state in case of an error. That's generally a
bad idea and can lead to ugly bugs. Also the "builder" pattern that is
used here usually consumes itself (and provides a clone impl), so it is
easier to reason about modifications. So this commit just changes it to
self-consuming builder.
A nice side effect of the new pattern is also that it is build-time
checked and does not contain a runtime assert any longer.
2021-07-06 18:20:05 +02:00
Andrew Lamb
56c8c8d428
feat: Use separate executor for queries and compactions/moves ( #1870 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:47:50 +00:00
Jacob Marble
0779b0d9bd
feat: add gRPC listener for new write protocol ( #1842 )
...
* feat: add gRPC listener for new write protocol
* chore: clippy happy
* chore: lint
* chore: cargo fmt --all
* chore: cargo clippy
* chore: protobuf-lint
* chore: more formatting
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:15:12 +00:00
kodiakhq[bot]
e03a1a1def
Merge branch 'main' into ntran/dedup_less_concat
2021-07-01 15:59:22 +00:00
Nga Tran
d0afc7a176
refactor: clean up and add a missing else case
2021-07-01 11:00:30 -04:00
Nga Tran
5cf623201d
fix: deduplicate the last batch before sending it downstream
2021-07-01 10:45:23 -04:00
Andrew Lamb
7235c7b965
refactor: Remove vestigial execution counters ( #1865 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 14:08:06 +00:00
Nga Tran
ba919726b6
test: unit tests
2021-06-30 15:01:31 -04:00
Nga Tran
2a06b93b00
chore: Merge branch 'main' into ntran/dedup_less_concat
2021-06-30 11:37:15 -04:00
Nga Tran
1dbdabd66e
fix: 2 values are also considered to be the same if at least one of them is invalid
2021-06-30 10:52:21 -04:00
Raphael Taylor-Davies
62d3305923
feat: optimize the dictionaries in the output of deduplicate node ( #1827 ) ( #1832 )
...
* feat: optimize dedup dictionaries (#1827 )
* fix: handle sliced null bitmasks
* chore: review feedback
2021-06-30 09:30:16 +00:00
Nga Tran
e6a4e0d709
refactor: make the code clearer for schema even though they are the same
2021-06-29 17:46:30 -04:00
Nga Tran
a249b90952
refactor: refactor and add temp info for debugging
2021-06-29 16:35:50 -04:00
Nga Tran
4611e5d584
chore: merge main to branch
2021-06-29 15:39:23 -04:00
Nga Tran
388e7b7650
fix: reset last_batch
2021-06-29 15:15:09 -04:00
Nga Tran
8f309eb569
feat: improve deduplicate to avoid as many concat_batches as possible
2021-06-29 14:41:54 -04:00
Edd Robinson
12ae9b012a
refactor: clarify intent of
2021-06-28 17:39:48 +01:00
Andrew Lamb
2e5f10f6b1
feat: Sort the output of split_plans as well ( #1800 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-25 13:02:30 +00:00
Andrew Lamb
4e7cf39b23
chore: Reduce debug logging in query crate ( #1802 )
2021-06-24 21:01:11 +00:00
Andrew Lamb
79446d45be
feat: Implement split_plans ( #1794 )
...
* feat: implement split plan / planner
* fix: Apply suggestions from code review
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
* fix: resolve merge conflicts
* fix: add values to panic
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
2021-06-24 18:38:00 +00:00
Raphael Taylor-Davies
297fc12db8
feat: compact chunks ( #1776 )
...
* feat: compact chunks
* chore: review feedback
* chore: clippy lints
* chore: document sort key algorithm
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-24 16:49:10 +00:00
Andrew Lamb
0a03605bbc
refactor: pull Channel --> Stream adapater into its own module ( #1793 )
...
* refactor: pull Channel --> Stream adapater into its own module
* docs: Update query/src/exec/stream.rs
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
2021-06-24 10:35:45 +00:00
Andrew Lamb
60eb89cad1
feat: Reorg Planner for merge plans ( #1780 )
...
* feat: Reorg Planner
* docs: add example for split
* fix: clippy
* docs: Specify <= rather than < for split
2021-06-23 10:50:44 +00:00
Andrew Lamb
4c5007f961
fix: Select the correct timestamp for min/max selectors ( #1771 )
...
* test: Reproducer showing that the min/max selectors are order dependent
* fix: pick correct timestamp for first/last selectors
* refactor: remove println
* docs: Fixup comments and add to link to arrow-datafusion/issues/600
* fix: Add debug if timestamp is null
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-22 17:53:54 +00:00
Andrew Lamb
763ade390c
refactor: rename deduplicate --> overlap ( #1779 )
2021-06-22 17:07:53 +00:00
Andrew Lamb
5362c7c924
feat: enable query deduplication ( #1762 )
2021-06-21 18:49:04 +00:00
Andrew Lamb
bed6ec8c31
feat: Handle merging chunks that have different schemas ( #1761 )
...
* feat: Handle merging chunks that have different schemas
* test: print out original (non deduplicated) data in tests
2021-06-21 15:52:13 +00:00
Andrew Lamb
6559a9e997
refactor: use Schema to compute InfluxDB primary keys ( #1757 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-18 21:15:31 +00:00
Andrew Lamb
de67bd3efe
refactor: Remove PartitionChunk::table_schema ( #1756 )
...
* refactor: Remove PartitionChunk::table_schema
* docs: update comments
2021-06-18 16:13:16 +00:00
Andrew Lamb
9beeca3e7c
refactor: Unify schema handling in query crate ( #1755 )
...
* refactor: Unify schema handling in query crate
* fix: doclink
2021-06-18 14:10:57 +00:00
Andrew Lamb
1c13d676b4
refactor: Rename query::PartitionChunk --> query::QueryChunk ( #1754 )
2021-06-18 13:24:09 +00:00
Andrew Lamb
c5eea9af6a
feat: Implement DeduplicateExec ( #1733 )
...
* feat: Implement DeduplicateExec
* fix: Doc comments
* fix: fix comment
* fix: Update with arrow ticket references and use datafusion coalsce batches impl
* refactor: rename inner.rs to algo.rs
* docs: Add additional documentation on rationale for last field value
* docs: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* docs: Update query/src/provider/deduplicate/algo.rs
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* docs: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* refactor: do not use pub(crate)
* docs: fix test comments
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-06-17 14:17:52 +00:00
Andrew Lamb
b42218a197
chore: Add proper format for SchemaPivotNode ( #1744 )
2021-06-17 11:32:48 +00:00
Raphael Taylor-Davies
38d17a3093
chore: remove unused query dependency ( #1731 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-15 22:06:13 +00:00
Edd Robinson
e2315f0016
refactor: revert reead_filter debugging
2021-06-14 17:54:21 +01:00
Edd Robinson
6657e6f596
refactor: update query/src/exec/seriesset.rs
2021-06-14 16:09:02 +01:00
Edd Robinson
58f4073a7d
Merge branch 'main' into er/fix/dictionary_dupe_keys
2021-06-14 15:59:58 +01:00
Edd Robinson
ec52bca309
fix: ensure values are different
2021-06-14 15:28:35 +01:00
kodiakhq[bot]
cf6b658ee3
Merge branch 'main' into er/duplicate_keys
2021-06-14 11:10:45 +00:00
Andrew Lamb
0d8d32fd8f
chore: Update deps to get latest arrow ( #1708 )
...
* chore: Update deps to get latest arrow
* fix: Update to rust 1.52
* fix: clippy
2021-06-14 11:08:09 +00:00
Edd Robinson
1612ebcbdb
refactor: more debug logging
2021-06-14 12:07:51 +01:00
Edd Robinson
927d6f890f
Merge branch 'main' into er/duplicate_keys
2021-06-14 10:29:46 +01:00
Edd Robinson
96fb595cc0
refactor: read_filter debugging
2021-06-14 10:22:05 +01:00
Nga Tran
11729b9aa7
test: select non-key from 2 chunks with different key/tag sets ( #1703 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-11 18:52:36 +00:00
Nga Tran
736cf1ff6f
Merge branch 'main' into ntran/dedupe_final_union
2021-06-11 09:45:54 -04:00
Nga Tran
7dd0416960
refactor: address review comments
2021-06-11 09:43:39 -04:00
Nga Tran
e34d157f28
fix: comments
2021-06-11 07:30:49 -04:00
Nga Tran
ea9edef716
fix: testing option
2021-06-11 07:18:33 -04:00
Nga Tran
fb639ee54f
feat: add UnionExec on top of the scan activities
2021-06-11 07:06:08 -04:00
Andrew Lamb
13dd4b23fd
fix: make pruning debug log less confusing ( #1684 )
2021-06-10 18:35:04 +00:00
kodiakhq[bot]
16b268402e
Merge branch 'main' into ntran/dedup_merge_exec
2021-06-10 17:13:49 +00:00
Nga Tran
46d4ab1f2a
refactor: address review comments
2021-06-10 13:13:02 -04:00
Marco Neumann
7b1106ff64
chore: enforce `clippy::future_not_send` for `query`
2021-06-10 09:48:35 +02:00
Nga Tran
4cf05df35b
feat: hook SortPreservingMergeExec into deduplication framework
2021-06-09 23:29:44 -04:00
Nga Tran
4478d900ee
refactor: capture test output
2021-06-09 15:09:13 -04:00
Nga Tran
8cc99e3420
Merge branch 'ntran/dedup_within_chunk' of https://github.com/influxdata/influxdb_iox into ntran/dedup_within_chunk
2021-06-09 14:40:29 -04:00
Nga Tran
b3c94b9d65
refactor: change order of fields to pass circle CI tests
2021-06-09 14:40:10 -04:00
kodiakhq[bot]
eed73a30c5
Merge branch 'main' into ntran/dedup_within_chunk
2021-06-09 18:19:17 +00:00
Nga Tran
c1c58018fc
refactor: address review comments
2021-06-09 14:17:47 -04:00
Andrew Lamb
89fcc457f4
fix: Fix bug in chunk overlap calculation due to nulls ( #1669 )
...
* fix: Fix bug in chunk overlap calculation due to nulls
* docs: add note about algorithmic complexity
* fix: avoid recursion in normal case
2021-06-09 17:46:39 +00:00
Raphael Taylor-Davies
07c4277ca7
refactor: schema merge to give more control over field merging ( #1653 )
...
* refactor: schema merge to give more control over field merging
* chore: review feedback
2021-06-09 06:30:45 +00:00
Nga Tran
3d50ff7a60
refactor: remove comments
2021-06-08 21:48:57 -04:00
Nga Tran
ab7d3384b7
refactor: remove unused comments
2021-06-08 21:43:02 -04:00
Nga Tran
3e10351538
test: add tests for the sort plan
2021-06-08 21:40:46 -04:00
Andrew Lamb
cba7f270b4
docs: Improve comments + whitespace ( #1663 )
2021-06-08 21:13:35 +00:00
Nga Tran
68e3a2121f
feat: add SortExec
2021-06-08 15:04:31 -04:00
Andrew Lamb
666204d4a8
fix: remove whitespace changes
2021-06-08 14:46:55 -04:00
Andrew Lamb
b23c4e5210
fix: clippy
2021-06-08 14:44:48 -04:00
Andrew Lamb
fd8a87484e
feat: Hook up chunk grouping into provider
2021-06-08 14:42:37 -04:00
Nga Tran
edbf1b7d5e
Merge branch 'main' into ntran/dedup_within_chunk
2021-06-08 13:18:40 -04:00
Nga Tran
40cb4f741f
feat: initial implementaton
2021-06-08 13:17:36 -04:00
Andrew Lamb
62e8675737
refactor: move primary_key calculaton to TableSummary ( #1659 )
2021-06-08 17:06:37 +00:00
Andrew Lamb
34ba268cf1
feat: Group chunks by potential overlap ( #1654 )
...
* feat: Group chunks by potential overlap
* docs: clarify in what way the calculation is conservative
* fix: Add test for mixed nulls
2021-06-08 16:55:29 +00:00
Edd Robinson
b88f277477
feat: enable not eq operator
2021-06-08 15:57:07 +01:00
Andrew Lamb
e9834a907c
feat: Prune on boolean column predicates too ( #1629 )
...
* chore: update deps to get latest DataFusion
* fix: enable boolean pruning tests
* fix: update explain plan tests
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-07 16:51:30 +00:00
Nga Tran
ff641e5638
refactor: address Andrew's comments
2021-06-06 22:36:44 -04:00
Nga Tran
2f82a9d670
feat: full foundation for deduplicate with todo functions to finish
2021-06-06 22:09:01 -04:00
Andrew Lamb
ff3215e6a9
feat: Implement Chunk Pruning ( #1567 )
2021-06-04 13:05:22 +00:00
Andrew Lamb
c986ce2c19
feat: Add pruning module to query crate ( #1611 )
...
* feat: Add pruning module
* fix: clippy
* fix: Apply suggestions from code review
* fix: remove erronious claims of DF bugs
* fix: update comments with DF bug reference
2021-06-03 11:07:26 +00:00
Nga Tran
e7a97f3ac1
test: merge main and add more tests for deduplicate work
2021-06-02 12:00:40 -04:00
Nga Tran
60ad929721
refactor: add macro tto compare output of explains
2021-06-01 16:39:14 -04:00
Nga Tran
aa867601e5
chore: merge main with DF plan display fix
2021-06-01 16:17:41 -04:00
Andrew Lamb
d8fbb7b410
refactor: Remove last vestiges of multi-table chunks from PartitionChunk API ( #1588 )
...
* refactor: Remove last vestiges of multi-table chunks from PartitionChunk API
* fix: remove test that can no longer fail
* fix: update tests + code review comments
* fix: clippy
* fix: clippy
* fix: restore test_measurement_fields_error test
2021-06-01 16:12:33 +00:00
Andrew Lamb
d3711a5591
refactor: Use ParquetExec from DataFusion to read parquet files ( #1580 )
...
* refactor: use ParquetExec to read parquet files
* fix: test
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-01 14:44:07 +00:00
Andrew Lamb
162a808a8d
refactor: Remove `table_name` from PartitionChunk API ( #1584 )
...
* refactor: Remove `table_name` from PartitionChunk API
* fix: clippy
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-31 12:05:09 +00:00
Andrew Lamb
d50c7c8919
chore: remove unused dependency ( #1581 )
2021-05-31 09:58:10 +00:00
Nga Tran
62147ff0d4
feat: add more explain tests
2021-05-27 12:19:41 -04:00
Raphael Taylor-Davies
5d342d7779
feat: associate tracker with lifecycle action ( #1099 ) ( #1556 )
...
* feat: associate tracker with lifecycle action (#1099 )
* chore: docs
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-27 10:47:35 +00:00
Raphael Taylor-Davies
4fcc04e6c9
chore: enable arrow prettyprint feature ( #1566 )
2021-05-27 10:28:14 +00:00
Raphael Taylor-Davies
c2fd85209c
feat: wait for task shutdown on DedicatedExecutor ( #1537 )
2021-05-25 11:33:55 +00:00
Andrew Lamb
14ba25f86d
chore: Update datafusion and use released version of arrow crates ( #1546 )
...
* chore: Update datafusion and use released version of arrow crate
* fix: Update for change in API
2021-05-24 15:37:22 +00:00
Nga Tran
0563005aac
chore: remove leftover comments
2021-05-21 17:01:49 -04:00
Nga Tran
f113abacb5
feat: more unit & e2e tests plus cleanup and addressing review comments of Andrew and Edd
2021-05-21 16:48:43 -04:00
Nga Tran
e44a3a87db
feat: fnow predicate is actuallu pushed down to RUB but there are bugs and not working yet
2021-05-20 16:56:15 -04:00
Nga Tran
51de37e752
chore: run fmt
2021-05-19 15:28:44 -04:00
Nga Tran
11561111d5
chore: merge main to branch
2021-05-19 15:11:15 -04:00
Nga Tran
1f13842550
chore: modify comments
2021-05-19 14:49:48 -04:00
Nga Tran
087d61f229
feat: Part 1 of predicate push down - Send predicates to MUB, RUB, and Parquet File. Note that MUB has not handled predicates yet
2021-05-19 13:59:51 -04:00
Andrew Lamb
7e223780f3
feat: Implement Display for query::predicate to improve debug printing of plans ( #1519 )
...
* feat: Implement Display for query::predicate to improve debug printing of plans
* fix: clippy
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-19 12:38:34 +00:00
Andrew Lamb
0680a5167f
chore: Improve DataFusion plan logging ( #1508 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-18 11:08:06 +00:00
Andrew Lamb
07db4932ee
refactor: rename data_types/src/chunk.rs -> data_types/src/chunk_metadata.rs ( #1500 )
2021-05-15 10:18:01 +00:00
Edd Robinson
8ccc359cab
refactor: address PR feedback
2021-05-07 13:48:44 +01:00
Edd Robinson
4c4bd2f164
refactor: update query/src/func/regex.rs
2021-05-07 13:44:51 +01:00
Edd Robinson
4cc7a99854
refactor: include not match in support check
2021-05-07 13:44:51 +01:00
Edd Robinson
beee3115f4
feat: expose regex =\~ and to gRPC API
2021-05-07 13:44:51 +01:00
Edd Robinson
eae3fec571
feat: wire up regex UDF as predicate filter expr
2021-05-07 13:44:51 +01:00
Edd Robinson
3fc2c9fc04
feat: add DataFusion regex match operator
...
This commit adds a new custom UDF to IOx that provide a regex operator to Datafusion plans.
Effectively it allows predicates to contain regex operators that are applied as filters, only allowing rows that satisfy the regex to be returned.
I did not use the Arrow regex kernel for this work because that does not return a boolean array indicating which rows matched a regex, but instead returns a new string array of results. This doesn't work well with DF's approach to filtering.
2021-05-07 13:44:51 +01:00
Carol (Nichols || Goulding)
febc1538ff
chore: Update Rust version ( #1445 )
...
* chore: Update Rust version
* refactor: Make struct constructor field orderings consistent
Sometimes I changed the struct definition, sometimes changed the struct
construction instance, depending on consistency with code around each
(other similar structs, function argument orders, etc)
More info: https://rust-lang.github.io/rust-clippy/master/index.html#inconsistent_struct_constructor
* refactor: Use flatten where appropriate
One instance is a false positive with a clippy bug.
More info:
- https://rust-lang.github.io/rust-clippy/master/index.html#filter_map_identity
- https://rust-lang.github.io/rust-clippy/master/index.html#manual_flatten
* refactor: Use Option map instead of match
More info: https://rust-lang.github.io/rust-clippy/master/index.html#manual_map
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-06 22:07:10 +00:00
Raphael Taylor-Davies
44de42906f
refactor: use Arc<str> instead of Arc<String> ( #1442 )
2021-05-06 17:05:08 +00:00
Raphael Taylor-Davies
411cf134e9
refactor: explode arrow_deps ( #1425 )
...
* refactor: explode arrow_deps
* chore: workaround doctest bug
2021-05-05 16:59:12 +00:00
Edd Robinson
2f789485e6
refactor: fix spelling
2021-05-05 11:06:04 +01:00
Andrew Lamb
3b7c5ac350
fix(storage rpc): do not send back tags with empty values ( #1403 )
2021-05-04 10:35:24 +00:00
Andrew Lamb
40b9b09cdc
refactor: rename assert_table_eq to assert_batches_eq ( #1368 )
2021-04-30 10:51:08 +00:00
Andrew Lamb
eb8d91cf1c
refactor: remove additional uses of RecordBatch::try_new ( #1378 )
...
* refactor: remove additional uses of RecordBatch::try_new
* fix: fix accidental change
* fix: clippy
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-30 10:24:47 +00:00
Edd Robinson
13fbf2e68d
refactor: plumb registry to gRPC server
2021-04-29 14:00:05 +01:00
Edd Robinson
4acbdcf1c9
refactor: address PR feedback
2021-04-28 16:11:57 +00:00
Edd Robinson
a9ef604ef6
perf: avoid using channels for query execution
...
Pre-sized channels get full when the results to send over them are larger than the capacities. This causes significant runtime overhead and slows down query performance.
This commit removes the intermediate channels. The potential downside to this approach is there may be more buffering which could increase memory usage during query and also block a thread for longer periods of time.
2021-04-28 16:11:57 +00:00
Raphael Taylor-Davies
7ca1da3fcd
feat: pushdown table and partition key predicates to catalog ( #736 ) ( #1327 )
...
* feat: catalog predicate pushdown (#736 )
* chore: fix lints
* chore: review comments
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-27 15:31:47 +00:00
Marco Neumann
91bccdfca3
ci: pass `--document-private-items` to `cargo doc`
2021-04-27 15:42:07 +02:00
Marco Neumann
eddc9319ff
docs: deny broken intradoc links
2021-04-27 13:22:28 +02:00
Raphael Taylor-Davies
20117de078
feat: string dictionary encoding ( #1220 ) ( #1262 )
...
* feat: string dictionary encoding (#1220 )
* chore: review comments
* chore: fix lint
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-27 09:36:58 +00:00
Edd Robinson
a322d05838
refactor: rust fmt
2021-04-20 17:30:50 +00:00
Edd Robinson
554b3b4662
refactor: satisfy new clippy lints
2021-04-20 17:30:50 +00:00
Carol (Nichols || Goulding)
51041ba2d9
fix: Prefer implementing From over Into
2021-04-19 08:48:11 -04:00
Carol (Nichols || Goulding)
757933afc4
fix: use Self when possible
2021-04-19 08:48:11 -04:00
Carol (Nichols || Goulding)
f136931225
fix: Inconsistent ordering lints
2021-04-19 08:48:11 -04:00
Carol (Nichols || Goulding)
3e87ce5232
fix: Make this trait and methods more idiomatically named
...
"into" usually takes ownership and does a conversion; "as" takes
references and provides a different view.
2021-04-19 08:45:34 -04:00
Andrew Lamb
529c99c93f
fix: don't clone arrays to make TimestampNanosecondArrays ( #1241 )
...
* fix: avoid clone
* fix: remove another clone
2021-04-16 18:40:22 +00:00
Andrew Lamb
e226b5a820
feat: Use TimestampNanosecondArray for timestamps in IOx ( #1230 )
...
* refactor: Create Arrow arrays using iterators
* feat: use Timestamp64(TimeUnit::Nanosecond) for timestamps
* feat: add support for timestamp array
* fix: update more tests
* fix: remove unecessary code
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-16 15:55:33 +00:00
Andrew Lamb
f092294da3
fix: Use MAX (window end) for timestamps in read group ( #1228 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-16 10:51:38 +00:00
Andrew Lamb
5aeeccb97c
feat: Run query plans on the database wide executor as well ( #1210 )
...
* feat: route all query planning through executor
* fix: Rename JoinError -> TaskJoinError and make message clearer
* fix: remove dangling comment
* fix: remove confusing comments
2021-04-15 11:57:20 +00:00
Andrew Lamb
59ca090aef
feat: Use single db-wide executor for running queries ( #1198 )
...
* refactor: plumb executor into all Db instances
* refactor: Route all query executions through worker pool
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-14 16:46:02 +00:00
Andrew Lamb
8f1bf8a960
fix: Remove mutex acquisition in impl `std::fmt::Debug` for DedicatedExecutor ( #1205 )
2021-04-14 12:09:40 +00:00
Andrew Lamb
f5f768d750
feat: Add a dedicated threadpool for running queries ( #1191 )
...
* feat: use a dedicated tokio threadpool for running queries
* feat: plumb number of executor threads through to command line
thread through command line
* fix: Logical merge conflict
* fix: another logical conflict
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-14 10:48:09 +00:00
Andrew Lamb
150ed4e1d9
refactor: Remove async from `InfluxRPCPlanner` ( #1200 )
...
* refactor: Remove async from InfluxRPCPlanner
* fix: make it compile
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-13 22:17:19 +00:00
Paul Dix
7e28f8ef66
feat: Implement Entry writing to Db
...
This removes the old ReplicatedWrite structure and implements the writing of an Entry to the Db. I also call out in `server/lib.rs` and in the `Db` where sharding and replication might happen.
I've also added helpers in various places to write line protocol to chunks, tables, and databases. That enabled removing a good amount of code from the test helpers crate.
2021-04-13 12:52:14 +00:00
Raphael Taylor-Davies
1997324344
feat: mutable buffer snapshotting ( #1179 )
...
* feat: mutable buffer snapshotting
* chore: review feedback
2021-04-13 12:14:54 +00:00
Raphael Taylor-Davies
078c0f3fda
refactor: lift chunk and table summaries out of DBChunk ( #1162 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-09 12:00:47 +00:00
Nga Tran
be6e1e48e4
feat: add writer_id and object_store in Db
2021-04-07 18:36:07 -04:00
Carol (Nichols || Goulding)
82588d5c72
fix: Don't return Result from test functions
2021-04-07 12:40:00 -04:00
Raphael Taylor-Davies
5cd1d6691d
refactor: use DatabaseName in DatabaseRules ( #1127 )
2021-04-06 13:26:30 +00:00
Jacob Marble
80d55d0829
chore: rename tracing_deps to observability_deps
...
OpenTelemetry makes this necessary.
2021-04-02 13:14:30 -07:00
Carol (Nichols || Goulding)
0b880d3534
chore: Group all tracing-related crates under one crate for easier upgrade management
2021-04-02 09:54:39 -04:00
Andrew Lamb
569f90d937
feat: Add ability to get PartitionSummary statistics from a Db ( #1090 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-31 14:18:53 +00:00
Andrew Lamb
f0b411cd43
feat: enable information_schema
2021-03-30 09:01:43 -04:00
Andrew Lamb
6a48001d13
refactor: Manage storage directly in the Catalog ( #1057 )
...
* refactor: Manage mutable buffer chunks directly
* fix: do not use mutable_buffer for listing table names
2021-03-29 17:55:07 +00:00
Andrew Lamb
eb0122655d
refactor: Remove async from PartitionChunk ( #1062 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-29 13:00:36 +00:00
Andrew Lamb
02ae743e8e
refactor: Remove async from Database ( #1063 )
2021-03-29 12:48:12 +00:00
Raphael Taylor-Davies
fb130ea99d
feat: use CatalogProvider and SchemaProvider ( #1058 )
...
* feat: use CatalogProvider and SchemaProvider
* refactor: review comments
2021-03-29 11:08:46 +00:00
Andrew Lamb
0ca9ad7285
refactor: Remove async from `PartitionChunk::table_schema` ( #1060 )
2021-03-27 18:08:12 +00:00
Andrew Lamb
663d4fb6f7
docs: Use Scan rather than InMemoryScan for clarity ( #1049 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-26 14:22:49 +00:00
Andrew Lamb
895e808754
chore: Upgrade arrow deps ( #1046 )
...
* chore: Upgrade dependencies
* chore: upgrade query for new interfaces
* chore: update read_buffer
2021-03-25 13:35:08 +00:00
Andrew Lamb
6e1795fda0
refactor: Move some types (not yet exposed to clients) into internal_types ( #1015 )
...
* refactor: Move some types (not yet exposed to clients) into internal_types
* docs: Add README.md explaining the rationale
* refactor: remove some stragglers
* fix: fix benches
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: add clippy lints
* fix: fmt
* docs: Apply suggestions from code review
fix typos
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-19 16:27:57 +00:00
Andrew Lamb
72eff5eed5
chore: update deps (including arrow)
2021-03-16 18:15:44 -04:00
Raphael Taylor-Davies
65f7a1ac5b
fix: use consistent crate versions ( #989 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-15 15:42:19 +00:00
Andrew Lamb
6ac7e2c1a7
feat: Add management API and CLI to list chunks ( #968 )
...
* feat: Add management API and CLI to list chunks
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: add comment to protobuf
* fix: fix comment
* fix: fmt, fixup merge errors
* fix: fascinating type dance with prost generated types
* fix: clippy
* fix: move command to influxdb_iox database chunk list
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-03-12 13:56:14 +00:00
Raphael Taylor-Davies
0ff527285c
refactor: remove unnecessary async from DatabaseStore trait ( #965 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-11 11:33:53 +00:00
Andrew Lamb
746373a687
refactor: Remove mutable_buffer crate dependency on query crate ( #927 )
2021-03-05 11:34:27 +00:00
Andrew Lamb
8b1f100df3
feat: make read_group and read_window_aggregate work across chunks ( #905 )
...
* feat: make read_group and read_window_aggregate work across chunks
* refactor: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* refactor: Update query/src/frontend/influxrpc.rs
Improve logic and use strings directly
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: fmt
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-04 17:06:31 +00:00
Nga Tran
957e05ef25
chore: use newly added Arrow's Expr::is_not_null function
2021-03-03 11:46:49 -05:00
Andrew Lamb
94bd200e60
refactor: Add Predicate::is_empty() and EMPTY_PREDICATE to avoid unecessary construction ( #891 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-01 21:03:05 +00:00
Andrew Lamb
7d8d00781c
feat: Make read_filter work for mutable buffer and read buffer ( #882 )
...
* feat: port read_filter to InfluxRPCPlanner
* fix: remove commented out vestigal test
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: fmt
* fix: Update arrow_deps/src/util.rs
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-03-01 16:50:29 +00:00
Nga Tran
6ad8e1aa33
feat: use newly implemented tags_iter to get Tag columns
2021-02-26 15:54:20 -05:00
Nga Tran
18de3bdcab
chore: merge main into branch
...
Merge branch 'main' into ntran/optimize_column_selection
2021-02-26 15:29:43 -05:00
Nga Tran
f37e5846aa
feat: fmt auto fix
2021-02-26 14:56:10 -05:00
NGA TRAN
eb81975151
feat: Optimize Column Selection
2021-02-26 14:28:46 -05:00
Andrew Lamb
12deacd8a0
refactor: move SeriesSetPlans into its own module ( #878 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-02-25 23:12:39 +00:00