Commit Graph

570 Commits (77d8967c8e04be7357abca6bd8b264bb8a25dd51)

Author SHA1 Message Date
Edd Robinson ea26a77217
Merge branch 'main' into ntran/no_use_stats 2021-10-06 09:30:18 +01:00
Nga Tran aa64daca86 feat: dDisable using statistics to query data if there are soft deleted rows 2021-10-05 17:52:32 -04:00
Andrew Lamb 785a62c114
fix: include all group tags, not just group_keys in GroupFrame response (#2741)
* fix: include all group tags, not just group_keys in GroupFrame response

* docs: fix test comments, add doc strings for group_description_to_frames

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-05 19:08:04 +00:00
Marco Neumann 97881079e8 refactor: make `ChunkOrder` non-zero
This will make it easier to handle missing values.

Helps with #2633.
2021-10-04 17:49:12 +02:00
Marco Neumann 5a5a929b9e refactor: introduce `DeletePredicate`
`DeletePredicate` is a simpler version of `Predicate` that is based on
IOx `DeleteExpr` instead of the full-blown DataFusion `Expr`. This will
allow us to do a couple of things (in follow-up changes):

- Order and de-duplicate delete predicates
- Normalize predicates
- Infallible serialization
- Smaller memory footprint

Note that this change only affects delete expressions. Query expressions
that are supported via the API are not changed. The query subsystem also
still uses the full-featured expressions/predicates (delete
expressions/predicates are converted to the more powerful DataFusion
version on-the-fly).
2021-10-04 16:36:20 +02:00
dependabot[bot] d1f5209869
chore(deps): bump arrow from 5.4.0 to 5.5.0
Bumps [arrow](https://github.com/apache/arrow-rs) from 5.4.0 to 5.5.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/5.5.0/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/5.4.0...5.5.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-04 08:55:38 +00:00
Andrew Lamb 134fb96b26
feat: add UInt64 support for gRPC query results (#2701) 2021-10-01 17:18:32 +00:00
Andrew Lamb 2db56a0332
chore: Make query logging a bit less verbose (#2655)
* chore: Make query logging a bit less verbose

* fix: remove unused use
2021-09-28 20:58:37 +00:00
Andrew Lamb a55a21c644
chore: Update datafusion (#2635)
* chore: Update datafusion and sqlparser

* fix: remove STACK_SIZE workaround

* chore: update datafusion_util

* chore: update predicate

* chore: update query_tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-27 14:13:19 +00:00
Andrew Lamb d38648952c
chore: Update datafusion (#2602)
* chore: Update datafusion + other deps

* refactor: update query crate for new async interfaces

* refactor: update server crate for new async interface

* refactor: update query_tests crate for new async interfaces

* refactor: update ioxd and server to use new async interface

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-22 10:33:25 +00:00
Nga Tran 0fc2567161 refactor: address review comment 2021-09-21 16:50:29 -04:00
Nga Tran a06afb3932 feat: optimize scan for chunks without delete preidcates and without the need to sort output 2021-09-21 15:56:19 -04:00
Nga Tran 61ed67a5d9 refactor: cleanup 2021-09-21 12:28:18 -04:00
Nga Tran 93551bdd1e fix: all chunks now are applied delete predicates during scan 2021-09-21 12:17:59 -04:00
kodiakhq[bot] 77d84ca5ab
Merge branch 'main' into crepererum/chunk_id 2021-09-20 13:39:05 +00:00
Marco Neumann cef5aeee52 refactor: introduce `ChunkId` type 2021-09-20 13:10:41 +02:00
Nga Tran 364d245eae feat: apply negated delete predicates during scan 2021-09-17 16:20:42 -04:00
Marco Neumann ec943081c7 refactor: `Arc<Vec<...>>` => `Vec<Arc<...>>` for del predicates
The motivations are:

1. The API uses a SINGLE predicate and adds that to many chunks. With
   `Arc<Vec<...>>` you gain nothing, with `Vec<Arc<...>>` the predicate
   is only stored once (in many vectors)
2. While we currently add predicates blindly to all chunks, we can be way
   smarter in the future and prune out tables, partitions or even single
   chunks (based on statistics). With that, it will be rare that many
   chunks share the exact same set of predicates.
3. It would be nice if we could de-duplicate predicates when writing them
   to the preserved catalog without needing to repeat the pruning
   discussed in point 2. This is way easier to implement whan chunks
   exists in `Arc`s.
4. As a side-note: the `Arc<Vec<...>>` wasn't really cloned around but
   instead was created many time. So the new version should be more
   memory efficient out of the box.
2021-09-16 17:16:09 +02:00
Andrew Lamb ce224bd37f
fix: Capture query execution traces for storage gRPC queries as well (#2553)
* fix: Capture query execution traces for storage gRPC queries as well

* refactor: remove debugging droppings

* refactor: do not Box::pin within TracedStream

* refactor: Use Futures::TryStreamExt rather than custom collect function

* fix: remove wild println

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-16 14:45:20 +00:00
kodiakhq[bot] 33cd1cffad
Merge branch 'main' into ntran/delete_read 2021-09-16 13:22:50 +00:00
Andrew Lamb a478138756
refactor: Add SpanContext:new() to make a new span (#2551)
* refactor: Add SpanContext::new() and remove make_span

* fix: generate random trace_id and span_ids

* docs: Update trace/src/ctx.rs

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-16 11:02:28 +00:00
Nga Tran 7175488133 chore: add some comments 2021-09-15 14:45:04 -04:00
Nga Tran 3486cc8b38 fix: should not send an empty delete predicate predicate which means delete everything (no time range) 2021-09-15 14:14:26 -04:00
Andrew Lamb 74d3c2e6d2
feat: Translate DataFusion execution metrics to IOx Spans (#2529)
* feat: Translate DataFusion execution metrics to IOx Spans

* fix: add end to end test to ensure plumbing is hookedup

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-15 18:14:23 +00:00
kodiakhq[bot] de732b4273
Merge branch 'main' into crepererum/parquet_file_wo_query 2021-09-15 07:15:19 +00:00
Nga Tran 63cc7b3fb0 test: more tests to discover what still need to be done 2021-09-14 17:57:30 -04:00
Nga Tran f4f140d3b7 chore: merge main to branch 2021-09-14 13:25:32 -04:00
Marco Neumann 509c07330d refactor: decouple `parquet_file` from `query` 2021-09-14 18:26:16 +02:00
kodiakhq[bot] d60aa5940b
Merge branch 'main' into crepererum/chunk_order_type 2021-09-14 16:25:17 +00:00
Marco Neumann bfaba78dc3 refactor: move `predicate` into its own crate
Two reasons:

1. I wanna decouple `parquet_file` from `query` (nearly done, needs a
   small follow-up PR).
2. `predicate` will have more and more features (like serialization)
   which justifies a new home
2021-09-14 17:13:02 +02:00
Marco Neumann becef1c75f refactor: introduce `ChunkOrder` type 2021-09-14 17:10:23 +02:00
kodiakhq[bot] 9ea61cd434
Merge branch 'main' into crepererum/issue1963 2021-09-14 11:38:59 +00:00
Marco Neumann 4795bd5c9c refactor: stricter delete predicate TS parsing
As a a nice side effect, the parser no longer depends on the line
protocol parser.
2021-09-14 13:14:19 +02:00
Marco Neumann 1b788732da fix: order chunks correctly during query processing
The query processing was implicitly relying on the order provided by the
catalog. This had two issues:

- this ordering was not defined in the API contract (neither via docs
  nor via typing)
- the order was based on chunk IDs which is not adequate in some cases
  (e.g. when chunks are created while a persistence operations is in
  progress)

Now we explicitly sort chunks by `(order, ID)`.

Fixes #1963.
2021-09-14 13:00:55 +02:00
Nga Tran 042a78e5a7 feat: apply delete predicate during query to emilimate deleted data 2021-09-13 18:02:55 -04:00
Andrew Lamb 5eef76c868
chore: Update dependencies (including datafusion) (#2521)
* chore: Update datafusion deps to pre-release

* refactor: Update IOx to use new datafusion Statistics

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-13 21:30:44 +00:00
Raphael Taylor-Davies 20143e4f4e
feat: migrate chunk pruning metrics (#2516) 2021-09-13 13:13:47 +00:00
Andrew Lamb eb72799f0d
chore: Update datafusion (and arrow et al) dependencies (#2509)
* chore: update datafusion and other deps

* fix: Update InfluxRPC frontend with new op types

* fix: Update test output for new column names

* fix: typos and unintended changes

* fix: Update query_tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-10 17:57:25 +00:00
Marco Neumann 368f0369ee chore: Rust 1.55 2021-09-10 12:36:49 +02:00
Nga Tran 5d2b4a87c3 refactor: use snafu 2021-09-09 16:34:20 -04:00
Nga Tran 1effb11ad9 refactor: move delete parseing work out of influxdb_line_protocol crate 2021-09-09 15:27:17 -04:00
Nga Tran b4f8fad400 refactor: address review comments and remove uno longer needed dependencies 2021-09-09 14:24:58 -04:00
Nga Tran 00df7b064c feat: finally have the delete predicate parsed 2021-09-08 17:30:10 -04:00
Nga Tran dbe4bcff22 chore: merge main to branch 2021-09-07 10:54:59 -04:00
Nga Tran 9ee1bdeeb9 refactor: address review comments 2021-09-07 10:24:38 -04:00
dependabot[bot] b67610d9b9
chore(deps): bump tokio from 1.10.1 to 1.11.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.10.1 to 1.11.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.10.1...tokio-1.11.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 09:11:38 +00:00
dependabot[bot] b1bb390893
chore(deps): bump parking_lot from 0.11.1 to 0.11.2
Bumps [parking_lot](https://github.com/Amanieu/parking_lot) from 0.11.1 to 0.11.2.
- [Release notes](https://github.com/Amanieu/parking_lot/releases)
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Amanieu/parking_lot/compare/0.11.1...0.11.2)

---
updated-dependencies:
- dependency-name: parking_lot
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 01:18:24 +00:00
Nga Tran d8b3208896 refactor: sstart building parser in the client side 2021-09-03 17:56:45 -04:00
Nga Tran a4183de411 feat: more progress on the delete flow from grpc API to catalog chunks 2021-08-31 17:42:07 -04:00
Nga Tran b42784d8a0 refactor: address review comments 2021-08-30 16:31:59 -04:00
Nga Tran 7edb3fd270 fix: time column is alwyas last even if the column is sorted lexicographically 2021-08-30 14:48:18 -04:00
Andrew Lamb f42f0349ed
feat: Implement basic metrics for `DeduplicateExec`, `IOxReadFilterNode`, `SchemaPivotExec` and `StreamSplitExec` (#2387)
* feat: Add baseline metrics to DeduplicateExec

* feat: Add metrics to `IOxReadFilterNode`

* feat: Add metrics for SchemaPivotExec

* feat: Add metrics to StreamSplitExec

* fix: Update for new API, cleanups

* test: Add tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 20:31:28 +00:00
Raphael Taylor-Davies e3e801d29a
feat: propagate span context into storage RPC queries (#2407)
* feat: propagate span context into storage RPC queries

* refactor: create ExecutionContextProvider trait

* chore: cleanup imports

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 17:11:49 +00:00
Andrew Lamb ddf6c6362e
chore: update DataFusion again (#2411)
* chore: update datafusion ref

* chore: run cargo update

* refactor: Rename concurrency to target_partitions, avoid deprecation warning
2021-08-26 08:03:13 +00:00
kodiakhq[bot] 5d97acb2f3
Merge branch 'main' into crepererum/issue2372 2021-08-25 07:08:15 +00:00
Raphael Taylor-Davies f7792aafe6
feat: query tracing (#2273) (#2391)
* feat: query tracing (#2273)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 17:35:59 +00:00
Andrew Lamb 745eaa9248
chore: Update DataFusion + other deps (to get new Metric API) (#2385)
* chore: Update deps

* refactor: Update IOx to use new DataFusion Metric API

* fix: update Modulus --> Modulo
2021-08-24 16:07:23 +00:00
Marco Neumann 4f23d3b60b feat: shut down executor when `Executor` is dropped 2021-08-24 14:38:00 +02:00
Raphael Taylor-Davies a6c9cc2bf2
refactor: rework exec module (#2384)
* refactor: rework exec module

* chore: update docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 08:39:54 +00:00
Raphael Taylor-Davies 0946ffe916
refactor: reuse IOxExecutionContext (#2373)
* refactor: reuse IOxExecutionContext

* fix: orphaned comment

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-23 15:47:15 +00:00
Dom 3de6b44e23
build: use new rustdoc lint name (#2261)
* fix: nocache feature code rot

The MBChunk::snapshot code when using the "nocache" option no longer
compiles - this commit updates it to match the not(nocache) code.

* build: use updated broken_intra_doc_links name

The broken_intra_doc_links lint was renamed
rustdoc::broken_intra_doc_links

https://doc.rust-lang.org/rustdoc/lints.html
2021-08-11 19:48:51 +00:00
Andrew Lamb 559db4529d
refactor: Move DatabaseStore out of query crate (#2219)
* refactor: Move DatabaseStore out of query crate

* fix: doc links
2021-08-09 12:06:25 +00:00
Carol (Nichols || Goulding) 9d15798288 fix: Address or allow Clippy warnings new with Rust 1.54 2021-07-30 09:59:59 -04:00
Nga Tran e8828c22e4 refactor: address review comments 2021-07-29 13:38:42 -04:00
Nga Tran 0d05ac3961 feat: add sort option while building scan plan to avoid extra sort during compaction 2021-07-28 17:32:01 -04:00
Andrew Lamb e6cbd4d217
feat: Use statistics for count(*) queries (#2038)
* feat: Use statistics for count(*) queries

* docs: fix mangled comment

* refactor: rewrite to use fold

* refactor: use sort_by_cached_key

* fix: set null count properly

* fix: fmt + clippy
2021-07-28 19:39:41 +00:00
Andrew Lamb 5fb3e00f2a
fix: Properly record total_count and null_count in statistics (#2103)
* fix: Properly record total_count and null_count in statistics

* fix: fix statistics calculation in mutable_buffer

* refactor: expose null counts in read_buffer

* refactor: expose null_count in parquet_file

* fix: update server crate tests

* fix: update query_tests tests

* docs: tweak comments

* refactor: Use storage_stats rather than adding `null_count`

* refactor: rename test data field for clarity

* fix: fixup merge conflicts

* refactor: rename initial_non_null_count to initial_total_count

* refactor: caculate null_count as row_count - to_add
2021-07-26 18:13:36 +00:00
Andrew Lamb 01c79f1a1a
fix: Print all timestamps using RFC3339 format (#2098)
* fix: Use IOx pretty printer rather than arrow pretty printer

* chore: update tests in the query crate

* chore: update influxdb_iox tests

* chore: Update end to end tests

* chore: update query_tests

* chore: update mutable_buffer tests

* refactor: update parquet_file tests

* refactor: update db tests

* chore: update kafka integration test output

* fix: merge conflict
2021-07-22 19:04:52 +00:00
Nga Tran 11ba4b5f6a fix: fix unit_test setting to have the desired results 2021-07-22 14:22:08 -04:00
Nga Tran b2063fb29f test: fix the stats and discover a bug in compaction/split/dedupplication 2021-07-21 17:40:48 -04:00
kodiakhq[bot] 18dd108ba6
Merge branch 'main' into ntran/dedup_compare_cols_order 2021-07-21 15:42:30 +00:00
Nga Tran 86add39175 refactor: address review comments 2021-07-21 11:41:21 -04:00
Nga Tran d547c22e97 refactor: comments 2021-07-20 15:27:41 -04:00
Nga Tran 150e166813 refactor: fix comments 2021-07-20 15:16:24 -04:00
Nga Tran fa6d216a85 refactor: cleanup 2021-07-20 15:11:02 -04:00
Nga Tran b98888e8d6 feat: implement key_ranges function that uses new range identify algo 2021-07-20 14:58:54 -04:00
Andrew Lamb 2c20528c69
chore: use upstream versions of some workarounds (#2057)
* chore: use upstream versions of some workarounds

* docs: update docstring

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 08:53:46 +00:00
Nga Tran 1668420ded feat: new algorth to compute key ranges for deduplicating data 2021-07-19 18:04:25 -04:00
Andrew Lamb 1c16988a51
chore: Update datafusion references (#2056) 2021-07-19 18:09:06 +00:00
Andrew Lamb 4da8a16c18
chore: update to arrow 5.0 and master datafusion (#2049)
* chore: update to arrow 5.0 and master datafusion

* fix: Update test for change in object size
2021-07-19 12:49:51 +00:00
Raphael Taylor-Davies 5fc98c7c56
feat: add failure reporting to TaskTracker (#2031)
* feat: add failure reporting to TaskTracker

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-19 09:17:20 +00:00
Andrew Lamb d00d56027b
docs: add comment to trigger build (#2039) 2021-07-16 17:53:55 +00:00
Nga Tran cfe0bfa88b refactor: address review comments and add useful log info to catch resort 2021-07-15 15:39:12 -04:00
Nga Tran 0b1f2b1fd0 chore: merge main to branch 2021-07-14 16:17:14 -04:00
Nga Tran ef271d1e1c test: make the tests clearer 2021-07-14 15:42:30 -04:00
Nga Tran b4d86dcb7d fix: make the order of sort key deterministic 2021-07-14 14:50:19 -04:00
Nga Tran 9ffaf863fa refactor: cleanup 2021-07-14 14:30:04 -04:00
Nga Tran 552e3fb691 fix: Padd stats compute deterministic order of sort key and update tests that got changed by the use of sort key 2021-07-14 14:06:41 -04:00
Edd Robinson 46ac15a77e refactor: increase compaction batch size 2021-07-14 17:19:11 +01:00
Nga Tran 8fd0df04f2 feat: continue buidling and using sort_key if available 2021-07-13 16:25:58 -04:00
Andrew Lamb 4800b36949 chore: Update IOx to a pre-release version of arrow and datafusion to test out performance improvement 2021-07-13 15:44:57 -04:00
Andrew Lamb 0164cabbf3
refactor: do not use DataFrame DataFusion API / stop optimizing twice (#1982)
* refactor: do not use DataFrame DataFusion API

* fix: update output to reflect not running optimizer twice

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-13 16:29:43 +00:00
Marco Neumann 2e391deb34 chore: update croaring to 0.5.0
Upstreame changelog:

- CRoaring updated to 0.3.1
- `-march=native` is not a default for croaring-sys anymore
- Impl Default for `Bitmap` and `Treemap`
2021-07-13 15:15:41 +02:00
Andrew Lamb d35b74c226
fix: Fix doc build warnings (#1945)
* fix: Fix doc build warnings

* refactor: add deny bare_urls to crates

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-13 08:03:42 +00:00
Nga Tran 5418a1fe6b refactor: remove unused comments 2021-07-12 18:14:38 -04:00
Nga Tran 23895e6673 feat: Using sort_key to avoid resorts 2021-07-12 18:08:45 -04:00
kodiakhq[bot] f26f844ed2
Merge branch 'main' into ntran/use_sortkey 2021-07-12 18:12:47 +00:00
Carol (Nichols || Goulding) c681da1031 refactor: Define the TestChunk methods with macros 2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) 4e53a32928 refactor: Completely replace query::provider::overlap::TestChunk with query::test::TestChunk 2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) 1698edcc39 refactor: Implement query::provider::overlap::TestChunk in terms of query::test::TestChunk 2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) dc0b97e121 refactor: Completely replace TestChunkMeta with TestChunk 2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) 96f9485792 refactor: Move a with_no_stats method to be entirely defined on TestChunk 2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) b4c5a87088 refactor: Rename int field to i64 field to be more consistent 2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) 54f7ee8b8d refactor: Implement TestChunkMeta in terms of TestChunk
This is a temporary step to make sure TestChunk does everything
TestChunkMeta needs
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) ee545ce90e test: Make _with_stats methods able to optionally take max/min
Not used yet, but will be when this is unified with query/src/pruning.rs
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) b26aae1cb4 test: Add an arg to control whether to add a column summary at all
Always true for now, but there are some cases in query/src/pruning.rs
that don't add any column summaries that will use this with `false`.
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) 6cd75bc688 test: Optionally take stats in add_schema_to_table
This gets rid of a lookup and construction of default stats that aren't
necessary
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) e05ca7f98b fix: Change a method name that says null to not say null
The comment and implementation seem to indicate this is creating
non-null data.
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) 4406d8a219 test: Always initialize a TableSummary on TestChunk 2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) 22d4040c81 test: Always initialize a Schema for TestChunk 2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) 92cb5986f1 test: Initialize a schema on TestChunk to always exist 2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) 78f1c4fc80 test: Chunks can only have one table; no need to specify repeatedly
This lets us make the name required and always present on TestChunks,
and make the ID optional.
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding) 15aac65c2c fix: Arrange use statements so rustfmt can manage their order 2021-07-12 09:59:11 -04:00
Nga Tran 7b7a60993d feat: consider time as a special key 2021-07-09 18:54:22 -04:00
Nga Tran 8f4463664c feat: add super_key function 2021-07-09 15:37:04 -04:00
Marco Neumann bc958e2ff0 refactor: use Arcs to pass schemas around 2021-07-09 09:45:12 +02:00
Marco Neumann 09e611deb7 refactor: lift query schema generation up to caller
Do no longer scan chunks during query planning to determine the schema
(except for the lifetime jobs where we have a good reason to do so).
Instead pass the schema down to from whoever is triggering the query.
For real SQL queries, we then just use the the table-wide schemas
introduced in #1913.

Apart from avoiding schema merges we now also don't crash any longer
when no chunks are left in the table (aka columns are present but all
rows are gone).

Fixes #1768.
Fixes #1884.
2021-07-09 09:24:21 +02:00
kodiakhq[bot] c8126784a8
Merge branch 'main' into ntran/avoid_sort_in_scan 2021-07-08 20:22:18 +00:00
Nga Tran 680394b50b refactor: run fmt 2021-07-08 16:21:42 -04:00
Nga Tran c5733ab4a7 refactor: remove redudant code 2021-07-08 16:11:42 -04:00
Nga Tran 6738cb272f refactor: remove duplicate test 2021-07-08 15:59:25 -04:00
Nga Tran da6249a4df fix: address reviewers' comments and also fixe a bug they discovered 2021-07-08 15:54:54 -04:00
Andrew Lamb 33bc85ad18
feat: Infrastructure for persistence (#1925)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 11:14:38 +00:00
Andrew Lamb 7602bde850
chore: Update datafusion deps (#1799)
* chore: Update datafusion deps + rework code

* refactor: remove workaround as it has been contributed upstream

* fix: Update query/src/exec/split.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 10:58:32 +00:00
Nga Tran d3c4f8c249 fix: store sort key correctly inthe schema. Update tests to reflect it 2021-07-07 15:55:23 -04:00
Andrew Lamb e6d995cbd8
chore: Update to Rust 1.53.0 (#1922)
* chore: Update to Rust 1.53.0

* fix: Update to latest clippy standards

* fix: bad refactor

* fix: Update escaping

* test: update test output

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-07 18:02:03 +00:00
Nga Tran 76789e5902 feat: store sotkey into the chunk schema of RUB 2021-07-06 17:00:35 -04:00
Marco Neumann b6185982f7 refactor: make `ProviderBuilder` a build-time-checked builder
It's safer and also avoids cloning / copying state around.
2021-07-06 18:20:05 +02:00
Marco Neumann 4172d7946c refactor: make `SchemaMerger` self-consuming
The error handling in `merge` was incomplete, aka it could leave the
merger in a half-modified state in case of an error. That's generally a
bad idea and can lead to ugly bugs. Also the "builder" pattern that is
used here usually consumes itself (and provides a clone impl), so it is
easier to reason about modifications. So this commit just changes it to
self-consuming builder.

A nice side effect of the new pattern is also that it is build-time
checked and does not contain a runtime assert any longer.
2021-07-06 18:20:05 +02:00
Andrew Lamb 56c8c8d428
feat: Use separate executor for queries and compactions/moves (#1870)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:47:50 +00:00
Jacob Marble 0779b0d9bd
feat: add gRPC listener for new write protocol (#1842)
* feat: add gRPC listener for new write protocol

* chore: clippy happy

* chore: lint

* chore: cargo fmt --all

* chore: cargo clippy

* chore: protobuf-lint

* chore: more formatting

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:15:12 +00:00
kodiakhq[bot] e03a1a1def
Merge branch 'main' into ntran/dedup_less_concat 2021-07-01 15:59:22 +00:00
Nga Tran d0afc7a176 refactor: clean up and add a missing else case 2021-07-01 11:00:30 -04:00
Nga Tran 5cf623201d fix: deduplicate the last batch before sending it downstream 2021-07-01 10:45:23 -04:00
Andrew Lamb 7235c7b965
refactor: Remove vestigial execution counters (#1865)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 14:08:06 +00:00
Nga Tran ba919726b6 test: unit tests 2021-06-30 15:01:31 -04:00
Nga Tran 2a06b93b00 chore: Merge branch 'main' into ntran/dedup_less_concat 2021-06-30 11:37:15 -04:00
Nga Tran 1dbdabd66e fix: 2 values are also considered to be the same if at least one of them is invalid 2021-06-30 10:52:21 -04:00
Raphael Taylor-Davies 62d3305923
feat: optimize the dictionaries in the output of deduplicate node (#1827) (#1832)
* feat: optimize dedup dictionaries (#1827)

* fix: handle sliced null bitmasks

* chore: review feedback
2021-06-30 09:30:16 +00:00
Nga Tran e6a4e0d709 refactor: make the code clearer for schema even though they are the same 2021-06-29 17:46:30 -04:00
Nga Tran a249b90952 refactor: refactor and add temp info for debugging 2021-06-29 16:35:50 -04:00
Nga Tran 4611e5d584 chore: merge main to branch 2021-06-29 15:39:23 -04:00
Nga Tran 388e7b7650 fix: reset last_batch 2021-06-29 15:15:09 -04:00
Nga Tran 8f309eb569 feat: improve deduplicate to avoid as many concat_batches as possible 2021-06-29 14:41:54 -04:00
Edd Robinson 12ae9b012a refactor: clarify intent of 2021-06-28 17:39:48 +01:00
Andrew Lamb 2e5f10f6b1
feat: Sort the output of split_plans as well (#1800)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-25 13:02:30 +00:00
Andrew Lamb 4e7cf39b23
chore: Reduce debug logging in query crate (#1802) 2021-06-24 21:01:11 +00:00
Andrew Lamb 79446d45be
feat: Implement split_plans (#1794)
* feat: implement split plan / planner

* fix: Apply suggestions from code review

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

* fix: resolve merge conflicts

* fix: add values to panic

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
2021-06-24 18:38:00 +00:00
Raphael Taylor-Davies 297fc12db8
feat: compact chunks (#1776)
* feat: compact chunks

* chore: review feedback

* chore: clippy lints

* chore: document sort key algorithm

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-24 16:49:10 +00:00
Andrew Lamb 0a03605bbc
refactor: pull Channel --> Stream adapater into its own module (#1793)
* refactor: pull Channel --> Stream adapater into its own module

* docs: Update query/src/exec/stream.rs

Co-authored-by: Marko Mikulicic <mkm@influxdata.com>

Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
2021-06-24 10:35:45 +00:00
Andrew Lamb 60eb89cad1
feat: Reorg Planner for merge plans (#1780)
* feat: Reorg Planner

* docs: add example for split

* fix: clippy

* docs: Specify <= rather than < for split
2021-06-23 10:50:44 +00:00
Andrew Lamb 4c5007f961
fix: Select the correct timestamp for min/max selectors (#1771)
* test: Reproducer showing that the min/max selectors are order dependent

* fix: pick correct timestamp for first/last selectors

* refactor: remove println

* docs: Fixup comments and add to link to arrow-datafusion/issues/600

* fix: Add debug if timestamp is null

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-22 17:53:54 +00:00
Andrew Lamb 763ade390c
refactor: rename deduplicate --> overlap (#1779) 2021-06-22 17:07:53 +00:00
Andrew Lamb 5362c7c924
feat: enable query deduplication (#1762) 2021-06-21 18:49:04 +00:00
Andrew Lamb bed6ec8c31
feat: Handle merging chunks that have different schemas (#1761)
* feat: Handle merging chunks that have different schemas

* test: print out original (non deduplicated) data in tests
2021-06-21 15:52:13 +00:00
Andrew Lamb 6559a9e997
refactor: use Schema to compute InfluxDB primary keys (#1757)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-18 21:15:31 +00:00
Andrew Lamb de67bd3efe
refactor: Remove PartitionChunk::table_schema (#1756)
* refactor: Remove PartitionChunk::table_schema

* docs: update comments
2021-06-18 16:13:16 +00:00
Andrew Lamb 9beeca3e7c
refactor: Unify schema handling in query crate (#1755)
* refactor: Unify schema handling in query crate

* fix: doclink
2021-06-18 14:10:57 +00:00
Andrew Lamb 1c13d676b4
refactor: Rename query::PartitionChunk --> query::QueryChunk (#1754) 2021-06-18 13:24:09 +00:00
Andrew Lamb c5eea9af6a
feat: Implement DeduplicateExec (#1733)
* feat: Implement DeduplicateExec

* fix: Doc comments

* fix: fix comment

* fix: Update with arrow ticket references and use datafusion coalsce batches impl

* refactor: rename inner.rs to algo.rs

* docs: Add additional documentation on rationale for last field value

* docs: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* docs: Update query/src/provider/deduplicate/algo.rs

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* docs: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: do not use pub(crate)

* docs: fix test comments

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-06-17 14:17:52 +00:00
Andrew Lamb b42218a197
chore: Add proper format for SchemaPivotNode (#1744) 2021-06-17 11:32:48 +00:00
Raphael Taylor-Davies 38d17a3093
chore: remove unused query dependency (#1731)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-15 22:06:13 +00:00
Edd Robinson e2315f0016 refactor: revert reead_filter debugging 2021-06-14 17:54:21 +01:00
Edd Robinson 6657e6f596 refactor: update query/src/exec/seriesset.rs 2021-06-14 16:09:02 +01:00
Edd Robinson 58f4073a7d
Merge branch 'main' into er/fix/dictionary_dupe_keys 2021-06-14 15:59:58 +01:00
Edd Robinson ec52bca309 fix: ensure values are different 2021-06-14 15:28:35 +01:00
kodiakhq[bot] cf6b658ee3
Merge branch 'main' into er/duplicate_keys 2021-06-14 11:10:45 +00:00
Andrew Lamb 0d8d32fd8f
chore: Update deps to get latest arrow (#1708)
* chore: Update deps to get latest arrow

* fix: Update to rust 1.52

* fix: clippy
2021-06-14 11:08:09 +00:00
Edd Robinson 1612ebcbdb refactor: more debug logging 2021-06-14 12:07:51 +01:00
Edd Robinson 927d6f890f
Merge branch 'main' into er/duplicate_keys 2021-06-14 10:29:46 +01:00
Edd Robinson 96fb595cc0 refactor: read_filter debugging 2021-06-14 10:22:05 +01:00
Nga Tran 11729b9aa7
test: select non-key from 2 chunks with different key/tag sets (#1703)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-11 18:52:36 +00:00
Nga Tran 736cf1ff6f
Merge branch 'main' into ntran/dedupe_final_union 2021-06-11 09:45:54 -04:00
Nga Tran 7dd0416960 refactor: address review comments 2021-06-11 09:43:39 -04:00
Nga Tran e34d157f28 fix: comments 2021-06-11 07:30:49 -04:00
Nga Tran ea9edef716 fix: testing option 2021-06-11 07:18:33 -04:00
Nga Tran fb639ee54f feat: add UnionExec on top of the scan activities 2021-06-11 07:06:08 -04:00
Andrew Lamb 13dd4b23fd
fix: make pruning debug log less confusing (#1684) 2021-06-10 18:35:04 +00:00
kodiakhq[bot] 16b268402e
Merge branch 'main' into ntran/dedup_merge_exec 2021-06-10 17:13:49 +00:00
Nga Tran 46d4ab1f2a refactor: address review comments 2021-06-10 13:13:02 -04:00
Marco Neumann 7b1106ff64 chore: enforce `clippy::future_not_send` for `query` 2021-06-10 09:48:35 +02:00
Nga Tran 4cf05df35b feat: hook SortPreservingMergeExec into deduplication framework 2021-06-09 23:29:44 -04:00
Nga Tran 4478d900ee refactor: capture test output 2021-06-09 15:09:13 -04:00
Nga Tran 8cc99e3420 Merge branch 'ntran/dedup_within_chunk' of https://github.com/influxdata/influxdb_iox into ntran/dedup_within_chunk 2021-06-09 14:40:29 -04:00
Nga Tran b3c94b9d65 refactor: change order of fields to pass circle CI tests 2021-06-09 14:40:10 -04:00
kodiakhq[bot] eed73a30c5
Merge branch 'main' into ntran/dedup_within_chunk 2021-06-09 18:19:17 +00:00
Nga Tran c1c58018fc refactor: address review comments 2021-06-09 14:17:47 -04:00
Andrew Lamb 89fcc457f4
fix: Fix bug in chunk overlap calculation due to nulls (#1669)
* fix: Fix bug in chunk overlap calculation due to nulls

* docs: add note about algorithmic complexity

* fix: avoid recursion in normal case
2021-06-09 17:46:39 +00:00
Raphael Taylor-Davies 07c4277ca7
refactor: schema merge to give more control over field merging (#1653)
* refactor: schema merge to give more control over field merging

* chore: review feedback
2021-06-09 06:30:45 +00:00
Nga Tran 3d50ff7a60 refactor: remove comments 2021-06-08 21:48:57 -04:00
Nga Tran ab7d3384b7 refactor: remove unused comments 2021-06-08 21:43:02 -04:00
Nga Tran 3e10351538 test: add tests for the sort plan 2021-06-08 21:40:46 -04:00
Andrew Lamb cba7f270b4
docs: Improve comments + whitespace (#1663) 2021-06-08 21:13:35 +00:00
Nga Tran 68e3a2121f feat: add SortExec 2021-06-08 15:04:31 -04:00
Andrew Lamb 666204d4a8 fix: remove whitespace changes 2021-06-08 14:46:55 -04:00
Andrew Lamb b23c4e5210 fix: clippy 2021-06-08 14:44:48 -04:00
Andrew Lamb fd8a87484e feat: Hook up chunk grouping into provider 2021-06-08 14:42:37 -04:00
Nga Tran edbf1b7d5e Merge branch 'main' into ntran/dedup_within_chunk 2021-06-08 13:18:40 -04:00
Nga Tran 40cb4f741f feat: initial implementaton 2021-06-08 13:17:36 -04:00
Andrew Lamb 62e8675737
refactor: move primary_key calculaton to TableSummary (#1659) 2021-06-08 17:06:37 +00:00
Andrew Lamb 34ba268cf1
feat: Group chunks by potential overlap (#1654)
* feat: Group chunks by potential overlap

* docs: clarify in what way the calculation is conservative

* fix: Add test for mixed nulls
2021-06-08 16:55:29 +00:00
Edd Robinson b88f277477 feat: enable not eq operator 2021-06-08 15:57:07 +01:00
Andrew Lamb e9834a907c
feat: Prune on boolean column predicates too (#1629)
* chore: update deps to get latest DataFusion

* fix: enable boolean pruning tests

* fix: update explain plan tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-07 16:51:30 +00:00
Nga Tran ff641e5638 refactor: address Andrew's comments 2021-06-06 22:36:44 -04:00
Nga Tran 2f82a9d670 feat: full foundation for deduplicate with todo functions to finish 2021-06-06 22:09:01 -04:00
Andrew Lamb ff3215e6a9
feat: Implement Chunk Pruning (#1567) 2021-06-04 13:05:22 +00:00
Andrew Lamb c986ce2c19
feat: Add pruning module to query crate (#1611)
* feat: Add pruning module

* fix: clippy

* fix: Apply suggestions from code review

* fix: remove erronious claims of DF bugs

* fix: update comments with DF bug reference
2021-06-03 11:07:26 +00:00
Nga Tran e7a97f3ac1 test: merge main and add more tests for deduplicate work 2021-06-02 12:00:40 -04:00
Nga Tran 60ad929721 refactor: add macro tto compare output of explains 2021-06-01 16:39:14 -04:00
Nga Tran aa867601e5 chore: merge main with DF plan display fix 2021-06-01 16:17:41 -04:00
Andrew Lamb d8fbb7b410
refactor: Remove last vestiges of multi-table chunks from PartitionChunk API (#1588)
* refactor: Remove last vestiges of multi-table chunks from PartitionChunk API

* fix: remove test that can no longer fail

* fix: update tests + code review comments

* fix: clippy

* fix: clippy

* fix: restore test_measurement_fields_error test
2021-06-01 16:12:33 +00:00
Andrew Lamb d3711a5591
refactor: Use ParquetExec from DataFusion to read parquet files (#1580)
* refactor: use ParquetExec to read parquet files

* fix: test

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-01 14:44:07 +00:00
Andrew Lamb 162a808a8d
refactor: Remove `table_name` from PartitionChunk API (#1584)
* refactor: Remove `table_name` from PartitionChunk API

* fix: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-31 12:05:09 +00:00
Andrew Lamb d50c7c8919
chore: remove unused dependency (#1581) 2021-05-31 09:58:10 +00:00
Nga Tran 62147ff0d4 feat: add more explain tests 2021-05-27 12:19:41 -04:00
Raphael Taylor-Davies 5d342d7779
feat: associate tracker with lifecycle action (#1099) (#1556)
* feat: associate tracker with lifecycle action (#1099)

* chore: docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-27 10:47:35 +00:00
Raphael Taylor-Davies 4fcc04e6c9
chore: enable arrow prettyprint feature (#1566) 2021-05-27 10:28:14 +00:00
Raphael Taylor-Davies c2fd85209c
feat: wait for task shutdown on DedicatedExecutor (#1537) 2021-05-25 11:33:55 +00:00
Andrew Lamb 14ba25f86d
chore: Update datafusion and use released version of arrow crates (#1546)
* chore: Update datafusion and use released version of arrow crate

* fix: Update for change in API
2021-05-24 15:37:22 +00:00
Nga Tran 0563005aac chore: remove leftover comments 2021-05-21 17:01:49 -04:00
Nga Tran f113abacb5 feat: more unit & e2e tests plus cleanup and addressing review comments of Andrew and Edd 2021-05-21 16:48:43 -04:00
Nga Tran e44a3a87db feat: fnow predicate is actuallu pushed down to RUB but there are bugs and not working yet 2021-05-20 16:56:15 -04:00
Nga Tran 51de37e752 chore: run fmt 2021-05-19 15:28:44 -04:00
Nga Tran 11561111d5 chore: merge main to branch 2021-05-19 15:11:15 -04:00
Nga Tran 1f13842550 chore: modify comments 2021-05-19 14:49:48 -04:00
Nga Tran 087d61f229 feat: Part 1 of predicate push down - Send predicates to MUB, RUB, and Parquet File. Note that MUB has not handled predicates yet 2021-05-19 13:59:51 -04:00
Andrew Lamb 7e223780f3
feat: Implement Display for query::predicate to improve debug printing of plans (#1519)
* feat: Implement Display for query::predicate to improve debug printing of plans

* fix: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-19 12:38:34 +00:00
Andrew Lamb 0680a5167f
chore: Improve DataFusion plan logging (#1508)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-18 11:08:06 +00:00
Andrew Lamb 07db4932ee
refactor: rename data_types/src/chunk.rs -> data_types/src/chunk_metadata.rs (#1500) 2021-05-15 10:18:01 +00:00
Edd Robinson 8ccc359cab refactor: address PR feedback 2021-05-07 13:48:44 +01:00
Edd Robinson 4c4bd2f164 refactor: update query/src/func/regex.rs 2021-05-07 13:44:51 +01:00
Edd Robinson 4cc7a99854 refactor: include not match in support check 2021-05-07 13:44:51 +01:00
Edd Robinson beee3115f4 feat: expose regex =\~ and to gRPC API 2021-05-07 13:44:51 +01:00
Edd Robinson eae3fec571 feat: wire up regex UDF as predicate filter expr 2021-05-07 13:44:51 +01:00
Edd Robinson 3fc2c9fc04 feat: add DataFusion regex match operator
This commit adds a new custom UDF to IOx that provide a regex operator to Datafusion plans.
Effectively it allows predicates to contain regex operators that are applied as filters, only allowing rows that satisfy the regex to be returned.

I did not use the Arrow regex kernel for this work because that does not return a boolean array indicating which rows matched a regex, but instead returns a new string array of results. This doesn't work well with DF's approach to filtering.
2021-05-07 13:44:51 +01:00
Carol (Nichols || Goulding) febc1538ff
chore: Update Rust version (#1445)
* chore: Update Rust version

* refactor: Make struct constructor field orderings consistent

Sometimes I changed the struct definition, sometimes changed the struct
construction instance, depending on consistency with code around each
(other similar structs, function argument orders, etc)

More info: https://rust-lang.github.io/rust-clippy/master/index.html#inconsistent_struct_constructor

* refactor: Use flatten where appropriate

One instance is a false positive with a clippy bug.

More info:

- https://rust-lang.github.io/rust-clippy/master/index.html#filter_map_identity
- https://rust-lang.github.io/rust-clippy/master/index.html#manual_flatten

* refactor: Use Option map instead of match

More info: https://rust-lang.github.io/rust-clippy/master/index.html#manual_map

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-06 22:07:10 +00:00
Raphael Taylor-Davies 44de42906f
refactor: use Arc<str> instead of Arc<String> (#1442) 2021-05-06 17:05:08 +00:00
Raphael Taylor-Davies 411cf134e9
refactor: explode arrow_deps (#1425)
* refactor: explode arrow_deps

* chore: workaround doctest bug
2021-05-05 16:59:12 +00:00
Edd Robinson 2f789485e6 refactor: fix spelling 2021-05-05 11:06:04 +01:00
Andrew Lamb 3b7c5ac350
fix(storage rpc): do not send back tags with empty values (#1403) 2021-05-04 10:35:24 +00:00
Andrew Lamb 40b9b09cdc
refactor: rename assert_table_eq to assert_batches_eq (#1368) 2021-04-30 10:51:08 +00:00
Andrew Lamb eb8d91cf1c
refactor: remove additional uses of RecordBatch::try_new (#1378)
* refactor: remove additional uses of RecordBatch::try_new

* fix: fix accidental change

* fix: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-30 10:24:47 +00:00
Edd Robinson 13fbf2e68d refactor: plumb registry to gRPC server 2021-04-29 14:00:05 +01:00
Edd Robinson 4acbdcf1c9 refactor: address PR feedback 2021-04-28 16:11:57 +00:00
Edd Robinson a9ef604ef6 perf: avoid using channels for query execution
Pre-sized channels get full when the results to send over them are larger than the capacities. This causes significant runtime overhead and slows down query performance.

This commit removes the intermediate channels. The potential downside to this approach is there may be more buffering which could increase memory usage during query and also block a thread for longer periods of time.
2021-04-28 16:11:57 +00:00
Raphael Taylor-Davies 7ca1da3fcd
feat: pushdown table and partition key predicates to catalog (#736) (#1327)
* feat: catalog predicate pushdown (#736)

* chore: fix lints

* chore: review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-27 15:31:47 +00:00
Marco Neumann 91bccdfca3 ci: pass `--document-private-items` to `cargo doc` 2021-04-27 15:42:07 +02:00
Marco Neumann eddc9319ff docs: deny broken intradoc links 2021-04-27 13:22:28 +02:00
Raphael Taylor-Davies 20117de078
feat: string dictionary encoding (#1220) (#1262)
* feat: string dictionary encoding (#1220)

* chore: review comments

* chore: fix lint

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-27 09:36:58 +00:00
Edd Robinson a322d05838 refactor: rust fmt 2021-04-20 17:30:50 +00:00