Nga Tran
c755678aa1
refactor: address review comments
2021-10-19 13:41:41 -04:00
Nga Tran
cabb007956
chore: Merge branch 'main' into ntran/table_names
2021-10-19 13:22:28 -04:00
Andrew Lamb
a82dc6f5f0
chore: Update datafusion + arrow ( #2903 )
...
* chore: Update datafusion to latest, arrow to 6.0.0
* fix: Update tests
* fix: bubble internal error
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-19 17:14:08 +00:00
Nga Tran
ea85d6478e
chore: remove code changes for table_names abecause normal plan needs to be implemented first
2021-10-19 10:36:10 -04:00
Nga Tran
afa6e50c9c
feat: make tag_keys work with delete
2021-10-18 15:36:19 -04:00
Andrew Lamb
f5a84122e3
feat: Support grouping by _field and _measurement ( #2874 )
...
* feat: Support grouping by _field and _measurement
* fix: clippy
* fix: doclink
* refactor: rename SeriesOrGroup --> Either
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-18 15:32:24 +00:00
Andrew Lamb
51276119df
docs: Add better SeriesSet explanation ( #2857 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-15 21:53:51 +00:00
Andrew Lamb
beaf77cecf
refactor: move Series translation logic into query crate, update gRPC tests ( #2852 )
...
* refactor: move Series translation logic into query crate
* refactor: update grpc_tests to use new display
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-15 11:06:40 +00:00
Andrew Lamb
08e6a01e69
refactor: Move series set converter into its own module ( #2847 )
...
* refactor: Move series set converter into its own module
* fix: add file
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-14 14:16:32 +00:00
Edd Robinson
8342e138b0
refactor: PR feedback
...
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-10-14 09:23:29 +01:00
Edd Robinson
fcf31eebb2
fix: use correct timestamps for selectors
2021-10-14 09:23:29 +01:00
Andrew Lamb
e0929f20ae
refactor: Pull out read_group order creation ( #2832 )
2021-10-13 19:05:15 +00:00
Andrew Lamb
d2cf6fa9f7
docs: improve docstring for InfluxRPCPlanner::read_group ( #2827 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-13 14:48:32 +00:00
Nga Tran
04468f3208
chore: more cleanup
2021-10-12 17:27:04 -04:00
Nga Tran
459dd46ae9
refactor: move delete tests to .sql
2021-10-12 15:49:23 -04:00
Nga Tran
0b4ae95ca4
refactor: exhaust scenarios for one-chunk test
2021-10-11 17:47:41 -04:00
kodiakhq[bot]
6b42f1cdbb
Merge branch 'main' into ntran/count_star
2021-10-11 15:01:00 +00:00
Nga Tran
1718b55283
chore: Merge branch 'ntran/count_star' of https://github.com/influxdata/influxdb_iox into ntran/count_star
2021-10-11 10:47:38 -04:00
Nga Tran
fbf5539336
chore: merge main to branch
2021-10-11 10:47:10 -04:00
Marco Neumann
24ae269b3a
refactor: cancel executor jobs on drop
...
Our executor is not meant as a fire-and-forget system. Instead the
submitter should always poll the result. Dropping the receiver side (aka
the job handle) should cancel the job.
2021-10-11 16:13:04 +02:00
Nga Tran
d13e61c201
fix: Apply suggestions from code review
...
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-10-11 09:33:23 -04:00
Raphael Taylor-Davies
afe34751e7
refactor: split out schema crate ( #2781 )
...
* refactor: split out schema crate
* chore: fix doc
2021-10-11 09:45:08 +00:00
Nga Tran
d0a17ca79d
refactor: address Edd's review comments
2021-10-08 16:47:37 -04:00
Nga Tran
f7475322a6
chore: merge main to branch, resolve conflicts, and discover an inconsitent bug
2021-10-08 15:50:46 -04:00
Nga Tran
bea310db76
chore: remove comments
2021-10-08 14:58:06 -04:00
Nga Tran
2556639bb5
chore: more cleanup
2021-10-08 14:57:22 -04:00
Nga Tran
f2cdb9531f
chore: cleanup
2021-10-08 14:52:15 -04:00
Nga Tran
adbcd85c26
fix: fully fix 2745
2021-10-08 14:37:34 -04:00
Andrew Lamb
2072b4066e
feat: Implement support for `_measurement` predicate in gRPC plans ( #2772 )
...
* feat: Implement filtering for _measurement in general purpose gRPC plans
* docs: fixup docstrings
* fix: fmt
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-08 17:25:33 +00:00
Nga Tran
22d6f11bea
fix: add cols of delete predicates into the schema of scanning columns
2021-10-07 17:37:34 -04:00
Andrew Lamb
e590ac4da2
fix: remove outdated "supported predicate" check in gRPC planner ( #2763 )
2021-10-07 20:34:05 +00:00
Andrew Lamb
c7727f1b5b
chore: Update datafusion + other deps ( #2760 )
...
* chore: Update datafusion and other deps
* fix: fmt
* fix: cleanup workaround
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-07 16:48:11 +00:00
Raphael Taylor-Davies
d4bc33b921
refactor: use RWLock instead of TaskTracker for query exec shutdown ( #2761 )
2021-10-07 14:18:40 +00:00
Andrew Lamb
0b4fd01d04
fix: Cast count aggregates correctly ( #2756 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-07 10:49:15 +00:00
Marco Neumann
63d74be490
refactor: make `ChunkId` a UUID
2021-10-07 10:23:27 +02:00
Nga Tran
de148337e8
fix: half way fix the bug to inlcude schema of column in delete predicate into the schema of IOx scan to avoid missing reading columns
2021-10-06 17:43:48 -04:00
Andrew Lamb
efa2316626
fix: do not sort the output of read_group with no group keys ( #2755 )
2021-10-06 18:59:58 +00:00
Nga Tran
65a02f7085
refactor: Apply suggestions from code review
...
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-10-06 10:04:28 -04:00
Edd Robinson
ea26a77217
Merge branch 'main' into ntran/no_use_stats
2021-10-06 09:30:18 +01:00
Nga Tran
aa64daca86
feat: dDisable using statistics to query data if there are soft deleted rows
2021-10-05 17:52:32 -04:00
Andrew Lamb
785a62c114
fix: include all group tags, not just group_keys in GroupFrame response ( #2741 )
...
* fix: include all group tags, not just group_keys in GroupFrame response
* docs: fix test comments, add doc strings for group_description_to_frames
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-05 19:08:04 +00:00
Marco Neumann
97881079e8
refactor: make `ChunkOrder` non-zero
...
This will make it easier to handle missing values.
Helps with #2633 .
2021-10-04 17:49:12 +02:00
Marco Neumann
5a5a929b9e
refactor: introduce `DeletePredicate`
...
`DeletePredicate` is a simpler version of `Predicate` that is based on
IOx `DeleteExpr` instead of the full-blown DataFusion `Expr`. This will
allow us to do a couple of things (in follow-up changes):
- Order and de-duplicate delete predicates
- Normalize predicates
- Infallible serialization
- Smaller memory footprint
Note that this change only affects delete expressions. Query expressions
that are supported via the API are not changed. The query subsystem also
still uses the full-featured expressions/predicates (delete
expressions/predicates are converted to the more powerful DataFusion
version on-the-fly).
2021-10-04 16:36:20 +02:00
dependabot[bot]
d1f5209869
chore(deps): bump arrow from 5.4.0 to 5.5.0
...
Bumps [arrow](https://github.com/apache/arrow-rs ) from 5.4.0 to 5.5.0.
- [Release notes](https://github.com/apache/arrow-rs/releases )
- [Changelog](https://github.com/apache/arrow-rs/blob/5.5.0/CHANGELOG.md )
- [Commits](https://github.com/apache/arrow-rs/compare/5.4.0...5.5.0 )
---
updated-dependencies:
- dependency-name: arrow
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2021-10-04 08:55:38 +00:00
Andrew Lamb
134fb96b26
feat: add UInt64 support for gRPC query results ( #2701 )
2021-10-01 17:18:32 +00:00
Andrew Lamb
2db56a0332
chore: Make query logging a bit less verbose ( #2655 )
...
* chore: Make query logging a bit less verbose
* fix: remove unused use
2021-09-28 20:58:37 +00:00
Andrew Lamb
a55a21c644
chore: Update datafusion ( #2635 )
...
* chore: Update datafusion and sqlparser
* fix: remove STACK_SIZE workaround
* chore: update datafusion_util
* chore: update predicate
* chore: update query_tests
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-27 14:13:19 +00:00
Andrew Lamb
d38648952c
chore: Update datafusion ( #2602 )
...
* chore: Update datafusion + other deps
* refactor: update query crate for new async interfaces
* refactor: update server crate for new async interface
* refactor: update query_tests crate for new async interfaces
* refactor: update ioxd and server to use new async interface
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-22 10:33:25 +00:00
Nga Tran
0fc2567161
refactor: address review comment
2021-09-21 16:50:29 -04:00
Nga Tran
a06afb3932
feat: optimize scan for chunks without delete preidcates and without the need to sort output
2021-09-21 15:56:19 -04:00
Nga Tran
61ed67a5d9
refactor: cleanup
2021-09-21 12:28:18 -04:00
Nga Tran
93551bdd1e
fix: all chunks now are applied delete predicates during scan
2021-09-21 12:17:59 -04:00
kodiakhq[bot]
77d84ca5ab
Merge branch 'main' into crepererum/chunk_id
2021-09-20 13:39:05 +00:00
Marco Neumann
cef5aeee52
refactor: introduce `ChunkId` type
2021-09-20 13:10:41 +02:00
Nga Tran
364d245eae
feat: apply negated delete predicates during scan
2021-09-17 16:20:42 -04:00
Marco Neumann
ec943081c7
refactor: `Arc<Vec<...>>` => `Vec<Arc<...>>` for del predicates
...
The motivations are:
1. The API uses a SINGLE predicate and adds that to many chunks. With
`Arc<Vec<...>>` you gain nothing, with `Vec<Arc<...>>` the predicate
is only stored once (in many vectors)
2. While we currently add predicates blindly to all chunks, we can be way
smarter in the future and prune out tables, partitions or even single
chunks (based on statistics). With that, it will be rare that many
chunks share the exact same set of predicates.
3. It would be nice if we could de-duplicate predicates when writing them
to the preserved catalog without needing to repeat the pruning
discussed in point 2. This is way easier to implement whan chunks
exists in `Arc`s.
4. As a side-note: the `Arc<Vec<...>>` wasn't really cloned around but
instead was created many time. So the new version should be more
memory efficient out of the box.
2021-09-16 17:16:09 +02:00
Andrew Lamb
ce224bd37f
fix: Capture query execution traces for storage gRPC queries as well ( #2553 )
...
* fix: Capture query execution traces for storage gRPC queries as well
* refactor: remove debugging droppings
* refactor: do not Box::pin within TracedStream
* refactor: Use Futures::TryStreamExt rather than custom collect function
* fix: remove wild println
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-16 14:45:20 +00:00
kodiakhq[bot]
33cd1cffad
Merge branch 'main' into ntran/delete_read
2021-09-16 13:22:50 +00:00
Andrew Lamb
a478138756
refactor: Add SpanContext:new() to make a new span ( #2551 )
...
* refactor: Add SpanContext::new() and remove make_span
* fix: generate random trace_id and span_ids
* docs: Update trace/src/ctx.rs
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-16 11:02:28 +00:00
Nga Tran
7175488133
chore: add some comments
2021-09-15 14:45:04 -04:00
Nga Tran
3486cc8b38
fix: should not send an empty delete predicate predicate which means delete everything (no time range)
2021-09-15 14:14:26 -04:00
Andrew Lamb
74d3c2e6d2
feat: Translate DataFusion execution metrics to IOx Spans ( #2529 )
...
* feat: Translate DataFusion execution metrics to IOx Spans
* fix: add end to end test to ensure plumbing is hookedup
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-15 18:14:23 +00:00
kodiakhq[bot]
de732b4273
Merge branch 'main' into crepererum/parquet_file_wo_query
2021-09-15 07:15:19 +00:00
Nga Tran
63cc7b3fb0
test: more tests to discover what still need to be done
2021-09-14 17:57:30 -04:00
Nga Tran
f4f140d3b7
chore: merge main to branch
2021-09-14 13:25:32 -04:00
Marco Neumann
509c07330d
refactor: decouple `parquet_file` from `query`
2021-09-14 18:26:16 +02:00
kodiakhq[bot]
d60aa5940b
Merge branch 'main' into crepererum/chunk_order_type
2021-09-14 16:25:17 +00:00
Marco Neumann
bfaba78dc3
refactor: move `predicate` into its own crate
...
Two reasons:
1. I wanna decouple `parquet_file` from `query` (nearly done, needs a
small follow-up PR).
2. `predicate` will have more and more features (like serialization)
which justifies a new home
2021-09-14 17:13:02 +02:00
Marco Neumann
becef1c75f
refactor: introduce `ChunkOrder` type
2021-09-14 17:10:23 +02:00
kodiakhq[bot]
9ea61cd434
Merge branch 'main' into crepererum/issue1963
2021-09-14 11:38:59 +00:00
Marco Neumann
4795bd5c9c
refactor: stricter delete predicate TS parsing
...
As a a nice side effect, the parser no longer depends on the line
protocol parser.
2021-09-14 13:14:19 +02:00
Marco Neumann
1b788732da
fix: order chunks correctly during query processing
...
The query processing was implicitly relying on the order provided by the
catalog. This had two issues:
- this ordering was not defined in the API contract (neither via docs
nor via typing)
- the order was based on chunk IDs which is not adequate in some cases
(e.g. when chunks are created while a persistence operations is in
progress)
Now we explicitly sort chunks by `(order, ID)`.
Fixes #1963 .
2021-09-14 13:00:55 +02:00
Nga Tran
042a78e5a7
feat: apply delete predicate during query to emilimate deleted data
2021-09-13 18:02:55 -04:00
Andrew Lamb
5eef76c868
chore: Update dependencies (including datafusion) ( #2521 )
...
* chore: Update datafusion deps to pre-release
* refactor: Update IOx to use new datafusion Statistics
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-13 21:30:44 +00:00
Raphael Taylor-Davies
20143e4f4e
feat: migrate chunk pruning metrics ( #2516 )
2021-09-13 13:13:47 +00:00
Andrew Lamb
eb72799f0d
chore: Update datafusion (and arrow et al) dependencies ( #2509 )
...
* chore: update datafusion and other deps
* fix: Update InfluxRPC frontend with new op types
* fix: Update test output for new column names
* fix: typos and unintended changes
* fix: Update query_tests
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-10 17:57:25 +00:00
Marco Neumann
368f0369ee
chore: Rust 1.55
2021-09-10 12:36:49 +02:00
Nga Tran
5d2b4a87c3
refactor: use snafu
2021-09-09 16:34:20 -04:00
Nga Tran
1effb11ad9
refactor: move delete parseing work out of influxdb_line_protocol crate
2021-09-09 15:27:17 -04:00
Nga Tran
b4f8fad400
refactor: address review comments and remove uno longer needed dependencies
2021-09-09 14:24:58 -04:00
Nga Tran
00df7b064c
feat: finally have the delete predicate parsed
2021-09-08 17:30:10 -04:00
Nga Tran
dbe4bcff22
chore: merge main to branch
2021-09-07 10:54:59 -04:00
Nga Tran
9ee1bdeeb9
refactor: address review comments
2021-09-07 10:24:38 -04:00
dependabot[bot]
b67610d9b9
chore(deps): bump tokio from 1.10.1 to 1.11.0
...
Bumps [tokio](https://github.com/tokio-rs/tokio ) from 1.10.1 to 1.11.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases )
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.10.1...tokio-1.11.0 )
---
updated-dependencies:
- dependency-name: tokio
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 09:11:38 +00:00
dependabot[bot]
b1bb390893
chore(deps): bump parking_lot from 0.11.1 to 0.11.2
...
Bumps [parking_lot](https://github.com/Amanieu/parking_lot ) from 0.11.1 to 0.11.2.
- [Release notes](https://github.com/Amanieu/parking_lot/releases )
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md )
- [Commits](https://github.com/Amanieu/parking_lot/compare/0.11.1...0.11.2 )
---
updated-dependencies:
- dependency-name: parking_lot
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 01:18:24 +00:00
Nga Tran
d8b3208896
refactor: sstart building parser in the client side
2021-09-03 17:56:45 -04:00
Nga Tran
a4183de411
feat: more progress on the delete flow from grpc API to catalog chunks
2021-08-31 17:42:07 -04:00
Nga Tran
b42784d8a0
refactor: address review comments
2021-08-30 16:31:59 -04:00
Nga Tran
7edb3fd270
fix: time column is alwyas last even if the column is sorted lexicographically
2021-08-30 14:48:18 -04:00
Andrew Lamb
f42f0349ed
feat: Implement basic metrics for `DeduplicateExec`, `IOxReadFilterNode`, `SchemaPivotExec` and `StreamSplitExec` ( #2387 )
...
* feat: Add baseline metrics to DeduplicateExec
* feat: Add metrics to `IOxReadFilterNode`
* feat: Add metrics for SchemaPivotExec
* feat: Add metrics to StreamSplitExec
* fix: Update for new API, cleanups
* test: Add tests
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 20:31:28 +00:00
Raphael Taylor-Davies
e3e801d29a
feat: propagate span context into storage RPC queries ( #2407 )
...
* feat: propagate span context into storage RPC queries
* refactor: create ExecutionContextProvider trait
* chore: cleanup imports
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 17:11:49 +00:00
Andrew Lamb
ddf6c6362e
chore: update DataFusion again ( #2411 )
...
* chore: update datafusion ref
* chore: run cargo update
* refactor: Rename concurrency to target_partitions, avoid deprecation warning
2021-08-26 08:03:13 +00:00
kodiakhq[bot]
5d97acb2f3
Merge branch 'main' into crepererum/issue2372
2021-08-25 07:08:15 +00:00
Raphael Taylor-Davies
f7792aafe6
feat: query tracing ( #2273 ) ( #2391 )
...
* feat: query tracing (#2273 )
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 17:35:59 +00:00
Andrew Lamb
745eaa9248
chore: Update DataFusion + other deps (to get new Metric API) ( #2385 )
...
* chore: Update deps
* refactor: Update IOx to use new DataFusion Metric API
* fix: update Modulus --> Modulo
2021-08-24 16:07:23 +00:00
Marco Neumann
4f23d3b60b
feat: shut down executor when `Executor` is dropped
2021-08-24 14:38:00 +02:00
Raphael Taylor-Davies
a6c9cc2bf2
refactor: rework exec module ( #2384 )
...
* refactor: rework exec module
* chore: update docs
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 08:39:54 +00:00
Raphael Taylor-Davies
0946ffe916
refactor: reuse IOxExecutionContext ( #2373 )
...
* refactor: reuse IOxExecutionContext
* fix: orphaned comment
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-23 15:47:15 +00:00
Dom
3de6b44e23
build: use new rustdoc lint name ( #2261 )
...
* fix: nocache feature code rot
The MBChunk::snapshot code when using the "nocache" option no longer
compiles - this commit updates it to match the not(nocache) code.
* build: use updated broken_intra_doc_links name
The broken_intra_doc_links lint was renamed
rustdoc::broken_intra_doc_links
https://doc.rust-lang.org/rustdoc/lints.html
2021-08-11 19:48:51 +00:00
Andrew Lamb
559db4529d
refactor: Move DatabaseStore out of query crate ( #2219 )
...
* refactor: Move DatabaseStore out of query crate
* fix: doc links
2021-08-09 12:06:25 +00:00
Carol (Nichols || Goulding)
9d15798288
fix: Address or allow Clippy warnings new with Rust 1.54
2021-07-30 09:59:59 -04:00
Nga Tran
e8828c22e4
refactor: address review comments
2021-07-29 13:38:42 -04:00
Nga Tran
0d05ac3961
feat: add sort option while building scan plan to avoid extra sort during compaction
2021-07-28 17:32:01 -04:00
Andrew Lamb
e6cbd4d217
feat: Use statistics for count(*) queries ( #2038 )
...
* feat: Use statistics for count(*) queries
* docs: fix mangled comment
* refactor: rewrite to use fold
* refactor: use sort_by_cached_key
* fix: set null count properly
* fix: fmt + clippy
2021-07-28 19:39:41 +00:00
Andrew Lamb
5fb3e00f2a
fix: Properly record total_count and null_count in statistics ( #2103 )
...
* fix: Properly record total_count and null_count in statistics
* fix: fix statistics calculation in mutable_buffer
* refactor: expose null counts in read_buffer
* refactor: expose null_count in parquet_file
* fix: update server crate tests
* fix: update query_tests tests
* docs: tweak comments
* refactor: Use storage_stats rather than adding `null_count`
* refactor: rename test data field for clarity
* fix: fixup merge conflicts
* refactor: rename initial_non_null_count to initial_total_count
* refactor: caculate null_count as row_count - to_add
2021-07-26 18:13:36 +00:00
Andrew Lamb
01c79f1a1a
fix: Print all timestamps using RFC3339 format ( #2098 )
...
* fix: Use IOx pretty printer rather than arrow pretty printer
* chore: update tests in the query crate
* chore: update influxdb_iox tests
* chore: Update end to end tests
* chore: update query_tests
* chore: update mutable_buffer tests
* refactor: update parquet_file tests
* refactor: update db tests
* chore: update kafka integration test output
* fix: merge conflict
2021-07-22 19:04:52 +00:00
Nga Tran
11ba4b5f6a
fix: fix unit_test setting to have the desired results
2021-07-22 14:22:08 -04:00
Nga Tran
b2063fb29f
test: fix the stats and discover a bug in compaction/split/dedupplication
2021-07-21 17:40:48 -04:00
kodiakhq[bot]
18dd108ba6
Merge branch 'main' into ntran/dedup_compare_cols_order
2021-07-21 15:42:30 +00:00
Nga Tran
86add39175
refactor: address review comments
2021-07-21 11:41:21 -04:00
Nga Tran
d547c22e97
refactor: comments
2021-07-20 15:27:41 -04:00
Nga Tran
150e166813
refactor: fix comments
2021-07-20 15:16:24 -04:00
Nga Tran
fa6d216a85
refactor: cleanup
2021-07-20 15:11:02 -04:00
Nga Tran
b98888e8d6
feat: implement key_ranges function that uses new range identify algo
2021-07-20 14:58:54 -04:00
Andrew Lamb
2c20528c69
chore: use upstream versions of some workarounds ( #2057 )
...
* chore: use upstream versions of some workarounds
* docs: update docstring
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 08:53:46 +00:00
Nga Tran
1668420ded
feat: new algorth to compute key ranges for deduplicating data
2021-07-19 18:04:25 -04:00
Andrew Lamb
1c16988a51
chore: Update datafusion references ( #2056 )
2021-07-19 18:09:06 +00:00
Andrew Lamb
4da8a16c18
chore: update to arrow 5.0 and master datafusion ( #2049 )
...
* chore: update to arrow 5.0 and master datafusion
* fix: Update test for change in object size
2021-07-19 12:49:51 +00:00
Raphael Taylor-Davies
5fc98c7c56
feat: add failure reporting to TaskTracker ( #2031 )
...
* feat: add failure reporting to TaskTracker
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-19 09:17:20 +00:00
Andrew Lamb
d00d56027b
docs: add comment to trigger build ( #2039 )
2021-07-16 17:53:55 +00:00
Nga Tran
cfe0bfa88b
refactor: address review comments and add useful log info to catch resort
2021-07-15 15:39:12 -04:00
Nga Tran
0b1f2b1fd0
chore: merge main to branch
2021-07-14 16:17:14 -04:00
Nga Tran
ef271d1e1c
test: make the tests clearer
2021-07-14 15:42:30 -04:00
Nga Tran
b4d86dcb7d
fix: make the order of sort key deterministic
2021-07-14 14:50:19 -04:00
Nga Tran
9ffaf863fa
refactor: cleanup
2021-07-14 14:30:04 -04:00
Nga Tran
552e3fb691
fix: Padd stats compute deterministic order of sort key and update tests that got changed by the use of sort key
2021-07-14 14:06:41 -04:00
Edd Robinson
46ac15a77e
refactor: increase compaction batch size
2021-07-14 17:19:11 +01:00
Nga Tran
8fd0df04f2
feat: continue buidling and using sort_key if available
2021-07-13 16:25:58 -04:00
Andrew Lamb
4800b36949
chore: Update IOx to a pre-release version of arrow and datafusion to test out performance improvement
2021-07-13 15:44:57 -04:00
Andrew Lamb
0164cabbf3
refactor: do not use DataFrame DataFusion API / stop optimizing twice ( #1982 )
...
* refactor: do not use DataFrame DataFusion API
* fix: update output to reflect not running optimizer twice
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-13 16:29:43 +00:00
Marco Neumann
2e391deb34
chore: update croaring to 0.5.0
...
Upstreame changelog:
- CRoaring updated to 0.3.1
- `-march=native` is not a default for croaring-sys anymore
- Impl Default for `Bitmap` and `Treemap`
2021-07-13 15:15:41 +02:00
Andrew Lamb
d35b74c226
fix: Fix doc build warnings ( #1945 )
...
* fix: Fix doc build warnings
* refactor: add deny bare_urls to crates
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-13 08:03:42 +00:00
Nga Tran
5418a1fe6b
refactor: remove unused comments
2021-07-12 18:14:38 -04:00
Nga Tran
23895e6673
feat: Using sort_key to avoid resorts
2021-07-12 18:08:45 -04:00
kodiakhq[bot]
f26f844ed2
Merge branch 'main' into ntran/use_sortkey
2021-07-12 18:12:47 +00:00
Carol (Nichols || Goulding)
c681da1031
refactor: Define the TestChunk methods with macros
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
4e53a32928
refactor: Completely replace query::provider::overlap::TestChunk with query::test::TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
1698edcc39
refactor: Implement query::provider::overlap::TestChunk in terms of query::test::TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
dc0b97e121
refactor: Completely replace TestChunkMeta with TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
96f9485792
refactor: Move a with_no_stats method to be entirely defined on TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
b4c5a87088
refactor: Rename int field to i64 field to be more consistent
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
54f7ee8b8d
refactor: Implement TestChunkMeta in terms of TestChunk
...
This is a temporary step to make sure TestChunk does everything
TestChunkMeta needs
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
ee545ce90e
test: Make _with_stats methods able to optionally take max/min
...
Not used yet, but will be when this is unified with query/src/pruning.rs
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
b26aae1cb4
test: Add an arg to control whether to add a column summary at all
...
Always true for now, but there are some cases in query/src/pruning.rs
that don't add any column summaries that will use this with `false`.
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
6cd75bc688
test: Optionally take stats in add_schema_to_table
...
This gets rid of a lookup and construction of default stats that aren't
necessary
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
e05ca7f98b
fix: Change a method name that says null to not say null
...
The comment and implementation seem to indicate this is creating
non-null data.
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
4406d8a219
test: Always initialize a TableSummary on TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
22d4040c81
test: Always initialize a Schema for TestChunk
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
92cb5986f1
test: Initialize a schema on TestChunk to always exist
2021-07-12 09:59:12 -04:00
Carol (Nichols || Goulding)
78f1c4fc80
test: Chunks can only have one table; no need to specify repeatedly
...
This lets us make the name required and always present on TestChunks,
and make the ID optional.
2021-07-12 09:59:12 -04:00