Marco Neumann
3c968ac092
feat: correctly account MUB sizes
...
Fixes #1565 .
2021-09-03 09:15:49 +02:00
Marco Neumann
79ad48ac3a
chore: rename "labels" to "attributes"
2021-08-31 11:31:15 +02:00
Raphael Taylor-Davies
e3e801d29a
feat: propagate span context into storage RPC queries ( #2407 )
...
* feat: propagate span context into storage RPC queries
* refactor: create ExecutionContextProvider trait
* chore: cleanup imports
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 17:11:49 +00:00
Andrew Lamb
f975baba6b
chore: Update datafusion + other deps again (get baseline metrics) ( #2422 )
...
* chore: Update datafusion reference
* chore: cargo update
* fix: update explain tests to show Union
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 13:13:00 +00:00
kodiakhq[bot]
b1ecf1bfed
Merge branch 'main' into crepererum/job_start_time_in_system_table
2021-08-26 08:04:10 +00:00
Andrew Lamb
ddf6c6362e
chore: update DataFusion again ( #2411 )
...
* chore: update datafusion ref
* chore: run cargo update
* refactor: Rename concurrency to target_partitions, avoid deprecation warning
2021-08-26 08:03:13 +00:00
Marco Neumann
558aa54aa3
feat: add start time to `operations` system table
2021-08-26 10:00:29 +02:00
Edd Robinson
11e88877f4
fix: correct size estimation of RLE encoding
2021-08-25 12:03:04 +01:00
Marco Neumann
2ad9843e5f
feat: make `RLE` a bit smaller by capacity-based allocation
...
For some demo data this reduced the overall chunk size from
195049367 bytes
to
191088095 bytes
2021-08-25 10:22:43 +02:00
Raphael Taylor-Davies
f7792aafe6
feat: query tracing ( #2273 ) ( #2391 )
...
* feat: query tracing (#2273 )
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 17:35:59 +00:00
Raphael Taylor-Davies
a6c9cc2bf2
refactor: rework exec module ( #2384 )
...
* refactor: rework exec module
* chore: update docs
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 08:39:54 +00:00
Raphael Taylor-Davies
0946ffe916
refactor: reuse IOxExecutionContext ( #2373 )
...
* refactor: reuse IOxExecutionContext
* fix: orphaned comment
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-23 15:47:15 +00:00
Edd Robinson
b9f09fce49
feat: improve bitset size estimation
2021-08-17 22:54:22 +01:00
Edd Robinson
1daa30cc7d
fix: include enum in sizing
2021-08-17 22:54:22 +01:00
Edd Robinson
311d36d776
refactor: include capacity in Read Buffer chunk size
2021-08-13 11:57:46 +01:00
Edd Robinson
fa8da19c45
refactor: expose enc size API into column
2021-08-13 11:57:46 +01:00
Edd Robinson
c68bbb6309
test: update test
2021-08-12 15:05:47 +01:00
Andrew Lamb
bb8021d9fd
fix: "Can not convert index to usize in dictionary of type creating group by value Int32" ( #2151 )
...
* test: add reproducer for index error
* chore: update datafusion
2021-08-02 12:20:41 +00:00
Carol (Nichols || Goulding)
9d15798288
fix: Address or allow Clippy warnings new with Rust 1.54
2021-07-30 09:59:59 -04:00
Andrew Lamb
e6cbd4d217
feat: Use statistics for count(*) queries ( #2038 )
...
* feat: Use statistics for count(*) queries
* docs: fix mangled comment
* refactor: rewrite to use fold
* refactor: use sort_by_cached_key
* fix: set null count properly
* fix: fmt + clippy
2021-07-28 19:39:41 +00:00
Andrew Lamb
3ea84c6be4
feat: expose null_counts in system.chunk_columns ( #2105 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-27 11:05:23 +00:00
Andrew Lamb
5fb3e00f2a
fix: Properly record total_count and null_count in statistics ( #2103 )
...
* fix: Properly record total_count and null_count in statistics
* fix: fix statistics calculation in mutable_buffer
* refactor: expose null counts in read_buffer
* refactor: expose null_count in parquet_file
* fix: update server crate tests
* fix: update query_tests tests
* docs: tweak comments
* refactor: Use storage_stats rather than adding `null_count`
* refactor: rename test data field for clarity
* fix: fixup merge conflicts
* refactor: rename initial_non_null_count to initial_total_count
* refactor: caculate null_count as row_count - to_add
2021-07-26 18:13:36 +00:00
Marco Neumann
6ef3680554
feat: collect replay plan during catalog loading
2021-07-23 09:23:06 +02:00
Andrew Lamb
38261cc7ac
test: add tests using `to_timestamp()` as predicates in SQL ( #2099 )
...
* test: add tests using `to_timestamp()` as predicates in SQL
* fix: cleanup redundancy
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 21:06:52 +00:00
Andrew Lamb
01c79f1a1a
fix: Print all timestamps using RFC3339 format ( #2098 )
...
* fix: Use IOx pretty printer rather than arrow pretty printer
* chore: update tests in the query crate
* chore: update influxdb_iox tests
* chore: Update end to end tests
* chore: update query_tests
* chore: update mutable_buffer tests
* refactor: update parquet_file tests
* refactor: update db tests
* chore: update kafka integration test output
* fix: merge conflict
2021-07-22 19:04:52 +00:00
Raphael Taylor-Davies
20d06e3225
feat: include more information in system.operations table ( #2097 )
...
* feat: include more information in system.operations table
* chore: review feedback
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 17:16:09 +00:00
Andrew Lamb
387667330a
chore: Update datafusion deps ( #2073 )
...
* chore: Update datafusion deps
* fix: update tests
2021-07-21 08:27:03 +00:00
Raphael Taylor-Davies
091837420f
feat: add PersistenceWindows sytem table ( #2030 ) ( #2062 )
...
* feat: add PersistenceWindows sytem table (#2030 )
* chore: update log
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 13:10:57 +00:00
Andrew Lamb
1c16988a51
chore: Update datafusion references ( #2056 )
2021-07-19 18:09:06 +00:00
Andrew Lamb
4da8a16c18
chore: update to arrow 5.0 and master datafusion ( #2049 )
...
* chore: update to arrow 5.0 and master datafusion
* fix: Update test for change in object size
2021-07-19 12:49:51 +00:00
Marco Neumann
2263189e09
test: make TestDb lifecycle better for testing
...
This is a leftover from #1972 .
2021-07-19 09:50:44 +02:00
Marco Neumann
1ef2bc1887
refactor: `Db::{write_chunk_to_object_store => Db::persist_partition}`
...
The previous method allowed to persist any chunk -- even ones that
should not be persisted yet and w/o any order of peristence. That will
break our persistence windows. So instead offer a sane higher-level
interface that can trigger persistence of a partition within the
boundaries of the lifecycle rules. This needs some adjustments for our
test suite.
2021-07-16 12:07:58 +02:00
Andrew Lamb
3fd6430fb6
fix: rename `estimated_bytes` to `memory_bytes` and expose `object_store_bytes` in ChunkSummary and system.chunks ( #2017 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 16:00:24 +00:00
Andrew Lamb
3bb32594ba
refactor: rename end-to-end.rs to end_to_end.rs ( #2015 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 13:50:32 +00:00
Marco Neumann
d89fca00be
feat: persist "drop chunk"
2021-07-15 12:07:56 +02:00
Nga Tran
0b1f2b1fd0
chore: merge main to branch
2021-07-14 16:17:14 -04:00
Nga Tran
552e3fb691
fix: Padd stats compute deterministic order of sort key and update tests that got changed by the use of sort key
2021-07-14 14:06:41 -04:00
Andrew Lamb
4800b36949
chore: Update IOx to a pre-release version of arrow and datafusion to test out performance improvement
2021-07-13 15:44:57 -04:00
Andrew Lamb
0164cabbf3
refactor: do not use DataFrame DataFusion API / stop optimizing twice ( #1982 )
...
* refactor: do not use DataFrame DataFusion API
* fix: update output to reflect not running optimizer twice
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-13 16:29:43 +00:00
kodiakhq[bot]
f26f844ed2
Merge branch 'main' into ntran/use_sortkey
2021-07-12 18:12:47 +00:00
Nga Tran
7b7a60993d
feat: consider time as a special key
2021-07-09 18:54:22 -04:00
Marco Neumann
676034b4ae
docs: explain why the path placeholder is there
2021-07-09 09:45:13 +02:00
Marco Neumann
09e611deb7
refactor: lift query schema generation up to caller
...
Do no longer scan chunks during query planning to determine the schema
(except for the lifetime jobs where we have a good reason to do so).
Instead pass the schema down to from whoever is triggering the query.
For real SQL queries, we then just use the the table-wide schemas
introduced in #1913 .
Apart from avoiding schema merges we now also don't crash any longer
when no chunks are left in the table (aka columns are present but all
rows are gone).
Fixes #1768 .
Fixes #1884 .
2021-07-09 09:24:21 +02:00
Marco Neumann
6ac1420335
test: fix out dir for query tests
2021-07-09 09:16:28 +02:00
kodiakhq[bot]
c8126784a8
Merge branch 'main' into ntran/avoid_sort_in_scan
2021-07-08 20:22:18 +00:00
Nga Tran
da6249a4df
fix: address reviewers' comments and also fixe a bug they discovered
2021-07-08 15:54:54 -04:00
Andrew Lamb
dd3eff7748
refactor: Always use `row_count` for count of rows in system.* tables ( #1937 )
2021-07-08 19:28:11 +00:00
Andrew Lamb
f670224ea1
chore: Reduce output spew during query tests ( #1926 )
...
* chore: Reduce output spew during query tests
* docs: Update query_tests/src/runner.rs
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 11:06:24 +00:00
Andrew Lamb
7602bde850
chore: Update datafusion deps ( #1799 )
...
* chore: Update datafusion deps + rework code
* refactor: remove workaround as it has been contributed upstream
* fix: Update query/src/exec/split.rs
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 10:58:32 +00:00
Nga Tran
5c722af0fa
fix: remove comments
2021-07-07 16:50:53 -04:00