Marco Neumann
91df8a30e7
feat: limit number of files during storage cleanup
...
Since the number of parquet files can potentially be unbound (aka very
very large) and we do not want to hold the transaction lock for too
long and also want to limit memory consumption of the cleanup routine,
let's limit the number of files that we collect for cleanup.
2021-06-03 17:43:11 +02:00
Edd Robinson
e583e1fbda
Merge branch 'main' into er/feat/read_buffer/float_int
2021-06-03 14:48:36 +01:00
Andrew Lamb
eaa5b75437
refactor: Make it clear only partition_key and table name pruning happens in catalog ( #1608 )
...
* refactor: Make it clear only partition_key and table name pruning is happening in catalog
* fix: clippy
* fix: Update server/src/db/catalog.rs
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* refactor: use TableNameFilter enum rather than Option
* docs: Add docstring to the `From` implementation
* fix: Update server/src/db/catalog/partition.rs
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Edd Robinson <me@edd.io>
2021-06-03 13:09:09 +00:00
Edd Robinson
65bfa4dd10
test: fix tests
2021-06-03 12:32:40 +01:00
Marco Neumann
27b9477aa4
test: fix flaky test
2021-06-03 11:23:29 +02:00
Marco Neumann
7b2663a38a
test: make tests faster
2021-06-03 11:23:29 +02:00
Marco Neumann
3c9fd81697
refactor: split overlong line
2021-06-03 11:23:29 +02:00
Marco Neumann
bbd73e59be
feat: jitter background clean-up job + wait on first job
2021-06-03 11:23:29 +02:00
Marco Neumann
ce412dbce2
fix: use structured error for background cleanup task reporting
2021-06-03 11:23:29 +02:00
kodiakhq[bot]
1c764c47a2
Merge branch 'main' into ntran/deduplicate
2021-06-02 17:42:36 +00:00
Nga Tran
40bd932fff
refactor: address Andrew's comment
2021-06-02 13:41:46 -04:00
Andrew Lamb
32c6ed1f34
refactor: More cleanup related to multi-table chunks ( #1604 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-02 17:00:23 +00:00
Nga Tran
e7a97f3ac1
test: merge main and add more tests for deduplicate work
2021-06-02 12:00:40 -04:00
Marco Neumann
80f4d84ce8
refactor: isolate DB loading and streamline error handling
...
There are not functional changes here (except that errors look slightly
different) but it should allow for an easier move of the DB loading into
a delayed task.
2021-06-02 13:42:24 +02:00
kodiakhq[bot]
0e09b20ca8
Merge branch 'main' into crepererum/issue1513-b
2021-06-02 07:08:29 +00:00
Nga Tran
40df7def0e
test: ttests for the deduplicate work
2021-06-01 18:06:35 -04:00
Nga Tran
60ad929721
refactor: add macro tto compare output of explains
2021-06-01 16:39:14 -04:00
Nga Tran
aa867601e5
chore: merge main with DF plan display fix
2021-06-01 16:17:41 -04:00
Nga Tran
0ad258bab3
refactor: remove comments since the time function predicates are pushed down after the recent constant folding fix in DF
2021-06-01 16:00:09 -04:00
Andrew Lamb
d8fbb7b410
refactor: Remove last vestiges of multi-table chunks from PartitionChunk API ( #1588 )
...
* refactor: Remove last vestiges of multi-table chunks from PartitionChunk API
* fix: remove test that can no longer fail
* fix: update tests + code review comments
* fix: clippy
* fix: clippy
* fix: restore test_measurement_fields_error test
2021-06-01 16:12:33 +00:00
Marco Neumann
714a082f3a
refactor: remove chunk state struct nesting
...
Inline structs that are only used for enum variants.
2021-06-01 18:00:16 +02:00
Marco Neumann
5a4562f1c9
test: test `Chunk::new_open`
2021-06-01 18:00:16 +02:00
Marco Neumann
f45e61f9ef
test: test chunk lifecycle action handling
2021-06-01 18:00:16 +02:00
Marco Neumann
50636ca011
refactor: rename `Chunk::{set_closed => freeze}` and add tests
...
This make it clearer what is actually happening. Furthermore, freezing
frozen chunks is now a no-op.
2021-06-01 18:00:16 +02:00
kodiakhq[bot]
aafc8c4746
Merge branch 'main' into crepererum/fix_catalog_replay_logging
2021-06-01 15:59:42 +00:00
Marco Neumann
98c2963c28
fix: fix confusing log message during catalog replay
2021-06-01 17:58:38 +02:00
Andrew Lamb
d3711a5591
refactor: Use ParquetExec from DataFusion to read parquet files ( #1580 )
...
* refactor: use ParquetExec to read parquet files
* fix: test
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-01 14:44:07 +00:00
Andrew Lamb
64328dcf1c
feat: cache schema on catalog chunks too ( #1575 )
2021-06-01 12:42:46 +00:00
kodiakhq[bot]
4e7b754098
Merge branch 'main' into crepererum/issue1513-a
2021-06-01 08:23:01 +00:00
Raphael Taylor-Davies
6e07a735bd
feat: don't recompute chunk size on every iteration ( #1586 )
2021-05-31 16:19:11 +00:00
Andrew Lamb
73cedd2f88
chore: remove unused dependency ( #1587 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-31 14:22:11 +00:00
Marco Neumann
991314ebe8
docs: fix `set_writing_to_object_store` docstring
2021-05-31 15:44:29 +02:00
Marco Neumann
996ce833f1
chore: fix formatting
2021-05-31 15:42:13 +02:00
Andrew Lamb
162a808a8d
refactor: Remove `table_name` from PartitionChunk API ( #1584 )
...
* refactor: Remove `table_name` from PartitionChunk API
* fix: clippy
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-31 12:05:09 +00:00
Marco Neumann
c658a627ed
refactor: change state structure for chunks
...
This is the first step towards #1513 . However it leaves all consumers
bascially unchanged and also does NOT touch state transitions. These
changes will follow in upcoming PRs.
2021-05-31 11:19:01 +02:00
Raphael Taylor-Davies
db432de137
feat: add distinct count to StatValues ( #1568 )
2021-05-28 17:41:34 +00:00
Raphael Taylor-Davies
d8f19348bf
feat: per-column dictionaries in MUB ( #1570 )
...
* feat: per-column dictionaries in MUB
* chore: fmt
* refactor: remove chunk-level dictionary
* chore: remove redundant sort
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-28 13:51:56 +00:00
kodiakhq[bot]
d70d7a63a2
Merge branch 'main' into crepererum/remove_invalid_chunk_state
2021-05-28 10:20:05 +00:00
Andrew Lamb
c6f42cf304
refactor: Remove unnecessary code ( #1573 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-28 10:12:47 +00:00
Marco Neumann
5cfede51f2
refactor: remove `ChunkState::Invalid`
...
This seems to only exist to fight the borrow checker and we can actually
live without it.
2021-05-28 11:16:06 +02:00
Andrew Lamb
3ae44a0375
refactor: Chunks can have at most one object store path ( #1574 )
...
* refactor: Chunk can have at most one path
* fix: update tests
2021-05-27 19:52:09 +00:00
Nga Tran
62147ff0d4
feat: add more explain tests
2021-05-27 12:19:41 -04:00
Andrew Lamb
f3bec93ef1
feat: Cache TableSummary in Catalog rather than computing it on demand ( #1569 )
...
* feat: Cache `TableSummary` in catalog Chunks
* refactor: use consistent table summary
2021-05-27 16:03:05 +00:00
Raphael Taylor-Davies
5d342d7779
feat: associate tracker with lifecycle action ( #1099 ) ( #1556 )
...
* feat: associate tracker with lifecycle action (#1099 )
* chore: docs
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-27 10:47:35 +00:00
Raphael Taylor-Davies
792bff07d1
feat: only store ChunkSnapshot in Closed state ( #1560 )
...
* feat: only store ChunkSnapshot in Closed state
* chore: review feedback
* feat: record MUB size as closed size
* chore: document column ordering assumption
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-27 10:36:47 +00:00
Raphael Taylor-Davies
4fcc04e6c9
chore: enable arrow prettyprint feature ( #1566 )
2021-05-27 10:28:14 +00:00
kodiakhq[bot]
efe077da8f
Merge branch 'main' into crepererum/issue1313
2021-05-26 14:46:18 +00:00
Marco Neumann
24ec1a472e
fix: do NOT delete parquet files that are reachable by time travel
2021-05-26 12:38:54 +02:00
Raphael Taylor-Davies
c03b8a3963
refactor: remove tables from ChunkSnapshot ( #1295 ) ( #1558 )
2021-05-26 10:37:40 +00:00
Marco Neumann
1fb6af2364
refactor: split DB background loop into lifecycle and cleanup
...
This should prevent one from blocking / stalling the other.
2021-05-26 11:09:30 +02:00