Commit Graph

941 Commits (e3e801d29aa31b019b8e3ebaff6875617b9a01a6)

Author SHA1 Message Date
Raphael Taylor-Davies e3e801d29a
feat: propagate span context into storage RPC queries (#2407)
* feat: propagate span context into storage RPC queries

* refactor: create ExecutionContextProvider trait

* chore: cleanup imports

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 17:11:49 +00:00
Carol (Nichols || Goulding) 7cf7fb02ed refactor: Rename database ObjectStore state types to DatabaseObjectStore 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 6d0959fbc3 fix: Move IOx object store creation logic into Database state machine 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 199d212b18 refactor: Move find-or-create IoxObjectStore logic into tests
This is the only place this logic is used; it's not appropriate for
production usage as we only ever want to either find and error or create
and error in real life.
2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) c7eceac8a3 refactor: Have server determine database generation from object store 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 5e1b57de9a refactor: Borrow arcs instead of as_ref 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) cee2f21d47 feat: Add a find_or_create object store function for tests 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) 18ba3b5c59 feat: Create database directories with a generation ID 2021-08-26 09:14:22 -04:00
Marco Neumann 026202a05c fix: correctly account for parquet metadata size
We need to hold the parquet metadata in memory so that we're able to
create catalog checkpoints. We used to do that by holding the decoded
structure (provided by the upstream `parquet` crate) in memory and
serializing that data on demand to Apache Thrift.

There are two drawbacks:

1. We did not account for the memory usage of the decoded structures (or
   at least not fully).
2. We actually don't need the decoded data in-memory, since for the
   checkpoint creation we only need to write the serialized data.

So this PR changes our wrapper so it holds the serialized data which is
then only decoded when it's really necessary. Since the serialized data
is a simple byte vector, we can also easily account for the size.

Note that this makes the accounted size of parquet chunks larger.
However this data was always there, we just ignored it up until now. If
the size of the parquet metadata really becomes an issue, we could trait
some CPU time for memory by compressing it.
2021-08-26 13:24:32 +02:00
kodiakhq[bot] b1ecf1bfed
Merge branch 'main' into crepererum/job_start_time_in_system_table 2021-08-26 08:04:10 +00:00
Andrew Lamb ddf6c6362e
chore: update DataFusion again (#2411)
* chore: update datafusion ref

* chore: run cargo update

* refactor: Rename concurrency to target_partitions, avoid deprecation warning
2021-08-26 08:03:13 +00:00
Marco Neumann 558aa54aa3 feat: add start time to `operations` system table 2021-08-26 10:00:29 +02:00
Edd Robinson 69329b0b38
Merge branch 'main' into er/refactor/read_buffer/rle_entries 2021-08-25 12:08:44 +01:00
Edd Robinson 11e88877f4 fix: correct size estimation of RLE encoding 2021-08-25 12:03:04 +01:00
Edd Robinson f3c57c47fa
Merge branch 'main' into er/refactor/read_buffer/table_arg 2021-08-25 10:30:12 +01:00
kodiakhq[bot] c98723e3b3
Merge branch 'main' into crepererum/rub_shrink_rle 2021-08-25 08:58:22 +00:00
Marco Neumann 2ad9843e5f feat: make `RLE` a bit smaller by capacity-based allocation
For some demo data this reduced the overall chunk size from

195049367 bytes
to
191088095 bytes
2021-08-25 10:22:43 +02:00
kodiakhq[bot] 5d97acb2f3
Merge branch 'main' into crepererum/issue2372 2021-08-25 07:08:15 +00:00
Edd Robinson 5648817285 refactor: remove redunant argument 2021-08-24 22:26:17 +01:00
Raphael Taylor-Davies f7792aafe6
feat: query tracing (#2273) (#2391)
* feat: query tracing (#2273)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 17:35:59 +00:00
Marco Neumann 363d202202 feat: stop application executor in one dedicated place 2021-08-24 14:46:36 +02:00
Raphael Taylor-Davies a6c9cc2bf2
refactor: rework exec module (#2384)
* refactor: rework exec module

* chore: update docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 08:39:54 +00:00
Andrew Lamb 35cf560c9f
fix: do not error if partition has no chunks (#2383)
* fix: do not error if partition has no chunks

* fix: do not panic

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-23 17:33:54 +00:00
Raphael Taylor-Davies 0946ffe916
refactor: reuse IOxExecutionContext (#2373)
* refactor: reuse IOxExecutionContext

* fix: orphaned comment

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-23 15:47:15 +00:00
kodiakhq[bot] ec0152714e
Merge branch 'main' into catalog-test-determinism 2021-08-19 17:53:04 +00:00
Raphael Taylor-Davies b0e8b75a8a fix: TestCatalogState unique chunk ID 2021-08-19 17:19:12 +01:00
kodiakhq[bot] 47431148d5
Merge branch 'main' into er/refactor/read_buffer/bitmap_size 2021-08-18 21:20:13 +00:00
Raphael Taylor-Davies e81b82c0a4
feat: split db worker loop (#2337)
* feat: split db worker loop

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-18 17:33:13 +00:00
Carol (Nichols || Goulding) 61263c8774 feat: Add a debugging-suitable way to get the object storage path of a database 2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding) fbf3ceb1e2 refactor: Extract listing of all databases into iox_object_store 2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding) f782e77dcc test: Use the iox_object_store when testing a database's object store files 2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding) ff89398132 fix: Remove DatabaseConfig store_path field
This is now managed by the iox_object_store crate.
2021-08-18 11:32:39 -04:00
Jake Goulding 63111d9d9a refactor: Move the database rules functionality to iox_object_store 2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding) 4447f1e22c test: Adjust parquet file sizes; only storing relative paths now 2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding) 6d5cb9c117 refactor: Extract a ParquetFilePath to handle paths to parquet files in a db's object store 2021-08-18 11:32:39 -04:00
Edd Robinson b9f09fce49 feat: improve bitset size estimation 2021-08-17 22:54:22 +01:00
Edd Robinson 1daa30cc7d fix: include enum in sizing 2021-08-17 22:54:22 +01:00
kodiakhq[bot] 006d4db0c1
Merge branch 'main' into er/feat/read_buffer/row_group_metrics 2021-08-17 21:44:01 +00:00
Andrew Lamb 6b2ac77b8b
docs: Add some doc comments about sortedness in catalog Partition chunks (#2323)
* docs: Note on iteration order in catalog::Partition

* test: add tests for chunk_id order
2021-08-17 15:17:12 +00:00
Edd Robinson 211d814c8c
Merge branch 'main' into er/feat/read_buffer/row_group_metrics 2021-08-17 13:00:44 +01:00
Edd Robinson c795fc7f9d feat: add metric to track total row groups 2021-08-17 12:55:11 +01:00
Marco Neumann 55e9a3beda docs: better explain locking 2021-08-17 10:14:20 +02:00
Marco Neumann e540798eed test: drop two chunks in `drop_partition` test 2021-08-17 10:07:26 +02:00
Marco Neumann 5b0c3728b6 fix: ensure that code invariants hold 2021-08-17 10:03:28 +02:00
Marco Neumann 32cf23100d docs: explain why `drop_partition` does not deadlock 2021-08-17 09:52:30 +02:00
Marco Neumann 4a5dfc895a docs: clarify that `Partition::chunks` returns an ordered iterator 2021-08-17 09:52:07 +02:00
Marco Neumann 177d5fbb35 docs: fix typo in `Step::Drop`
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-08-17 09:44:35 +02:00
Marco Neumann 9454e06d61 test: test interaction of dropping partitions and replay 2021-08-17 09:44:35 +02:00
Marco Neumann 77892a0998 feat: add API to drop entire partitions 2021-08-17 09:44:35 +02:00
Ning Sun c012e996ab
refactor: remove display methods, use fmt::Display instead. (#2272)
* refactor: remove display methods, use fmt::Display instead.

Signed-off-by: Ning Sun <sunng@protonmail.com>

* refactor: update a few calls from .display to .to_string()

* fix: consistently use `Path` rather than occasionally `DirsAndFileName`

* fix: fixup for merge conflicts

* fix: update test

* fix: Catch another case or two

* fix: fmt

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-16 18:00:22 +00:00