Commit Graph

548 Commits (fdbf9e112e3b03701c7fb89f89399a8fb6e2e763)

Author SHA1 Message Date
Andrew Lamb 7b96a37165
chore: Update datafusion (#3586)
* chore: update DataFusion to f849968057ddddccc9aa19915ef3ea56bf14d80d

* fix: reduce overhead of creating physical expressions

* chore: use MemTrackingMetrics

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-31 18:15:28 +00:00
Carol (Nichols || Goulding) 4006dc14b3
fix: Correct typo in function name 2022-01-31 10:48:30 -05:00
Carol (Nichols || Goulding) 749989a937
refactor: Simplify type, eliminating empty vec creation
If there aren't any record batches, there isn't any metadata, and vice
versa. Make this relationship clearer by putting the Option around both
the vec of recordbatches and the metadata.
2022-01-31 10:48:30 -05:00
Carol (Nichols || Goulding) 093d5acfd4
fix: Unify temporary multiple definitions of IoxMetadata 2022-01-31 10:48:29 -05:00
Carol (Nichols || Goulding) 8f81ce5501
refactor: Share parquet_file::storage code between new and old metadata 2022-01-31 10:36:33 -05:00
Carol (Nichols || Goulding) bf89162fa5
refactor: Move IoxMetadata to parquet_file 2022-01-31 10:36:33 -05:00
Carol (Nichols || Goulding) dd9620da0c
feat: Create a new proto definition for the new design's IoxMetadata 2022-01-31 10:36:32 -05:00
Carol (Nichols || Goulding) 81647f253c
feat: Use IoxMetadata and a list of RecordBatches 2022-01-31 10:36:32 -05:00
Carol (Nichols || Goulding) fef968f75c
fix: Remove catalog insertion; will be handled elsewhere 2022-01-31 10:36:32 -05:00
Carol (Nichols || Goulding) 8b47ad6885
test: Add more tests 2022-01-31 10:36:32 -05:00
Carol (Nichols || Goulding) d413157b99
feat: Extract a fn for creating the parquet file paths and test it 2022-01-31 10:36:32 -05:00
Carol (Nichols || Goulding) 5e0e0d8aa7
feat: Write parquet to object storage in a similar way as parquet_file::Storage 2022-01-31 10:36:32 -05:00
Carol (Nichols || Goulding) ea18c71e6d
feat: Create an object store path for a new parquet file 2022-01-31 10:36:32 -05:00
Carol (Nichols || Goulding) c633c9bc5c
feat: Wire object store into ingester persistence 2022-01-31 10:36:30 -05:00
Nga Tran ac247e4de5
feat: update catalog after persistence (#3581)
* feat: update catalog after persistence

* test: add a negative test for the update catalog

* chore: add IDs into the messages

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: address review comments

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-31 15:23:16 +00:00
Nga Tran 8735ede74f
feat: IoxMetadata for parquet file (#3547)
* feat: IoxMetadata for parquet file

* fix: typos

* refactor: address review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-28 14:41:59 +00:00
Nga Tran fb33a88dc8
test: Delete application during Ingester's compaction (#3542)
* test: Delete application during Ingester's compaction

* fix: typos

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: remove comments

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 16:53:17 +00:00
Andrew Lamb 5488c257d1
chore: Update datafusion, upgrade to arrow/parqet/arrow-flight 8.0.0 (#3517)
* chore: Update datafusion

* chore: update to arrow 8

* fix: update to use new DataFusion APIs

* fix: update case for sortedness

* fix: cargo hakari
2022-01-27 13:33:27 +00:00
Carol (Nichols || Goulding) bc44d33108
feat: Implement a snapshot method on DataBuffer (#3518)
* feat: Implement a snapshot method on DataBuffer

Fixes #3510.

* test: Add a test snapshotting batches with different but compatible schemas

* fix: Simplify min/max sequencer number collection

The first batch should always have the min sequencer number. The last
batch should always have the max sequencer number. The min should always
be less than (or equal to, in case there's only one batch) the max.
2022-01-26 15:22:51 +00:00
Nga Tran 52866fe6a9
fix: merge record batches into one batch (#3535)
* fix: merge record batches into one batch

refactor: address review comments

* chore: update test output
2022-01-25 23:29:16 +00:00
Nga Tran d559561fd7
refactor: have the deduplicate work without chunk statistics (#3519)
* refactor: have the deduplicate work without chunk statistics

* test: more tests for duplicates data on different combinations of record batches

* refactor: address review comments
2022-01-25 17:00:25 +00:00
NGA-TRAN c6a195b0e6 refactor: address review comments 2022-01-24 13:05:44 -05:00
NGA-TRAN 797ba459b9 chore: merge main to branch 2022-01-24 12:06:23 -05:00
NGA-TRAN 939ea536d4 feat: add but ignore a few compaction tests 2022-01-24 12:00:23 -05:00
NGA-TRAN ee0a468b4d feat: a few tests for compaction 2022-01-21 18:15:23 -05:00
Paul Dix bb893510a0 feat: Add scaffolding for ingester server
* Adds a new ingester command to start an ingester server
* Moves previous ingester server over to handler
* Skeleton for gRPC and HTTP handlers
2022-01-21 18:02:19 -05:00
NGA-TRAN fa41067e3d refactor: for paul 2022-01-21 16:50:49 -05:00
NGA-TRAN cd01b141f3 refactor: for paul 2022-01-21 16:49:02 -05:00
Paul Dix bfa54033bd refactor: Clean up the Catalog API
This updates the catalog API to make it easier to work with for consumers. I also found a bug in the MemCatalog implementation while refactoring the tests to work with the new API definition. Consumers will now be able to Arc wrap the catalog and use it across awaits.
2022-01-21 16:01:13 -05:00
NGA-TRAN 191adc9fc7 feat: initial implementation for ingester's compaction 2022-01-20 18:22:41 -05:00
NGA-TRAN 029f4bb41e fix: comment 2022-01-19 18:11:00 -05:00
NGA-TRAN dcf952bb27 chore: merge main to branch 2022-01-19 17:59:05 -05:00
NGA-TRAN 4ede10b3a0 refactor: add new fields and comments in ingest data buffer 2022-01-19 17:53:58 -05:00
Paul Dix 860e5a30ca refactor: update ingester to get sequencer record and not attempt to create 2022-01-19 17:15:10 -05:00
NGA-TRAN be3e523312 fix: use PersistingBatch 2022-01-19 13:25:03 -05:00
NGA-TRAN 9977f174b7 refactor: use wrapper ID 2022-01-19 12:51:04 -05:00
NGA-TRAN edb97f51cf refactor: add persisting struct 2022-01-19 12:36:18 -05:00
NGA-TRAN 8a17e1c132 refactor: address review comments 2022-01-19 11:20:20 -05:00
NGA-TRAN fe9a41ee9a chore: remove non-longer needed dependency 2022-01-18 21:45:20 -05:00
NGA-TRAN b89c250ccc refactor: use RepoColection instead of MemCatalog 2022-01-18 21:39:22 -05:00
NGA-TRAN b57f027e35 refactor: address review comments 2022-01-18 20:57:13 -05:00
NGA-TRAN 367a9fb812 fix: add workspace-hack 2022-01-18 18:10:42 -05:00
NGA-TRAN 1c970a2064 fix: format 2022-01-18 18:01:47 -05:00
NGA-TRAN 667ec5bfc5 fix: the code is now compile without warnings 2022-01-18 18:01:06 -05:00
NGA-TRAN b20d1757d0 feat: initialize ingester data 2022-01-18 17:43:03 -05:00
NGA-TRAN 125285ae9a feat: commit in order to pull and merge new commit from main 2022-01-18 16:11:25 -05:00
NGA-TRAN 23290fd2ff fix: new data structures suggested by reviewers 2022-01-18 14:04:07 -05:00
NGA-TRAN ef336b4659 feat: add ingester crate and a few basic data structures for its data lifecycle 2022-01-17 15:38:03 -05:00