This commit refactors the flatbuffers data types from the wal to a new crate where they can be used by storage, write buffer, and cluster. It also refactors cluster to move the configuration types out to the data types crate so they can be used across storage and elsewhere.
Finally, it adds a new method to store replicated writes on a database in the database trait and implements it.
* test: traits for database and tests for http handler
* refactor: Use generics and trait bounds instead of trait objects
* refactor: Replace trait objects with an associated type
* refactor: Extract an associated Error type on the Database traits
* refactor: Remove some explicit conversions to_string that Snafu takes care of
* docs: add comments
* refactor: move traits into storage module
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@integer32.com>
This is the initial prototype of the WriteBuffer and WAL. This does the following:
* accepts a slice of ParsedLine into the DB
* writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported
* persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch)
* has a method to return a table from the buffer as an Arrow RecordBatch
* recovers the WAL after the database is closed and opened back up again
* has a single test that covers the end-to-end from the DB side
* It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration.
* hooked up to the v2 HTTP write API
* hooked up to a read API which will execute a SQL query against the data in the buffer
This includes a refactor of the WAL:
Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes.
This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic.
Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
Jake dug into why the end-to-end tests fail with delorean running in the
Docker image I built, and it appears to be a crash with an illegal
instruction from CRoaring.
We think it's this issue: https://github.com/saulius/croaring-rs/pull/62
which was merged and released, so let's try updating CRoaring.
* feat: benchmark for lp->parquet performance
* feat: improve parser performance by storing contiguous EscapedStr
* fix: remove all string copies during LP-Parquet conversion
* refactor: Implement from_str as From<&str> only
* refactor: implement Deref instead of as_str
* refactor: Remove ends_with because Deref now makes it work
* refactor: Eq can be derived
* refactor: Remove unused From implementation
* refactor: Replace single-character strings with chars as requested by clippy
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@integer32.com>
* refactor: move all dstool code into delorean binary
* fix: Move code/mods to make it compile and run
* fix: warn if db dir does not exist
* refactor: Match argument subcommands w/ more idomatic rust
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
fix: restore hyper logging
fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: update expected code
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* feat: Add parquet writer, hook up conversion in dstool
* fix: use bigger executor for test
* fix: less cloning
* fix: make unsupported messages less pejorative
* fix: fmt
* fix: Rename writer and do not require std::File, add example
* fix: clippy and fmt
* fix: remove unnecessary module in end to end tests
* fix: remove strange use of tempfile
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: cleanup use
* fix: Use more specific error messages
* fix: comment tweak
* fix: touchup temp path creation
* fix: clippy!
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* refactor: rename the module containing generated types
The nested `delorean` was confusing anyway, and this will make more
sense when we extract a new crate.
* refactor: Move the generated types to their own crate
This allows us to have more lax warnings in that crate alone, keeping
the main crate more strict.
* style: Re-enable elided lifetimes lint in the main crate