This looks at the database rules and replicates writes to the replicate host groups specified. Later commits will add hashing based on partition key and handling replication errors from remote servers.
This commit refactors the flatbuffers data types from the wal to a new crate where they can be used by storage, write buffer, and cluster. It also refactors cluster to move the configuration types out to the data types crate so they can be used across storage and elsewhere.
Finally, it adds a new method to store replicated writes on a database in the database trait and implements it.
This commit implements partition templates as a struct that can be serialized and deserialzed. It is comprised of parts that can include the table name, a column name and its value, a formatted time, or a string column and regex captures of its value.
This updates cluster so that the concept of replication and subscriptions for handling queries are separated. It also adds flatbuffer structure that can be used as a common format for replication.
* chore: Refactor write buffer WAL
This commit refactors the WAL to remove partition events and to collapse rows into a single write buffer entry.
This further simplifies the WAL by removing WriteBufferBatch.
Finally, this removes the concept of a partition generation as that is currently not used.
* refactor: WriteBuffer database and WAL Flatbuffers
This refactor updates the WriteBuffer write path signficantly. At the public API it takes parsed lines, but then immediately converts them over to a built Flatbuffer byte array, which has also been signficantly refactored.
The Flatbuffer structure has been updated so that a WriteBufferBatch contains a vec of WriteBufferEntry. Each of those entries corresponds to a collection of data that is bound for a single partition. The generated partition key is now kept as part of this entry.
Within the WriteBufferEntry you now have a vec of TableWriteBatch which have the table name and a vec of Row. This pulls the table name out of the row, elminating redundancy for writes that have multiple rows being written into the same table.
The database now has methods to accept the Flatbuffer WriteBufferEntry with updates down the line to Partition and Table.
This also has a nice little performance bump for WAL restore:
wal-restoration/restore_single_entry_single_partition
time: [684.51 us 688.45 us 692.53 us]
thrpt: [1.4440 Melem/s 1.4525 Melem/s 1.4609 Melem/s]
change:
time: [-55.913% -55.351% -54.800%] (p = 0.00 < 0.05)
thrpt: [+121.24% +123.97% +126.82%]
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
wal-restoration/restore_multiple_entry_multiple_partition
time: [8.7483 ms 8.8964 ms 9.0815 ms]
thrpt: [1.3214 Melem/s 1.3489 Melem/s 1.3717 Melem/s]
change:
time: [-55.952% -55.166% -54.213%] (p = 0.00 < 0.05)
thrpt: [+118.40% +123.04% +127.02%]
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
5 (5.00%) high mild
4 (4.00%) high severe
* fix: fmt
Co-authored-by: alamb <andrew@nerdnetworks.org>
* feat: write_database support for predicates
* fix: temporarily pull in arrow fork to pick up fix for ARROW-10136
* fix: Update mutex usage based on PR feedback
* fix: more mutex polish and use OptionExt
* fix: update comments
* fix: rust-fu the table lookup
* fix: update docs
* fix: more idomatic rust types
* fix: better usage of reference types