influxdb/proto/delorean/wal.fbs

99 lines
1.1 KiB
Plaintext
Raw Normal View History

2020-04-24 14:36:59 +00:00
namespace wal;
table Entry {
entry_type: EntryType;
}
table EntryType {
write: Write;
delete: Delete;
}
table Write {
points: [Point];
}
table I64Value {
value: int64;
}
table U64Value {
value: uint64;
}
table F64Value {
value: float64;
}
table BoolValue {
value: bool;
}
table StringValue {
value: string;
}
union PointValue {
I64Value,
U64Value,
F64Value,
BoolValue,
StringValue
}
table Point {
key: string;
time: int64;
value: PointValue;
}
table Delete {
predicate: string;
start_time: int64;
stop_time: int64;
}
feat: Initial prototype of WriteBuffer and WAL (#271) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 18:12:16 +00:00
table WriteBufferBatch {
entries: [WriteBufferEntry];
}
table WriteBufferEntry {
refactor: WriteBuffer database and WAL Flatbuffers (#331) * chore: Refactor write buffer WAL This commit refactors the WAL to remove partition events and to collapse rows into a single write buffer entry. This further simplifies the WAL by removing WriteBufferBatch. Finally, this removes the concept of a partition generation as that is currently not used. * refactor: WriteBuffer database and WAL Flatbuffers This refactor updates the WriteBuffer write path signficantly. At the public API it takes parsed lines, but then immediately converts them over to a built Flatbuffer byte array, which has also been signficantly refactored. The Flatbuffer structure has been updated so that a WriteBufferBatch contains a vec of WriteBufferEntry. Each of those entries corresponds to a collection of data that is bound for a single partition. The generated partition key is now kept as part of this entry. Within the WriteBufferEntry you now have a vec of TableWriteBatch which have the table name and a vec of Row. This pulls the table name out of the row, elminating redundancy for writes that have multiple rows being written into the same table. The database now has methods to accept the Flatbuffer WriteBufferEntry with updates down the line to Partition and Table. This also has a nice little performance bump for WAL restore: wal-restoration/restore_single_entry_single_partition time: [684.51 us 688.45 us 692.53 us] thrpt: [1.4440 Melem/s 1.4525 Melem/s 1.4609 Melem/s] change: time: [-55.913% -55.351% -54.800%] (p = 0.00 < 0.05) thrpt: [+121.24% +123.97% +126.82%] Performance has improved. Found 7 outliers among 100 measurements (7.00%) 3 (3.00%) high mild 4 (4.00%) high severe wal-restoration/restore_multiple_entry_multiple_partition time: [8.7483 ms 8.8964 ms 9.0815 ms] thrpt: [1.3214 Melem/s 1.3489 Melem/s 1.3717 Melem/s] change: time: [-55.952% -55.166% -54.213%] (p = 0.00 < 0.05) thrpt: [+118.40% +123.04% +127.02%] Performance has improved. Found 9 outliers among 100 measurements (9.00%) 5 (5.00%) high mild 4 (4.00%) high severe * fix: fmt Co-authored-by: alamb <andrew@nerdnetworks.org>
2020-10-02 17:52:00 +00:00
partition_key: string;
table_batches: [TableWriteBatch];
feat: Initial prototype of WriteBuffer and WAL (#271) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 18:12:16 +00:00
delete: WriteBufferDelete;
}
refactor: WriteBuffer database and WAL Flatbuffers (#331) * chore: Refactor write buffer WAL This commit refactors the WAL to remove partition events and to collapse rows into a single write buffer entry. This further simplifies the WAL by removing WriteBufferBatch. Finally, this removes the concept of a partition generation as that is currently not used. * refactor: WriteBuffer database and WAL Flatbuffers This refactor updates the WriteBuffer write path signficantly. At the public API it takes parsed lines, but then immediately converts them over to a built Flatbuffer byte array, which has also been signficantly refactored. The Flatbuffer structure has been updated so that a WriteBufferBatch contains a vec of WriteBufferEntry. Each of those entries corresponds to a collection of data that is bound for a single partition. The generated partition key is now kept as part of this entry. Within the WriteBufferEntry you now have a vec of TableWriteBatch which have the table name and a vec of Row. This pulls the table name out of the row, elminating redundancy for writes that have multiple rows being written into the same table. The database now has methods to accept the Flatbuffer WriteBufferEntry with updates down the line to Partition and Table. This also has a nice little performance bump for WAL restore: wal-restoration/restore_single_entry_single_partition time: [684.51 us 688.45 us 692.53 us] thrpt: [1.4440 Melem/s 1.4525 Melem/s 1.4609 Melem/s] change: time: [-55.913% -55.351% -54.800%] (p = 0.00 < 0.05) thrpt: [+121.24% +123.97% +126.82%] Performance has improved. Found 7 outliers among 100 measurements (7.00%) 3 (3.00%) high mild 4 (4.00%) high severe wal-restoration/restore_multiple_entry_multiple_partition time: [8.7483 ms 8.8964 ms 9.0815 ms] thrpt: [1.3214 Melem/s 1.3489 Melem/s 1.3717 Melem/s] change: time: [-55.952% -55.166% -54.213%] (p = 0.00 < 0.05) thrpt: [+118.40% +123.04% +127.02%] Performance has improved. Found 9 outliers among 100 measurements (9.00%) 5 (5.00%) high mild 4 (4.00%) high severe * fix: fmt Co-authored-by: alamb <andrew@nerdnetworks.org>
2020-10-02 17:52:00 +00:00
enum ColumnType : byte { I64, U64, F64, Tag, String, Bool }
feat: Initial prototype of WriteBuffer and WAL (#271) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 18:12:16 +00:00
refactor: WriteBuffer database and WAL Flatbuffers (#331) * chore: Refactor write buffer WAL This commit refactors the WAL to remove partition events and to collapse rows into a single write buffer entry. This further simplifies the WAL by removing WriteBufferBatch. Finally, this removes the concept of a partition generation as that is currently not used. * refactor: WriteBuffer database and WAL Flatbuffers This refactor updates the WriteBuffer write path signficantly. At the public API it takes parsed lines, but then immediately converts them over to a built Flatbuffer byte array, which has also been signficantly refactored. The Flatbuffer structure has been updated so that a WriteBufferBatch contains a vec of WriteBufferEntry. Each of those entries corresponds to a collection of data that is bound for a single partition. The generated partition key is now kept as part of this entry. Within the WriteBufferEntry you now have a vec of TableWriteBatch which have the table name and a vec of Row. This pulls the table name out of the row, elminating redundancy for writes that have multiple rows being written into the same table. The database now has methods to accept the Flatbuffer WriteBufferEntry with updates down the line to Partition and Table. This also has a nice little performance bump for WAL restore: wal-restoration/restore_single_entry_single_partition time: [684.51 us 688.45 us 692.53 us] thrpt: [1.4440 Melem/s 1.4525 Melem/s 1.4609 Melem/s] change: time: [-55.913% -55.351% -54.800%] (p = 0.00 < 0.05) thrpt: [+121.24% +123.97% +126.82%] Performance has improved. Found 7 outliers among 100 measurements (7.00%) 3 (3.00%) high mild 4 (4.00%) high severe wal-restoration/restore_multiple_entry_multiple_partition time: [8.7483 ms 8.8964 ms 9.0815 ms] thrpt: [1.3214 Melem/s 1.3489 Melem/s 1.3717 Melem/s] change: time: [-55.952% -55.166% -54.213%] (p = 0.00 < 0.05) thrpt: [+118.40% +123.04% +127.02%] Performance has improved. Found 9 outliers among 100 measurements (9.00%) 5 (5.00%) high mild 4 (4.00%) high severe * fix: fmt Co-authored-by: alamb <andrew@nerdnetworks.org>
2020-10-02 17:52:00 +00:00
table TableWriteBatch {
name: string;
rows: [Row];
feat: Initial prototype of WriteBuffer and WAL (#271) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 18:12:16 +00:00
}
table Row {
values: [Value];
}
table TagValue {
value: string;
feat: Initial prototype of WriteBuffer and WAL (#271) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 18:12:16 +00:00
}
union ColumnValue {
TagValue,
I64Value,
U64Value,
F64Value,
BoolValue,
StringValue
}
table Value {
column: string;
feat: Initial prototype of WriteBuffer and WAL (#271) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 18:12:16 +00:00
value: ColumnValue;
}
table WriteBufferDelete {
table_name: string;
predicate: string;
}