influxdb/wal/tests/total_size.rs

65 lines
2.1 KiB
Rust
Raw Normal View History

feat: Initial prototype of WriteBuffer and WAL (#271) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 18:12:16 +00:00
use std::fs;
use wal::{WalBuilder, WritePayload};
#[macro_use]
mod helpers;
use helpers::Result;
#[test]
#[allow(clippy::cognitive_complexity)]
fn total_size() -> Result {
let dir = test_helpers::tmp_dir()?;
// Set the file rollover size limit low to test how rollover interacts with
// total size
let builder = WalBuilder::new(dir.as_ref()).file_rollover_size(100);
feat: Initial prototype of WriteBuffer and WAL (#271) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 18:12:16 +00:00
let mut wal = builder.clone().wal()?;
// Should start without existing WAL files; this implies total file size on disk
// is 0
let wal_files = helpers::wal_file_names(&dir.as_ref());
assert!(wal_files.is_empty());
// Total size should be 0
assert_eq!(wal.total_size(), 0);
feat: Initial prototype of WriteBuffer and WAL (#271) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 18:12:16 +00:00
create_and_sync_batch!(wal, ["some data within the file limit"]);
// Total size should be that of all the files
assert_eq!(wal.total_size(), helpers::total_size_on_disk(&dir.as_ref()));
// Write one WAL entry that ends up in the same WAL file
feat: Initial prototype of WriteBuffer and WAL (#271) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 18:12:16 +00:00
create_and_sync_batch!(wal, ["some more data that puts the file over the limit"]);
// Total size should be that of all the files
assert_eq!(wal.total_size(), helpers::total_size_on_disk(&dir.as_ref()));
// Write one WAL entry, and because the existing file is over the size limit,
// this entry should end up in a new WAL file
create_and_sync_batch!(
wal,
feat: Initial prototype of WriteBuffer and WAL (#271) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 18:12:16 +00:00
["some more data, this should now be rolled over into the next WAL file"]
);
// Total size should be that of all the files
assert_eq!(wal.total_size(), helpers::total_size_on_disk(&dir.as_ref()));
let total_file_size_before_delete = helpers::total_size_on_disk(&dir.as_ref());
// Some process deletes the first WAL file
let path = dir.path().join(helpers::file_name_for_sequence_number(0));
fs::remove_file(path)?;
// Total size isn't aware of the out-of-band deletion
assert_eq!(wal.total_size(), total_file_size_before_delete);
// Pretend the process restarts
let wal = builder.wal()?;
// Total size should be that of all the files, so without the file deleted
// out-of-band
assert_eq!(wal.total_size(), helpers::total_size_on_disk(&dir.as_ref()));
Ok(())
}