fix: Change WAL to Write Buffer in comments and documentation
parent
c7e6a0a797
commit
80995afb70
|
@ -189,10 +189,10 @@ cargo clippy --all-targets --workspace -- -D warnings
|
|||
|
||||
## Upgrading the `flatbuffers` crate
|
||||
|
||||
IOx uses Flatbuffers for its write-ahead log. The structure is defined in
|
||||
[`generated_types/protos/wal.fbs`]. We have then used the `flatc` Flatbuffers compiler to generate
|
||||
the corresponding Rust code in [`generated_types/src/wal_generated.rs`], which is checked in to the
|
||||
repository.
|
||||
IOx uses Flatbuffers for its write buffer. The structure is defined in
|
||||
[`generated_types/protos/write_buffer.fbs`]. We have then used the `flatc` Flatbuffers compiler to
|
||||
generate the corresponding Rust code in [`generated_types/src/write_buffer_generated.rs`], which
|
||||
is checked in to the repository.
|
||||
|
||||
The checked-in code is compatible with the `flatbuffers` crate version in the `Cargo.lock` file. If
|
||||
upgrading the version of the `flatbuffers` crate that IOx depends on, the generated code will need
|
||||
|
|
|
@ -15,7 +15,7 @@ pub enum Job {
|
|||
nanos: Vec<u64>,
|
||||
},
|
||||
|
||||
/// Persist a WAL segment to object store
|
||||
/// Persist a Write Buffer segment to object store
|
||||
PersistSegment {
|
||||
writer_id: u32,
|
||||
segment_id: u64,
|
||||
|
|
|
@ -8,7 +8,7 @@ The data lifecycle of time series data usually involves some sort of real-time i
|
|||
|
||||
The real-time aspect of time series data for monitoring, ETL, and visualization for recent data is what IOx is optimizing for. Because IOx uses object storage and Parquet as its persistence format, we can defer larger scale and more ad hoc processing to systems that are well suited for the task.
|
||||
|
||||
IOx defines APIs and configuration to manage the movement of data as it arrives and periodically in the background. These can be used to send data to other IOx servers for processing, query or sending it to object storage in the form of write ahead log segments or Parquet files and their summaries.
|
||||
IOx defines APIs and configuration to manage the movement of data as it arrives and periodically in the background. These can be used to send data to other IOx servers for processing, query or sending it to object storage in the form of write buffer segments or Parquet files and their summaries.
|
||||
|
||||
## Vocabulary Definitions
|
||||
This section describes the key terms and vocabulary that is used in this document. Brief descriptions are given, but their meanings should become more clear as you read through the rest of this document. It’s impossible to talk about the data lifecycle without touching on each of these terms.
|
||||
|
@ -22,17 +22,17 @@ This section describes the key terms and vocabulary that is used in this documen
|
|||
|
||||
*Partition*: a grouping of data within a database defined by the administrator of IOx. Each row is assigned to a partition based on a string identifier called its partition key. Partitions within a database are generally non-overlapping, but because the rules for how to generate partition keys can change over time, this isn’t a strict requirement. When querying data, a partition key is irrelevant and only the actual data contained in the partition is considered when determining the best plan for querying.
|
||||
|
||||
*WAL Buffer*: a buffer for entries into a write ahead log. This buffer can exist in-memory only or as a file appended to on the local filesystem.
|
||||
*Write Buffer*: a buffer for entries into a write buffer. This buffer can exist in-memory only or as a file appended to on the local filesystem.
|
||||
|
||||
*WAL Segment*: a historical part of the WAL buffer that has a monotonically increasing identifier. Segments are an ordered collection of individual WAL entries. Segments can exist in memory or as a read-only file on local disk or object storage. Its filename should be its identifier as a base 10 number with 10 digits with leading zero padding to make it sort correctly by filename.
|
||||
*Write Buffer Segment*: a historical part of the Write Buffer that has a monotonically increasing identifier. Segments are an ordered collection of individual Write Buffer entries. Segments can exist in memory or as a read-only file on local disk or object storage. Its filename should be its identifier as a base 10 number with 10 digits with leading zero padding to make it sort correctly by filename.
|
||||
|
||||
*Mutable Buffer*: an in-memory collection of data that can be actively written to and queried from. It is optimized for incoming, real-time writes. Its purpose is to buffer data for a given partition while that partition is being actively written to so that it can later be converted to a read-optimized format for persistence and querying.
|
||||
|
||||
*Read Buffer*: an in-memory read-optimized collection of data. Data within the read buffer is organized into large immutable chunks. Each chunk within the read buffer must be created all at once (rather than as data is written into the DB). This can be done from data buffered in the mutable buffer, or from Parquet files, or from WAL segments.
|
||||
*Read Buffer*: an in-memory read-optimized collection of data. Data within the read buffer is organized into large immutable chunks. Each chunk within the read buffer must be created all at once (rather than as data is written into the DB). This can be done from data buffered in the mutable buffer, or from Parquet files, or from Write Buffer segments.
|
||||
|
||||
*Object Store*: a store that can get, put, or delete individual objects with a path like /foo/bar/asdf.txt. It also supports listing objects in the store with an optional prefix filter. Underlying implementations exist for AWS S3, Google Cloud Storage, Azure Blob Storage, and for single node deployments, the local file system, or in memory.
|
||||
|
||||
*Chunk*: a chunk is a collection of data within a partition. A database can have many partitions, and each partition can have many chunks. Each chunk only contains data from a single partition. A chunk is a collection of tables and their data, the schema, and the summary statistics that describe the data within a chunk. Chunks, like WAL segments, have a monotonically increasing ID, which is reflected in their path in the durable store. Chunks can exist in the Mutable Buffer, Read Buffer, and Object Store.
|
||||
*Chunk*: a chunk is a collection of data within a partition. A database can have many partitions, and each partition can have many chunks. Each chunk only contains data from a single partition. A chunk is a collection of tables and their data, the schema, and the summary statistics that describe the data within a chunk. Chunks, like Write Buffer segments, have a monotonically increasing ID, which is reflected in their path in the durable store. Chunks can exist in the Mutable Buffer, Read Buffer, and Object Store.
|
||||
|
||||
*Chunk Summary*: the summary information for what tables exist, what their column and data types are, and additional metadata such as what the min, max and count are for each column in each table. Chunk summaries can be rolled up to create a partition summary and partition summaries can be rolled up to create a database summary.
|
||||
|
||||
|
@ -42,13 +42,13 @@ This section describes the key terms and vocabulary that is used in this documen
|
|||
1. An IOx server defines the following parts of the data lifecycle:
|
||||
2. Ingest (from InfluxDB Line Protocol, JSON, or Flatbuffers)
|
||||
3. Schema validation & partition key creation
|
||||
4. WAL buffering and shipping WAL segments to object storage
|
||||
4. Write buffering and shipping Write Buffer segments to object storage
|
||||
5. Writes to In-memory buffer that is queryable (named MutableBuffer)
|
||||
6. Synchronous Replication (inside the write request/response)
|
||||
7. Subscriptions (sent outside the write request/response from the WAL buffer)
|
||||
7. Subscriptions (sent outside the write request/response from the Write Buffer)
|
||||
8. Creation of immutable in-memory Chunks (in the ReadBuffer)
|
||||
9. Creation of immutable durable Chunks (in the ObjectStore)
|
||||
10. Tails (subscriptions, but from WAL Segments in object storage potentially from many writers)
|
||||
10. Tails (subscriptions, but from Write Buffer Segments in object storage potentially from many writers)
|
||||
11. Loading of Parquet files into the Read Buffer
|
||||
|
||||
Each part of this lifecycle is optional. This can be run entirely on a single server or it can be split up across many servers to isolate workloads. Here’s a high level overview of the lifecycle flow:
|
||||
|
@ -84,7 +84,7 @@ From drawings/data_lifecycle.monopic
|
|||
▼ ▼ ▼
|
||||
┌─────────────────────┐ ┌───────────────────────┐ ┌────────────────────┐
|
||||
│ │ │ Synchronous │ │ │
|
||||
│ Mutable Buffer │ │ Replication │ │ WAL Buffer │
|
||||
│ Mutable Buffer │ │ Replication │ │ Write Buffer │
|
||||
│ │ │ │ │ │
|
||||
└─────────────────────┘ └───────────────────────┘ └────────────────────┘
|
||||
│ │
|
||||
|
@ -93,13 +93,13 @@ From drawings/data_lifecycle.monopic
|
|||
│ │ │ │ │
|
||||
▼ ▼ ▼ ▼ ▼
|
||||
┌──────────────┐ ┌────────────────┐ ┌────────────────┐ ┌────────────────────┐ ┌────────────────────┐
|
||||
│ Queries │ │ Chunk to │ │ Chunk to │ │ WAL Segment to │ │ Subscriptions │
|
||||
│ Queries │ │ Chunk to │ │ Chunk to │ │ WB Segment to │ │ Subscriptions │
|
||||
│ │ │ Read Buffer │ │ Object Store │ │ Object Store │ │ │
|
||||
└──────────────┘ └────────────────┘ └────────────────┘ └────────────────────┘ └────────────────────┘
|
||||
```
|
||||
|
||||
|
||||
As mentioned previously, any of these boxes is optional. Because data can come from WAL Segments or Chunks in object storage, even the ingest path is optional.
|
||||
As mentioned previously, any of these boxes is optional. Because data can come from Write Buffer Segments or Chunks in object storage, even the ingest path is optional.
|
||||
|
||||
## How data is logically organized
|
||||
|
||||
|
|
|
@ -11,9 +11,10 @@
|
|||
#
|
||||
#
|
||||
# The identifier for the server. Used for writing to object storage and as
|
||||
# an identifier that is added to replicated writes, WAL segments and Chunks.
|
||||
# Must be unique in a group of connected or semi-connected IOx servers.
|
||||
# Must be a number that can be represented by a 32-bit unsigned integer.
|
||||
# an identifier that is added to replicated writes, Write Buffer segments
|
||||
# and Chunks. Must be unique in a group of connected or semi-connected IOx
|
||||
# servers. Must be a number that can be represented by a 32-bit unsigned
|
||||
# integer.
|
||||
# INFLUXDB_IOX_ID=1
|
||||
#
|
||||
# Which object store implementation to use (defaults to Memory if unset)
|
||||
|
|
|
@ -49,7 +49,7 @@ This can not always be done (e.g. with a library such as parquet writer which is
|
|||
|
||||
There will, of course, always be a judgment call to be made of where "CPU bound work" starts and "work acceptable for I/O processing" ends. A reasonable rule of thumb is if a job will *always* be completed in less than 100ms then that is probably fine for an I/O thread). This number may be revised as we tune the system.
|
||||
|
||||
The following is a specific example of why it is important to be cautious when receiving RPC work and ensuring only minimal CPU work is done before handing off to a pool of workers. In some versions of InfluxDB, the `/write` HTTP handlers read, parse, allocate and forward the points to the InfluxDB TSM engine for writing to the WAL. The WAL is the first place where back pressure appears. The problem with this approach is you can have 1,000s of HTTP requests in flight parsing, allocating, consuming huge amounts of memory and CPU resources, causing servers to fall over. This situation can be avoided by forwarding the work to a work queue (from parsing onwards).
|
||||
The following is a specific example of why it is important to be cautious when receiving RPC work and ensuring only minimal CPU work is done before handing off to a pool of workers. In some versions of InfluxDB, the `/write` HTTP handlers read, parse, allocate and forward the points to the InfluxDB TSM engine for writing to the Write Buffer. The Write Buffer is the first place where back pressure appears. The problem with this approach is you can have 1,000s of HTTP requests in flight parsing, allocating, consuming huge amounts of memory and CPU resources, causing servers to fall over. This situation can be avoided by forwarding the work to a work queue (from parsing onwards).
|
||||
|
||||
|
||||
## Examples
|
||||
|
|
|
@ -29,7 +29,7 @@ message Dummy {
|
|||
repeated uint64 nanos = 1;
|
||||
}
|
||||
|
||||
// A job that persists a WAL segment to object store
|
||||
// A job that persists a Write Buffer segment to object store
|
||||
message PersistSegment {
|
||||
uint32 writer_id = 1;
|
||||
uint64 segment_id = 2;
|
||||
|
|
|
@ -53,7 +53,7 @@ table Delete {
|
|||
stop_time: int64;
|
||||
}
|
||||
|
||||
// Segment is a collection of ReplicatedWrites. It is the payload of a WAL
|
||||
// Segment is a collection of ReplicatedWrites. It is the payload of a Write Buffer
|
||||
// segment file.
|
||||
table Segment {
|
||||
// the segment number
|
||||
|
|
|
@ -226,8 +226,8 @@ pub fn split_lines_into_write_entry_partitions(
|
|||
.push(line);
|
||||
}
|
||||
|
||||
// create a WALEntry for each batch of lines going to a partition (one WALEntry
|
||||
// per partition)
|
||||
// create a WriteBufferEntry for each batch of lines going to a partition (one
|
||||
// WriteBufferEntry per partition)
|
||||
let entries = partition_writes
|
||||
.into_iter()
|
||||
.map(|(key, lines)| add_write_entry(&mut fbb, Some(&key), &lines))
|
||||
|
|
|
@ -20,7 +20,7 @@ pub fn make_db() -> Db {
|
|||
server_id,
|
||||
object_store,
|
||||
exec,
|
||||
None, // wal buffer
|
||||
None, // write buffer
|
||||
Arc::new(JobRegistry::new()),
|
||||
)
|
||||
}
|
||||
|
@ -32,7 +32,7 @@ pub fn make_database(server_id: NonZeroU32, object_store: Arc<ObjectStore>, db_n
|
|||
server_id,
|
||||
object_store,
|
||||
exec,
|
||||
None, // wal buffer
|
||||
None, // write buffer
|
||||
Arc::new(JobRegistry::new()),
|
||||
)
|
||||
}
|
||||
|
|
|
@ -398,7 +398,7 @@ mem,host=A,region=west used=45 1
|
|||
server_id,
|
||||
object_store,
|
||||
exec,
|
||||
None, // wal buffer
|
||||
None, // write buffer
|
||||
Arc::new(JobRegistry::new()),
|
||||
)
|
||||
}
|
||||
|
|
|
@ -191,7 +191,7 @@ pub async fn command(url: String, config: Config) -> Result<()> {
|
|||
}],
|
||||
}),
|
||||
|
||||
// Note no wal buffer config
|
||||
// Note no write buffer config
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
|
|
|
@ -248,9 +248,9 @@ pub struct Config {
|
|||
/// The identifier for the server.
|
||||
///
|
||||
/// Used for writing to object storage and as an identifier that is added to
|
||||
/// replicated writes, WAL segments and Chunks. Must be unique in a group of
|
||||
/// connected or semi-connected IOx servers. Must be a number that can be
|
||||
/// represented by a 32-bit unsigned integer.
|
||||
/// replicated writes, write buffer segments, and Chunks. Must be unique in
|
||||
/// a group of connected or semi-connected IOx servers. Must be a number
|
||||
/// that can be represented by a 32-bit unsigned integer.
|
||||
#[structopt(long = "--writer-id", env = "INFLUXDB_IOX_ID")]
|
||||
pub writer_id: Option<NonZeroU32>,
|
||||
|
||||
|
|
Loading…
Reference in New Issue