diff --git a/content/v2.0/reference/internals/storage-engine.md b/content/v2.0/reference/internals/storage-engine.md index 744bda9d1..25f188d90 100644 --- a/content/v2.0/reference/internals/storage-engine.md +++ b/content/v2.0/reference/internals/storage-engine.md @@ -30,6 +30,7 @@ Major topics include: The storage engine handles data from the point an API request is received through writing it to the physical disk. Data is written to InfluxDB using [line protocol](/v2.0/reference/line-) sent via HTTP POST request to the `/write` endpoint. Batches of [points](/v2.0/reference/glossary/#point) are sent to InfluxDB, compressed, and written to a WAL for immediate durability. +(A *point* is a series key, field value, and timestamp.) The points are also written to an in-memory cache and become immediately queryable. The cache is periodically written to disk in the form of [TSM](#time-structured-merge-tree-tsm) files. As TSM files accumulate, they are combined and compacted into higher level TSM files. @@ -50,21 +51,11 @@ When a client sends a write request, the following occurs: 4. Return success to caller. `fsync()` takes the file and pushes pending writes all the way through any buffers and caches to disk. -As a system call, `fsync()` has a kernel context switch which is expensive _in terms of time_ but guarantees your data is safe on disk. - -{{% note%}} -To `fsync()` less frequently, batch your points. -{{% /note %}} +As a system call, `fsync()` has a kernel context switch which is computationally expensive, but guarantees your data is safe on disk. When the storage engine restarts, the WAL file is read back into the in-memory database. InfluxDB then snswer requests to the `/read` endpoint. - - - - - - {{% note%}} Once you receive a response to a write request, your data is on disk! {{% /note %}} @@ -78,22 +69,20 @@ Data is not compressed in the cache. The cache is recreated on restart by re-reading the WAL files on disk back into memory. The cache is queried at runtime and merged with the data stored in TSM files. - - - - +When the storage engine restarts, WAL files are re-read into the in-memory cache. Queries to the storage engine will merge data from the cache with data from the TSM files. Queries execute on a copy of the data that is made from the cache at query processing time. This way writes that come in while a query is running do not affect the result. -Deletes sent to the Cache will clear out the given key or the specific time range for the given key. +Deletes sent to the cache will clear out the given key or the specific time range for the given key. ## Time-Structured Merge Tree (TSM) To efficiently compact and store data, -the storage engine groups field values by [series](/v2.0/reference/key-concepts/data-elements/#series) key, +the storage engine groups field values by series key, and then orders those field values by time. +(A *series key* is defined by measurement, tag key and value, and field key.) The storage engine uses a **Time-Structured Merge Tree** (TSM) data format. TSM files store compressed series data in a columnar format. @@ -101,13 +90,7 @@ To improve efficiency, the storage engine only stores differences (or *deltas*) Column-oriented storage means we can read by series key and ignore what it doesn't need. Storing data in columns lets the storage engine read by series key. - - - - - - -After fields are stored safely in TSM files, WAL is truncated... +After fields are stored safely in TSM files, the WAL is truncated and the cache is cleared. There’s a lot of logic and sophistication in the TSM compaction code. @@ -116,7 +99,7 @@ organize values for a series together into long runs to best optimize compressio ## Time Series Index (TSI) -As data cardinality (number of series) grows, queries read more series keys and become slower. +As data cardinality (the number of series) grows, queries read more series keys and become slower. The **Time Series Index** ensures queries remain fast as data cardinality of data grows... To keep queries fast as we have more data, we use a **Time Series Index**. @@ -127,28 +110,3 @@ The TSI stores series keys grouped by measurement, tag, and field. TSI answers two questions well: 1) What measurements, tags, fields exist? 2) Given a measurement, tags, and fields, what series keys exist? - - - - - - - - - - - - - - - - - - - - - - - - -