From 3c6164953ea05727fa21910b17c513ddd6e5d126 Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Mon, 21 Oct 2019 11:18:48 -0700
Subject: [PATCH 01/18] Add initial outline and notes for storage engine
 documentation

Uses material based on v1 docs and:
- https://www.influxdata.com/blog/influxdb-internals-101-part-one/
- https://www.youtube.com/watch?v=8dOZGnSwKmo
---
 content/v2.0/reference/storage-engine.md | 202 +++++++++++++++++++++++
 1 file changed, 202 insertions(+)
 create mode 100644 content/v2.0/reference/storage-engine.md

diff --git a/content/v2.0/reference/storage-engine.md b/content/v2.0/reference/storage-engine.md
new file mode 100644
index 000000000..12dbb9e8f
--- /dev/null
+++ b/content/v2.0/reference/storage-engine.md
@@ -0,0 +1,202 @@
+---
+title: InfluxDB storage engine
+description: >
+  Concepts related to InfluxDB storage engine.
+weight: 7
+menu:
+  v2_0_ref:
+    name: Storage engine
+v2.0/tags: [storage engine, internals, platform]
+---
+
+## Introduction
+
+The InfluxDB 2.0 platform includes:
+
+- query engine
+- internal API
+- storage engine
+
+The query engine sends requests through the internal API to the storage engine.
+The storage engine ensures the following three things:
+
+- Data is safely written to disk
+- Queried data is returned complete and correct
+- Data is accurate (first) and performant (second)
+
+This document details the internal workings of the storage engine.
+This information is presented both as a reference and to aid those looking to maximize performance.
+
+{{% note %}}
+##### At a glance: Changes in InfluxDB 2.0
+- The InfluxDB 2.0 storage engine no longer partitions data into shards by time.
+- **Buckets** replace databases and retention policies.
+- Only TSI is used. There is no more in-memory index.
+- The configuration interface and options have changed. Configuration options are not currently exposed, but will be.
+
+[Read about the v1 storage engine](https://docs.influxdata.com/influxdb/v1.7/concepts/storage_engine).
+{{% /note %}}
+
+## Summary
+
+In summary, batches of points are POSTed to InfluxDB.
+Those batches are snappy compressed and written to a WAL for immediate durability.
+The points are also written to an in-memory cache so that newly written points are immediately queryable.
+The cache is periodically flushed to TSM files.
+As TSM files accumulate, they are combined and compacted into higher level TSM files.
+TSM data is organized into shards.
+The time range covered by a shard and the replication factor of a shard in a clustered deployment are configured by the retention policy.
+
+<!-- V1 -->
+<!-- The storage engine is composed of a number of components that each serve a particular role: -->
+
+<!-- - WAL - The WAL is a write-optimized storage format that allows for writes to be durable, but not easily queryable. -->
+<!--   Writes to the WAL are appended to segments of a fixed size. -->
+<!-- - Cache - The Cache is an in-memory representation of the data stored in the WAL. -->
+<!--   It is queried at runtime and merged with the data stored in TSM files. -->
+<!-- - TSM Files - TSM files store compressed series data in a columnar format. -->
+<!-- - FileStore - The FileStore mediates access to all TSM files on disk. -->
+<!--   It ensures that TSM files are installed atomically when existing ones are replaced as well as removing TSM files that are no longer used. -->
+<!-- - Compactor - The Compactor is responsible for converting less optimized Cache and TSM data into more read-optimized formats. -->
+<!--   It does this by compressing series, removing deleted data, optimizing indices and combining smaller files into larger ones. -->
+<!-- - Compaction Planner - The Compaction Planner determines which TSM files are ready for a compaction and ensures that multiple concurrent compactions do not interfere with each other. -->
+<!-- - Compression - Compression is handled by various Encoders and Decoders for specific data types. -->
+<!--   Some encoders are fairly static and always encode the same type the same way; -->
+<!--   others switch their compression strategy based on the shape of the data. -->
+<!-- - Writers/Readers - Each file type (WAL segment, TSM files, tombstones, etc..) has Writers and Readers for working with the formats. -->
+
+## /write endpoint
+
+Data is written to InfluxDB using [Line protocol](/) sent via HTTP POST request to the `/write` endpoint.
+Points can be sent individually; however, for efficiency, most applications send points in batches.
+A typical batch ranges in size from hundreds to thousands of points.
+Points in a POST body can be from an arbitrary number of series.
+Points in a batch do not have to be from the same measurement or tagset.
+
+## Durability: Write Ahead Log (WAL)
+
+<!-- Also mention the cache -->
+<!-- Which parts of cache and WAL are configurable? -->
+
+To ensure durability, we use a Write Ahead Log (WAL).
+<!-- On the write side, -->
+WAL is a data structure and algorithm that is super simple and powerful.
+It ensures that written data does not disappear when storage engine restarts.
+When a client sends a /write request, the following occurs:
+
+1. Write request is appended to the end of the WAL file.
+2. fsync() the data to the file.
+3. Update the in-memory database.
+4. Return success to caller.
+
+fsync() is a system call, so it has a kernel context switch which costs something.
+fsync() takes the file and pushes pending writes all the way through any buffers and caches to disk.
+fsync() is expensive _in terms of time_ but guarantees your data is safe on disk.
+**Important** to batch your points (send in ~2000 points at a time), to fsync() less frequently.
+
+<!-- On read side: -->
+When the storage engine restarts, open WAL file and read it back into the in-memory database.
+Answer requests to the /read endpoint.
+
+<!-- So: data durability! Write to disk -->
+<!-- Do this with WAL (write-ahead log) -->
+<!-- Algorithm that is simple and powerful: -->
+<!-- when a write req is ent, append to WAL file, then fsync() file, then update in-memory database, then return "success" to caller -->
+<!-- This why it's good to Batch your writes, so that we fsync() less frequently -->
+<!-- When storage engine restarts (if we've pulled the plug), "replay" the WAL and then answer /read requests -->
+
+<!-- V1 (edited) -->
+<!-- TODO is this still true? -->
+<!-- On the file system, the WAL is made up of files named `_000001.wal`. -->
+<!-- The file numbers are monotonically increasing and referred to as WAL segments. -->
+<!-- When a segment reaches 10MB in size, it is closed and a new one is opened.  Each WAL segment stores multiple compressed blocks of writes and deletes. -->
+<!-- When a write comes in the new points are serialized, compressed using Snappy, and written to a WAL file. -->
+<!-- The file is `fsync`'d and the data is added to an in-memory index before a success is returned. -->
+<!-- This means that batching points together is required to achieve high throughput performance. -->
+<!-- (Optimal batch size seems to be 5,000-10,000 points per batch for many use cases.) -->
+<!-- Each entry in the WAL follows a [TLV standard](https://en.wikipedia.org/wiki/Type-length-value) with a single byte representing the type of entry (write or delete), a 4 byte `uint32` for the length of the compressed block, and then the compressed block. -->
+
+{{% note%}}
+Once you receive a response to a write request, your data is on disk!
+{{% /note %}}
+
+## Cache
+
+<!-- V1 -->
+The Cache is an in-memory copy of all data points current stored in the WAL.
+The points are organized by the key, which is the measurement, [tag set](/influxdb/v1.7/concepts/glossary/#tag-set), and unique [field](/influxdb/v1.7/concepts/glossary/#field).
+Each field is kept as its own time-ordered range.
+The Cache data is not compressed while in memory.
+
+Queries to the storage engine will merge data from the Cache with data from the TSM files.
+Queries execute on a copy of the data that is made from the cache at query processing time.
+This way writes that come in while a query is running won't affect the result.
+
+Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
+
+<!-- === CONFIGURABLES === -->
+<!-- The Cache exposes a few controls for snapshotting behavior. -->
+<!-- The two most important controls are the memory limits. -->
+<!-- There is a lower bound, [`cache-snapshot-memory-size`](/influxdb/v1.7/administration/config#cache-snapshot-memory-size-25m), which when exceeded will trigger a snapshot to TSM files and remove the corresponding WAL segments. -->
+<!-- There is also an upper bound, [`cache-max-memory-size`](/influxdb/v1.7/administration/config#cache-max-memory-size-1g), which when exceeded will cause the Cache to reject new writes. -->
+<!-- These configurations are useful to prevent out of memory situations and to apply back pressure to clients writing data faster than the instance can persist it. -->
+<!-- The checks for memory thresholds occur on every write. -->
+<!-- The other snapshot controls are time based. -->
+<!-- The idle threshold, [`cache-snapshot-write-cold-duration`](/influxdb/v1.7/administration/config#cache-snapshot-write-cold-duration-10m), forces the Cache to snapshot to TSM files if it hasn't received a write within the specified interval. -->
+
+The cache is recreated on restart by re-reading the WAL files on disk back into memory.
+
+## Time-Structured Merge Tree
+
+Now let's handle more Data!
+Queries get slower as data grows
+Service terminates if data size exceeds memory...
+
+**Time-Structured Merge Tree** (TSM) is our data format
+We group field values grouped by series key, then order field values by time
+
+<!-- TERMS -->
+series key = measurement, tag key+value, field key
+point = series key, field value, timestamp
+
+Within a series, we store only differences between values, which is more efficient.
+
+Column-Oriented storage means we can read by series key and ignore what it doesn't need.
+
+Compression helps with performance.
+
+After fields are stored safely in TSM files, WAL is truncated...
+<!-- TODO what next? -->
+(This stuff is configurable)
+
+There’s a lot of logic and sophistication in the TSM compaction code.
+However, the high-level goal is quite simple:
+organize values for a series together into long runs to best optimize compression and scanning queries.
+
+## TSI
+
+To keep queries fast as we have more data, we use a **Time Series Index**.
+cardinality = quantity of series keys
+With high cardinality, we have to search through all series keys.
+So how to quickly find and match series keys?
+We use Time Series Index (TSI), which stores series keys grouped by measurement,tag,field
+TSI answers question what measurements, tags, fields exist?
+<!-- TODO there's another Question TSI answers... -->
+
+<!-- TODO should we even mention shards -->
+<!-- ## Shards -->
+
+<!-- A shard contains: -->
+<!--   WAL files -->
+<!--   TSM files -->
+<!--   TSI files -->
+<!-- Shards are time-bounded -->
+<!-- Retention policies have properties: duration and shard duration -->
+
+<!-- colder shards get more compacted -->
+
+<!-- _2.0 changes to storage engines_ -->
+<!-- In OSS 2.0, shards are no longer time-bounded -->
+<!-- This is to support cold-storage -->
+<!-- Retention policies are more complicated -->
+

From 61ce3d47ff839b6b9a7d5ff6e74f9181793968fd Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Thu, 24 Oct 2019 10:41:56 -0700
Subject: [PATCH 02/18] Re-structure storage engine doc

---
 content/v2.0/reference/storage-engine.md | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/content/v2.0/reference/storage-engine.md b/content/v2.0/reference/storage-engine.md
index 12dbb9e8f..0cfaf5a0e 100644
--- a/content/v2.0/reference/storage-engine.md
+++ b/content/v2.0/reference/storage-engine.md
@@ -11,14 +11,7 @@ v2.0/tags: [storage engine, internals, platform]
 
 ## Introduction
 
-The InfluxDB 2.0 platform includes:
-
-- query engine
-- internal API
-- storage engine
-
-The query engine sends requests through the internal API to the storage engine.
-The storage engine ensures the following three things:
+The InfluxDB storage engine ensures the following three things:
 
 - Data is safely written to disk
 - Queried data is returned complete and correct
@@ -27,17 +20,24 @@ The storage engine ensures the following three things:
 This document details the internal workings of the storage engine.
 This information is presented both as a reference and to aid those looking to maximize performance.
 
+Major topics include:
+
+* [Write Ahead Log (WAL)](#)
+* [Time-Structed Merge Tree (TSM)](#)
+* [Time Series Index (TSI)](#)
+
 {{% note %}}
-##### At a glance: Changes in InfluxDB 2.0
+##### At a glance: changes to the storage engine in InfluxDB 2.0
 - The InfluxDB 2.0 storage engine no longer partitions data into shards by time.
 - **Buckets** replace databases and retention policies.
 - Only TSI is used. There is no more in-memory index.
 - The configuration interface and options have changed. Configuration options are not currently exposed, but will be.
 
-[Read about the v1 storage engine](https://docs.influxdata.com/influxdb/v1.7/concepts/storage_engine).
+Read about the [v1 storage engine](https://docs.influxdata.com/influxdb/v1.7/concepts/storage_engine).
 {{% /note %}}
 
-## Summary
+
+## Writing data: from API to disk
 
 In summary, batches of points are POSTed to InfluxDB.
 Those batches are snappy compressed and written to a WAL for immediate durability.
@@ -183,7 +183,7 @@ We use Time Series Index (TSI), which stores series keys grouped by measurement,
 TSI answers question what measurements, tags, fields exist?
 <!-- TODO there's another Question TSI answers... -->
 
-<!-- TODO should we even mention shards -->
+<!-- TODO should we even mention shards? -->
 <!-- ## Shards -->
 
 <!-- A shard contains: -->

From 496673f6a44f2e04f4b2b9c347c8e7f96abcb547 Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Thu, 24 Oct 2019 11:13:35 -0700
Subject: [PATCH 03/18] Redistribute v1 storage engine information

---
 content/v2.0/reference/storage-engine.md | 60 ++++++++++++------------
 1 file changed, 31 insertions(+), 29 deletions(-)

diff --git a/content/v2.0/reference/storage-engine.md b/content/v2.0/reference/storage-engine.md
index 0cfaf5a0e..aae389564 100644
--- a/content/v2.0/reference/storage-engine.md
+++ b/content/v2.0/reference/storage-engine.md
@@ -1,7 +1,7 @@
 ---
 title: InfluxDB storage engine
 description: >
-  Concepts related to InfluxDB storage engine.
+  An overview of the InfluxDB storage engine architecture.
 weight: 7
 menu:
   v2_0_ref:
@@ -36,7 +36,6 @@ Major topics include:
 Read about the [v1 storage engine](https://docs.influxdata.com/influxdb/v1.7/concepts/storage_engine).
 {{% /note %}}
 
-
 ## Writing data: from API to disk
 
 In summary, batches of points are POSTed to InfluxDB.
@@ -44,28 +43,10 @@ Those batches are snappy compressed and written to a WAL for immediate durabilit
 The points are also written to an in-memory cache so that newly written points are immediately queryable.
 The cache is periodically flushed to TSM files.
 As TSM files accumulate, they are combined and compacted into higher level TSM files.
-TSM data is organized into shards.
-The time range covered by a shard and the replication factor of a shard in a clustered deployment are configured by the retention policy.
+<!-- TSM data is organized into shards. -->
+<!-- The time range covered by a shard and the replication factor of a shard in a clustered deployment are configured by the retention policy. -->
 
-<!-- V1 -->
-<!-- The storage engine is composed of a number of components that each serve a particular role: -->
-
-<!-- - WAL - The WAL is a write-optimized storage format that allows for writes to be durable, but not easily queryable. -->
-<!--   Writes to the WAL are appended to segments of a fixed size. -->
-<!-- - Cache - The Cache is an in-memory representation of the data stored in the WAL. -->
-<!--   It is queried at runtime and merged with the data stored in TSM files. -->
-<!-- - TSM Files - TSM files store compressed series data in a columnar format. -->
-<!-- - FileStore - The FileStore mediates access to all TSM files on disk. -->
-<!--   It ensures that TSM files are installed atomically when existing ones are replaced as well as removing TSM files that are no longer used. -->
-<!-- - Compactor - The Compactor is responsible for converting less optimized Cache and TSM data into more read-optimized formats. -->
-<!--   It does this by compressing series, removing deleted data, optimizing indices and combining smaller files into larger ones. -->
-<!-- - Compaction Planner - The Compaction Planner determines which TSM files are ready for a compaction and ensures that multiple concurrent compactions do not interfere with each other. -->
-<!-- - Compression - Compression is handled by various Encoders and Decoders for specific data types. -->
-<!--   Some encoders are fairly static and always encode the same type the same way; -->
-<!--   others switch their compression strategy based on the shape of the data. -->
-<!-- - Writers/Readers - Each file type (WAL segment, TSM files, tombstones, etc..) has Writers and Readers for working with the formats. -->
-
-## /write endpoint
+<!-- ## /write endpoint -->
 
 Data is written to InfluxDB using [Line protocol](/) sent via HTTP POST request to the `/write` endpoint.
 Points can be sent individually; however, for efficiency, most applications send points in batches.
@@ -78,6 +59,8 @@ Points in a batch do not have to be from the same measurement or tagset.
 <!-- Also mention the cache -->
 <!-- Which parts of cache and WAL are configurable? -->
 
+<!-- The WAL is a write-optimized storage format that allows for writes to be durable, but not easily queryable -->
+
 To ensure durability, we use a Write Ahead Log (WAL).
 <!-- On the write side, -->
 WAL is a data structure and algorithm that is super simple and powerful.
@@ -107,7 +90,7 @@ Answer requests to the /read endpoint.
 
 <!-- V1 (edited) -->
 <!-- TODO is this still true? -->
-<!-- On the file system, the WAL is made up of files named `_000001.wal`. -->
+<!-- On the file system, the WAL is made up of sequentially numbered files (`_000001.wal`). -->
 <!-- The file numbers are monotonically increasing and referred to as WAL segments. -->
 <!-- When a segment reaches 10MB in size, it is closed and a new one is opened.  Each WAL segment stores multiple compressed blocks of writes and deletes. -->
 <!-- When a write comes in the new points are serialized, compressed using Snappy, and written to a WAL file. -->
@@ -122,16 +105,19 @@ Once you receive a response to a write request, your data is on disk!
 
 ## Cache
 
+Queries to the storage engine will merge data from the Cache with data from the TSM files.
+Queries execute on a copy of the data that is made from the cache at query processing time.
+This way writes that come in while a query is running won’t affect the result.
+
+<!-- - Cache - The Cache is an in-memory representation of the data stored in the WAL. -->
+<!--   It is queried at runtime and merged with the data stored in TSM files. -->
+
 <!-- V1 -->
 The Cache is an in-memory copy of all data points current stored in the WAL.
 The points are organized by the key, which is the measurement, [tag set](/influxdb/v1.7/concepts/glossary/#tag-set), and unique [field](/influxdb/v1.7/concepts/glossary/#field).
 Each field is kept as its own time-ordered range.
 The Cache data is not compressed while in memory.
 
-Queries to the storage engine will merge data from the Cache with data from the TSM files.
-Queries execute on a copy of the data that is made from the cache at query processing time.
-This way writes that come in while a query is running won't affect the result.
-
 Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
 
 <!-- === CONFIGURABLES === -->
@@ -148,6 +134,9 @@ The cache is recreated on restart by re-reading the WAL files on disk back into
 
 ## Time-Structured Merge Tree
 
+<!-- - TSM Files - TSM files store compressed series data in a columnar format. -->
+
+
 Now let's handle more Data!
 Queries get slower as data grows
 Service terminates if data size exceeds memory...
@@ -183,8 +172,8 @@ We use Time Series Index (TSI), which stores series keys grouped by measurement,
 TSI answers question what measurements, tags, fields exist?
 <!-- TODO there's another Question TSI answers... -->
 
+<!-- ## Retention-->
 <!-- TODO should we even mention shards? -->
-<!-- ## Shards -->
 
 <!-- A shard contains: -->
 <!--   WAL files -->
@@ -200,3 +189,16 @@ TSI answers question what measurements, tags, fields exist?
 <!-- This is to support cold-storage -->
 <!-- Retention policies are more complicated -->
 
+<!-- =========== OTHER -->
+
+<!-- V1 -->
+<!-- - FileStore - The FileStore mediates access to all TSM files on disk. -->
+<!--   It ensures that TSM files are installed atomically when existing ones are replaced as well as removing TSM files that are no longer used. -->
+<!-- - Compactor - The Compactor is responsible for converting less optimized Cache and TSM data into more read-optimized formats. -->
+<!--   It does this by compressing series, removing deleted data, optimizing indices and combining smaller files into larger ones. -->
+<!-- - Compaction Planner - The Compaction Planner determines which TSM files are ready for a compaction and ensures that multiple concurrent compactions do not interfere with each other. -->
+<!-- - Compression - Compression is handled by various Encoders and Decoders for specific data types. -->
+<!--   Some encoders are fairly static and always encode the same type the same way; -->
+<!--   others switch their compression strategy based on the shape of the data. -->
+<!-- - Writers/Readers - Each file type (WAL segment, TSM files, tombstones, etc..) has Writers and Readers for working with the formats. -->
+

From d67081b98a8e0d50b5c85fb5f9b6b6854540ca7e Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Thu, 24 Oct 2019 12:58:32 -0700
Subject: [PATCH 04/18] Work on drafting storage engine doc

---
 content/v2.0/reference/storage-engine.md | 102 +++++++++--------------
 1 file changed, 41 insertions(+), 61 deletions(-)

diff --git a/content/v2.0/reference/storage-engine.md b/content/v2.0/reference/storage-engine.md
index aae389564..2d03092b8 100644
--- a/content/v2.0/reference/storage-engine.md
+++ b/content/v2.0/reference/storage-engine.md
@@ -9,8 +9,6 @@ menu:
 v2.0/tags: [storage engine, internals, platform]
 ---
 
-## Introduction
-
 The InfluxDB storage engine ensures the following three things:
 
 - Data is safely written to disk
@@ -22,12 +20,12 @@ This information is presented both as a reference and to aid those looking to ma
 
 Major topics include:
 
-* [Write Ahead Log (WAL)](#)
-* [Time-Structed Merge Tree (TSM)](#)
-* [Time Series Index (TSI)](#)
+* [Write Ahead Log (WAL)](#write-ahead-log-wal)
+* [Time-Structed Merge Tree (TSM)](#time-structured-merge-tree-tsm)
+* [Time Series Index (TSI)](#time-series-index-tsi)
 
 {{% note %}}
-##### At a glance: changes to the storage engine in InfluxDB 2.0
+##### At a glance: changes to the InfluxDB storage engine from 1.x to 2.0
 - The InfluxDB 2.0 storage engine no longer partitions data into shards by time.
 - **Buckets** replace databases and retention policies.
 - Only TSI is used. There is no more in-memory index.
@@ -38,38 +36,32 @@ Read about the [v1 storage engine](https://docs.influxdata.com/influxdb/v1.7/con
 
 ## Writing data: from API to disk
 
-In summary, batches of points are POSTed to InfluxDB.
-Those batches are snappy compressed and written to a WAL for immediate durability.
-The points are also written to an in-memory cache so that newly written points are immediately queryable.
-The cache is periodically flushed to TSM files.
-As TSM files accumulate, they are combined and compacted into higher level TSM files.
-<!-- TSM data is organized into shards. -->
-<!-- The time range covered by a shard and the replication factor of a shard in a clustered deployment are configured by the retention policy. -->
-
-<!-- ## /write endpoint -->
-
+The storage engine handles data from the point an API request is received through writing it to the physical disk.
 Data is written to InfluxDB using [Line protocol](/) sent via HTTP POST request to the `/write` endpoint.
+Batches of points are sent to InfluxDB.
+Those batches are compressed and written to a WAL for immediate durability.
+The points are also written to an in-memory cache so that newly written points are immediately queryable.
+The cache is periodically written to disk as TSM files.
+As TSM files accumulate, they are combined and compacted into higher level TSM files.
+
 Points can be sent individually; however, for efficiency, most applications send points in batches.
 A typical batch ranges in size from hundreds to thousands of points.
 Points in a POST body can be from an arbitrary number of series.
 Points in a batch do not have to be from the same measurement or tagset.
 
-## Durability: Write Ahead Log (WAL)
-
-<!-- Also mention the cache -->
-<!-- Which parts of cache and WAL are configurable? -->
-
-<!-- The WAL is a write-optimized storage format that allows for writes to be durable, but not easily queryable -->
+## Write Ahead Log (WAL)
 
 To ensure durability, we use a Write Ahead Log (WAL).
+<!-- The WAL is a write-optimized storage format that allows for writes to be durable, but not easily queryable -->
 <!-- On the write side, -->
 WAL is a data structure and algorithm that is super simple and powerful.
 It ensures that written data does not disappear when storage engine restarts.
-When a client sends a /write request, the following occurs:
+When a client sends a write request, the following occurs:
 
 1. Write request is appended to the end of the WAL file.
 2. fsync() the data to the file.
 3. Update the in-memory database.
+   <!-- 3. Update the in-memory CACHE? -->
 4. Return success to caller.
 
 fsync() is a system call, so it has a kernel context switch which costs something.
@@ -78,25 +70,16 @@ fsync() is expensive _in terms of time_ but guarantees your data is safe on disk
 **Important** to batch your points (send in ~2000 points at a time), to fsync() less frequently.
 
 <!-- On read side: -->
-When the storage engine restarts, open WAL file and read it back into the in-memory database.
+When the storage engine restarts,
+<!-- (if we've pulled the plug, say) -->
+open WAL file and read it back into the in-memory database.
 Answer requests to the /read endpoint.
 
-<!-- So: data durability! Write to disk -->
-<!-- Do this with WAL (write-ahead log) -->
-<!-- Algorithm that is simple and powerful: -->
-<!-- when a write req is ent, append to WAL file, then fsync() file, then update in-memory database, then return "success" to caller -->
-<!-- This why it's good to Batch your writes, so that we fsync() less frequently -->
-<!-- When storage engine restarts (if we've pulled the plug), "replay" the WAL and then answer /read requests -->
-
-<!-- V1 (edited) -->
+<!-- ===== V1 material -->
 <!-- TODO is this still true? -->
 <!-- On the file system, the WAL is made up of sequentially numbered files (`_000001.wal`). -->
 <!-- The file numbers are monotonically increasing and referred to as WAL segments. -->
 <!-- When a segment reaches 10MB in size, it is closed and a new one is opened.  Each WAL segment stores multiple compressed blocks of writes and deletes. -->
-<!-- When a write comes in the new points are serialized, compressed using Snappy, and written to a WAL file. -->
-<!-- The file is `fsync`'d and the data is added to an in-memory index before a success is returned. -->
-<!-- This means that batching points together is required to achieve high throughput performance. -->
-<!-- (Optimal batch size seems to be 5,000-10,000 points per batch for many use cases.) -->
 <!-- Each entry in the WAL follows a [TLV standard](https://en.wikipedia.org/wiki/Type-length-value) with a single byte representing the type of entry (write or delete), a 4 byte `uint32` for the length of the compressed block, and then the compressed block. -->
 
 {{% note%}}
@@ -109,34 +92,21 @@ Queries to the storage engine will merge data from the Cache with data from the
 Queries execute on a copy of the data that is made from the cache at query processing time.
 This way writes that come in while a query is running won’t affect the result.
 
-<!-- - Cache - The Cache is an in-memory representation of the data stored in the WAL. -->
-<!--   It is queried at runtime and merged with the data stored in TSM files. -->
-
-<!-- V1 -->
 The Cache is an in-memory copy of all data points current stored in the WAL.
 The points are organized by the key, which is the measurement, [tag set](/influxdb/v1.7/concepts/glossary/#tag-set), and unique [field](/influxdb/v1.7/concepts/glossary/#field).
 Each field is kept as its own time-ordered range.
 The Cache data is not compressed while in memory.
 
-Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
+It is queried at runtime and merged with the data stored in TSM files.
 
-<!-- === CONFIGURABLES === -->
-<!-- The Cache exposes a few controls for snapshotting behavior. -->
-<!-- The two most important controls are the memory limits. -->
-<!-- There is a lower bound, [`cache-snapshot-memory-size`](/influxdb/v1.7/administration/config#cache-snapshot-memory-size-25m), which when exceeded will trigger a snapshot to TSM files and remove the corresponding WAL segments. -->
-<!-- There is also an upper bound, [`cache-max-memory-size`](/influxdb/v1.7/administration/config#cache-max-memory-size-1g), which when exceeded will cause the Cache to reject new writes. -->
-<!-- These configurations are useful to prevent out of memory situations and to apply back pressure to clients writing data faster than the instance can persist it. -->
-<!-- The checks for memory thresholds occur on every write. -->
-<!-- The other snapshot controls are time based. -->
-<!-- The idle threshold, [`cache-snapshot-write-cold-duration`](/influxdb/v1.7/administration/config#cache-snapshot-write-cold-duration-10m), forces the Cache to snapshot to TSM files if it hasn't received a write within the specified interval. -->
+Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
 
 The cache is recreated on restart by re-reading the WAL files on disk back into memory.
 
-## Time-Structured Merge Tree
+## Time-Structured Merge Tree (TSM)
 
 <!-- - TSM Files - TSM files store compressed series data in a columnar format. -->
 
-
 Now let's handle more Data!
 Queries get slower as data grows
 Service terminates if data size exceeds memory...
@@ -162,18 +132,20 @@ There’s a lot of logic and sophistication in the TSM compaction code.
 However, the high-level goal is quite simple:
 organize values for a series together into long runs to best optimize compression and scanning queries.
 
-## TSI
+## Time Series Index (TSI)
+
+TSI stores series keys grouped by measure, tag, field.
 
 To keep queries fast as we have more data, we use a **Time Series Index**.
 cardinality = quantity of series keys
 With high cardinality, we have to search through all series keys.
 So how to quickly find and match series keys?
 We use Time Series Index (TSI), which stores series keys grouped by measurement,tag,field
-TSI answers question what measurements, tags, fields exist?
-<!-- TODO there's another Question TSI answers... -->
+TSI answers two questions well:
+1) What measurements, tags, fields exist?
+2) Given a measurement, tags, and fields, what series keys exist?
 
-<!-- ## Retention-->
-<!-- TODO should we even mention shards? -->
+<!-- ## Shards -->
 
 <!-- A shard contains: -->
 <!--   WAL files -->
@@ -184,10 +156,9 @@ TSI answers question what measurements, tags, fields exist?
 
 <!-- colder shards get more compacted -->
 
-<!-- _2.0 changes to storage engines_ -->
-<!-- In OSS 2.0, shards are no longer time-bounded -->
-<!-- This is to support cold-storage -->
-<!-- Retention policies are more complicated -->
+<!-- =========== QUESTIONS -->
+<!-- Which parts of cache and WAL are configurable? -->
+<!-- Should we even mention shards? -->
 
 <!-- =========== OTHER -->
 
@@ -202,3 +173,12 @@ TSI answers question what measurements, tags, fields exist?
 <!--   others switch their compression strategy based on the shape of the data. -->
 <!-- - Writers/Readers - Each file type (WAL segment, TSM files, tombstones, etc..) has Writers and Readers for working with the formats. -->
 
+<!-- === CONFIGURABLES? === -->
+<!-- The Cache exposes a few controls for snapshotting behavior. -->
+<!-- The two most important controls are the memory limits. -->
+<!-- There is a lower bound, [`cache-snapshot-memory-size`](/influxdb/v1.7/administration/config#cache-snapshot-memory-size-25m), which when exceeded will trigger a snapshot to TSM files and remove the corresponding WAL segments. -->
+<!-- There is also an upper bound, [`cache-max-memory-size`](/influxdb/v1.7/administration/config#cache-max-memory-size-1g), which when exceeded will cause the Cache to reject new writes. -->
+<!-- These configurations are useful to prevent out of memory situations and to apply back pressure to clients writing data faster than the instance can persist it. -->
+<!-- The checks for memory thresholds occur on every write. -->
+<!-- The other snapshot controls are time based. -->
+<!-- The idle threshold, [`cache-snapshot-write-cold-duration`](/influxdb/v1.7/administration/config#cache-snapshot-write-cold-duration-10m), forces the Cache to snapshot to TSM files if it hasn't received a write within the specified interval. -->

From 239192638bbc1bd793d422f42f8d7d4cb495a488 Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Thu, 24 Oct 2019 13:20:20 -0700
Subject: [PATCH 05/18] Continued drafting storage engine doc

---
 content/v2.0/reference/storage-engine.md | 42 +++++++++---------------
 1 file changed, 16 insertions(+), 26 deletions(-)

diff --git a/content/v2.0/reference/storage-engine.md b/content/v2.0/reference/storage-engine.md
index 2d03092b8..d4657a5e0 100644
--- a/content/v2.0/reference/storage-engine.md
+++ b/content/v2.0/reference/storage-engine.md
@@ -79,7 +79,7 @@ Answer requests to the /read endpoint.
 <!-- TODO is this still true? -->
 <!-- On the file system, the WAL is made up of sequentially numbered files (`_000001.wal`). -->
 <!-- The file numbers are monotonically increasing and referred to as WAL segments. -->
-<!-- When a segment reaches 10MB in size, it is closed and a new one is opened.  Each WAL segment stores multiple compressed blocks of writes and deletes. -->
+<!-- When a segment reaches 10MB in size, it is closed and a new one is opened. Each WAL segment stores multiple compressed blocks of writes and deletes. -->
 <!-- Each entry in the WAL follows a [TLV standard](https://en.wikipedia.org/wiki/Type-length-value) with a single byte representing the type of entry (write or delete), a 4 byte `uint32` for the length of the compressed block, and then the compressed block. -->
 
 {{% note%}}
@@ -93,40 +93,31 @@ Queries execute on a copy of the data that is made from the cache at query proce
 This way writes that come in while a query is running won’t affect the result.
 
 The Cache is an in-memory copy of all data points current stored in the WAL.
-The points are organized by the key, which is the measurement, [tag set](/influxdb/v1.7/concepts/glossary/#tag-set), and unique [field](/influxdb/v1.7/concepts/glossary/#field).
+The points are organized by the key, which is the measurement, tag set, and unique field.
 Each field is kept as its own time-ordered range.
 The Cache data is not compressed while in memory.
-
-It is queried at runtime and merged with the data stored in TSM files.
-
+The cache is recreated on restart by re-reading the WAL files on disk back into memory.
 Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
 
-The cache is recreated on restart by re-reading the WAL files on disk back into memory.
+<!-- It is queried at runtime and merged with the data stored in TSM files. -->
 
 ## Time-Structured Merge Tree (TSM)
 
-<!-- - TSM Files - TSM files store compressed series data in a columnar format. -->
-
-Now let's handle more Data!
-Queries get slower as data grows
-Service terminates if data size exceeds memory...
-
-**Time-Structured Merge Tree** (TSM) is our data format
-We group field values grouped by series key, then order field values by time
-
-<!-- TERMS -->
-series key = measurement, tag key+value, field key
-point = series key, field value, timestamp
-
+In order to efficiently compact and store data,
+We group field values grouped by series key, then order field values by time.
+**Time-Structured Merge Tree** (TSM) is our data format.
+TSM files store compressed series data in a columnar format.
 Within a series, we store only differences between values, which is more efficient.
-
 Column-Oriented storage means we can read by series key and ignore what it doesn't need.
 
-Compression helps with performance.
+<!-- TERMS -->
+Some terminology:
+
+- a *series key* is defined by measurement, tag key+value, and field key.
+- a *point* is a series key, field value, and timestamp.
 
 After fields are stored safely in TSM files, WAL is truncated...
 <!-- TODO what next? -->
-(This stuff is configurable)
 
 There’s a lot of logic and sophistication in the TSM compaction code.
 However, the high-level goal is quite simple:
@@ -137,10 +128,9 @@ organize values for a series together into long runs to best optimize compressio
 TSI stores series keys grouped by measure, tag, field.
 
 To keep queries fast as we have more data, we use a **Time Series Index**.
-cardinality = quantity of series keys
-With high cardinality, we have to search through all series keys.
-So how to quickly find and match series keys?
-We use Time Series Index (TSI), which stores series keys grouped by measurement,tag,field
+In data with high cardinality (a large quantity of series), it becomes slower to search through all series keys.
+<!-- So how to quickly find and match series keys? -->
+We use Time Series Index (TSI), which stores series keys grouped by measurement, tag, and field.
 TSI answers two questions well:
 1) What measurements, tags, fields exist?
 2) Given a measurement, tags, and fields, what series keys exist?

From 413a299bc040a7f715c0a42a61c41cdf0857339c Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Thu, 24 Oct 2019 15:05:36 -0700
Subject: [PATCH 06/18] Begin addressing PR feedback

---
 content/v2.0/reference/storage-engine.md | 15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/content/v2.0/reference/storage-engine.md b/content/v2.0/reference/storage-engine.md
index d4657a5e0..8cfc3d2b1 100644
--- a/content/v2.0/reference/storage-engine.md
+++ b/content/v2.0/reference/storage-engine.md
@@ -24,23 +24,12 @@ Major topics include:
 * [Time-Structed Merge Tree (TSM)](#time-structured-merge-tree-tsm)
 * [Time Series Index (TSI)](#time-series-index-tsi)
 
-{{% note %}}
-##### At a glance: changes to the InfluxDB storage engine from 1.x to 2.0
-- The InfluxDB 2.0 storage engine no longer partitions data into shards by time.
-- **Buckets** replace databases and retention policies.
-- Only TSI is used. There is no more in-memory index.
-- The configuration interface and options have changed. Configuration options are not currently exposed, but will be.
-
-Read about the [v1 storage engine](https://docs.influxdata.com/influxdb/v1.7/concepts/storage_engine).
-{{% /note %}}
-
 ## Writing data: from API to disk
 
 The storage engine handles data from the point an API request is received through writing it to the physical disk.
 Data is written to InfluxDB using [Line protocol](/) sent via HTTP POST request to the `/write` endpoint.
-Batches of points are sent to InfluxDB.
-Those batches are compressed and written to a WAL for immediate durability.
-The points are also written to an in-memory cache so that newly written points are immediately queryable.
+Batches of [points](/v2.0/reference/glossary/#point) are sent to InfluxDB, compressed, and written to a WAL for immediate durability.
+The points are also written to an in-memory cache and become immediately queryable.
 The cache is periodically written to disk as TSM files.
 As TSM files accumulate, they are combined and compacted into higher level TSM files.
 

From 2c16b89008b61062c5390a191231b12c12278047 Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Thu, 24 Oct 2019 15:59:24 -0700
Subject: [PATCH 07/18] Edit storage engine doc

Addressing PR feedback
---
 content/v2.0/reference/storage-engine.md | 27 ++++++++++++------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/content/v2.0/reference/storage-engine.md b/content/v2.0/reference/storage-engine.md
index 8cfc3d2b1..8f3e5ea0f 100644
--- a/content/v2.0/reference/storage-engine.md
+++ b/content/v2.0/reference/storage-engine.md
@@ -35,16 +35,13 @@ As TSM files accumulate, they are combined and compacted into higher level TSM f
 
 Points can be sent individually; however, for efficiency, most applications send points in batches.
 A typical batch ranges in size from hundreds to thousands of points.
-Points in a POST body can be from an arbitrary number of series.
+Points in a POST body can be from an arbitrary number of series, measurements, and tag sets.
 Points in a batch do not have to be from the same measurement or tagset.
 
 ## Write Ahead Log (WAL)
 
-To ensure durability, we use a Write Ahead Log (WAL).
-<!-- The WAL is a write-optimized storage format that allows for writes to be durable, but not easily queryable -->
-<!-- On the write side, -->
-WAL is a data structure and algorithm that is super simple and powerful.
-It ensures that written data does not disappear when storage engine restarts.
+The Write Ahead Log (WAL) ensures durability by retaining data when the storage engine restarts.
+It ensures that written data does not disappear in an unexpected failure.
 When a client sends a write request, the following occurs:
 
 1. Write request is appended to the end of the WAL file.
@@ -53,16 +50,18 @@ When a client sends a write request, the following occurs:
    <!-- 3. Update the in-memory CACHE? -->
 4. Return success to caller.
 
-fsync() is a system call, so it has a kernel context switch which costs something.
 fsync() takes the file and pushes pending writes all the way through any buffers and caches to disk.
-fsync() is expensive _in terms of time_ but guarantees your data is safe on disk.
-**Important** to batch your points (send in ~2000 points at a time), to fsync() less frequently.
+As a system call, fsync() has a kernel context switch which is expensive _in terms of time_ but guarantees your data is safe on disk.
+
+{{% note%}}
+To fsync() less frequently, batch your points (send in ~2000 points at a time).
+{{% /note %}}
 
 <!-- On read side: -->
 When the storage engine restarts,
 <!-- (if we've pulled the plug, say) -->
 open WAL file and read it back into the in-memory database.
-Answer requests to the /read endpoint.
+InfluxDB then snswer requests to the `/read` endpoint.
 
 <!-- ===== V1 material -->
 <!-- TODO is this still true? -->
@@ -82,9 +81,9 @@ Queries execute on a copy of the data that is made from the cache at query proce
 This way writes that come in while a query is running won’t affect the result.
 
 The Cache is an in-memory copy of all data points current stored in the WAL.
-The points are organized by the key, which is the measurement, tag set, and unique field.
-Each field is kept as its own time-ordered range.
-The Cache data is not compressed while in memory.
+Points are organized by the key, which is the measurement, tag set, and unique field.
+Each field is stored in its own time-ordered range.
+Data is not compressed in the cache.
 The cache is recreated on restart by re-reading the WAL files on disk back into memory.
 Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
 
@@ -114,7 +113,7 @@ organize values for a series together into long runs to best optimize compressio
 
 ## Time Series Index (TSI)
 
-TSI stores series keys grouped by measure, tag, field.
+TSI stores series keys grouped by measurement, tag, and field.
 
 To keep queries fast as we have more data, we use a **Time Series Index**.
 In data with high cardinality (a large quantity of series), it becomes slower to search through all series keys.

From cf1d2906fdfe3052602785b8a0a034484040c82e Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Fri, 25 Oct 2019 08:53:43 -0700
Subject: [PATCH 08/18] Continue editing storage enginge doc

---
 content/v2.0/reference/storage-engine.md | 41 +++++++++++-------------
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/content/v2.0/reference/storage-engine.md b/content/v2.0/reference/storage-engine.md
index 8f3e5ea0f..f1d215a69 100644
--- a/content/v2.0/reference/storage-engine.md
+++ b/content/v2.0/reference/storage-engine.md
@@ -27,10 +27,10 @@ Major topics include:
 ## Writing data: from API to disk
 
 The storage engine handles data from the point an API request is received through writing it to the physical disk.
-Data is written to InfluxDB using [Line protocol](/) sent via HTTP POST request to the `/write` endpoint.
+Data is written to InfluxDB using [line protocol](/v2.0/reference/line-) sent via HTTP POST request to the `/write` endpoint.
 Batches of [points](/v2.0/reference/glossary/#point) are sent to InfluxDB, compressed, and written to a WAL for immediate durability.
 The points are also written to an in-memory cache and become immediately queryable.
-The cache is periodically written to disk as TSM files.
+The cache is periodically written to disk in the form of [TSM](#time-structured-merge-tree-tsm) files.
 As TSM files accumulate, they are combined and compacted into higher level TSM files.
 
 Points can be sent individually; however, for efficiency, most applications send points in batches.
@@ -45,22 +45,19 @@ It ensures that written data does not disappear in an unexpected failure.
 When a client sends a write request, the following occurs:
 
 1. Write request is appended to the end of the WAL file.
-2. fsync() the data to the file.
+2. `fsync()` the data to the file.
 3. Update the in-memory database.
    <!-- 3. Update the in-memory CACHE? -->
 4. Return success to caller.
 
-fsync() takes the file and pushes pending writes all the way through any buffers and caches to disk.
-As a system call, fsync() has a kernel context switch which is expensive _in terms of time_ but guarantees your data is safe on disk.
+`fsync()` takes the file and pushes pending writes all the way through any buffers and caches to disk.
+As a system call, `fsync()` has a kernel context switch which is expensive _in terms of time_ but guarantees your data is safe on disk.
 
 {{% note%}}
-To fsync() less frequently, batch your points (send in ~2000 points at a time).
+To `fsync()` less frequently, batch your points (send in ~2000 points at a time).
 {{% /note %}}
 
-<!-- On read side: -->
-When the storage engine restarts,
-<!-- (if we've pulled the plug, say) -->
-open WAL file and read it back into the in-memory database.
+When the storage engine restarts, the WAL file is read back into the in-memory database.
 InfluxDB then snswer requests to the `/read` endpoint.
 
 <!-- ===== V1 material -->
@@ -80,23 +77,26 @@ Queries to the storage engine will merge data from the Cache with data from the
 Queries execute on a copy of the data that is made from the cache at query processing time.
 This way writes that come in while a query is running won’t affect the result.
 
-The Cache is an in-memory copy of all data points current stored in the WAL.
+The cache is an in-memory copy of data points current stored in the WAL.
 Points are organized by the key, which is the measurement, tag set, and unique field.
 Each field is stored in its own time-ordered range.
 Data is not compressed in the cache.
 The cache is recreated on restart by re-reading the WAL files on disk back into memory.
+The cache is queried at runtime and merged with the data stored in TSM files.
 Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
 
-<!-- It is queried at runtime and merged with the data stored in TSM files. -->
-
 ## Time-Structured Merge Tree (TSM)
 
-In order to efficiently compact and store data,
-We group field values grouped by series key, then order field values by time.
-**Time-Structured Merge Tree** (TSM) is our data format.
+To efficiently compact and store data,
+the storage engine groups field values by [series](/v2.0/reference/key-concepts/data-elements/#series) key,
+and then orders those field values by time.
+
+The storage engine uses a **Time-Structured Merge Tree** (TSM) data format.
 TSM files store compressed series data in a columnar format.
-Within a series, we store only differences between values, which is more efficient.
+Within a series, we store only , which is more efficient.
+To improve efficiency, the storage engine only stores differences between values in a series.
 Column-Oriented storage means we can read by series key and ignore what it doesn't need.
+Storing data in columns lets the storage engine read by series key.
 
 <!-- TERMS -->
 Some terminology:
@@ -113,25 +113,21 @@ organize values for a series together into long runs to best optimize compressio
 
 ## Time Series Index (TSI)
 
-TSI stores series keys grouped by measurement, tag, and field.
-
 To keep queries fast as we have more data, we use a **Time Series Index**.
+TSI stores series keys grouped by measurement, tag, and field.
 In data with high cardinality (a large quantity of series), it becomes slower to search through all series keys.
-<!-- So how to quickly find and match series keys? -->
 We use Time Series Index (TSI), which stores series keys grouped by measurement, tag, and field.
 TSI answers two questions well:
 1) What measurements, tags, fields exist?
 2) Given a measurement, tags, and fields, what series keys exist?
 
 <!-- ## Shards -->
-
 <!-- A shard contains: -->
 <!--   WAL files -->
 <!--   TSM files -->
 <!--   TSI files -->
 <!-- Shards are time-bounded -->
 <!-- Retention policies have properties: duration and shard duration -->
-
 <!-- colder shards get more compacted -->
 
 <!-- =========== QUESTIONS -->
@@ -139,7 +135,6 @@ TSI answers two questions well:
 <!-- Should we even mention shards? -->
 
 <!-- =========== OTHER -->
-
 <!-- V1 -->
 <!-- - FileStore - The FileStore mediates access to all TSM files on disk. -->
 <!--   It ensures that TSM files are installed atomically when existing ones are replaced as well as removing TSM files that are no longer used. -->

From 024eb95096b287f1a500b24fb2dfe700984fba51 Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Mon, 28 Oct 2019 14:27:35 -0700
Subject: [PATCH 09/18] Work on storage engine docs

---
 content/v2.0/reference/storage-engine.md | 31 +++++++++++++++---------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/content/v2.0/reference/storage-engine.md b/content/v2.0/reference/storage-engine.md
index f1d215a69..ef259cebf 100644
--- a/content/v2.0/reference/storage-engine.md
+++ b/content/v2.0/reference/storage-engine.md
@@ -6,6 +6,7 @@ weight: 7
 menu:
   v2_0_ref:
     name: Storage engine
+    parent: Internals
 v2.0/tags: [storage engine, internals, platform]
 ---
 
@@ -34,13 +35,13 @@ The cache is periodically written to disk in the form of [TSM](#time-structured-
 As TSM files accumulate, they are combined and compacted into higher level TSM files.
 
 Points can be sent individually; however, for efficiency, most applications send points in batches.
-A typical batch ranges in size from hundreds to thousands of points.
+<!-- Batch size ranges from hundreds to thousands of points. -->
 Points in a POST body can be from an arbitrary number of series, measurements, and tag sets.
 Points in a batch do not have to be from the same measurement or tagset.
 
 ## Write Ahead Log (WAL)
 
-The Write Ahead Log (WAL) ensures durability by retaining data when the storage engine restarts.
+The **Write Ahead Log** (WAL) ensures durability by retaining data when the storage engine restarts.
 It ensures that written data does not disappear in an unexpected failure.
 When a client sends a write request, the following occurs:
 
@@ -54,7 +55,7 @@ When a client sends a write request, the following occurs:
 As a system call, `fsync()` has a kernel context switch which is expensive _in terms of time_ but guarantees your data is safe on disk.
 
 {{% note%}}
-To `fsync()` less frequently, batch your points (send in ~2000 points at a time).
+To `fsync()` less frequently, batch your points.
 {{% /note %}}
 
 When the storage engine restarts, the WAL file is read back into the in-memory database.
@@ -73,16 +74,19 @@ Once you receive a response to a write request, your data is on disk!
 
 ## Cache
 
-Queries to the storage engine will merge data from the Cache with data from the TSM files.
-Queries execute on a copy of the data that is made from the cache at query processing time.
-This way writes that come in while a query is running won’t affect the result.
-
-The cache is an in-memory copy of data points current stored in the WAL.
+The **cache** is an in-memory copy of data points current stored in the WAL.
 Points are organized by the key, which is the measurement, tag set, and unique field.
 Each field is stored in its own time-ordered range.
 Data is not compressed in the cache.
 The cache is recreated on restart by re-reading the WAL files on disk back into memory.
 The cache is queried at runtime and merged with the data stored in TSM files.
+
+<!-- When the storage engine restarts, WAL files are written to the in-memory cache. -->
+
+Queries to the storage engine will merge data from the cache with data from the TSM files.
+Queries execute on a copy of the data that is made from the cache at query processing time.
+This way writes that come in while a query is running won’t affect the result.
+
 Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
 
 ## Time-Structured Merge Tree (TSM)
@@ -93,15 +97,14 @@ and then orders those field values by time.
 
 The storage engine uses a **Time-Structured Merge Tree** (TSM) data format.
 TSM files store compressed series data in a columnar format.
-Within a series, we store only , which is more efficient.
 To improve efficiency, the storage engine only stores differences between values in a series.
-Column-Oriented storage means we can read by series key and ignore what it doesn't need.
+Column-oriented storage means we can read by series key and ignore what it doesn't need.
 Storing data in columns lets the storage engine read by series key.
 
 <!-- TERMS -->
 Some terminology:
 
-- a *series key* is defined by measurement, tag key+value, and field key.
+- a *series key* is defined by measurement, tag key and value, and field key.
 - a *point* is a series key, field value, and timestamp.
 
 After fields are stored safely in TSM files, WAL is truncated...
@@ -113,10 +116,14 @@ organize values for a series together into long runs to best optimize compressio
 
 ## Time Series Index (TSI)
 
+As data cardinality (number of series) grows, queries read more series keys and become slower.
+
+The **Time Series Index** ensures queries fast as data cardinality of data grows...
 To keep queries fast as we have more data, we use a **Time Series Index**.
+
 TSI stores series keys grouped by measurement, tag, and field.
 In data with high cardinality (a large quantity of series), it becomes slower to search through all series keys.
-We use Time Series Index (TSI), which stores series keys grouped by measurement, tag, and field.
+The TSI stores series keys grouped by measurement, tag, and field.
 TSI answers two questions well:
 1) What measurements, tags, fields exist?
 2) Given a measurement, tags, and fields, what series keys exist?

From 7e22fa8ac5de199e29949650671c928854c8e0aa Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Mon, 28 Oct 2019 14:28:13 -0700
Subject: [PATCH 10/18] Create "Internals" reference section

Move storage engine docs

Edit storage engine page tags
---
 content/v2.0/reference/{ => internals}/storage-engine.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
 rename content/v2.0/reference/{ => internals}/storage-engine.md (99%)

diff --git a/content/v2.0/reference/storage-engine.md b/content/v2.0/reference/internals/storage-engine.md
similarity index 99%
rename from content/v2.0/reference/storage-engine.md
rename to content/v2.0/reference/internals/storage-engine.md
index ef259cebf..8bf604cff 100644
--- a/content/v2.0/reference/storage-engine.md
+++ b/content/v2.0/reference/internals/storage-engine.md
@@ -7,7 +7,7 @@ menu:
   v2_0_ref:
     name: Storage engine
     parent: Internals
-v2.0/tags: [storage engine, internals, platform]
+v2.0/tags: [storage, internals]
 ---
 
 The InfluxDB storage engine ensures the following three things:

From 761220c90a22bb3168771a1f44e26e6b476f092e Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Wed, 30 Oct 2019 14:05:00 -0700
Subject: [PATCH 11/18] Add internals landing page

---
 content/v2.0/reference/internals/_index.md    |  9 ++++++
 .../reference/internals/storage-engine.md     | 32 +++++++------------
 2 files changed, 20 insertions(+), 21 deletions(-)
 create mode 100644 content/v2.0/reference/internals/_index.md

diff --git a/content/v2.0/reference/internals/_index.md b/content/v2.0/reference/internals/_index.md
new file mode 100644
index 000000000..1f063cdf8
--- /dev/null
+++ b/content/v2.0/reference/internals/_index.md
@@ -0,0 +1,9 @@
+---
+title: InfluxDB Internals
+menu:
+  v2_0_ref:
+    name: InfluxDB Internals
+    weight: 8
+---
+
+{{< children >}}
diff --git a/content/v2.0/reference/internals/storage-engine.md b/content/v2.0/reference/internals/storage-engine.md
index 8bf604cff..744bda9d1 100644
--- a/content/v2.0/reference/internals/storage-engine.md
+++ b/content/v2.0/reference/internals/storage-engine.md
@@ -6,7 +6,7 @@ weight: 7
 menu:
   v2_0_ref:
     name: Storage engine
-    parent: Internals
+    parent: InfluxDB Internals
 v2.0/tags: [storage, internals]
 ---
 
@@ -35,7 +35,6 @@ The cache is periodically written to disk in the form of [TSM](#time-structured-
 As TSM files accumulate, they are combined and compacted into higher level TSM files.
 
 Points can be sent individually; however, for efficiency, most applications send points in batches.
-<!-- Batch size ranges from hundreds to thousands of points. -->
 Points in a POST body can be from an arbitrary number of series, measurements, and tag sets.
 Points in a batch do not have to be from the same measurement or tagset.
 
@@ -47,8 +46,7 @@ When a client sends a write request, the following occurs:
 
 1. Write request is appended to the end of the WAL file.
 2. `fsync()` the data to the file.
-3. Update the in-memory database.
-   <!-- 3. Update the in-memory CACHE? -->
+3. Update the in-memory cache.
 4. Return success to caller.
 
 `fsync()` takes the file and pushes pending writes all the way through any buffers and caches to disk.
@@ -61,7 +59,6 @@ To `fsync()` less frequently, batch your points.
 When the storage engine restarts, the WAL file is read back into the in-memory database.
 InfluxDB then snswer requests to the `/read` endpoint.
 
-<!-- ===== V1 material -->
 <!-- TODO is this still true? -->
 <!-- On the file system, the WAL is made up of sequentially numbered files (`_000001.wal`). -->
 <!-- The file numbers are monotonically increasing and referred to as WAL segments. -->
@@ -81,11 +78,14 @@ Data is not compressed in the cache.
 The cache is recreated on restart by re-reading the WAL files on disk back into memory.
 The cache is queried at runtime and merged with the data stored in TSM files.
 
+<!-- From Scott: Points are organize by series. -->
+<!-- A series key defines the contents of a series and is comprised of a measurement, tag set, and field key. -->
+
 <!-- When the storage engine restarts, WAL files are written to the in-memory cache. -->
 
 Queries to the storage engine will merge data from the cache with data from the TSM files.
 Queries execute on a copy of the data that is made from the cache at query processing time.
-This way writes that come in while a query is running won’t affect the result.
+This way writes that come in while a query is running do not affect the result.
 
 Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
 
@@ -97,15 +97,15 @@ and then orders those field values by time.
 
 The storage engine uses a **Time-Structured Merge Tree** (TSM) data format.
 TSM files store compressed series data in a columnar format.
-To improve efficiency, the storage engine only stores differences between values in a series.
+To improve efficiency, the storage engine only stores differences (or *deltas*) between values in a series.
 Column-oriented storage means we can read by series key and ignore what it doesn't need.
 Storing data in columns lets the storage engine read by series key.
 
 <!-- TERMS -->
-Some terminology:
+<!-- Some terminology: -->
 
-- a *series key* is defined by measurement, tag key and value, and field key.
-- a *point* is a series key, field value, and timestamp.
+<!-- - a *series key* is defined by measurement, tag key and value, and field key. -->
+<!-- - a *point* is a series key, field value, and timestamp. -->
 
 After fields are stored safely in TSM files, WAL is truncated...
 <!-- TODO what next? -->
@@ -118,7 +118,7 @@ organize values for a series together into long runs to best optimize compressio
 
 As data cardinality (number of series) grows, queries read more series keys and become slower.
 
-The **Time Series Index** ensures queries fast as data cardinality of data grows...
+The **Time Series Index** ensures queries remain fast as data cardinality of data grows...
 To keep queries fast as we have more data, we use a **Time Series Index**.
 
 TSI stores series keys grouped by measurement, tag, and field.
@@ -152,13 +152,3 @@ TSI answers two questions well:
 <!--   Some encoders are fairly static and always encode the same type the same way; -->
 <!--   others switch their compression strategy based on the shape of the data. -->
 <!-- - Writers/Readers - Each file type (WAL segment, TSM files, tombstones, etc..) has Writers and Readers for working with the formats. -->
-
-<!-- === CONFIGURABLES? === -->
-<!-- The Cache exposes a few controls for snapshotting behavior. -->
-<!-- The two most important controls are the memory limits. -->
-<!-- There is a lower bound, [`cache-snapshot-memory-size`](/influxdb/v1.7/administration/config#cache-snapshot-memory-size-25m), which when exceeded will trigger a snapshot to TSM files and remove the corresponding WAL segments. -->
-<!-- There is also an upper bound, [`cache-max-memory-size`](/influxdb/v1.7/administration/config#cache-max-memory-size-1g), which when exceeded will cause the Cache to reject new writes. -->
-<!-- These configurations are useful to prevent out of memory situations and to apply back pressure to clients writing data faster than the instance can persist it. -->
-<!-- The checks for memory thresholds occur on every write. -->
-<!-- The other snapshot controls are time based. -->
-<!-- The idle threshold, [`cache-snapshot-write-cold-duration`](/influxdb/v1.7/administration/config#cache-snapshot-write-cold-duration-10m), forces the Cache to snapshot to TSM files if it hasn't received a write within the specified interval. -->

From fe88a9d6e424e57fca836060782db365e4d0e437 Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Tue, 12 Nov 2019 09:40:45 -0800
Subject: [PATCH 12/18] Continue work on storage engine doc

---
 .../reference/internals/storage-engine.md     | 58 +++----------------
 1 file changed, 8 insertions(+), 50 deletions(-)

diff --git a/content/v2.0/reference/internals/storage-engine.md b/content/v2.0/reference/internals/storage-engine.md
index 744bda9d1..25f188d90 100644
--- a/content/v2.0/reference/internals/storage-engine.md
+++ b/content/v2.0/reference/internals/storage-engine.md
@@ -30,6 +30,7 @@ Major topics include:
 The storage engine handles data from the point an API request is received through writing it to the physical disk.
 Data is written to InfluxDB using [line protocol](/v2.0/reference/line-) sent via HTTP POST request to the `/write` endpoint.
 Batches of [points](/v2.0/reference/glossary/#point) are sent to InfluxDB, compressed, and written to a WAL for immediate durability.
+(A *point* is a series key, field value, and timestamp.)
 The points are also written to an in-memory cache and become immediately queryable.
 The cache is periodically written to disk in the form of [TSM](#time-structured-merge-tree-tsm) files.
 As TSM files accumulate, they are combined and compacted into higher level TSM files.
@@ -50,21 +51,11 @@ When a client sends a write request, the following occurs:
 4. Return success to caller.
 
 `fsync()` takes the file and pushes pending writes all the way through any buffers and caches to disk.
-As a system call, `fsync()` has a kernel context switch which is expensive _in terms of time_ but guarantees your data is safe on disk.
-
-{{% note%}}
-To `fsync()` less frequently, batch your points.
-{{% /note %}}
+As a system call, `fsync()` has a kernel context switch which is computationally expensive, but guarantees your data is safe on disk.
 
 When the storage engine restarts, the WAL file is read back into the in-memory database.
 InfluxDB then snswer requests to the `/read` endpoint.
 
-<!-- TODO is this still true? -->
-<!-- On the file system, the WAL is made up of sequentially numbered files (`_000001.wal`). -->
-<!-- The file numbers are monotonically increasing and referred to as WAL segments. -->
-<!-- When a segment reaches 10MB in size, it is closed and a new one is opened. Each WAL segment stores multiple compressed blocks of writes and deletes. -->
-<!-- Each entry in the WAL follows a [TLV standard](https://en.wikipedia.org/wiki/Type-length-value) with a single byte representing the type of entry (write or delete), a 4 byte `uint32` for the length of the compressed block, and then the compressed block. -->
-
 {{% note%}}
 Once you receive a response to a write request, your data is on disk!
 {{% /note %}}
@@ -78,22 +69,20 @@ Data is not compressed in the cache.
 The cache is recreated on restart by re-reading the WAL files on disk back into memory.
 The cache is queried at runtime and merged with the data stored in TSM files.
 
-<!-- From Scott: Points are organize by series. -->
-<!-- A series key defines the contents of a series and is comprised of a measurement, tag set, and field key. -->
-
-<!-- When the storage engine restarts, WAL files are written to the in-memory cache. -->
+When the storage engine restarts, WAL files are re-read into the in-memory cache.
 
 Queries to the storage engine will merge data from the cache with data from the TSM files.
 Queries execute on a copy of the data that is made from the cache at query processing time.
 This way writes that come in while a query is running do not affect the result.
 
-Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
+Deletes sent to the cache will clear out the given key or the specific time range for the given key.
 
 ## Time-Structured Merge Tree (TSM)
 
 To efficiently compact and store data,
-the storage engine groups field values by [series](/v2.0/reference/key-concepts/data-elements/#series) key,
+the storage engine groups field values by series key,
 and then orders those field values by time.
+(A *series key* is defined by measurement, tag key and value, and field key.)
 
 The storage engine uses a **Time-Structured Merge Tree** (TSM) data format.
 TSM files store compressed series data in a columnar format.
@@ -101,13 +90,7 @@ To improve efficiency, the storage engine only stores differences (or *deltas*)
 Column-oriented storage means we can read by series key and ignore what it doesn't need.
 Storing data in columns lets the storage engine read by series key.
 
-<!-- TERMS -->
-<!-- Some terminology: -->
-
-<!-- - a *series key* is defined by measurement, tag key and value, and field key. -->
-<!-- - a *point* is a series key, field value, and timestamp. -->
-
-After fields are stored safely in TSM files, WAL is truncated...
+After fields are stored safely in TSM files, the WAL is truncated and the cache is cleared.
 <!-- TODO what next? -->
 
 There’s a lot of logic and sophistication in the TSM compaction code.
@@ -116,7 +99,7 @@ organize values for a series together into long runs to best optimize compressio
 
 ## Time Series Index (TSI)
 
-As data cardinality (number of series) grows, queries read more series keys and become slower.
+As data cardinality (the number of series) grows, queries read more series keys and become slower.
 
 The **Time Series Index** ensures queries remain fast as data cardinality of data grows...
 To keep queries fast as we have more data, we use a **Time Series Index**.
@@ -127,28 +110,3 @@ The TSI stores series keys grouped by measurement, tag, and field.
 TSI answers two questions well:
 1) What measurements, tags, fields exist?
 2) Given a measurement, tags, and fields, what series keys exist?
-
-<!-- ## Shards -->
-<!-- A shard contains: -->
-<!--   WAL files -->
-<!--   TSM files -->
-<!--   TSI files -->
-<!-- Shards are time-bounded -->
-<!-- Retention policies have properties: duration and shard duration -->
-<!-- colder shards get more compacted -->
-
-<!-- =========== QUESTIONS -->
-<!-- Which parts of cache and WAL are configurable? -->
-<!-- Should we even mention shards? -->
-
-<!-- =========== OTHER -->
-<!-- V1 -->
-<!-- - FileStore - The FileStore mediates access to all TSM files on disk. -->
-<!--   It ensures that TSM files are installed atomically when existing ones are replaced as well as removing TSM files that are no longer used. -->
-<!-- - Compactor - The Compactor is responsible for converting less optimized Cache and TSM data into more read-optimized formats. -->
-<!--   It does this by compressing series, removing deleted data, optimizing indices and combining smaller files into larger ones. -->
-<!-- - Compaction Planner - The Compaction Planner determines which TSM files are ready for a compaction and ensures that multiple concurrent compactions do not interfere with each other. -->
-<!-- - Compression - Compression is handled by various Encoders and Decoders for specific data types. -->
-<!--   Some encoders are fairly static and always encode the same type the same way; -->
-<!--   others switch their compression strategy based on the shape of the data. -->
-<!-- - Writers/Readers - Each file type (WAL segment, TSM files, tombstones, etc..) has Writers and Readers for working with the formats. -->

From be5238b5ce99dc8dab9abcd39127ee5035bd1d1d Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Mon, 9 Dec 2019 10:56:36 -0800
Subject: [PATCH 13/18] Work on storage engine doc

---
 .../reference/internals/storage-engine.md     | 36 +++++++++----------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/content/v2.0/reference/internals/storage-engine.md b/content/v2.0/reference/internals/storage-engine.md
index 25f188d90..71a062ed1 100644
--- a/content/v2.0/reference/internals/storage-engine.md
+++ b/content/v2.0/reference/internals/storage-engine.md
@@ -10,28 +10,28 @@ menu:
 v2.0/tags: [storage, internals]
 ---
 
-The InfluxDB storage engine ensures the following three things:
+The InfluxDB storage engine ensures that
 
 - Data is safely written to disk
 - Queried data is returned complete and correct
 - Data is accurate (first) and performant (second)
 
-This document details the internal workings of the storage engine.
+This document outlines the internal workings of the storage engine.
 This information is presented both as a reference and to aid those looking to maximize performance.
 
 Major topics include:
 
 * [Write Ahead Log (WAL)](#write-ahead-log-wal)
+* [Cache](#cache)
 * [Time-Structed Merge Tree (TSM)](#time-structured-merge-tree-tsm)
 * [Time Series Index (TSI)](#time-series-index-tsi)
 
 ## Writing data: from API to disk
 
-The storage engine handles data from the point an API request is received through writing it to the physical disk.
-Data is written to InfluxDB using [line protocol](/v2.0/reference/line-) sent via HTTP POST request to the `/write` endpoint.
+The storage engine handles data from the point an API write request is received through writing data to the physical disk.
+Data is written to InfluxDB using [line protocol](/v2.0/reference/line-protocol/) sent via HTTP POST request to the `/write` endpoint.
 Batches of [points](/v2.0/reference/glossary/#point) are sent to InfluxDB, compressed, and written to a WAL for immediate durability.
-(A *point* is a series key, field value, and timestamp.)
-The points are also written to an in-memory cache and become immediately queryable.
+Points are also written to an in-memory cache and become immediately queryable.
 The cache is periodically written to disk in the form of [TSM](#time-structured-merge-tree-tsm) files.
 As TSM files accumulate, they are combined and compacted into higher level TSM files.
 
@@ -43,27 +43,27 @@ Points in a batch do not have to be from the same measurement or tagset.
 
 The **Write Ahead Log** (WAL) ensures durability by retaining data when the storage engine restarts.
 It ensures that written data does not disappear in an unexpected failure.
-When a client sends a write request, the following occurs:
+When a client sends a write request, the following steps occur:
 
-1. Write request is appended to the end of the WAL file.
-2. `fsync()` the data to the file.
+1. Append write request to the end of the WAL file.
+2. Write data to disk using `fsync()`.
 3. Update the in-memory cache.
 4. Return success to caller.
 
-`fsync()` takes the file and pushes pending writes all the way through any buffers and caches to disk.
-As a system call, `fsync()` has a kernel context switch which is computationally expensive, but guarantees your data is safe on disk.
-
-When the storage engine restarts, the WAL file is read back into the in-memory database.
-InfluxDB then snswer requests to the `/read` endpoint.
+`fsync()` takes the file and pushes pending writes all the way to the disk.
+As a system call, `fsync()` has a kernel context switch which is computationally expensive, but guarantees that data is safe on disk.
 
 {{% note%}}
 Once you receive a response to a write request, your data is on disk!
 {{% /note %}}
 
+When the storage engine restarts, the WAL file is read back into the in-memory database.
+InfluxDB then answers requests to the `/read` endpoint.
+
 ## Cache
 
 The **cache** is an in-memory copy of data points current stored in the WAL.
-Points are organized by the key, which is the measurement, tag set, and unique field.
+Points are organized by key, which is the measurement, tag set, and unique field.
 Each field is stored in its own time-ordered range.
 Data is not compressed in the cache.
 The cache is recreated on restart by re-reading the WAL files on disk back into memory.
@@ -82,7 +82,7 @@ Deletes sent to the cache will clear out the given key or the specific time rang
 To efficiently compact and store data,
 the storage engine groups field values by series key,
 and then orders those field values by time.
-(A *series key* is defined by measurement, tag key and value, and field key.)
+(A [series key](/v2/) is defined by measurement, tag key and value, and field key.)
 
 The storage engine uses a **Time-Structured Merge Tree** (TSM) data format.
 TSM files store compressed series data in a columnar format.
@@ -93,7 +93,7 @@ Storing data in columns lets the storage engine read by series key.
 After fields are stored safely in TSM files, the WAL is truncated and the cache is cleared.
 <!-- TODO what next? -->
 
-There’s a lot of logic and sophistication in the TSM compaction code.
+The TSM compaction code is quite complex.
 However, the high-level goal is quite simple:
 organize values for a series together into long runs to best optimize compression and scanning queries.
 
@@ -101,7 +101,7 @@ organize values for a series together into long runs to best optimize compressio
 
 As data cardinality (the number of series) grows, queries read more series keys and become slower.
 
-The **Time Series Index** ensures queries remain fast as data cardinality of data grows...
+The **Time Series Index** ensures queries remain fast as data cardinality grows.
 To keep queries fast as we have more data, we use a **Time Series Index**.
 
 TSI stores series keys grouped by measurement, tag, and field.

From 60f1bc7d9300d6f90f39e7f06d8f9cecf914063a Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Mon, 6 Jan 2020 10:08:29 -0800
Subject: [PATCH 14/18] Edit paragraphing in storage engine doc

---
 content/v2.0/reference/internals/storage-engine.md | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/content/v2.0/reference/internals/storage-engine.md b/content/v2.0/reference/internals/storage-engine.md
index 71a062ed1..bca7a8a2b 100644
--- a/content/v2.0/reference/internals/storage-engine.md
+++ b/content/v2.0/reference/internals/storage-engine.md
@@ -62,26 +62,23 @@ InfluxDB then answers requests to the `/read` endpoint.
 
 ## Cache
 
-The **cache** is an in-memory copy of data points current stored in the WAL.
+The **cache** is an in-memory copy of data points currently stored in the WAL.
 Points are organized by key, which is the measurement, tag set, and unique field.
 Each field is stored in its own time-ordered range.
 Data is not compressed in the cache.
 The cache is recreated on restart by re-reading the WAL files on disk back into memory.
 The cache is queried at runtime and merged with the data stored in TSM files.
-
 When the storage engine restarts, WAL files are re-read into the in-memory cache.
 
 Queries to the storage engine will merge data from the cache with data from the TSM files.
 Queries execute on a copy of the data that is made from the cache at query processing time.
 This way writes that come in while a query is running do not affect the result.
-
 Deletes sent to the cache will clear out the given key or the specific time range for the given key.
 
 ## Time-Structured Merge Tree (TSM)
 
 To efficiently compact and store data,
-the storage engine groups field values by series key,
-and then orders those field values by time.
+the storage engine groups field values by series key, and then orders those field values by time.
 (A [series key](/v2/) is defined by measurement, tag key and value, and field key.)
 
 The storage engine uses a **Time-Structured Merge Tree** (TSM) data format.
@@ -91,8 +88,6 @@ Column-oriented storage means we can read by series key and ignore what it doesn
 Storing data in columns lets the storage engine read by series key.
 
 After fields are stored safely in TSM files, the WAL is truncated and the cache is cleared.
-<!-- TODO what next? -->
-
 The TSM compaction code is quite complex.
 However, the high-level goal is quite simple:
 organize values for a series together into long runs to best optimize compression and scanning queries.
@@ -100,7 +95,6 @@ organize values for a series together into long runs to best optimize compressio
 ## Time Series Index (TSI)
 
 As data cardinality (the number of series) grows, queries read more series keys and become slower.
-
 The **Time Series Index** ensures queries remain fast as data cardinality grows.
 To keep queries fast as we have more data, we use a **Time Series Index**.
 

From f19ecd70dca4755b4fddac2ad5433e49643b0b1f Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Mon, 6 Jan 2020 14:04:20 -0800
Subject: [PATCH 15/18] Storage engine doc: begin addressing PR feedback

---
 .../reference/internals/storage-engine.md     | 36 +++++++++----------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/content/v2.0/reference/internals/storage-engine.md b/content/v2.0/reference/internals/storage-engine.md
index bca7a8a2b..e495c6d15 100644
--- a/content/v2.0/reference/internals/storage-engine.md
+++ b/content/v2.0/reference/internals/storage-engine.md
@@ -10,7 +10,7 @@ menu:
 v2.0/tags: [storage, internals]
 ---
 
-The InfluxDB storage engine ensures that
+The InfluxDB storage engine ensures that:
 
 - Data is safely written to disk
 - Queried data is returned complete and correct
@@ -19,43 +19,42 @@ The InfluxDB storage engine ensures that
 This document outlines the internal workings of the storage engine.
 This information is presented both as a reference and to aid those looking to maximize performance.
 
-Major topics include:
+The storage engine includes the following components:
 
 * [Write Ahead Log (WAL)](#write-ahead-log-wal)
 * [Cache](#cache)
 * [Time-Structed Merge Tree (TSM)](#time-structured-merge-tree-tsm)
 * [Time Series Index (TSI)](#time-series-index-tsi)
 
-## Writing data: from API to disk
+## Writing data from API to disk
 
 The storage engine handles data from the point an API write request is received through writing data to the physical disk.
 Data is written to InfluxDB using [line protocol](/v2.0/reference/line-protocol/) sent via HTTP POST request to the `/write` endpoint.
 Batches of [points](/v2.0/reference/glossary/#point) are sent to InfluxDB, compressed, and written to a WAL for immediate durability.
 Points are also written to an in-memory cache and become immediately queryable.
-The cache is periodically written to disk in the form of [TSM](#time-structured-merge-tree-tsm) files.
+The in-memory cache is periodically written to disk in the form of [TSM](#time-structured-merge-tree-tsm) files.
 As TSM files accumulate, they are combined and compacted into higher level TSM files.
 
+{{% note %}}
 Points can be sent individually; however, for efficiency, most applications send points in batches.
 Points in a POST body can be from an arbitrary number of series, measurements, and tag sets.
 Points in a batch do not have to be from the same measurement or tagset.
+{{% /note %}}
 
 ## Write Ahead Log (WAL)
 
-The **Write Ahead Log** (WAL) ensures durability by retaining data when the storage engine restarts.
-It ensures that written data does not disappear in an unexpected failure.
-When a client sends a write request, the following steps occur:
+The **Write Ahead Log** (WAL) retains InfluxDB data when the storage engine restarts.
+The WAL ensures data is durable in case of an unexpected failure.
 
-1. Append write request to the end of the WAL file.
-2. Write data to disk using `fsync()`.
-3. Update the in-memory cache.
-4. Return success to caller.
+When the storage engine receives a write request, the following steps occur:
+
+1. The write request is appended to the end of the WAL file.
+2. Data is written data to disk using `fsync()`.
+3. The in-memory cache is updated.
+4. When data is successfully written to disk, a response confirms the write request was successful.
 
 `fsync()` takes the file and pushes pending writes all the way to the disk.
-As a system call, `fsync()` has a kernel context switch which is computationally expensive, but guarantees that data is safe on disk.
-
-{{% note%}}
-Once you receive a response to a write request, your data is on disk!
-{{% /note %}}
+As a system call, `fsync()` has a kernel context switch that's computationally expensive, but guarantees that data is safe on disk.
 
 When the storage engine restarts, the WAL file is read back into the in-memory database.
 InfluxDB then answers requests to the `/read` endpoint.
@@ -102,5 +101,6 @@ TSI stores series keys grouped by measurement, tag, and field.
 In data with high cardinality (a large quantity of series), it becomes slower to search through all series keys.
 The TSI stores series keys grouped by measurement, tag, and field.
 TSI answers two questions well:
-1) What measurements, tags, fields exist?
-2) Given a measurement, tags, and fields, what series keys exist?
+
+- What measurements, tags, fields exist?
+- Given a measurement, tags, and fields, what series keys exist?

From 46803fc3ee50ccf16b20d81bb247b2e1cbf8efd8 Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Tue, 7 Jan 2020 09:25:24 -0800
Subject: [PATCH 16/18] Storage engine doc: more review work

---
 content/v2.0/reference/internals/storage-engine.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/content/v2.0/reference/internals/storage-engine.md b/content/v2.0/reference/internals/storage-engine.md
index e495c6d15..6e26fa11a 100644
--- a/content/v2.0/reference/internals/storage-engine.md
+++ b/content/v2.0/reference/internals/storage-engine.md
@@ -33,7 +33,7 @@ Data is written to InfluxDB using [line protocol](/v2.0/reference/line-protocol/
 Batches of [points](/v2.0/reference/glossary/#point) are sent to InfluxDB, compressed, and written to a WAL for immediate durability.
 Points are also written to an in-memory cache and become immediately queryable.
 The in-memory cache is periodically written to disk in the form of [TSM](#time-structured-merge-tree-tsm) files.
-As TSM files accumulate, they are combined and compacted into higher level TSM files.
+As TSM files accumulate, the storage engine combines and compacts accumulated them into higher level TSM files.
 
 {{% note %}}
 Points can be sent individually; however, for efficiency, most applications send points in batches.
@@ -69,7 +69,7 @@ The cache is recreated on restart by re-reading the WAL files on disk back into
 The cache is queried at runtime and merged with the data stored in TSM files.
 When the storage engine restarts, WAL files are re-read into the in-memory cache.
 
-Queries to the storage engine will merge data from the cache with data from the TSM files.
+Queries to the storage engine merge data from the cache with data from the TSM files.
 Queries execute on a copy of the data that is made from the cache at query processing time.
 This way writes that come in while a query is running do not affect the result.
 Deletes sent to the cache will clear out the given key or the specific time range for the given key.

From 09e6cef6d3bd9532fa70a9baf48397d0dd0d2533 Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Wed, 8 Jan 2020 10:41:29 -0800
Subject: [PATCH 17/18] Storage engine doc: continue PR review work

---
 .../reference/internals/storage-engine.md     | 24 +++++++++----------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/content/v2.0/reference/internals/storage-engine.md b/content/v2.0/reference/internals/storage-engine.md
index 6e26fa11a..047c7b1d2 100644
--- a/content/v2.0/reference/internals/storage-engine.md
+++ b/content/v2.0/reference/internals/storage-engine.md
@@ -36,7 +36,7 @@ The in-memory cache is periodically written to disk in the form of [TSM](#time-s
 As TSM files accumulate, the storage engine combines and compacts accumulated them into higher level TSM files.
 
 {{% note %}}
-Points can be sent individually; however, for efficiency, most applications send points in batches.
+While points can be sent individually, for efficiency, most applications send points in batches.
 Points in a POST body can be from an arbitrary number of series, measurements, and tag sets.
 Points in a batch do not have to be from the same measurement or tagset.
 {{% /note %}}
@@ -62,29 +62,29 @@ InfluxDB then answers requests to the `/read` endpoint.
 ## Cache
 
 The **cache** is an in-memory copy of data points currently stored in the WAL.
-Points are organized by key, which is the measurement, tag set, and unique field.
-Each field is stored in its own time-ordered range.
-Data is not compressed in the cache.
-The cache is recreated on restart by re-reading the WAL files on disk back into memory.
-The cache is queried at runtime and merged with the data stored in TSM files.
-When the storage engine restarts, WAL files are re-read into the in-memory cache.
+The cache:
+
+- Organizes points by key (measurement, tag set, and unique field)
+  Each field is stored in its own time-ordered range.
+- Stores uncompressed data.
+- Gets updates from the WAL each time the storage engine restarts.
+  The cache is queried at runtime and merged with the data stored in TSM files.
 
 Queries to the storage engine merge data from the cache with data from the TSM files.
 Queries execute on a copy of the data that is made from the cache at query processing time.
 This way writes that come in while a query is running do not affect the result.
-Deletes sent to the cache will clear out the given key or the specific time range for the given key.
+Deletes sent to the cache clear the specified key or time range for a specified key.
 
 ## Time-Structured Merge Tree (TSM)
 
 To efficiently compact and store data,
 the storage engine groups field values by series key, and then orders those field values by time.
-(A [series key](/v2/) is defined by measurement, tag key and value, and field key.)
+(A [series key](/v2.0/reference/glossary/#series-key) is defined by measurement, tag key and value, and field key.)
 
 The storage engine uses a **Time-Structured Merge Tree** (TSM) data format.
 TSM files store compressed series data in a columnar format.
 To improve efficiency, the storage engine only stores differences (or *deltas*) between values in a series.
-Column-oriented storage means we can read by series key and ignore what it doesn't need.
-Storing data in columns lets the storage engine read by series key.
+Column-oriented storage lets the engine read by series key and omit extraneous data.
 
 After fields are stored safely in TSM files, the WAL is truncated and the cache is cleared.
 The TSM compaction code is quite complex.
@@ -98,7 +98,7 @@ The **Time Series Index** ensures queries remain fast as data cardinality grows.
 To keep queries fast as we have more data, we use a **Time Series Index**.
 
 TSI stores series keys grouped by measurement, tag, and field.
-In data with high cardinality (a large quantity of series), it becomes slower to search through all series keys.
+In data with high cardinality (a large quantity of series), queries become slower.
 The TSI stores series keys grouped by measurement, tag, and field.
 TSI answers two questions well:
 

From e5d273f71f67d59ff23289525f552f75b50b25e3 Mon Sep 17 00:00:00 2001
From: pierwill <19642016+pierwill@users.noreply.github.com>
Date: Wed, 8 Jan 2020 10:55:47 -0800
Subject: [PATCH 18/18] Storage engine: edit TSI section

---
 content/v2.0/reference/internals/storage-engine.md | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/content/v2.0/reference/internals/storage-engine.md b/content/v2.0/reference/internals/storage-engine.md
index 047c7b1d2..1aabe5995 100644
--- a/content/v2.0/reference/internals/storage-engine.md
+++ b/content/v2.0/reference/internals/storage-engine.md
@@ -95,12 +95,9 @@ organize values for a series together into long runs to best optimize compressio
 
 As data cardinality (the number of series) grows, queries read more series keys and become slower.
 The **Time Series Index** ensures queries remain fast as data cardinality grows.
-To keep queries fast as we have more data, we use a **Time Series Index**.
-
-TSI stores series keys grouped by measurement, tag, and field.
-In data with high cardinality (a large quantity of series), queries become slower.
 The TSI stores series keys grouped by measurement, tag, and field.
-TSI answers two questions well:
+This allows the database to answer two questions well:
 
 - What measurements, tags, fields exist?
+  (This happens in meta queries.)
 - Given a measurement, tags, and fields, what series keys exist?