Merge branch 'master' into system-buckets

2020-12-09 11:13:43 -07:00 · 2020-12-09 11:13:43 -07:00 · 9d6957ed0c
parent 3f34884c11 ff0d860b1d
commit 9d6957ed0c
5 changed files with 123 additions and 67 deletions
--- a/content/influxdb/cloud/reference/flux/stdlib/profiler/_index.md
+++ b/content/influxdb/cloud/reference/flux/stdlib/profiler/_index.md
@ -9,61 +9,9 @@ menu:
    name: Profiler
    parent: Flux standard library
 weight: 202
-influxdb/v2.0/tags: [functions, optimize, package]
+influxdb/cloud/tags: [functions, optimize, package]
 related:
  - /influxdb/cloud/query-data/optimize-queries/
 ---

-The Flux Profiler package provides performance profiling tools for Flux queries and operations.
-Import the `profiler` package:
-
-```js
-import "profiler"
-```
-
-## Options
-The Profiler package includes the following options:
-
-### enabledProfilers
-Enable Flux profilers.
-
-_**Data type:** Array of strings_
-
-```js
-import "profiler"
-
-option profiler.enabledProfilers = [""]
-```
-
-#### Available profilers
-
-##### query
-The `query` profiler provides statistics about the execution of an entire Flux script.
-When enabled, results returned by [`yield()`](/influxdb/cloud/reference/flux/stdlib/built-in/outputs/yield/)
-include a table with the following columns:
-
- **TotalDuration**: total query duration in nanoseconds.
- **CompileDuration**: number of nanoseconds spent compiling the query.
- **QueueDuration**: number of nanoseconds spent queueing.
- **RequeueDuration**: number fo nanoseconds spent requeueing.
- **PlanDuration**: number of nanoseconds spent planning the query.
- **ExecuteDuration**: number of nanoseconds spent executing the query.
- **Concurrency**: number of goroutines allocated to process the query.
- **MaxAllocated**: maximum number of bytes the query allocated.
- **TotalAllocated**: total number of bytes the query allocated (includes memory that was freed and then used again).
- **RuntimeErrors**: error messages returned during query execution.
- **flux/query-plan**: Flux query plan.
- **influxdb/scanned-values**: value scanned by InfluxDB.
- **influxdb/scanned-bytes**: number of bytes scanned by InfluxDB.
-
-#### Use the query profiler
-
-Use the query profiler to output statistics about query execution.
-
-```js
-import "profiler"
-
-option profiler.enabledProfilers = ["query"]
-
-// ... Query to profile
-```
+{{< duplicate-oss >}}
--- a/content/influxdb/cloud/reference/internals/_index.md
+++ b/content/influxdb/cloud/reference/internals/_index.md
@ -0,0 +1,9 @@
+---
+title: InfluxDB Cloud internals
+menu:
+  influxdb_cloud_ref:
+    name: InfluxDB Cloud internals
+weight: 7
+---
+
+{{< children >}}
--- a/content/influxdb/cloud/reference/internals/durability.md
+++ b/content/influxdb/cloud/reference/internals/durability.md
@ -0,0 +1,88 @@
+---
+title: InfluxDB Cloud data durability
+description: >
+  InfluxDB Cloud ensures the durability of all stored data by replicating data across
+  multiple availability zones in a cloud region, automatically creating backups,
+  and verifying that replicated data is consistent and backups are readable.
+weight: 101
+menu:
+  influxdb_cloud_ref:
+    name: Data durability
+    parent: InfluxDB Cloud internals
+influxdb/cloud/tags: [backups, internals]
+---
+
+InfluxDB Cloud replicates all data in the storage tier across two availability
+zones in a cloud region, automatically creates backups, and verifies that replicated
+data is consistent and that data is readable.
+
+##### On this page
+
+- [Data replication](#data-replication)
+- [Backup processes](#backup-processes)
+- [Recovery](#recovery)
+- [Data verification](#data-verification)
+
+## Data replication
+InfluxDB Cloud replicates data in both the write tier and the storage tier.
+
+- **Write tier:** all data written to InfluxDB is processed by a durable message queue.
+  The message queue partitions each batch of points based off series keys and then
+  replicates each partition across other physical nodes in the message queue.
+- **Storage tier:** all data in the underlying storage tier is replicated across
+  two availability zones in a cloud region.
+
+## Backup processes
+InfluxDB Cloud backs up all data in the following way:
+
+- [Backup on write](#backup-on-write)
+- [Backup after compaction](#backup-after-compaction)
+
+### Backup on write
+All inbound write requests to InfluxDB Cloud are added to a durable message queue.
+The message queue does the following:
+
+1. Caches the [line protocol](/influxdb/cloud/reference/glossary/#line-protocol)
+   of each write request.
+2. Writes data to the storage tier.
+3. Routinely persists cached line protocol to object storage as an out-of-band backup.
+
+Message queue backups provide raw line protocol that can be used to recover from
+catastrophic failure in the storage tier or an accidental deletion.
+The durability of the message queue is 96 hours, meaning InfluxDB Cloud can sustain
+a failure of its underlying storage tier or object storage services for up to 96 hours
+without any data loss.
+
+To minimize potential data loss due to defects introduced in the InfluxDB Cloud service,
+we minimize the code used between the data ingest and backup processes.
+
+### Backup after compaction
+The InfluxDB storage engine compresses data over time in a process known as
+[compaction](/influxdb/cloud/reference/glossary/#compaction).
+When each compaction cycle completes, InfluxDB Cloud stores compressed
+[TSM](/influxdb/cloud/reference/glossary/#tsm-time-structured-merge-tree) files
+in object storage.
+
+## Recovery
+InfluxDB Cloud uses the following out-of-band backups stored in object storage to recover data:
+
+- **Message queue backup:** line protocol from inbound write requests within the last 96 hours
+- **Historic backup:** compressed TSM files
+
+The Recovery Point Objective (RPO) is any accepted write.
+The Recovery Time Objective (RTO) is harder to definitively predict as potential failure modes can vary.
+While most common failure modes can be resolved within minutes or hours,
+critical failure modes may take longer.
+For example, if we need to rebuild all data from the message queue backup,
+it could take 24 hours or longer.
+
+## Data verification
+InfluxDB Cloud has two data verification services running at all times:
+
+- **Entropy detection:** ensures that replicated data is consistent
+- **Data verification:** verifies that data written to InfluxDB is readable
+
+## InfluxDB Cloud status
+InfluxDB Cloud regions and underlying services are monitored at all times.
+For information about the current status of InfluxDB Cloud, see the
+[InfluxDB Cloud status page](https://status.influxdata.com).
--- a/content/influxdb/v2.0/reference/flux/stdlib/profiler/_index.md
+++ b/content/influxdb/v2.0/reference/flux/stdlib/profiler/_index.md
@ -32,12 +32,16 @@ _**Data type:** Array of strings_
 ```js
 import "profiler"

-option profiler.enabledProfilers = [""]
+option profiler.enabledProfilers = ["query", "operator"]
+
+// Query to profile
 ```

-#### Available profilers
+## Available profilers
+- [query](#query)
+- [operator](#operator)

-##### query
+### query
 The `query` profiler provides statistics about the execution of an entire Flux script.
 When enabled, results returned by [`yield()`](/influxdb/v2.0/reference/flux/stdlib/built-in/outputs/yield/)
 include a table with the following columns:
@ -56,14 +60,17 @@ include a table with the following columns:
 - **influxdb/scanned-values**: value scanned by InfluxDB.
 - **influxdb/scanned-bytes**: number of bytes scanned by InfluxDB.

-#### Use the query profiler
+### operator
+The `operator` profiler output statistics about each operation in a query.
+[Operations executed in the storage tier](/influxdb/v2.0/query-data/optimize-queries/#start-queries-with-pushdown-functions)
+return as a single operation.
+When the `operator` profile is enabled, results returned by [`yield()`](/influxdb/v2.0/reference/flux/stdlib/built-in/outputs/yield/)
+include a table with a row for each operation and the following columns:

-Use the query profiler to output statistics about query execution.
-
-```js
-import "profiler"
-
-option profiler.enabledProfilers = ["query"]
-
-// ... Query to profile
-```
+- **Type:** operation type
+- **Label:** operation name
+- **Count:** total number of times the operation executed
+- **MinDuration:** minimum duration of the operation in nanoseconds
+- **MaxDuration:** maximum duration of the operation in nanoseconds
+- **DurationSum:** total duration of all operation executions in nanoseconds
+- **MeanDuration:** average duration of all operation executions in nanoseconds
--- a/content/influxdb/v2.0/reference/glossary.md
+++ b/content/influxdb/v2.0/reference/glossary.md
@ -188,6 +188,10 @@ Use comments with Flux statements to describe your functions.

 A standardized text file format used by the InfluxDB web server to create log entries when generating server log files.

+### compaction
+
+Compressing time series data to optimize disk usage.
+
 ### continuous query (CQ)

 Continuous queries are the predecessor to tasks in InfluxDB 2.0.