Merge branch 'master' into system-buckets

pull/1963/head
Scott Anderson 2020-12-09 11:13:43 -07:00
commit 9d6957ed0c
5 changed files with 123 additions and 67 deletions

View File

@ -9,61 +9,9 @@ menu:
name: Profiler name: Profiler
parent: Flux standard library parent: Flux standard library
weight: 202 weight: 202
influxdb/v2.0/tags: [functions, optimize, package] influxdb/cloud/tags: [functions, optimize, package]
related: related:
- /influxdb/cloud/query-data/optimize-queries/ - /influxdb/cloud/query-data/optimize-queries/
--- ---
The Flux Profiler package provides performance profiling tools for Flux queries and operations. {{< duplicate-oss >}}
Import the `profiler` package:
```js
import "profiler"
```
## Options
The Profiler package includes the following options:
### enabledProfilers
Enable Flux profilers.
_**Data type:** Array of strings_
```js
import "profiler"
option profiler.enabledProfilers = [""]
```
#### Available profilers
##### query
The `query` profiler provides statistics about the execution of an entire Flux script.
When enabled, results returned by [`yield()`](/influxdb/cloud/reference/flux/stdlib/built-in/outputs/yield/)
include a table with the following columns:
- **TotalDuration**: total query duration in nanoseconds.
- **CompileDuration**: number of nanoseconds spent compiling the query.
- **QueueDuration**: number of nanoseconds spent queueing.
- **RequeueDuration**: number fo nanoseconds spent requeueing.
- **PlanDuration**: number of nanoseconds spent planning the query.
- **ExecuteDuration**: number of nanoseconds spent executing the query.
- **Concurrency**: number of goroutines allocated to process the query.
- **MaxAllocated**: maximum number of bytes the query allocated.
- **TotalAllocated**: total number of bytes the query allocated (includes memory that was freed and then used again).
- **RuntimeErrors**: error messages returned during query execution.
- **flux/query-plan**: Flux query plan.
- **influxdb/scanned-values**: value scanned by InfluxDB.
- **influxdb/scanned-bytes**: number of bytes scanned by InfluxDB.
#### Use the query profiler
Use the query profiler to output statistics about query execution.
```js
import "profiler"
option profiler.enabledProfilers = ["query"]
// ... Query to profile
```

View File

@ -0,0 +1,9 @@
---
title: InfluxDB Cloud internals
menu:
influxdb_cloud_ref:
name: InfluxDB Cloud internals
weight: 7
---
{{< children >}}

View File

@ -0,0 +1,88 @@
---
title: InfluxDB Cloud data durability
description: >
InfluxDB Cloud ensures the durability of all stored data by replicating data across
multiple availability zones in a cloud region, automatically creating backups,
and verifying that replicated data is consistent and backups are readable.
weight: 101
menu:
influxdb_cloud_ref:
name: Data durability
parent: InfluxDB Cloud internals
influxdb/cloud/tags: [backups, internals]
---
InfluxDB Cloud replicates all data in the storage tier across two availability
zones in a cloud region, automatically creates backups, and verifies that replicated
data is consistent and that data is readable.
##### On this page
- [Data replication](#data-replication)
- [Backup processes](#backup-processes)
- [Recovery](#recovery)
- [Data verification](#data-verification)
## Data replication
InfluxDB Cloud replicates data in both the write tier and the storage tier.
- **Write tier:** all data written to InfluxDB is processed by a durable message queue.
The message queue partitions each batch of points based off series keys and then
replicates each partition across other physical nodes in the message queue.
- **Storage tier:** all data in the underlying storage tier is replicated across
two availability zones in a cloud region.
## Backup processes
InfluxDB Cloud backs up all data in the following way:
- [Backup on write](#backup-on-write)
- [Backup after compaction](#backup-after-compaction)
### Backup on write
All inbound write requests to InfluxDB Cloud are added to a durable message queue.
The message queue does the following:
1. Caches the [line protocol](/influxdb/cloud/reference/glossary/#line-protocol)
of each write request.
2. Writes data to the storage tier.
3. Routinely persists cached line protocol to object storage as an out-of-band backup.
Message queue backups provide raw line protocol that can be used to recover from
catastrophic failure in the storage tier or an accidental deletion.
The durability of the message queue is 96 hours, meaning InfluxDB Cloud can sustain
a failure of its underlying storage tier or object storage services for up to 96 hours
without any data loss.
To minimize potential data loss due to defects introduced in the InfluxDB Cloud service,
we minimize the code used between the data ingest and backup processes.
### Backup after compaction
The InfluxDB storage engine compresses data over time in a process known as
[compaction](/influxdb/cloud/reference/glossary/#compaction).
When each compaction cycle completes, InfluxDB Cloud stores compressed
[TSM](/influxdb/cloud/reference/glossary/#tsm-time-structured-merge-tree) files
in object storage.
## Recovery
InfluxDB Cloud uses the following out-of-band backups stored in object storage to recover data:
- **Message queue backup:** line protocol from inbound write requests within the last 96 hours
- **Historic backup:** compressed TSM files
The Recovery Point Objective (RPO) is any accepted write.
The Recovery Time Objective (RTO) is harder to definitively predict as potential failure modes can vary.
While most common failure modes can be resolved within minutes or hours,
critical failure modes may take longer.
For example, if we need to rebuild all data from the message queue backup,
it could take 24 hours or longer.
## Data verification
InfluxDB Cloud has two data verification services running at all times:
- **Entropy detection:** ensures that replicated data is consistent
- **Data verification:** verifies that data written to InfluxDB is readable
## InfluxDB Cloud status
InfluxDB Cloud regions and underlying services are monitored at all times.
For information about the current status of InfluxDB Cloud, see the
[InfluxDB Cloud status page](https://status.influxdata.com).

View File

@ -32,12 +32,16 @@ _**Data type:** Array of strings_
```js ```js
import "profiler" import "profiler"
option profiler.enabledProfilers = [""] option profiler.enabledProfilers = ["query", "operator"]
// Query to profile
``` ```
#### Available profilers ## Available profilers
- [query](#query)
- [operator](#operator)
##### query ### query
The `query` profiler provides statistics about the execution of an entire Flux script. The `query` profiler provides statistics about the execution of an entire Flux script.
When enabled, results returned by [`yield()`](/influxdb/v2.0/reference/flux/stdlib/built-in/outputs/yield/) When enabled, results returned by [`yield()`](/influxdb/v2.0/reference/flux/stdlib/built-in/outputs/yield/)
include a table with the following columns: include a table with the following columns:
@ -56,14 +60,17 @@ include a table with the following columns:
- **influxdb/scanned-values**: value scanned by InfluxDB. - **influxdb/scanned-values**: value scanned by InfluxDB.
- **influxdb/scanned-bytes**: number of bytes scanned by InfluxDB. - **influxdb/scanned-bytes**: number of bytes scanned by InfluxDB.
#### Use the query profiler ### operator
The `operator` profiler output statistics about each operation in a query.
[Operations executed in the storage tier](/influxdb/v2.0/query-data/optimize-queries/#start-queries-with-pushdown-functions)
return as a single operation.
When the `operator` profile is enabled, results returned by [`yield()`](/influxdb/v2.0/reference/flux/stdlib/built-in/outputs/yield/)
include a table with a row for each operation and the following columns:
Use the query profiler to output statistics about query execution. - **Type:** operation type
- **Label:** operation name
```js - **Count:** total number of times the operation executed
import "profiler" - **MinDuration:** minimum duration of the operation in nanoseconds
- **MaxDuration:** maximum duration of the operation in nanoseconds
option profiler.enabledProfilers = ["query"] - **DurationSum:** total duration of all operation executions in nanoseconds
- **MeanDuration:** average duration of all operation executions in nanoseconds
// ... Query to profile
```

View File

@ -188,6 +188,10 @@ Use comments with Flux statements to describe your functions.
A standardized text file format used by the InfluxDB web server to create log entries when generating server log files. A standardized text file format used by the InfluxDB web server to create log entries when generating server log files.
### compaction
Compressing time series data to optimize disk usage.
### continuous query (CQ) ### continuous query (CQ)
Continuous queries are the predecessor to tasks in InfluxDB 2.0. Continuous queries are the predecessor to tasks in InfluxDB 2.0.