finalized downsampling guide, added page descriptions to task docs

2019-01-18 17:29:28 -07:00 · 2019-01-18 17:29:28 -07:00 · e901980d5d
parent 6e13d5292f
commit e901980d5d
6 changed files with 82 additions and 28 deletions
--- a/content/v2.0/process-data/_index.md
+++ b/content/v2.0/process-data/_index.md
@ -1,7 +1,9 @@
 ---
 title: Process Data with InfluxDB tasks
 seotitle: Process Data with InfluxDB tasks
-description: placeholder
+description: >
+  InfluxDB's task engine runs scheduled Flux tasks that process and analyze data.
+  This collection of articles provides information about creating and managing InfluxDB tasks.
 menu:
  v2_0:
    name: Process data
@ -21,4 +23,7 @@ The following articles explain how to configure and build tasks using the Influx
 and via raw Flux scripts with the `influx` command line interface (CLI).
 They also provide examples of commonly used tasks.

-_Links for nested docs._
+[Write a task](/v2.0/process-data/write-a-task)  
+[Manage Tasks](/v2.0/process-data/manage-tasks)  
+[Common Tasks](/v2.0/process-data/common-tasks)  
+[Task Options](/v2.0/process-data/task-options)
--- a/content/v2.0/process-data/common-tasks/_index.md
+++ b/content/v2.0/process-data/common-tasks/_index.md
@ -1,7 +1,9 @@
 ---
 title: Common data processing tasks
 seotitle: Common data processing tasks performed with with InfluxDB
-description: placeholder
+description: >
+  InfluxDB Tasks process data on specified schedules.
+  This collection of articles walks through common use cases for InfluxDB tasks.
 menu:
  v2_0:
    name: Common tasks
@ -11,10 +13,10 @@ menu:

 The following articles walk through common task use cases.

+[Downsample Data with InfluxDB](/v2.0/process-data/common-tasks/downsample-data)
+
 {{% note %}}
 This list will continue to grow.
 If you have suggestions, please [create an issue](https://github.com/influxdata/docs-v2/issues/new)
 on the InfluxData documentation repository on Github.
 {{% /note %}}
-
-[Downsample Data with InfluxDB](/v2.0/process-data/common-tasks/downsample-data)
--- a/content/v2.0/process-data/common-tasks/downsample-data.md
+++ b/content/v2.0/process-data/common-tasks/downsample-data.md
@ -1,7 +1,9 @@
 ---
 title: Downsample data with InfluxDB
 seotitle: Downsample data in an InfluxDB task
-description: placeholder
+description: >
+  How to create a task that downsamples data much like continuous queries
+  in previous versions of InfluxDB.
 menu:
  v2_0:
    name: Downsample data
@ -9,35 +11,74 @@ menu:
    weight: 4
 ---

- Talk about Continuous Queries
+One of the most common use cases for InfluxDB tasks is downsampling data to reduce
+the overall disk usage as data collects over time.
+In previous versions of InfluxDB, continuous queries filled this role.

-**Requirements:**
+This article walks through creating a continuous-query-like task that downsamples
+data by aggregating data within windows of time, then storing the aggregate value in a new bucket.

- A "source" bucket
- A "destination" bucket
- Some type of aggregation
- and a `to` statement
+### Requirements
+To perform a downsampling task, you need to the following:

- You can't write data into the same bucket you're reading from
- A two buckets
- `to()` requires a bucket AND org
+##### A "source" bucket
+The bucket from which data is queried.

+##### A "destination" bucket
+A separate bucket where aggregated, downsampled data is stored.
+
+##### Some type of aggregation
+To downsample data, it must be aggregated in some way.
+What specific method of aggregation you use depends on your specific use case,
+but examples include mean, median, top, bottom, etc.
+View [Flux's aggregate functions](#) for more information and ideas.
+
+## Create a destination bucket
+By design, tasks cannot write to the same bucket from which they are reading.
+You need another bucket where the task can store the aggregated, downsampled data.
+
+_For information about creating buckets, see [Create a bucket](#)._
+
+## Example downsampling task script
+The example task script below is a very basic form of data downsampling that does the following:
+
+1. Defines a task named "cq-mem-data-1w" that runs once a week.
+2. Defines a `data` variable that represents all data from the last 2 weeks in the
+   `mem` measurement of the `system-data` bucket.
+3. Uses the [`aggregateWindow()` function](#) to window the data into 1 hour intervals
+   and calculate the average of each interval.
+4. Stores the aggregated data in the `system-data-downsampled` bucket under the
+   `my-org` organization.

 ```js
+// Task Options
 option task = {
-  name: "cqinterval15m",
+  name: "cq-mem-data-1w",
  every: 1w,
 }

-data = from(bucket: "telegraf")
+// Defines a data source
+data = from(bucket: "system-data")
  |> range(start: -task.every * 2)
-  |> filter(fn: (r) => r._measurement == "cpu")
+  |> filter(fn: (r) => r._measurement == "mem")

-downsampleHourly = (tables=<-) =>
-  tables
-    |> aggregateWindow(fn: mean, every: 1h)
-    |> set(key: "_measurement", value: "cpu_1h" )
-    |> to(bucket: "telegraf_downsampled", org: "my-org")
-
-downsampleHourly(data)
+data
+  // Windows and aggregates the data in to 1h averages
+  |> aggregateWindow(fn: mean, every: 1h)
+  // Stores the aggregated data in a new bucket
+  |> to(bucket: "system-data-downsampled", org: "my-org")
 ```
+
+Again, this is a very basic example, but it should provide you with a foundation
+to build more complex downsampling tasks.
+
+## Add your task
+Once your task is ready, see [Create a task](#) for information about adding it to InfluxDB.
+
+## Things to consider
+- If there is a chance that data may arrive late, specify an `offset` in your
+  task options long enough to account for late-data.
+- If running a task against a bucket with a finite retention policy, do not schedule
+  tasks to run too closely to the end of the retention policy.
+  Always provide a "cushion" for downsampling tasks to complete before the data
+  is dropped by the retention policy.
--- a/content/v2.0/process-data/manage-tasks/_index.md
+++ b/content/v2.0/process-data/manage-tasks/_index.md
@ -1,7 +1,9 @@
 ---
 title: Manage tasks in InfluxDB
 seotitle: Manage data processing tasks in InfluxDB
-description: placeholder
+description: >
+  InfluxDB provides options for managing the creation, reading, updating, and deletion
+  of tasks using both the 'influx' CLI and the InfluxDB UI.
 menu:
  v2_0:
    name: Manage tasks
--- a/content/v2.0/process-data/task-options.md
+++ b/content/v2.0/process-data/task-options.md
@ -1,7 +1,9 @@
 ---
 title: Task configuration options
 seotitle: InfluxDB task configuration options
-description: placeholder
+description: >
+  Task options define specific information about a task such as its name,
+  the schedule on which it runs, execution delays, and others.
 menu:
  v2_0:
    name: Task options
--- a/content/v2.0/process-data/write-a-task.md
+++ b/content/v2.0/process-data/write-a-task.md
@ -1,7 +1,9 @@
 ---
 title: Write an InfluxDB task
 seotitle: Write an InfluxDB task that processes data
-description: placeholder
+description: >
+  How to write an InfluxDB task that processes data in some way, then performs an action
+  such as storing the modified data in a new bucket or sending an alert.
 menu:
  v2_0:
    name: Write a task