Merge pull request #20 from influxdata/process-data

Process data
2019-01-22 15:49:17 -07:00 · 2019-01-22 15:49:17 -07:00 · 3355ba9b9e
parent 0c4199c66a 3f286f2783
commit 3355ba9b9e
12 changed files with 645 additions and 40 deletions
--- a/content/v2.0/UI/using-tasks.md
+++ b/content/v2.0/UI/using-tasks.md
@ -1,37 +0,0 @@
---
-title: Using tasks
-description: This is just an example post to show the format of new 2.0 posts
-menu:
-  v2_0:
-    name: Using tasks
-    weight: 1
---
-A task is a scheduled Flux query. Main use case is replacement for continuous queries, add info about CQs.
-
-**To filter the list of tasks**:
-
-1. Enable the **Show Inactive** option to include inactive tasks on the list.
-2. Enter text in the **Filter tasks by name** field to search for tasks by name.
-3. Select an organization from the **All Organizations** dropdown to filter the list by organization.
-4. Click on the heading of any column to sort by that field.
-
-**To import a task**:
-
-1. Click the Tasks (calendar) icon in the left navigation menu.
-2. Click **Import** in the upper right.
-3. Drag and drop or select a file to upload.
-4. !!!
-
-**To create a task**:
-
-1. Click **+ Create Task**.
-2. In the left sidebar panel, enter the following details:
-  * **Name**: The name of your task.
-  * **Owner**: Select an organization from the drop-down menu.
-  * **Schedule Task**: Select **Interval** for !!!! or **Cron** to !!!. Also enter value below (interval window or Cron thing).
-  * **Offset**: Enter an offset time. If you schedule it to run at the hour but you have an offset of ten minutes, then it runs at an hour and ten minutes.
-3. In the right panel, enter your task script.
-4. Click **Save**.
-
-
-**Disable tasks**
--- a/content/v2.0/process-data/_index.md
+++ b/content/v2.0/process-data/_index.md
@ -0,0 +1,29 @@
+---
+title: Process Data with InfluxDB tasks
+seotitle: Process Data with InfluxDB tasks
+description: >
+  InfluxDB's task engine runs scheduled Flux tasks that process and analyze data.
+  This collection of articles provides information about creating and managing InfluxDB tasks.
+menu:
+  v2_0:
+    name: Process data
+    weight: 3
+---
+
+InfluxDB's _**task engine**_ is designed for processing and analyzing data.
+A task is a scheduled Flux query that take a stream of input data, modify or
+analyze it in some way, then perform an action.
+Examples include data downsampling, anomaly detection _(Coming)_, alerting _(Coming)_, etc.
+
+{{% note %}}
+Tasks are a replacement for InfluxDB v1.x's continuous queries.
+{{% /note %}}
+
+The following articles explain how to configure and build tasks using the InfluxDB user interface (UI)
+and via raw Flux scripts with the `influx` command line interface (CLI).
+They also provide examples of commonly used tasks.
+
+[Write a task](/v2.0/process-data/write-a-task)  
+[Manage Tasks](/v2.0/process-data/manage-tasks)  
+[Common Tasks](/v2.0/process-data/common-tasks)  
+[Task Options](/v2.0/process-data/task-options)
--- a/content/v2.0/process-data/common-tasks/_index.md
+++ b/content/v2.0/process-data/common-tasks/_index.md
@ -0,0 +1,22 @@
+---
+title: Common data processing tasks
+seotitle: Common data processing tasks performed with with InfluxDB
+description: >
+  InfluxDB Tasks process data on specified schedules.
+  This collection of articles walks through common use cases for InfluxDB tasks.
+menu:
+  v2_0:
+    name: Common tasks
+    parent: Process data
+    weight: 4
+---
+
+The following articles walk through common task use cases.
+
+[Downsample Data with InfluxDB](/v2.0/process-data/common-tasks/downsample-data)
+
+{{% note %}}
+This list will continue to grow.
+If you have suggestions, please [create an issue](https://github.com/influxdata/docs-v2/issues/new)
+on the InfluxData documentation repository on Github.
+{{% /note %}}
--- a/content/v2.0/process-data/common-tasks/downsample-data.md
+++ b/content/v2.0/process-data/common-tasks/downsample-data.md
@ -0,0 +1,85 @@
+---
+title: Downsample data with InfluxDB
+seotitle: Downsample data in an InfluxDB task
+description: >
+  How to create a task that downsamples data much like continuous queries
+  in previous versions of InfluxDB.
+menu:
+  v2_0:
+    name: Downsample data
+    parent: Common tasks
+    weight: 4
+---
+
+One of the most common use cases for InfluxDB tasks is downsampling data to reduce
+the overall disk usage as data collects over time.
+In previous versions of InfluxDB, continuous queries filled this role.
+
+This article walks through creating a continuous-query-like task that downsamples
+data by aggregating data within windows of time, then storing the aggregate value in a new bucket.
+
+### Requirements
+To perform a downsampling task, you need to the following:
+
+##### A "source" bucket
+The bucket from which data is queried.
+
+##### A "destination" bucket
+A separate bucket where aggregated, downsampled data is stored.
+
+##### Some type of aggregation
+To downsample data, it must be aggregated in some way.
+What specific method of aggregation you use depends on your specific use case,
+but examples include mean, median, top, bottom, etc.
+View [Flux's aggregate functions](/v2.0/reference/flux/functions/transformations/aggregates/)
+for more information and ideas.
+
+## Create a destination bucket
+By design, tasks cannot write to the same bucket from which they are reading.
+You need another bucket where the task can store the aggregated, downsampled data.
+
+_For information about creating buckets, see [Create a bucket](#)._
+
+## Example downsampling task script
+The example task script below is a very basic form of data downsampling that does the following:
+
+1. Defines a task named "cq-mem-data-1w" that runs once a week.
+2. Defines a `data` variable that represents all data from the last 2 weeks in the
+   `mem` measurement of the `system-data` bucket.
+3. Uses the [`aggregateWindow()` function](/v2.0/reference/flux/functions/transformations/aggregates/aggregatewindow/)
+   to window the data into 1 hour intervals and calculate the average of each interval.
+4. Stores the aggregated data in the `system-data-downsampled` bucket under the
+   `my-org` organization.
+
+```js
+// Task Options
+option task = {
+  name: "cq-mem-data-1w",
+  every: 1w,
+}
+
+// Defines a data source
+data = from(bucket: "system-data")
+  |> range(start: -task.every * 2)
+  |> filter(fn: (r) => r._measurement == "mem")
+
+data
+  // Windows and aggregates the data in to 1h averages
+  |> aggregateWindow(fn: mean, every: 1h)
+  // Stores the aggregated data in a new bucket
+  |> to(bucket: "system-data-downsampled", org: "my-org")
+```
+
+Again, this is a very basic example, but it should provide you with a foundation
+to build more complex downsampling tasks.
+
+## Add your task
+Once your task is ready, see [Create a task](/v2.0/process-data/manage-tasks/create-task) for information about adding it to InfluxDB.
+
+## Things to consider
+- If there is a chance that data may arrive late, specify an `offset` in your
+  task options long enough to account for late-data.
+- If running a task against a bucket with a finite retention policy, do not schedule
+  tasks to run too closely to the end of the retention policy.
+  Always provide a "cushion" for downsampling tasks to complete before the data
+  is dropped by the retention policy.
--- a/content/v2.0/process-data/manage-tasks/_index.md
+++ b/content/v2.0/process-data/manage-tasks/_index.md
@ -0,0 +1,21 @@
+---
+title: Manage tasks in InfluxDB
+seotitle: Manage data processing tasks in InfluxDB
+description: >
+  InfluxDB provides options for managing the creation, reading, updating, and deletion
+  of tasks using both the 'influx' CLI and the InfluxDB UI.
+menu:
+  v2_0:
+    name: Manage tasks
+    parent: Process data
+    weight: 2
+---
+
+InfluxDB provides two options for managing the creation, reading, updating, and deletion (CRUD) of tasks -
+through the InfluxDB user interface (UI) or using the `influx` command line interface (CLI).
+Both tools can perform all task CRUD operations.
+
+[Create a task](/v2.0/process-data/manage-tasks/create-task)  
+[View tasks](/v2.0/process-data/manage-tasks/view-tasks)  
+[Update a task](/v2.0/process-data/manage-tasks/update-task)  
+[Delete a task](/v2.0/process-data/manage-tasks/delete-task)  
--- a/content/v2.0/process-data/manage-tasks/create-task.md
+++ b/content/v2.0/process-data/manage-tasks/create-task.md
@ -0,0 +1,85 @@
+---
+title: Create a task
+seotitle: Create a task for processing data in InfluxDB
+description: >
+  How to create a task that processes data in InfluxDB using the InfluxDB user
+  interface or the 'influx' command line interface.
+menu:
+  v2_0:
+    name: Create a task
+    parent: Manage tasks
+    weight: 1
+---
+
+InfluxDB provides multiple ways to create tasks both in the InfluxDB user interface (UI)
+and the `influx` command line interface (CLI).
+
+_This article assumes you have already [written a task](/v2.0/process-data/write-a-task)._
+
+## Create a task in the InfluxDB UI
+The InfluxDB UI provides multiple ways to create a task:
+
+- [Create a task from the Data Explorer](#create-a-task-from-the-data-explorer)
+- [Create a task in the Task UI](#create-a-task-in-the-task-ui)
+- [Import a task](#import-a-task)
+
+### Create a task from the Data Explorer
+1. Click on the **Data Explorer** icon in the left navigation menu.
+
+    {{< img-hd src="/img/data-explorer-icon.png" alt="Data Explorer Icon" />}}
+
+2. Building a query and click **Save As** in the upper right.
+3. Select the **Task** option.
+4. Specify the task options. See [Task options](/v2.0/process-data/task-options)
+   for detailed information about each option.
+5. Click **Save as Task**.
+
+{{< img-hd src="/img/data-explorer-save-as-task.png" alt="Add a task from the Data Explorer"/>}}
+
+### Create a task in the Task UI
+1. Click on the **Tasks** icon in the left navigation menu.
+
+    {{< img-hd src="/img/tasks-icon.png" alt="Tasks Icon" />}}
+
+2. Click **+ Create Task** in the upper right.
+3. In the left panel, specify the task options.
+   See [Task options](/v2.0/process-data/task-options)for detailed information about each option.
+4. In the right panel, enter your task script.
+5. Click **Save** in the upper right.
+
+{{< img-hd src="/img/tasks-create-edit.png" alt="Create a task" />}}
+
+### Import a task
+1. Click on the **Tasks** icon in the left navigation menu.
+2. Click **Import** in the upper right.
+3. Drag and drop or select a file to upload.
+4. Click **Upload Task**.
+
+{{< img-hd src="/img/tasks-import-task.png" alt="Import a task" />}}
+
+## Create a task using the influx CLI
+Use `influx task create` command to create a new task.
+It accepts either a file path or raw Flux.
+
+###### Create a task using a file
+```sh
+# Pattern
+influx task create --org <org-name> @</path/to/task-script>
+
+# Example
+influx task create --org my-org @/tasks/cq-mean-1h.flux
+```
+
+###### Create a task using raw Flux
+```sh
+influx task create --org my-org - # <return> to open stdin pipe
+
+options task = {
+  name: "task-name",
+  every: 6h
+}
+
+# ... Task script ...
+
+# <ctrl-d> to close the pipe and submit the command
+```
--- a/content/v2.0/process-data/manage-tasks/delete-task.md
+++ b/content/v2.0/process-data/manage-tasks/delete-task.md
@ -0,0 +1,37 @@
+---
+title: Delete a task
+seotitle: Delete a task for processing data in InfluxDB
+description: >
+  How to delete a task in InfluxDB using the InfluxDB user interface or using
+  the 'influx' command line interface.
+menu:
+  v2_0:
+    name: Delete a task
+    parent: Manage tasks
+    weight: 4
+---
+
+## Delete a task in the InfluxDB UI
+1. Click the **Tasks** icon in the left navigation menu.
+
+    {{< img-hd src="/img/tasks-icon.png" alt="Tasks Icon" />}}
+
+2. In the list of tasks, hover over the task you would like to delete.
+3. Click **Delete** on the far right.
+4. Click **Confirm**.
+
+{{< img-hd src="/img/tasks-delete-task.png" alt="Delete a task" />}}
+
+
+## Delete a task with the influx CLI
+Use the `influx task delete` command to delete a task.
+
+_This command requires a task ID, which is available in the output of `influx task find`._
+
+```sh
+# Pattern
+influx task delete -i <task-id>
+
+# Example
+influx task delete -i 0343698431c35000
+```
--- a/content/v2.0/process-data/manage-tasks/update-task.md
+++ b/content/v2.0/process-data/manage-tasks/update-task.md
@ -0,0 +1,63 @@
+---
+title: Update a task
+seotitle: Update a task for processing data in InfluxDB
+description: >
+  How to update a task that processes data in InfluxDB using the InfluxDB user
+  interface or the 'influx' command line interface.
+menu:
+  v2_0:
+    name: Update a task
+    parent: Manage tasks
+    weight: 3
+---
+
+## Update a task in the InfluxDB UI
+To view your tasks, click the **Tasks** icon in the left navigation menu.
+
+{{< img-hd src="/img/tasks-icon.png" alt="Tasks Icon" />}}
+
+#### Update a task's Flux script
+1. In the list of tasks, click the **Name** of the task you would like to update.
+2. In the left panel, modify the task options.
+3. In the right panel, modify the task script.
+4. Click **Save** in the upper right.
+
+{{< img-hd src="/img/tasks-create-edit.png" alt="Update a task" />}}
+
+#### Update the status of a task
+In the list of tasks, click the toggle in the **Active** column of the task you
+would like to activate or inactivate.
+
+
+## Update a task with the influx CLI
+Use the `influx task update` command to update or change the status of an existing task.
+
+_This command requires a task ID, which is available in the output of `influx task find`._
+
+#### Update a task's Flux script
+Pass the file path of your updated Flux script to the `influx task update` command
+with the ID of the task you would like to update.
+Modified [task options](/v2.0/process-data/task-options) defined in the Flux
+script are also updated.
+
+```sh
+# Pattern
+influx task update -i <task-id> @/path/to/updated-task-script
+
+# Example
+influx task update -i 0343698431c35000 @/tasks/cq-mean-1h.flux
+```
+
+#### Update the status of a task
+Pass the ID of the task you would like to update to the `influx task update`
+command with the `--status` flag.
+
+_Possible arguments of the `--status` flag are `active` or `inactive`._
+
+```sh
+# Pattern
+influx task update -i <task-id> --status < active | inactive >
+
+# Example
+influx task update -i 0343698431c35000 --status inactive
+```
--- a/content/v2.0/process-data/manage-tasks/view-tasks.md
+++ b/content/v2.0/process-data/manage-tasks/view-tasks.md
@ -0,0 +1,39 @@
+---
+title: View tasks in InfluxDB
+seotitle: View created tasks that process data in InfluxDB
+description: >
+  How to view all created data processing tasks using the InfluxDB user interface
+  or the 'influx' command line interface.
+menu:
+  v2_0:
+    name: View tasks
+    parent: Manage tasks
+    weight: 2
+---
+
+## View tasks in the InfluxDB UI
+Click the **Tasks** icon in the left navigation to view the lists of tasks.
+
+{{< img-hd src="/img/tasks-icon.png" alt="Tasks Icon" />}}
+
+### Filter the list of tasks
+
+1. Enable the **Show Inactive** option to include inactive tasks in the list.
+2. Enter text in the **Filter tasks by name** field to search for tasks by name.
+3. Select an organization from the **All Organizations** dropdown to filter the list by organization.
+4. Click on the heading of any column to sort by that field.
+
+{{< img-hd src="/img/tasks-list.png" alt="View and filter tasks" />}}
+
+## View tasks with the influx CLI
+Use the `influx task find` command to return a list of created tasks.
+
+```sh
+influx task find
+```
+
+#### Filter tasks using the CLI
+Other filtering options such as filtering by organization or user,
+or limiting the number of tasks returned are available.
+See the [`influx task find` documentation](/v2.0/reference/cli/influx/task/find)
+for information about other available flags.
--- a/content/v2.0/process-data/task-options.md
+++ b/content/v2.0/process-data/task-options.md
@ -0,0 +1,109 @@
+---
+title: Task configuration options
+seotitle: InfluxDB task configuration options
+description: >
+  Task options define specific information about a task such as its name,
+  the schedule on which it runs, execution delays, and others.
+menu:
+  v2_0:
+    name: Task options
+    parent: Process data
+    weight: 5
+---
+
+Task options define specific information about the task and are specified in your
+Flux script or in the InfluxDB user interface (UI).
+The following task options are available:
+
+- [name](#name)
+- [every](#every)
+- [cron](#cron)
+- [offset](#offset)
+- [concurrency](#concurrency)
+- [retry](#retry)
+
+{{% note %}}
+`every` and `cron` are mutually exclusive, but at least one is required.
+{{% /note %}}
+
+## name
+The name of the task. _**Required**_.
+
+_**Data type:** String_
+
+```js
+options task = {
+  name: "taskName",
+  // ...
+}
+```
+
+## every
+The interval at which the task runs.
+
+_**Data type:** Duration_
+
+_**Note:** In the InfluxDB UI, the **Interval** field sets this option_.
+
+```js
+options task = {
+  // ...
+  every: 1h,
+}
+```
+
+## cron
+The [cron expression](https://en.wikipedia.org/wiki/Cron#Overview) that
+defines the schedule on which the task runs.
+Cron scheduling is based on system time.
+
+_**Data type:** String_
+
+```js
+options task = {
+  // ...
+  cron: "0 * * * *",
+}
+```
+
+## offset
+Delays the execution of the task but preserves the original time range.
+For example, if a task is to run on the hour, a `10m` offset will delay it to 10
+minutes after the hour, but all time ranges defined in the task are relative to
+the specified execution time.
+A common use case is offsetting execution to account for data that may arrive late.
+
+_**Data type:** Duration_
+
+```js
+options task = {
+  // ...
+  offset: "0 * * * *",
+}
+```
+
+## concurrency
+The number task of executions that can run concurrently.
+If the concurrency limit is reached, all subsequent executions are queued until
+other running task executions complete.
+
+_**Data type:** Integer_
+
+```js
+options task = {
+  // ...
+  concurrency: 2,
+}
+```
+
+## retry
+The number of times to retry the task before it is considered as having failed.
+
+_**Data type:** Integer_
+
+```js
+options task = {
+  // ...
+  retry: 2,
+}
+```
--- a/content/v2.0/process-data/write-a-task.md
+++ b/content/v2.0/process-data/write-a-task.md
@ -0,0 +1,147 @@
+---
+title: Write an InfluxDB task
+seotitle: Write an InfluxDB task that processes data
+description: >
+  How to write an InfluxDB task that processes data in some way, then performs an action
+  such as storing the modified data in a new bucket or sending an alert.
+menu:
+  v2_0:
+    name: Write a task
+    parent: Process data
+    weight: 1
+---
+
+InfluxDB tasks are scheduled Flux scripts that take a stream of input data, modify or analyze
+it in some way, then store the modified data in a new bucket or perform other actions.
+
+This article walks through writing a basic InfluxDB task that downsamples
+data and stores it in a new bucket.
+
+## Components of a task
+Every InfluxDB task needs the following four components.
+Their form and order can vary, but the are all essential parts of a task.
+
+- [Task options](#define-task-options)
+- [A data source](#define-a-data-source)
+- [Data processing or transformation](#process-or-transform-your-data)
+- [A destination](#define-a-destination)
+
+_[Skip to the full example task script](#full-example-task-script)_
+
+## Define task options
+Task options define specific information about the task.
+The example below illustrates how task options are defined in your Flux script:
+
+```js
+option task = {
+    name: "cqinterval15m",
+    every: 1h,
+    offset: 0m,
+    concurrency: 1,
+    retry: 5
+}
+```
+
+_See [Task configuration options](/v2.0/process-data/task-options) for detailed information
+about each option._
+
+{{% note %}}
+If creating a task in the InfluxDB user interface (UI), task options are defined
+in form fields when creating the task.
+{{% /note %}}
+
+## Define a data source
+Define a data source using Flux's [`from()` function](/v2.0/reference/flux/functions/inputs/from/)
+or any other [Flux input functions](/v2.0/reference/flux/functions/inputs/).
+
+For convenience, consider creating a variable that includes the sourced data with
+the required time range and any relevant filters.
+
+```js
+data = from(bucket: "telegraf/default")
+  |> range(start: -task.every)
+  |> filter(fn: (r) =>
+    r._measurement == "mem" AND
+    r.host == "myHost"
+  )
+```
+
+{{% note %}}
+#### Using task options in your Flux script
+Task options are passed as part of a `task` object and can be referenced in your Flux script.
+In the example above, the time range is defined as `-task.every`.
+
+`task.every` is dot notation that references the `every` property of the `task` object.
+`every` is defined as `1h`, therefore `-task.every` equates to `-1h`.
+
+Using task options to define values in your Flux script can make reusing your task easier.
+{{% /note %}}
+
+## Process or transform your data
+The purpose of tasks is to process or transform data in some way.
+What exactly happens and what form the output data takes is up to you and your
+specific use case.
+
+The example below illustrates a task that downsamples data by calculating the average of set intervals.
+It uses the `data` variable defined [above](#define-a-data-source) as the data source.
+It then windows the data into 5 minute intervals and calculates the average of each
+window using the [`aggregateWindow()` function](/v2.0/reference/flux/functions/transformations/aggregates/aggregatewindow/).
+
+```js
+data
+  |> aggregateWindow(
+    every: 5m,
+    fn: mean
+  )
+```
+
+_See [Common tasks](/v2.0/process-data/common-tasks) for examples of tasks commonly used with InfluxDB._
+
+## Define a destination
+In the vast majority of task use cases, once data is transformed, it needs to sent and stored somewhere.
+This could be a separate bucket with a different retention policy, another measurement, or even an alert endpoint _(Coming)_.
+
+The example below uses Flux's [`to()` function](/v2.0/reference/flux/functions/outputs/to)
+to send the transformed data to another bucket:
+
+```js
+// ...
+|> to(bucket: "telegraf_downsampled", org: "my-org")
+```
+
+{{% note %}}
+You cannot write to the same bucket you are reading from.
+{{% /note %}}
+
+## Full example task script
+Below is the full example task script that combines all of the components described above:
+
+
+```js
+// Task options
+option task = {
+    name: "cqinterval15m",
+    every: 1h,
+    offset: 0m,
+    concurrency: 1,
+    retry: 5
+}
+
+// Data source
+data = from(bucket: "telegraf/default")
+  |> range(start: -task.every)
+  |> filter(fn: (r) =>
+    r._measurement == "mem" AND
+    r.host == "myHost"
+  )
+
+data
+  // Data transformation
+  |> aggregateWindow(
+    every: 5m,
+    fn: mean
+  )
+  // Data destination
+  |> to(bucket: "telegraf_downsampled")
+
+```
--- a/layouts/shortcodes/img-hd.html
+++ b/layouts/shortcodes/img-hd.html
@ -1,7 +1,12 @@
 {{ .Inner }}
 {{ $src := .Get "src" }}
 {{ $alt := .Get "alt" }}
-{{ with (imageConfig ( print "/static" $src )) }}
-  {{ $imageWidth := div .Width 2 }}
-  <img src='{{ $src }}' alt='{{ $alt }}' width='{{ $imageWidth }}' />
+
+{{ if (fileExists ( print "/static" $src )) }}
+  {{ with (imageConfig ( print "/static" $src )) }}
+    {{ $imageWidth := div .Width 2 }}
+    <img src='{{ $src }}' alt='{{ $alt }}' width='{{ $imageWidth }}' />
+  {{ end }}
+{{ else }}
+  <img src='{{ $src }}' alt='{{ $alt }}'/>
 {{ end }}