finalized downsampling guide, added page descriptions to task docs

pull/20/head
Scott Anderson 2019-01-18 17:29:28 -07:00
parent 6e13d5292f
commit e901980d5d
6 changed files with 82 additions and 28 deletions

View File

@ -1,7 +1,9 @@
---
title: Process Data with InfluxDB tasks
seotitle: Process Data with InfluxDB tasks
description: placeholder
description: >
InfluxDB's task engine runs scheduled Flux tasks that process and analyze data.
This collection of articles provides information about creating and managing InfluxDB tasks.
menu:
v2_0:
name: Process data
@ -21,4 +23,7 @@ The following articles explain how to configure and build tasks using the Influx
and via raw Flux scripts with the `influx` command line interface (CLI).
They also provide examples of commonly used tasks.
_Links for nested docs._
[Write a task](/v2.0/process-data/write-a-task)
[Manage Tasks](/v2.0/process-data/manage-tasks)
[Common Tasks](/v2.0/process-data/common-tasks)
[Task Options](/v2.0/process-data/task-options)

View File

@ -1,7 +1,9 @@
---
title: Common data processing tasks
seotitle: Common data processing tasks performed with with InfluxDB
description: placeholder
description: >
InfluxDB Tasks process data on specified schedules.
This collection of articles walks through common use cases for InfluxDB tasks.
menu:
v2_0:
name: Common tasks
@ -11,10 +13,10 @@ menu:
The following articles walk through common task use cases.
[Downsample Data with InfluxDB](/v2.0/process-data/common-tasks/downsample-data)
{{% note %}}
This list will continue to grow.
If you have suggestions, please [create an issue](https://github.com/influxdata/docs-v2/issues/new)
on the InfluxData documentation repository on Github.
{{% /note %}}
[Downsample Data with InfluxDB](/v2.0/process-data/common-tasks/downsample-data)

View File

@ -1,7 +1,9 @@
---
title: Downsample data with InfluxDB
seotitle: Downsample data in an InfluxDB task
description: placeholder
description: >
How to create a task that downsamples data much like continuous queries
in previous versions of InfluxDB.
menu:
v2_0:
name: Downsample data
@ -9,35 +11,74 @@ menu:
weight: 4
---
- Talk about Continuous Queries
One of the most common use cases for InfluxDB tasks is downsampling data to reduce
the overall disk usage as data collects over time.
In previous versions of InfluxDB, continuous queries filled this role.
**Requirements:**
This article walks through creating a continuous-query-like task that downsamples
data by aggregating data within windows of time, then storing the aggregate value in a new bucket.
- A "source" bucket
- A "destination" bucket
- Some type of aggregation
- and a `to` statement
### Requirements
To perform a downsampling task, you need to the following:
- You can't write data into the same bucket you're reading from
- A two buckets
- `to()` requires a bucket AND org
##### A "source" bucket
The bucket from which data is queried.
##### A "destination" bucket
A separate bucket where aggregated, downsampled data is stored.
##### Some type of aggregation
To downsample data, it must be aggregated in some way.
What specific method of aggregation you use depends on your specific use case,
but examples include mean, median, top, bottom, etc.
View [Flux's aggregate functions](#) for more information and ideas.
## Create a destination bucket
By design, tasks cannot write to the same bucket from which they are reading.
You need another bucket where the task can store the aggregated, downsampled data.
_For information about creating buckets, see [Create a bucket](#)._
## Example downsampling task script
The example task script below is a very basic form of data downsampling that does the following:
1. Defines a task named "cq-mem-data-1w" that runs once a week.
2. Defines a `data` variable that represents all data from the last 2 weeks in the
`mem` measurement of the `system-data` bucket.
3. Uses the [`aggregateWindow()` function](#) to window the data into 1 hour intervals
and calculate the average of each interval.
4. Stores the aggregated data in the `system-data-downsampled` bucket under the
`my-org` organization.
```js
// Task Options
option task = {
name: "cqinterval15m",
name: "cq-mem-data-1w",
every: 1w,
}
data = from(bucket: "telegraf")
// Defines a data source
data = from(bucket: "system-data")
|> range(start: -task.every * 2)
|> filter(fn: (r) => r._measurement == "cpu")
|> filter(fn: (r) => r._measurement == "mem")
downsampleHourly = (tables=<-) =>
tables
data
// Windows and aggregates the data in to 1h averages
|> aggregateWindow(fn: mean, every: 1h)
|> set(key: "_measurement", value: "cpu_1h" )
|> to(bucket: "telegraf_downsampled", org: "my-org")
downsampleHourly(data)
// Stores the aggregated data in a new bucket
|> to(bucket: "system-data-downsampled", org: "my-org")
```
Again, this is a very basic example, but it should provide you with a foundation
to build more complex downsampling tasks.
## Add your task
Once your task is ready, see [Create a task](#) for information about adding it to InfluxDB.
## Things to consider
- If there is a chance that data may arrive late, specify an `offset` in your
task options long enough to account for late-data.
- If running a task against a bucket with a finite retention policy, do not schedule
tasks to run too closely to the end of the retention policy.
Always provide a "cushion" for downsampling tasks to complete before the data
is dropped by the retention policy.

View File

@ -1,7 +1,9 @@
---
title: Manage tasks in InfluxDB
seotitle: Manage data processing tasks in InfluxDB
description: placeholder
description: >
InfluxDB provides options for managing the creation, reading, updating, and deletion
of tasks using both the 'influx' CLI and the InfluxDB UI.
menu:
v2_0:
name: Manage tasks

View File

@ -1,7 +1,9 @@
---
title: Task configuration options
seotitle: InfluxDB task configuration options
description: placeholder
description: >
Task options define specific information about a task such as its name,
the schedule on which it runs, execution delays, and others.
menu:
v2_0:
name: Task options

View File

@ -1,7 +1,9 @@
---
title: Write an InfluxDB task
seotitle: Write an InfluxDB task that processes data
description: placeholder
description: >
How to write an InfluxDB task that processes data in some way, then performs an action
such as storing the modified data in a new bucket or sending an alert.
menu:
v2_0:
name: Write a task