docs-v2/content/v2.0/process-data/get-started.md

149 lines
4.4 KiB
Markdown
Raw Normal View History

---
title: Get started with InfluxDB tasks
list_title: Get started with tasks
description: >
Learn the basics of writing an InfluxDB task that processes data, and then performs an action,
such as storing the modified data in a new bucket or sending an alert.
aliases:
- /v2.0/process-data/write-a-task/
2019-02-05 18:39:00 +00:00
v2.0/tags: [tasks]
menu:
v2_0:
name: Get started with tasks
parent: Process data
weight: 101
---
2019-10-29 20:44:12 +00:00
An **InfluxDB task** is a scheduled Flux script that takes a stream of input data, modifies or analyzes
2019-10-23 20:42:16 +00:00
it in some way, then stores the modified data in a new bucket or performs other actions.
2019-01-16 17:41:37 +00:00
This article walks through writing a basic InfluxDB task that downsamples
data and stores it in a new bucket.
## Components of a task
2019-01-16 17:41:37 +00:00
Every InfluxDB task needs the following four components.
Their form and order can vary, but the are all essential parts of a task.
2019-01-16 17:41:37 +00:00
- [Task options](#define-task-options)
- [A data source](#define-a-data-source)
- [Data processing or transformation](#process-or-transform-your-data)
- [A destination](#define-a-destination)
_[Skip to the full example task script](#full-example-task-script)_
## Define task options
Task options define specific information about the task.
The example below illustrates how task options are defined in your Flux script:
```js
option task = {
name: "cqinterval15m",
every: 1h,
2019-01-16 17:41:37 +00:00
offset: 0m,
concurrency: 1,
retry: 5
}
```
2019-01-16 17:41:37 +00:00
_See [Task configuration options](/v2.0/process-data/task-options) for detailed information
about each option._
{{% note %}}
2019-10-23 20:42:48 +00:00
When creating a task in the InfluxDB user interface (UI), task options are defined in form fields.
2019-01-16 17:41:37 +00:00
{{% /note %}}
## Define a data source
2019-09-10 18:38:38 +00:00
Define a data source using Flux's [`from()` function](/v2.0/reference/flux/stdlib/built-in/inputs/from/)
or any other [Flux input functions](/v2.0/reference/flux/stdlib/built-in/inputs/).
2019-01-16 17:41:37 +00:00
For convenience, consider creating a variable that includes the sourced data with
2019-01-18 06:04:56 +00:00
the required time range and any relevant filters.
2019-01-16 17:41:37 +00:00
```js
2019-01-16 17:41:37 +00:00
data = from(bucket: "telegraf/default")
|> range(start: -task.every)
|> filter(fn: (r) =>
r._measurement == "mem" and
2019-01-16 17:41:37 +00:00
r.host == "myHost"
)
```
2019-01-16 17:41:37 +00:00
{{% note %}}
#### Using task options in your Flux script
Task options are passed as part of a `task` object and can be referenced in your Flux script.
In the example above, the time range is defined as `-task.every`.
`task.every` is dot notation that references the `every` property of the `task` object.
`every` is defined as `1h`, therefore `-task.every` equates to `-1h`.
2019-01-16 17:41:37 +00:00
Using task options to define values in your Flux script can make reusing your task easier.
{{% /note %}}
2019-01-16 17:41:37 +00:00
## Process or transform your data
The purpose of tasks is to process or transform data in some way.
What exactly happens and what form the output data takes is up to you and your
specific use case.
2019-01-16 17:41:37 +00:00
The example below illustrates a task that downsamples data by calculating the average of set intervals.
It uses the `data` variable defined [above](#define-a-data-source) as the data source.
It then windows the data into 5 minute intervals and calculates the average of each
2019-09-10 18:38:38 +00:00
window using the [`aggregateWindow()` function](/v2.0/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow/).
2019-01-16 17:41:37 +00:00
```js
data
|> aggregateWindow(
every: 5m,
fn: mean
)
```
2019-01-16 17:41:37 +00:00
_See [Common tasks](/v2.0/process-data/common-tasks) for examples of tasks commonly used with InfluxDB._
2019-01-16 17:41:37 +00:00
## Define a destination
In the vast majority of task use cases, once data is transformed, it needs to sent and stored somewhere.
This could be a separate bucket or another measurement.
2019-09-10 18:38:38 +00:00
The example below uses Flux's [`to()` function](/v2.0/reference/flux/stdlib/built-in/outputs/to)
to send the transformed data to another bucket:
2019-01-16 17:41:37 +00:00
```js
// ...
2019-01-17 22:39:35 +00:00
|> to(bucket: "telegraf_downsampled", org: "my-org")
2019-01-16 17:41:37 +00:00
```
2019-01-17 22:39:35 +00:00
{{% note %}}
In order to write data into InfluxDB, you must have `_time`, `_measurement`, `_field`, and `_value` columns.
2019-01-17 22:39:35 +00:00
{{% /note %}}
2019-01-16 17:41:37 +00:00
## Full example task script
2019-10-30 18:47:43 +00:00
Below is a task script that combines all of the components described above:
2019-01-16 17:41:37 +00:00
```js
// Task options
option task = {
name: "cqinterval15m",
every: 1h,
offset: 0m,
concurrency: 1,
retry: 5
}
2019-01-16 17:41:37 +00:00
// Data source
data = from(bucket: "telegraf/default")
|> range(start: -task.every)
|> filter(fn: (r) =>
r._measurement == "mem" and
2019-01-16 17:41:37 +00:00
r.host == "myHost"
)
data
// Data transformation
|> aggregateWindow(
every: 5m,
fn: mean
)
// Data destination
|> to(bucket: "telegraf_downsampled")
2019-01-16 17:41:37 +00:00
```