4.2 KiB
title | seotitle | description | menu | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Write an InfluxDB task | Write an InfluxDB task that processes data | placeholder |
|
InfluxDB tasks are user-defined Flux scripts that take a stream of input data, modify or analyze it in some way, then perform an action all on a specified schedule. In their simplest form, tasks are essentially Flux scripts with a "destination." This destination could be another bucket, another measurement, an alert endpoint (Coming), etc.
This article walks through writing a basic InfluxDB task that downsamples data and stores it in a new bucket.
Components of a Task
Every InfluxDB task needs the following four components. Their form and order can vary, but the are all essential parts of a task.
Skip to the full example task script
Define task options
Task options define specific information about the task. The example below illustrates how task options are defined in your Flux script:
option task = {
name: "cqinterval15m",
every: 1h,
offset: 0m,
concurrency: 1,
retry: 5
}
See Task configuration options for detailed information about each option.
{{% note %}} When creating a task in the InfluxDB user interface (UI), task options are not required in your Flux script. They are defined in UI while creating the task. {{% /note %}}
Define a data source
Define a data source using Flux's from()
function or any other Flux input functions.
For convenience, consider creating a variable that includes the sourced data with
the required range()
and any relevant filters.
data = from(bucket: "telegraf/default")
|> range(start: -task.every)
|> filter(fn: (r) =>
r._measurement == "mem" AND
r.host == "myHost"
)
{{% note %}}
Using task options in your Flux script
Task options are passed as part of a task
object and can be referenced in your Flux script.
In the example above, the time range is defined as -task.every
.
task.every
is dot notation that references the every
property of the task
object.
every
is defined as 1h
, therefore -task.every
equates to -1h
.
Using task options to define values in your Flux script can make reusing your task easier. {{% /note %}}
Process or transform your data
The purpose of tasks is to process or transform data in some way. What exactly happens and what form the output data takes is up to you and your specific use case.
The example below illustrates a task that downsamples data by calculating the average of set intervals.
It uses the data
variable defined above as the data source.
It then windows the data into 5 minute intervals and calculates the average of each
window using the aggregateWindow()
function.
data
|> aggregateWindow(
every: 5m,
fn: mean
)
See Common tasks for examples of tasks commonly used with InfluxDB.
Define a destination
In the vast majority of task use cases, once data is transformed, it needs to sent and stored somewhere. This could be a separate bucket with a different retention policy, another measurement, or even an alert endpoint (Coming).
The example below uses Flux's to()
function to send the transformed data to another bucket:
// ...
|> to(bucket: "telegraf_downsampled", org: "my-org")
{{% note %}} You cannot write to the same bucket you are reading from. {{% /note %}}
Full example task script
Below is the full example task script that combines all of the components described above:
// Task options
option task = {
name: "cqinterval15m",
every: 1h,
offset: 0m,
concurrency: 1,
retry: 5
}
// Data source
data = from(bucket: "telegraf/default")
|> range(start: -task.every)
|> filter(fn: (r) =>
r._measurement == "mem" AND
r.host == "myHost"
)
data
// Data transformation
|> aggregateWindow(
every: 5m,
fn: mean
)
// Data destination
|> to(bucket: "telegraf_downsampled")