feat: add information and examples for push downs and performance implications (#3725)

pull/3770/head
Jonathan A. Sternberg 2022-02-09 16:35:51 -06:00 committed by GitHub
parent f3089ac619
commit df89ac80a2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 130 additions and 0 deletions

View File

@ -90,11 +90,84 @@ If authentication is _disabled_, provide an empty string (`""`).
If authentication is _enabled_, provide your InfluxDB username and password
using the `<username>:<password>` syntax.
## Push down optimizations
Some transformations called after `from()` trigger performance optimizations called pushdowns.
These optimizations are "pushed down" from Flux into the InfluxDB storage layer and where they utilize code in storage to apply the transformation.
Pushdowns happen automatically, but it is helpful understand how these optimizations work so you can better optimize your Flux queries.
Pushdowns require an unbroken and exclusive chain between transformations.
A `from()` call stored in a variable that then goes to multiple pushdowns will
cause none of the pushdowns to be applied. For example:
```js
// Pushdowns are NOT applied
data = from(bucket: "example-bucket")
|> range(start: -1h)
data |> filter(fn: (r) => r._measurement == "m0") |> yield(name: "m0")
data |> filter(fn: (r) => r._measurement == "m1") |> yield(name: "m1")
```
To reuse code and still apply pushdowns, invoke `from()` in a function and pipe-forward the output of the function into subsequent pushdowns:
```js
// Pushdowns ARE applied
data = () => from(bucket: "example-bucket")
|> range(start: -1h)
data() |> filter(fn: (r) => r._measurement == "m0") |> yield(name: "m0")
data() |> filter(fn: (r) => r._measurement == "m1") |> yield(name: "m1")
```
### Filter
`filter()` transformations that compare `r._measurement`, `r._field`, `r._value` or any tag value are pushed down to the storage layer.
Comparisons that use functions do not.
If the function produces a static value, evaluate the function outside of `filter()`.
For example:
```js\
import "strings"
// filter() is NOT pushed down
data
|> filter(fn: (r) => r.example == strings.joinStr(arr: ["foo", "bar"], v: ""))
// filter() is pushed down
exVar = strings.joinStr(arr: ["foo", "bar"], v: ""))
data
|> filter(fn: (r) => r.example == exVar)
```
Multiple consecutive `filter()` transformations that can be pushed down are merged together into a single filter that gets pushed down.
### Aggregates
The following aggregate transformations are pushed down:
- `min()`
- `max()`
- `sum()`
- `count()`
- `mean()` (except when used with `group()`)
Aggregates will also be pushed down if they are preceded by a `group()`.
The only exception is `mean()` which cannot be pushed down to the storage layer with `group()`.
### Aggregate Window
Aggregates used with `aggregateWindow()` are pushed down.
Aggregates pushed down with `aggregateWindow()` are not compatible with `group()`.
## Examples
- [Query InfluxDB using the bucket name](#query-using-the-bucket-name)
- [Query InfluxDB using the bucket ID](#query-using-the-bucket-id)
- [Query a remote InfluxDB Cloud instance](#query-a-remote-influxdb-cloud-instance)
- [Query push downs](#query-push-downs)
- [Query from the same bucket to multiple push downs](#query-multiple-push-downs)
#### Query using the bucket name
```js
@ -119,3 +192,60 @@ from(
token: token
)
```
### Utilize pushdowns in multiple queries
```js
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "m0")
|> filter(fn: (r) => r._field == "f0")
|> yield(name: "filter-only")
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "m0")
|> filter(fn: (r) => r._field == "f0")
|> max()
|> yield(name: "max")
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "m0")
|> filter(fn: (r) => r._field == "f0")
|> group(columns: ["t0"])
|> max()
|> yield(name: "grouped-max")
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "m0")
|> filter(fn: (r) => r._field == "f0")
|> aggregateWindow(every: 5m, fn: max)
|> yield(name: "windowed-max")
```
### Query from the same bucket to multiple pushdowns
```js
// Use a function. If you use a variable, this will stop
// Flux from pushing down the operation.
data = () => from(bucket: "example-bucket")
|> range(start: -1h)
data() |> filter(fn: (r) => r._measurement == "m0")
data() |> filter(fn: (r) => r._measurement == "m1")
```
### Query from the same bucket to multiple transformations
```js
// The push down chain is not broken until after the push down
// is complete. In this case, it is more efficient to use a variable.
data = from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "m0")
data |> derivative() |> yield(name: "derivative")
data |> movingAverage(n: 5) |> yield(name: "movingAverage")
```