feat: add information and examples for push downs and performance implications (#3725)

2022-02-09 16:35:51 -06:00 · 2022-02-09 16:35:51 -06:00 · df89ac80a2
parent f3089ac619
commit df89ac80a2
1 changed files with 130 additions and 0 deletions
--- a/content/flux/v0.x/stdlib/influxdata/influxdb/from.md
+++ b/content/flux/v0.x/stdlib/influxdata/influxdb/from.md
@ -90,11 +90,84 @@ If authentication is _disabled_, provide an empty string (`""`).
 If authentication is _enabled_, provide your InfluxDB username and password
 using the `<username>:<password>` syntax.

+## Push down optimizations
+
+Some transformations called after `from()` trigger performance optimizations called pushdowns.
+These optimizations are "pushed down" from Flux into the InfluxDB storage layer and where they utilize code in storage to apply the transformation.
+Pushdowns happen automatically, but it is helpful understand how these optimizations work so you can better optimize your Flux queries.
+
+Pushdowns require an unbroken and exclusive chain between transformations.
+A `from()` call stored in a variable that then goes to multiple pushdowns will
+cause none of the pushdowns to be applied. For example:
+
+```js
+// Pushdowns are NOT applied
+data = from(bucket: "example-bucket")
+    |> range(start: -1h)
+
+data |> filter(fn: (r) => r._measurement == "m0") |> yield(name: "m0")
+data |> filter(fn: (r) => r._measurement == "m1") |> yield(name: "m1")
+```
+
+To reuse code and still apply pushdowns, invoke `from()` in a function and pipe-forward the output of the function into subsequent pushdowns:
+
+```js
+// Pushdowns ARE applied
+data = () => from(bucket: "example-bucket")
+    |> range(start: -1h)
+
+data() |> filter(fn: (r) => r._measurement == "m0") |> yield(name: "m0")
+data() |> filter(fn: (r) => r._measurement == "m1") |> yield(name: "m1")
+```
+
+### Filter
+
+`filter()` transformations that compare `r._measurement`, `r._field`, `r._value` or any tag value are pushed down to the storage layer.
+Comparisons that use functions do not.
+If the function produces a static value, evaluate the function outside of `filter()`.
+For example:
+
+```js\
+import "strings"
+
+// filter() is NOT pushed down
+data
+    |> filter(fn: (r) => r.example == strings.joinStr(arr: ["foo", "bar"], v: ""))
+
+// filter() is pushed down
+exVar = strings.joinStr(arr: ["foo", "bar"], v: ""))
+
+data
+    |> filter(fn: (r) => r.example == exVar)
+```
+
+Multiple consecutive `filter()` transformations that can be pushed down are merged together into a single filter that gets pushed down.
+
+### Aggregates
+
+The following aggregate transformations are pushed down:
+
+- `min()`
+- `max()`
+- `sum()`
+- `count()`
+- `mean()` (except when used with `group()`)
+
+Aggregates will also be pushed down if they are preceded by a `group()`.
+The only exception is `mean()` which cannot be pushed down to the storage layer with `group()`.
+
+### Aggregate Window
+
+Aggregates used with `aggregateWindow()` are pushed down.
+Aggregates pushed down with `aggregateWindow()` are not compatible with `group()`.
+
 ## Examples

 - [Query InfluxDB using the bucket name](#query-using-the-bucket-name)
 - [Query InfluxDB using the bucket ID](#query-using-the-bucket-id)
 - [Query a remote InfluxDB Cloud instance](#query-a-remote-influxdb-cloud-instance)
+- [Query push downs](#query-push-downs)
+- [Query from the same bucket to multiple push downs](#query-multiple-push-downs)

 #### Query using the bucket name
 ```js
@ -119,3 +192,60 @@ from(
  token: token
 )
 ```
+
+### Utilize pushdowns in multiple queries
+
+```js
+from(bucket: "example-bucket")
+    |> range(start: -1h)
+    |> filter(fn: (r) => r._measurement == "m0")
+    |> filter(fn: (r) => r._field == "f0")
+    |> yield(name: "filter-only")
+
+from(bucket: "example-bucket")
+    |> range(start: -1h)
+    |> filter(fn: (r) => r._measurement == "m0")
+    |> filter(fn: (r) => r._field == "f0")
+    |> max()
+    |> yield(name: "max")
+
+from(bucket: "example-bucket")
+    |> range(start: -1h)
+    |> filter(fn: (r) => r._measurement == "m0")
+    |> filter(fn: (r) => r._field == "f0")
+    |> group(columns: ["t0"])
+    |> max()
+    |> yield(name: "grouped-max")
+
+from(bucket: "example-bucket")
+    |> range(start: -1h)
+    |> filter(fn: (r) => r._measurement == "m0")
+    |> filter(fn: (r) => r._field == "f0")
+    |> aggregateWindow(every: 5m, fn: max)
+    |> yield(name: "windowed-max")
+```
+
+### Query from the same bucket to multiple pushdowns
+
+```js
+// Use a function. If you use a variable, this will stop
+// Flux from pushing down the operation.
+data = () => from(bucket: "example-bucket")
+  |> range(start: -1h)
+
+data() |> filter(fn: (r) => r._measurement == "m0")
+data() |> filter(fn: (r) => r._measurement == "m1")
+```
+
+### Query from the same bucket to multiple transformations
+
+```js
+// The push down chain is not broken until after the push down
+// is complete. In this case, it is more efficient to use a variable.
+data = from(bucket: "example-bucket")
+  |> range(start: -1h)
+  |> filter(fn: (r) => r._measurement == "m0")
+
+data |> derivative() |> yield(name: "derivative")
+data |> movingAverage(n: 5) |> yield(name: "movingAverage")
+```