updated query optimizations guide to address PR feedback

2019-11-27 14:16:56 -07:00 · 2019-11-27 14:16:56 -07:00 · bc756e1f61
parent d9c0d409ed
commit bc756e1f61
1 changed files with 27 additions and 35 deletions
--- a/content/v2.0/query-data/optimize-queries.md
+++ b/content/v2.0/query-data/optimize-queries.md
@ -1,8 +1,7 @@
 ---
 title: Optimize Flux queries
 description: >
-  Optimize your Flux queries by changing function order, understanding how functions work,
-  and being conscious of the nature of the data queried.
+  Optimize your Flux queries to reduce their memory and compute (CPU) requirements.
 weight: 104
 menu:
  v2_0:
@ -11,25 +10,28 @@ menu:
 v2.0/tags: [query]
 ---

-Optimize your Flux queries with a few simple principles.
-Query optimizations center around reducing the memory and compute (CPU) requirements
-of a query by changing function order, understanding how functions work, and being
-conscious of the nature of the data queried.
+Optimize your Flux queries to reduce their memory and compute (CPU) requirements.
+
+- [Start queries with pushdown functions](#start-queries-with-pushdown-functions)
+- [Avoid short window durations](#avoid-short-window-durations)
+- [Use "heavy" functions sparingly](#use-heavy-functions-sparingly)
+- [Balance time range and data precision](#balance-time-range-and-data-precision)

 ## Start queries with pushdown functions
-Certain Flux functions can push their data manipulation down to the underlying
-data source rather than pulling the data into memory and manipulating it there.
-Using "pushdown" functions reduces the amount of memory necessary to run a query.
-However, to benefit from these performance gains, you must **use pushdown functions
-at the beginning of your query**.
-When a non-pushdown function runs, Flux pulls the data into memory and manipulates the data there.
-All subsequent functions must operate in memory, including pushdown-capable functions.
+Some Flux functions can push their data manipulation down to the underlying
+data source rather than storing and manipulating data in memory.
+These are known as "pushdown" functions and using them correctly can greatly
+reduce the amount of memory necessary to run a query.

 #### Pushdown functions
 - [range()](/v2.0/reference/flux/stdlib/built-in/transformations/range/)
 - [filter()](/v2.0/reference/flux/stdlib/built-in/transformations/filter/)
 - [group()](/v2.0/reference/flux/stdlib/built-in/transformations/group/)

+Use pushdown functions at the beginning of your query.
+Once a non-pushdown function runs, Flux pulls data into memory and runs all
+subsequent operations there.
+
 ##### Pushdown functions in use
 ```js
 from(bucket: "example-bucket")
@ -42,20 +44,15 @@ from(bucket: "example-bucket")
  |> top(n: 10)                              //
 ```

-## Don't over-window data
+## Avoid short window durations
 Windowing (grouping data based on time intervals) is commonly used to aggregate and downsample data.
-It's important to not "over-window" your data by using short window durations.
-The more windows Flux creates, the more compute power it needs to evaluate which
-window each row should be assigned to.
-
+Increase performance by avoiding short window durations.
+More windows require more compute power to evaluate which window each row should be assigned to.
 Reasonable window durations depend on the total time range queried.

 ## Use "heavy" functions sparingly
-Some Flux functions are known to use more memory or CPU than others.
-These provide vital functionality to Flux and your data processing workflow,
-but use them only when necessary.
-
-The following functions are known to be "heavy:"
+The following functions use more memory or CPU than others.
+Consider their necessity in your data processing before using them:

 - [map()](/v2.0/reference/flux/stdlib/built-in/transformations/map/)
 - [reduce()](/v2.0/reference/flux/stdlib/built-in/transformations/aggregates/reduce/)
@ -65,18 +62,13 @@ The following functions are known to be "heavy:"
 - [pivot()](/v2.0/reference/flux/stdlib/built-in/transformations/pivot/)

 {{% note %}}
-Flux engineers are in the process of optimizing functions.
-This list may not represent the current state of Flux and will be updated over time.
+We're continually optimizing Flux and this list may not represent its current state.
 {{% /note %}}

-## Balance time range vs data precision
-To ensure queries are performant, be sure to balance the time range of your query
-with the precision of your data.
-For example, if you query data with values stored every second and you request
-six months worth of data, results will include a minimum of ≈15.5 million points.
-Flux has to store that data in memory as it generates a response.
+## Balance time range and data precision
+To ensure queries are performant, balance the time range and the precision of your data.
+For example, if you query data stored every second and request six months worth of data,
+results will include a minimum of ≈15.5 million points.
+Flux must store these points in memory to generate a response.

-To query data over large periods of time, consider creating a task to
-[downsample high-resolution data](/v2.0/process-data/common-tasks/downsample-data/)
-into lower resolution data.
-Then query the low-resolution data using larger time ranges.
+To query data over large periods of time, create a task to [downsample data](/v2.0/process-data/common-tasks/downsample-data/), and then query the downsampled data instead.