updated query optimizations guide to address PR feedback
							parent
							
								
									d9c0d409ed
								
							
						
					
					
						commit
						bc756e1f61
					
				| 
						 | 
				
			
			@ -1,8 +1,7 @@
 | 
			
		|||
---
 | 
			
		||||
title: Optimize Flux queries
 | 
			
		||||
description: >
 | 
			
		||||
  Optimize your Flux queries by changing function order, understanding how functions work,
 | 
			
		||||
  and being conscious of the nature of the data queried.
 | 
			
		||||
  Optimize your Flux queries to reduce their memory and compute (CPU) requirements.
 | 
			
		||||
weight: 104
 | 
			
		||||
menu:
 | 
			
		||||
  v2_0:
 | 
			
		||||
| 
						 | 
				
			
			@ -11,25 +10,28 @@ menu:
 | 
			
		|||
v2.0/tags: [query]
 | 
			
		||||
---
 | 
			
		||||
 | 
			
		||||
Optimize your Flux queries with a few simple principles.
 | 
			
		||||
Query optimizations center around reducing the memory and compute (CPU) requirements
 | 
			
		||||
of a query by changing function order, understanding how functions work, and being
 | 
			
		||||
conscious of the nature of the data queried.
 | 
			
		||||
Optimize your Flux queries to reduce their memory and compute (CPU) requirements.
 | 
			
		||||
 | 
			
		||||
- [Start queries with pushdown functions](#start-queries-with-pushdown-functions)
 | 
			
		||||
- [Avoid short window durations](#avoid-short-window-durations)
 | 
			
		||||
- [Use "heavy" functions sparingly](#use-heavy-functions-sparingly)
 | 
			
		||||
- [Balance time range and data precision](#balance-time-range-and-data-precision)
 | 
			
		||||
 | 
			
		||||
## Start queries with pushdown functions
 | 
			
		||||
Certain Flux functions can push their data manipulation down to the underlying
 | 
			
		||||
data source rather than pulling the data into memory and manipulating it there.
 | 
			
		||||
Using "pushdown" functions reduces the amount of memory necessary to run a query.
 | 
			
		||||
However, to benefit from these performance gains, you must **use pushdown functions
 | 
			
		||||
at the beginning of your query**.
 | 
			
		||||
When a non-pushdown function runs, Flux pulls the data into memory and manipulates the data there.
 | 
			
		||||
All subsequent functions must operate in memory, including pushdown-capable functions.
 | 
			
		||||
Some Flux functions can push their data manipulation down to the underlying
 | 
			
		||||
data source rather than storing and manipulating data in memory.
 | 
			
		||||
These are known as "pushdown" functions and using them correctly can greatly
 | 
			
		||||
reduce the amount of memory necessary to run a query.
 | 
			
		||||
 | 
			
		||||
#### Pushdown functions
 | 
			
		||||
- [range()](/v2.0/reference/flux/stdlib/built-in/transformations/range/)
 | 
			
		||||
- [filter()](/v2.0/reference/flux/stdlib/built-in/transformations/filter/)
 | 
			
		||||
- [group()](/v2.0/reference/flux/stdlib/built-in/transformations/group/)
 | 
			
		||||
 | 
			
		||||
Use pushdown functions at the beginning of your query.
 | 
			
		||||
Once a non-pushdown function runs, Flux pulls data into memory and runs all
 | 
			
		||||
subsequent operations there.
 | 
			
		||||
 | 
			
		||||
##### Pushdown functions in use
 | 
			
		||||
```js
 | 
			
		||||
from(bucket: "example-bucket")
 | 
			
		||||
| 
						 | 
				
			
			@ -42,20 +44,15 @@ from(bucket: "example-bucket")
 | 
			
		|||
  |> top(n: 10)                              //
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
## Don't over-window data
 | 
			
		||||
## Avoid short window durations
 | 
			
		||||
Windowing (grouping data based on time intervals) is commonly used to aggregate and downsample data.
 | 
			
		||||
It's important to not "over-window" your data by using short window durations.
 | 
			
		||||
The more windows Flux creates, the more compute power it needs to evaluate which
 | 
			
		||||
window each row should be assigned to.
 | 
			
		||||
 | 
			
		||||
Increase performance by avoiding short window durations.
 | 
			
		||||
More windows require more compute power to evaluate which window each row should be assigned to.
 | 
			
		||||
Reasonable window durations depend on the total time range queried.
 | 
			
		||||
 | 
			
		||||
## Use "heavy" functions sparingly
 | 
			
		||||
Some Flux functions are known to use more memory or CPU than others.
 | 
			
		||||
These provide vital functionality to Flux and your data processing workflow,
 | 
			
		||||
but use them only when necessary.
 | 
			
		||||
 | 
			
		||||
The following functions are known to be "heavy:"
 | 
			
		||||
The following functions use more memory or CPU than others.
 | 
			
		||||
Consider their necessity in your data processing before using them:
 | 
			
		||||
 | 
			
		||||
- [map()](/v2.0/reference/flux/stdlib/built-in/transformations/map/)
 | 
			
		||||
- [reduce()](/v2.0/reference/flux/stdlib/built-in/transformations/aggregates/reduce/)
 | 
			
		||||
| 
						 | 
				
			
			@ -65,18 +62,13 @@ The following functions are known to be "heavy:"
 | 
			
		|||
- [pivot()](/v2.0/reference/flux/stdlib/built-in/transformations/pivot/)
 | 
			
		||||
 | 
			
		||||
{{% note %}}
 | 
			
		||||
Flux engineers are in the process of optimizing functions.
 | 
			
		||||
This list may not represent the current state of Flux and will be updated over time.
 | 
			
		||||
We're continually optimizing Flux and this list may not represent its current state.
 | 
			
		||||
{{% /note %}}
 | 
			
		||||
 | 
			
		||||
## Balance time range vs data precision
 | 
			
		||||
To ensure queries are performant, be sure to balance the time range of your query
 | 
			
		||||
with the precision of your data.
 | 
			
		||||
For example, if you query data with values stored every second and you request
 | 
			
		||||
six months worth of data, results will include a minimum of ≈15.5 million points.
 | 
			
		||||
Flux has to store that data in memory as it generates a response.
 | 
			
		||||
## Balance time range and data precision
 | 
			
		||||
To ensure queries are performant, balance the time range and the precision of your data.
 | 
			
		||||
For example, if you query data stored every second and request six months worth of data,
 | 
			
		||||
results will include a minimum of ≈15.5 million points.
 | 
			
		||||
Flux must store these points in memory to generate a response.
 | 
			
		||||
 | 
			
		||||
To query data over large periods of time, consider creating a task to
 | 
			
		||||
[downsample high-resolution data](/v2.0/process-data/common-tasks/downsample-data/)
 | 
			
		||||
into lower resolution data.
 | 
			
		||||
Then query the low-resolution data using larger time ranges.
 | 
			
		||||
To query data over large periods of time, create a task to [downsample data](/v2.0/process-data/common-tasks/downsample-data/), and then query the downsampled data instead.
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
		Loading…
	
		Reference in New Issue