fix(sql): Apply suggestions from code review.

pull/5854/head
Jason Stirnaman 2025-02-24 12:44:13 -06:00
parent 6b0265b4cb
commit 01236c6791
1 changed files with 90 additions and 61 deletions

View File

@ -1,10 +1,8 @@
A _window function_ performs an operation across a set of rows related to the Window functions let you calculate running totals, moving averages, or other aggregate-like results without collapsing rows into groups.
current row. This is similar to the type of operations They perform their calculations over a “window” of rows, which you can partition and order in various ways, and return a calculated value for each row in the set.
[aggregate functions](/influxdb3/cloud-dedicated/reference/sql/functions/aggregate/)
perform. However, window functions do not return a single output row per group Unlike non-window [aggregate functions](/influxdb3/version/reference/sql/functions/aggregate/) that combine each group into a single row, window functions preserve each rows identity and calculate an additional value for every row in the partition.
like non-window aggregate functions do. Instead, rows retain their separate
identities.
For example, the following query uses the {{< influxdb3/home-sample-link >}} For example, the following query uses the {{< influxdb3/home-sample-link >}}
and returns each temperature reading with the average temperature per room over and returns each temperature reading with the average temperature per room over
@ -65,9 +63,10 @@ ORDER BY
As window functions operate on a row, there is a set of rows in the row's As window functions operate on a row, there is a set of rows in the row's
partition that the window function uses to perform the operation. This set of partition that the window function uses to perform the operation. This set of
rows is called the _window frame_. Window frame boundaries can be defined using rows is called the _window frame_.
Window frame boundaries can be defined using
`RANGE`, `ROW`, or `GROUPS` frame units, each relative to the current row--for `RANGE`, `ROW`, or `GROUPS` frame units, each relative to the current row--for
exmaple: example:
{{< code-tabs-wrapper >}} {{< code-tabs-wrapper >}}
{{% code-tabs %}} {{% code-tabs %}}
@ -117,11 +116,9 @@ FROM home
_For more information about how window frames work, see the [frame clause](#frame-clause)._ _For more information about how window frames work, see the [frame clause](#frame-clause)._
If window frames are not defined, window functions use all rows in the current If you don't specify window frames, window functions use all rows in the current
partition to perform their operation. partition to perform their operation.
## Window function syntax
```sql ```sql
function([expr]) function([expr])
OVER( OVER(
@ -133,10 +130,10 @@ function([expr])
### OVER clause ### OVER clause
Window functions use an `OVER` clause directly following the window function's Window functions use an `OVER` clause that directly follows the window function's
name and arguments. The `OVER` clause syntactically distinguishes a window name and arguments.
function from a normal function or non-window aggregate function and determines The `OVER` clause syntactically distinguishes a window
how rows are split up for the window operation. function from a non-window or aggregate function and defines how to group and order rows for the window operation.
### PARTITION BY clause ### PARTITION BY clause
@ -154,13 +151,13 @@ may be explicit or implicit, limiting a window frame size in both directions
relative to the current row. relative to the current row.
> [!Note] > [!Note]
> The `ORDER BY` clause in an `OVER` clause is separate from the `ORDER BY` > The `ORDER BY` clause in an `OVER` clause determines the processing order for
> clause of the query and only determines the order that rows in each partition > rows in each partition and is separate from the `ORDER BY`
> are processed in. > clause of the query.
### Frame clause ### Frame clause
The frame clause defines window frame boundaries and can be one of the following: The _frame clause_ defines window frame boundaries and can be one of the following:
```sql ```sql
{ RANGE | ROWS | GROUPS } frame_start { RANGE | ROWS | GROUPS } frame_start
@ -196,7 +193,7 @@ the current row value.
> When using `RANGE` frame units, you must include an `ORDER BY` clause with > When using `RANGE` frame units, you must include an `ORDER BY` clause with
> _exactly one column_. > _exactly one column_.
The offset is the difference the between the current row value and surrounding The offset is the difference between the current row value and surrounding
row values. `RANGE` supports the following offset types: row values. `RANGE` supports the following offset types:
- Numeric _(non-negative)_ - Numeric _(non-negative)_
@ -207,7 +204,7 @@ row values. `RANGE` supports the following offset types:
{{% expand "See how `RANGE` frame units work with numeric offsets" %}} {{% expand "See how `RANGE` frame units work with numeric offsets" %}}
To use a numeric offset with the `RANGE` frame unit, you must sort partitions To use a numeric offset with the `RANGE` frame unit, you must sort partitions
by a numeric-typed column. by a numeric-type column.
```sql ```sql
... OVER ( ... OVER (
@ -226,7 +223,7 @@ The window frame includes rows with sort column values between 45 below and
{{% expand "See how `RANGE` frame units work with interval offsets" %}} {{% expand "See how `RANGE` frame units work with interval offsets" %}}
To use an interval offset with the `RANGE` frame unit, you must sort partitions To use an interval offset with the `RANGE` frame unit, you must sort partitions
by `time` or a timestamp-typed column. by `time` or a timestamp-type column.
```sql ```sql
... OVER ( ... OVER (
@ -251,7 +248,7 @@ one hour after the current row's timestamp:
##### ROWS ##### ROWS
Defines frame boundaries using row positions relative to the current row. Defines window frame boundaries using row positions relative to the current row.
The offset is the difference in row position from the current row. The offset is the difference in row position from the current row.
`ROWS` supports the following offset types: `ROWS` supports the following offset types:
@ -279,14 +276,14 @@ The window frame includes the two rows before and the one row after the current
##### GROUPS ##### GROUPS
Defines frame boundaries using row groups. Defines window frame boundaries using row groups.
Rows with the same values for the columns in the [`ORDER BY` clause](#order-by-clause) Rows with the same values for the columns in the [`ORDER BY` clause](#order-by-clause)
comprise a row group. comprise a row group.
> [!Important] > [!Important]
> When using `GROUPS` frame units, you must include an `ORDER BY` clause. > When using `GROUPS` frame units, include an `ORDER BY` clause.
The offset is the difference in row group position relative to the the current row group. The offset is the difference in row group position relative to the current row group.
`GROUPS` supports the following offset types: `GROUPS` supports the following offset types:
- Numeric _(non-negative)_ - Numeric _(non-negative)_
@ -319,8 +316,7 @@ You can then use group offsets to determine frame boundaries:
) )
``` ```
The window function uses all rows in the two row groups before the current The window function uses all rows in the current row group and the two preceding row groups to perform the operation:
row group and the current row group to perform the operation:
{{< sql/window-frame-units "groups with frame" >}} {{< sql/window-frame-units "groups with frame" >}}
@ -330,45 +326,75 @@ row group and the current row group to perform the operation:
#### Frame boundaries #### Frame boundaries
Frame boundaries (**frame_start** and **frame_end**) define the boundaries of Frame boundaries (**frame_start** and **frame_end**) define the boundaries of
each frame the window function operates on. Use the following to define each frame that the window function operates on.
frame boundaries:
```sql - [UNBOUNDED PRECEDING](#unbounded-preceding)
UNBOUNDED PRECEDING - [offset PRECEDING](#offset-preceding)
offset PRECEDING - CURRENT_ROW](#current-row)
CURRENT ROW - [offset> FOLLOWING](#offset-following)
offset FOLLOWING - [UNBOUNDED FOLLOWING](#unbounded-following)
UNBOUNDED FOLLOWING
```
##### UNBOUNDED PRECEDING ##### UNBOUNDED PRECEDING
Use the beginning of the partition to the current row as the frame boundary. Starts at the first row of the partition and ends at the current row.
```sql
UNBOUNDED PRECEDING
```
##### offset PRECEDING ##### offset PRECEDING
Use a specified offset of [frame units](#frame-units) _before_ the current row Starts at `offset` [frame units](#frame-units) before the current row and ends at the current row.
as a frame boundary. For example, `3 PRECEDING` includes 3 rows before the current row.
```sql
<offset> PRECEDING
```
##### CURRENT ROW ##### CURRENT ROW
Use the current row as a frame boundary. Both starts and ends at the current row when used as a boundary.
```sql
CURRENT ROW
```
##### offset FOLLOWING
Starts at the current row and ends at `offset` [frame units](#frame-units) after the current row.
For example, `3 FOLLOWING` includes 3 rows after the current row.
```sql
<offset> FOLLOWING
```
##### UNBOUNDED FOLLOWING
Starts at the current row and ends at the last row of the partition.
##### offset FOLLOWING ##### offset FOLLOWING
Use a specified offset of [frame units](#frame-units) _after_ the current row Use a specified offset of [frame units](#frame-units) _after_ the current row
as a frame boundary. as a frame boundary.
```sql
offset FOLLOWING
```
##### UNBOUNDED FOLLOWING ##### UNBOUNDED FOLLOWING
Use the current row to the end of the current partition the frame boundary. Use the current row to the end of the current partition the frame boundary.
```sql
UNBOUNDED FOLLOWING
```
### WINDOW clause ### WINDOW clause
When a query has multiple window functions that use the same window, rather than Use the `WINDOW` clause to define a reusable alias for a window specification.
writing each with a separate `OVER` clause (which is duplicative and error-prone), This is useful when multiple window functions in your query share the same window definition.
use the `WINDOW` clause to define the window and then reference the window alias
in each `OVER` clause--for example: Instead of repeating the same OVER clause for each function,
define the window once and reference it by alias--for example:
```sql ```sql
SELECT SELECT
@ -400,16 +426,15 @@ can be used as window functions.
Returns the cumulative distribution of a value within a group of values. Returns the cumulative distribution of a value within a group of values.
The returned value is greater than 0 and less than or equal to 1 and represents The returned value is greater than 0 and less than or equal to 1 and represents
the relative rank of the value in the set of values. the relative rank of the value in the set of values.
The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause is used
ranking order. to correctly calculate the cumulative distribution of the current row value.
```sql ```sql
cume_dist() cume_dist()
``` ```
> [!Important] > [!Important]
> `cume_dist` needs an [`ORDER BY` clause](#order-by-clause) in the `OVER` clause > When using `cume_dist`, include an [`ORDER BY` clause](#order-by-clause) in the `OVER` clause.
> to correctly calculate the cumulative distribution of the current row value.
{{< expand-wrapper >}} {{< expand-wrapper >}}
{{% expand "View `cume_dist` query example" %}} {{% expand "View `cume_dist` query example" %}}
@ -451,8 +476,9 @@ WHERE
### dense_rank ### dense_rank
Returns the rank of the current row without gaps. This function ranks rows in a Returns a rank for each row without gaps in the numbering.
dense manner, meaning consecutive ranks are assigned even for identical values. Unlike [rank()](#rank), this function assigns consecutive ranks even when values
are identical.
The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines
ranking order. ranking order.
@ -500,9 +526,9 @@ WHERE
### ntile ### ntile
Distributes the rows in an ordered partition into a specified number of groups. Distributes the rows in an ordered partition into the specified number of groups.
Each group is numbered, starting at one. For each row, `ntile` returns the Each group is numbered, starting at one.
group number to which the row belongs. For each row, `ntile` returns the group number to which the row belongs.
Group numbers range from 1 to the `expression` value, dividing the partition as Group numbers range from 1 to the `expression` value, dividing the partition as
equally as possible. equally as possible.
The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines
@ -514,8 +540,7 @@ ntile(expression)
##### Arguments ##### Arguments
- **expression**: An integer describing the number groups to split the partition - **expression**: An integer. The number of groups to split the partition into.
into.
{{< expand-wrapper >}} {{< expand-wrapper >}}
{{% expand "View `ntile` query example" %}} {{% expand "View `ntile` query example" %}}
@ -556,10 +581,14 @@ WHERE
### percent_rank ### percent_rank
Returns the percentage rank of the current row within its partition. Returns the percentage rank of the current row within its partition.
The returned value is between `0` and `1` and is computed as The returned value is between `0` and `1`, computed as:
`(rank - 1) / (total_rows - 1)`.
```
(rank - 1) / (total_rows - 1)
```
The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines
ranking order. the ranking order.
```sql ```sql
percent_rank() percent_rank()
@ -760,7 +789,7 @@ ORDER BY room, time
### lag ### lag
Returns the value from the row that is at the specified offset before the Returns the value from the row that is at the specified offset before the
current row in the partition. If the offset row is outside of the partition, current row in the partition. If the offset row is outside the partition,
the function returns the specified default. the function returns the specified default.
```sql ```sql
@ -876,7 +905,7 @@ ORDER BY room, time
### lead ### lead
Returns the value from the row that is at the specified offset after the Returns the value from the row that is at the specified offset after the
current row in the partition. If the offset row is outside of the partition, current row in the partition. If the offset row is outside the partition,
the function returns the specified default. the function returns the specified default.
```sql ```sql