From 5fc7e995753ab1f6e458acf0187a04dd7e4a0526 Mon Sep 17 00:00:00 2001 From: Scott Anderson Date: Wed, 19 Feb 2025 10:56:49 -0700 Subject: [PATCH 01/27] WIP SQL window functions --- .../reference/sql/functions/window.md | 342 ++++++++++++++++++ .../influxdb3/home-sample-link.html | 8 + 2 files changed, 350 insertions(+) create mode 100644 content/influxdb3/cloud-dedicated/reference/sql/functions/window.md create mode 100644 layouts/shortcodes/influxdb3/home-sample-link.html diff --git a/content/influxdb3/cloud-dedicated/reference/sql/functions/window.md b/content/influxdb3/cloud-dedicated/reference/sql/functions/window.md new file mode 100644 index 000000000..eb7b11580 --- /dev/null +++ b/content/influxdb3/cloud-dedicated/reference/sql/functions/window.md @@ -0,0 +1,342 @@ +--- +title: SQL window functions +list_title: Window functions +description: > + .... +menu: + influxdb3_cloud_dedicated: + name: Window + parent: sql-functions +weight: 309 +related: + - /influxdb3/cloud-dedicated/query-data/sql/aggregate-select/ + +# source: /content/shared/sql-reference/functions/aggregate.md +--- + +A _window function_ performs an operation across a set of rows related to the +current row. This is similar to the type of operations +[aggregate functions](/influxdb3/cloud-dedicated/reference/sql/functions/aggregate/) +perform. However, window functions do not return a single output row per group +like non-window aggregate functions do. Instead, rows retain their separate +identities. + +For example, the following query uses the {{< influxdb3/home-sample-link >}} +and returns each temperature reading with the average temperature per room over +the queried time range: + +```sql +SELECT + time, + room, + temp, + avg(temp) OVER (PARTITION BY room) AS avg_room_temp +FROM + home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time <= '2022-01-01T12:00:00Z' +ORDER BY + room, + time +``` + +| time | room | temp | avg_room_temp | +| :------------------ | :---------- | ---: | ------------: | +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 22.32 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 22.32 | +| 2022-01-01T10:00:00 | Kitchen | 22.7 | 22.32 | +| 2022-01-01T11:00:00 | Kitchen | 22.4 | 22.32 | +| 2022-01-01T12:00:00 | Kitchen | 22.5 | 22.32 | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 21.74 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 21.74 | +| 2022-01-01T10:00:00 | Living Room | 21.8 | 21.74 | +| 2022-01-01T11:00:00 | Living Room | 22.2 | 21.74 | +| 2022-01-01T12:00:00 | Living Room | 22.2 | 21.74 | + +## Window function syntax + +```sql +function([expr]) + OVER( + [PARTITION BY expr[, …]] + [ORDER BY expr [ ASC | DESC ][, …]] + [ frame_clause ] + ) +``` + +### OVER clause + +Window functions use an `OVER` clause directly following the window function's +name and arguments. The `OVER` clause syntactically distinguishes a window +function from a normal function or non-window aggregate function and determines +how rows are split up for the window operation. + +### PARTITION BY clause + +The `PARTITION BY` clause in the `OVER` clause divides the rows into groups, or +partitions, that share the same values of the `PARTITION BY` expressions. +The window function operates on all the rows in the same partition as the +current row. + +### ORDER BY clause + +The `ORDER BY` clause inside of the `OVER` clause controls the order that the +window function processors rows in each partition. + +> [!Note] +> The `ORDER BY` clause in an `OVER` clause is separate from the `ORDER BY` +> clause of the query and only determines the order that rows in each partition +> are processed in. + +### Frame clause + +The frame clause can be one of the following: + +```sql +{ RANGE | ROWS | GROUPS } frame_start +{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end +``` + +and **frame_start** and **frame_end** can be one of + +```sql +UNBOUNDED PRECEDING +offset PRECEDING +CURRENT ROW +offset FOLLOWING +UNBOUNDED FOLLOWING +``` + +where **offset** is an non-negative integer. + +`RANGE` and `GROUPS` modes require an `ORDER BY` clause (with `RANGE` the `ORDER BY` must +specify exactly one column). + +#### Framing modes + +##### RANGE + +##### ROWS + +##### GROUPs + + + +```sql +SELECT depname, empno, salary, + rank() OVER (PARTITION BY depname ORDER BY salary DESC) +FROM empsalary; + ++-----------+-------+--------+--------+ +| depname | empno | salary | rank | ++-----------+-------+--------+--------+ +| personnel | 2 | 3900 | 1 | +| develop | 8 | 6000 | 1 | +| develop | 10 | 5200 | 2 | +| develop | 11 | 5200 | 2 | +| develop | 9 | 4500 | 4 | +| develop | 7 | 4200 | 5 | +| sales | 1 | 5000 | 1 | +| sales | 4 | 4800 | 2 | +| personnel | 5 | 3500 | 2 | +| sales | 3 | 4800 | 2 | ++-----------+-------+--------+--------+ +``` + +There is another important concept associated with window functions: for each +row, there is a set of rows within its partition called its window frame. Some +window functions act only on the rows of the window frame, rather than of the +whole partition. Here is an example of using window frames in queries: + +```sql +SELECT depname, empno, salary, + avg(salary) OVER(ORDER BY salary ASC ROWS BETWEEN 1 PRECEDING AND 1 +FOLLOWING) AS avg, + min(salary) OVER(ORDER BY empno ASC ROWS BETWEEN UNBOUNDED PRECEDING AND +CURRENT ROW) AS cum_min +FROM empsalary +ORDER BY empno ASC; + ++-----------+-------+--------+--------------------+---------+ +| depname | empno | salary | avg | cum_min | ++-----------+-------+--------+--------------------+---------+ +| sales | 1 | 5000 | 5000.0 | 5000 | +| personnel | 2 | 3900 | 3866.6666666666665 | 3900 | +| sales | 3 | 4800 | 4700.0 | 3900 | +| sales | 4 | 4800 | 4866.666666666667 | 3900 | +| personnel | 5 | 3500 | 3700.0 | 3500 | +| develop | 7 | 4200 | 4200.0 | 3500 | +| develop | 8 | 6000 | 5600.0 | 3500 | +| develop | 9 | 4500 | 4500.0 | 3500 | +| develop | 10 | 5200 | 5133.333333333333 | 3500 | +| develop | 11 | 5200 | 5466.666666666667 | 3500 | ++-----------+-------+--------+--------------------+---------+ +``` + +When a query involves multiple window functions, it is possible to write out +each one with a separate OVER clause, but this is duplicative and error-prone if +the same windowing behavior is wanted for several functions. Instead, each +windowing behavior can be named in a WINDOW clause and then referenced in OVER. +For example: + +```sql +SELECT sum(salary) OVER w, avg(salary) OVER w +FROM empsalary +WINDOW w AS (PARTITION BY depname ORDER BY salary DESC); +``` + +## Aggregate functions + +All [aggregate functions](/influxdb3/cloud-dedicated/reference/sql/functions/aggregate/) +can be used as window functions. + +## Ranking Functions + +- [cume_dist](#cume_dist) +- [dense_rank](#dense_rank) +- [ntile](#ntile) +- [percent_rank](#percent_rank) +- [rank](#rank) +- [row_number](#row_number) + +### `cume_dist` + +Relative rank of the current row: (number of rows preceding or peer with current +row) / (total rows). + +```sql +cume_dist() +``` + +### `dense_rank` + +Returns the rank of the current row without gaps. This function ranks rows in a +dense manner, meaning consecutive ranks are assigned even for identical values. + +```sql +dense_rank() +``` + +### `ntile` + +Integer ranging from 1 to the argument value, dividing the partition as equally +as possible. + +```sql +ntile(expression) +``` + +#### Arguments + +- **expression**: An integer describing the number groups the partition should + be split into. + +### `percent_rank` + +Returns the percentage rank of the current row within its partition. The value +ranges from 0 to 1 and is computed as `(rank - 1) / (total_rows - 1)`. + +```sql +percent_rank() +``` + +### `rank` + +Returns the rank of the current row within its partition, allowing gaps between +ranks. This function provides a ranking similar to `row_number`, but skips ranks +for identical values. + +```sql +rank() +``` + +### `row_number` + +Number of the current row within its partition, counting from 1. + +```sql +row_number() +``` + +## Analytical Functions + +- [first_value](#first_value) +- [lag](#lag) +- [last_value](#last_value) +- [lead](#lead) +- [nth_value](#nth_value) + +### `first_value` + +Returns value evaluated at the row that is the first row of the window frame. + +```sql +first_value(expression) +``` + +#### Arguments + +- **expression**: Expression to operate on. + +### `lag` + +Returns value evaluated at the row that is offset rows before the current row +within the partition; if there is no such row, instead return default (which +must be of the same type as value). + +```sql +lag(expression, offset, default) +``` + +#### Arguments + +- **expression**: Expression to operate on. +- **offset**: Integer. Specifies how many rows back the value of expression + should be retrieved. Defaults to 1. +- **default**: The default value if the offset is not within the partition. Must + be of the same type as expression. + +### `last_value` + +Returns value evaluated at the row that is the last row of the window frame. + +```sql +last_value(expression) +``` + +#### Arguments + +- **expression**: Expression to operate on. + +### `lead` + +Returns value evaluated at the row that is offset rows after the current row +within the partition; if there is no such row, instead return default (which +must be of the same type as value). + +```sql +lead(expression, offset, default) +``` + +#### Arguments + +- **expression**: Expression to operate on. +- **offset**: Integer. Specifies how many rows forward the value of expression + should be retrieved. Defaults to 1. +- **default**: The default value if the offset is not within the partition. Must + be of the same type as expression. + +### `nth_value` + +Returns value evaluated at the row that is the nth row of the window frame +(counting from 1); null if no such row. + +```sql +nth_value(expression, n) +``` + +#### Arguments + +- **expression**: The name the column of which nth value to retrieve. +- **n**: Integer. Specifies the n in nth. diff --git a/layouts/shortcodes/influxdb3/home-sample-link.html b/layouts/shortcodes/influxdb3/home-sample-link.html new file mode 100644 index 000000000..32977b266 --- /dev/null +++ b/layouts/shortcodes/influxdb3/home-sample-link.html @@ -0,0 +1,8 @@ +{{- $productPathData := split .Page.RelPermalink "/" -}} +{{- $product := index $productPathData 2 -}} +{{- $isDistributed := in (slice "cloud-dedicated" "cloud-serverless" "clustered") $product -}} +{{- if $isDistributed -}} +Get started home sensor sample data +{{- else -}} +Home sensor sample data +{{- end -}} From d1e44a4044a3128145b091ca19292167137558ff Mon Sep 17 00:00:00 2001 From: Peter Barnett Date: Wed, 19 Feb 2025 21:23:23 -0500 Subject: [PATCH 02/27] update: fix some wording and hierarchy --- content/shared/v3-core-plugins/_index.md | 73 +++++++++++++----------- 1 file changed, 40 insertions(+), 33 deletions(-) diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index db1e4456a..84d138050 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -1,11 +1,16 @@ -Use the {{% product-name %}} Processing engine to run code and perform tasks +Use the {{% product-name %}} ProcessingEngine to run code and perform tasks for different database events. -{{% product-name %}} provides the InfluxDB 3 Processing engine, an embedded Python VM that can dynamically load and trigger Python plugins +{{% product-name %}} provides the InfluxDB 3 Processing Engine, an embedded Python VM that can dynamically load and trigger Python plugins in response to events in your database. -## Plugins +## Key Concepts + +### Plugins + +A Processing Engine _plugin_ is Python code you provide to run tasks, such as +downsampling data, monitoring, creating alerts, or calling external services. > [!Note] > #### Contribute and use community plugins @@ -14,31 +19,30 @@ in response to events in your database. > and contribute example plugins. > You can reference plugins from the repository directly within a trigger configuration. -A Processing engine _plugin_ is Python code you provide to run tasks, such as -downsampling data, monitoring, creating alerts, or calling external services. - -## Triggers +### Triggers A _trigger_ is an InfluxDB 3 resource you create to associate a database event (for example, a WAL flush) with the plugin that should run. -When an event occurs, the trigger passes configuration, optional arguments, and event data to the plugin. +When an event occurs, the trigger passes configuration details, optional arguments, and event data to the plugin. -The Processing engine provides four types of plugins and triggers--each type corresponds to an event type with event-specific configuration to let you handle events with targeted logic. +The Processing Engine provides four types of triggers—each type corresponds to an event type with event-specific configuration to let you handle events with targeted logic. -- **WAL flush**: Triggered when the write-ahead log (WAL) is flushed to the object store (default is every second) -- **Parquet persistence (coming soon)**: Triggered when InfluxDB 3 persists data to object store Parquet files -- **Scheduled tasks**: Triggered on a schedule you specify using cron syntax -- **On Request**: Bound to the HTTP API `/api/v3/engine/` endpoint and triggered by a GET or POST request to the endpoint. +- **WAL Flush**: Triggered when the write-ahead log (WAL) is flushed to the object store (default is every second). +- **Scheduled Tasks**: Triggered on a schedule you specify using cron syntax. +- **On Request**: Triggered on a GET or POST request to the bound HTTP API endpoint at `/api/v3/engine/`. +- **Parquet Persistence (coming soon)**: Triggered when InfluxDB 3 persists data to object storage Parquet files. -## Activate the Processing engine +### Activate the Processing Engine -To enable the Processing engine, start the {{% product-name %}} server with the `--plugin-dir` option and a path to your plugins directory (it doesn't need to exist yet)--for example: +To enable the Processing Engine, start the {{% product-name %}} server with the `--plugin-dir` option and a path to your plugins directory. If the directory doesn’t exist, the server creates it. ```bash -influxdb3 serve --node-id node0 --plugin-dir /path/to/plugins +influxdb3 serve --node-id node0 --object-store [OBJECT STORE TYPE] --plugin-dir /path/to/plugins ``` -## Shared API + + +## The Shared API All plugin types provide the InfluxDB 3 _shared API_ for interacting with the database. The shared API provides access to the following: @@ -47,7 +51,7 @@ The shared API provides access to the following: - `query` to query data from any database - `info`, `warn`, and `error` to log messages to the database log, which is output in the server logs and captured in system tables queryable by SQL -### Line builder +### LineBuilder The `LineBuilder` is a simple API for building lines of Line Protocol to write into the database. Writes are buffered while the plugin runs and are flushed when the plugin completes. The `LineBuilder` API is available in all plugin types. @@ -198,12 +202,12 @@ influxdb3_local.query("SELECT * from foo where bar = $bar and time > now() - 'in ### Logging The shared API `info`, `warn`, and `error` functions log messages to the database log, which is output in the server logs and captured in system tables queryable by SQL. -The `info`, `warn`, and `error` functions are available in all plugin types. The functions take an arbitrary number of arguments, convert them to strings, and then join them into a single message separated by a space. +The `info`, `warn`, and `error` functions are available in all plugin types. Each function accepts multiple arguments, converts them to strings, and logs them as a single, space-separated message. -The following examples show to use the `info`, `warn`, and `error` logging functions: +The following examples show how to use the `info`, `warn`, and `error` logging functions: ```python -ifluxdb3_local.info("This is an info message") +influxdb3_local.info("This is an info message") influxdb3_local.warn("This is a warning message") influxdb3_local.error("This is an error message") @@ -212,10 +216,10 @@ obj_to_log = {"hello": "world"} influxdb3_local.info("This is an info message with an object", obj_to_log) ``` -### Trigger arguments +### Trigger Arguments Every plugin type can receive arguments from the configuration of the trigger that runs it. -You can use this to provide runtime configuration and drive behavior of a plugin--for example: +You can use this to provide runtime configuration and drive behavior of a plugin—for example: - threshold values for monitoring - connection properties for connecting to third-party services @@ -233,9 +237,9 @@ def process_writes(influxdb3_local, table_batches, args=None): influxdb3_local.warn("No threshold provided") ``` -The `args` parameter is optional and can be omitted from the trigger definition if the plugin doesn't need to use arguments. +The `args` parameter is optional. If a plugin doesn’t require arguments, you can omit it from the trigger definition. -## Import plugin dependencies +## Import Plugin Dependencies Use the `influxdb3 install` command to download and install Python packages that your plugin depends on. @@ -267,14 +271,17 @@ influxdb3 install package ``` The result is an active Python virtual environment with the package installed in `/.venv`. -You can pass additional options to use a `requirements.txt` file or a custom virtual environment path. +You can specify additional options to install dependencies from a `requirements.txt` file or a custom virtual environment path. For more information, see the `influxdb3` CLI help: ```bash influxdb3 install package --help ``` -## WAL flush plugin +## Trigger Types and How They Work +Triggers define when and how plugins execute in response to database events. Each trigger type corresponds to a specific event, allowing precise control over automation within {{% product-name %}}. + +### WAL Flush Trigger When a WAL flush plugin is triggered, the plugin receives a list of `table_batches` filtered by the trigger configuration (either _all tables_ in the database or a specific table). @@ -302,7 +309,7 @@ def process_writes(influxdb3_local, table_batches, args=None): influxdb3_local.info("wal_plugin.py done") ``` -### WAL flush trigger Configuration +#### WAL flush trigger configuration When you create a trigger, you associate it with a database and provide configuration specific to the trigger type. @@ -330,7 +337,7 @@ For more information about trigger arguments, see the CLI help: influxdb3 create trigger help ``` -## Schedule Plugin +### Schedule Trigger Schedule plugins run on a schedule specified in cron syntax. The plugin will receive the local API, the time of the trigger, and any arguments passed in the trigger definition. Here's an example of a simple schedule plugin: @@ -347,7 +354,7 @@ def process_scheduled_call(influxdb3_local, time, args=None): influxdb3_local.error("No table_name provided for schedule plugin") ``` -### Schedule Trigger Configuration +#### Schedule trigger configuration Schedule plugins are set with a `trigger-spec` of `schedule:` or `every:`. The `args` parameter can be used to pass configuration to the plugin. For example, if we wanted to use the system-metrics example from the Github repo and have it collect every 10 seconds we could use the following trigger definition: @@ -358,7 +365,7 @@ influxdb3 create trigger \ --database mydb system-metrics ``` -## On Request Plugin +### On Request Trigger On Request plugins are triggered by a request to a specific endpoint under `/api/v3/engine`. The plugin will receive the local API, query parameters `Dict[str, str]`, request headers `Dict[str, str]`, request body (as bytes), and any arguments passed in the trigger definition. Here's an example of a simple On Request plugin: @@ -385,7 +392,7 @@ def process_request(influxdb3_local, query_parameters, request_headers, request_ return 200, {"Content-Type": "application/json"}, json.dumps({"status": "ok", "line": line_str}) ``` -### On Request Trigger Configuration +#### On request trigger configuration On Request plugins are set with a `trigger-spec` of `request:`. The `args` parameter can be used to pass configuration to the plugin. For example, if we wanted the above plugin to run on the endpoint `/api/v3/engine/my_plugin`, we would use `request:my_plugin` as the `trigger-spec`. @@ -394,6 +401,6 @@ Trigger specs must be unique across all configured plugins, regardless of which ```shell influxdb3 create trigger \ --trigger-spec "request:hello-world" \ - --plugin-filename "hellp/hello_world.py" \ + --plugin-filename "hello/hello_world.py" \ --database mydb hello-world ``` From 58b7b4b7829c985781debe6e6c9af19d41467f5f Mon Sep 17 00:00:00 2001 From: Scott Anderson Date: Thu, 20 Feb 2025 09:32:12 -0700 Subject: [PATCH 03/27] WIP sql window functions --- .../reference/sql/functions/window.md | 61 ++++++++++++++++--- 1 file changed, 52 insertions(+), 9 deletions(-) diff --git a/content/influxdb3/cloud-dedicated/reference/sql/functions/window.md b/content/influxdb3/cloud-dedicated/reference/sql/functions/window.md index eb7b11580..64023a552 100644 --- a/content/influxdb3/cloud-dedicated/reference/sql/functions/window.md +++ b/content/influxdb3/cloud-dedicated/reference/sql/functions/window.md @@ -82,7 +82,10 @@ current row. ### ORDER BY clause The `ORDER BY` clause inside of the `OVER` clause controls the order that the -window function processors rows in each partition. +window function processes rows in each partition. +When a window clause contains an `ORDER BY` clause, the window frame boundaries +may be explicit or implicit, limiting a window frame size in both directions +relative to the current row. > [!Note] > The `ORDER BY` clause in an `OVER` clause is separate from the `ORDER BY` @@ -91,14 +94,49 @@ window function processors rows in each partition. ### Frame clause -The frame clause can be one of the following: +The frame clause defines window frame boundaries relative to the current row and +can be one of the following: ```sql { RANGE | ROWS | GROUPS } frame_start { RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end ``` -and **frame_start** and **frame_end** can be one of +#### Frame units + +##### RANGE + +Defines frame boundaries using rows with distinct values for columns specified +in the [`ORDER BY` clause](#order-by-clause) within a value range relative to +the current row value. + +> [!Important] +> When using `RANGE` frame units, you must include an `ORDER BY` clause with +> _exactly one column_. + +The offset is the difference the between the current row value and surrounding +row values. `RANGE` supports the following offset types: + +- Numeric +- String +- Interval + +##### ROWS + +Defines frame boundaries using row positions relative to the current row. +The offset is the difference in row position from the current row. + +##### GROUPS + +Defines frame boundaries using row groups. +Rows with the same values for the columns in the [`ORDER BY` clause](#order-by-clause) +comprise a row group. The offset is the difference in row group position +relative to the the current row group. +When using `GROUPS` frame units, you must include an `ORDER BY` clause. + +#### Frame boundaries + +**frame_start** and **frame_end** can be one of the following: ```sql UNBOUNDED PRECEDING @@ -108,18 +146,23 @@ offset FOLLOWING UNBOUNDED FOLLOWING ``` +##### UNBOUNDED PRECEDING + + +##### offset PRECEDING + where **offset** is an non-negative integer. -`RANGE` and `GROUPS` modes require an `ORDER BY` clause (with `RANGE` the `ORDER BY` must -specify exactly one column). +##### CURRENT ROW -#### Framing modes -##### RANGE -##### ROWS +##### offset FOLLOWING + +where **offset** is an non-negative integer. + +##### UNBOUNDED FOLLOWING -##### GROUPs From 0e7d23dbac33dced16e0e36cb0b9d7c735865f36 Mon Sep 17 00:00:00 2001 From: Scott Anderson Date: Fri, 21 Feb 2025 14:11:32 -0700 Subject: [PATCH 04/27] add sql window functions and new shortcodes --- assets/styles/layouts/_article.scss | 1 + .../layouts/article/_html-diagrams.scss | 238 +++++ .../reference/sql/functions/window.md | 379 +------ .../reference/sql/functions/window.md | 18 + .../reference/sql/functions/window.md | 18 + .../core/reference/sql/functions/window.md | 18 + .../reference/sql/functions/window.md | 18 + .../shared/sql-reference/functions/window.md | 986 ++++++++++++++++++ .../shortcodes/sql/window-frame-units..html | 335 ++++++ 9 files changed, 1638 insertions(+), 373 deletions(-) create mode 100644 content/influxdb3/cloud-serverless/reference/sql/functions/window.md create mode 100644 content/influxdb3/clustered/reference/sql/functions/window.md create mode 100644 content/influxdb3/core/reference/sql/functions/window.md create mode 100644 content/influxdb3/enterprise/reference/sql/functions/window.md create mode 100644 content/shared/sql-reference/functions/window.md create mode 100644 layouts/shortcodes/sql/window-frame-units..html diff --git a/assets/styles/layouts/_article.scss b/assets/styles/layouts/_article.scss index 07945f489..1d5a68b3e 100644 --- a/assets/styles/layouts/_article.scss +++ b/assets/styles/layouts/_article.scss @@ -244,6 +244,7 @@ &.blue {color: $b-dodger;} &.green {color: $gr-viridian;} &.magenta {color: $p-comet;} + &.pink {color: $br-new-magenta;} } h2, diff --git a/assets/styles/layouts/article/_html-diagrams.scss b/assets/styles/layouts/article/_html-diagrams.scss index edd7240a0..2d60e71b7 100644 --- a/assets/styles/layouts/article/_html-diagrams.scss +++ b/assets/styles/layouts/article/_html-diagrams.scss @@ -904,6 +904,244 @@ table tr.point{ } } +//////////////////////// SQL WINDOW FRAME UNITS EXAMPLES /////////////////////// + +table.window-frame-units { + &.groups { + .group { + position: relative; + outline-style: solid; + outline-width: 3px; + outline-offset: -5px; + border-radius: 10px; + + &::before { + content: "Row Group"; + display: block; + padding: .25rem .5rem; + position: absolute; + top: 3px; + left: 3px; + border-radius: 4px; + color: #fff; + font-size: .8rem; + font-weight: bold; + text-transform: uppercase; + letter-spacing: .02em; + box-shadow: 4px 4px 4px $article-bg; + } + + td:nth-child(2), td:nth-child(3) { + font-weight: bold; + text-decoration: underline; + text-decoration-thickness: 2px; + text-underline-offset: 5px; + } + + &:nth-of-type(1) { + &::before {background: $br-new-magenta;} + outline-color: $br-new-magenta; + td:nth-child(2), td:nth-child(3) { + text-decoration-color: $br-new-magenta; + } + } + &:nth-of-type(2) { + &::before {background: $br-new-purple;} + outline-color: $br-new-purple; + td:nth-child(2), td:nth-child(3) { + text-decoration-color: $br-new-purple; + } + } + &:nth-of-type(3) { + &::before {background: $b-dodger;} + outline-color: $b-dodger; + td:nth-child(2), td:nth-child(3) { + text-decoration-color: $b-dodger; + } + } + &:nth-of-type(4) { + &::before {background: $b-sapphire;} + outline-color: $b-sapphire; + td:nth-child(2), td:nth-child(3) { + text-decoration-color: $b-sapphire; + } + } + } + } + + &.groups-with-frame { + .frame, tr.current-row { + position: relative; + outline-style: solid; + outline-width: 3px; + outline-offset: -5px; + border-radius: 10px; + + &::after { + display: block; + padding: .25rem .5rem; + position: absolute; + top: 3px; + left: 3px; + border-radius: 4px; + color: #fff; + font-size: .8rem; + font-weight: bold; + text-transform: uppercase; + letter-spacing: .02em; + box-shadow: 4px 4px 4px $article-bg; + } + + tr:nth-child(n + 1):nth-child(-n + 3) { + td {text-decoration-color: $br-new-magenta;} + } + tr:nth-child(n + 4):nth-child(-n + 6) { + td {text-decoration-color: $br-magenta;} + } + tr:nth-child(n + 7):nth-child(-n + 8) { + td {text-decoration-color: $b-dodger;} + } + + td:nth-child(n + 2):nth-child(-n + 3) { + font-weight: bold; + text-decoration: underline; + text-decoration-thickness: 2px; + text-underline-offset: 5px; + } + } + tr.current-row { + outline-color: $br-new-magenta; + &::after { + content: "Current Row"; + background: $br-new-magenta; + } + td {text-decoration-color: $b-dodger !important;} + } + + .frame { + outline-color: $br-new-purple; + &::after { + content: "Frame"; + background: $br-new-purple; + } + } + .group { + position: relative; + outline-color: $b-sapphire; + td:nth-child(2), td:nth-child(3) { + font-weight: bold; + text-decoration: underline; + text-decoration-thickness: 2px; + text-underline-offset: 5px; + text-decoration-color: $b-sapphire; + } + } + } + + &.range-interval { + .frame, tr.current-row { + position: relative; + outline-style: solid; + outline-width: 3px; + outline-offset: -5px; + border-radius: 10px; + + td:first-child { + font-weight: bold; + text-decoration: underline; + text-decoration-thickness: 2px; + text-underline-offset: 5px; + text-decoration-color: $br-new-purple; + } + &::after { + display: block; + padding: .25rem .5rem; + position: absolute; + top: 3px; + right: 3px; + border-radius: 4px; + color: #fff; + font-size: .8rem; + font-weight: bold; + text-transform: uppercase; + letter-spacing: .02em; + box-shadow: -4px 4px 4px $article-bg; + } + } + tr.current-row { + outline-color: $br-new-magenta; + td:first-child {text-decoration-color: $br-new-magenta;} + &::after { + content: "Current Row"; + background: $br-new-magenta; + box-shadow: -4px 4px 4px $article-table-row-alt; + } + } + + .frame { + outline-color: $br-new-purple; + &::after { + content: "Frame"; + background: $br-new-purple; + } + } + } + + &.range-numeric, &.rows { + .frame, tr.current-row { + position: relative; + outline-style: solid; + outline-width: 3px; + outline-offset: -5px; + border-radius: 10px; + + &::after { + display: block; + padding: .25rem .5rem; + position: absolute; + top: 3px; + left: 3px; + border-radius: 4px; + color: #fff; + font-size: .8rem; + font-weight: bold; + text-transform: uppercase; + letter-spacing: .02em; + box-shadow: 4px 4px 4px $article-bg; + } + } + tr.current-row { + outline-color: $br-new-magenta; + &::after { + content: "Current Row"; + background: $br-new-magenta; + } + } + + .frame { + outline-color: $br-new-purple; + &::after { + content: "Frame"; + background: $br-new-purple; + } + } + } + &.range-numeric { + .frame { + td:nth-child(3) { + font-weight: bold; + text-decoration: underline; + text-decoration-thickness: 2px; + text-underline-offset: 5px; + text-decoration-color: $br-new-purple; + } + tr.current-row { + td:nth-child(3) {text-decoration-color: $br-new-magenta;} + } + } + } +} + //////////////////////////////////////////////////////////////////////////////// ///////////////////////////////// MEDIA QUERIES //////////////////////////////// //////////////////////////////////////////////////////////////////////////////// diff --git a/content/influxdb3/cloud-dedicated/reference/sql/functions/window.md b/content/influxdb3/cloud-dedicated/reference/sql/functions/window.md index 64023a552..b2f34e937 100644 --- a/content/influxdb3/cloud-dedicated/reference/sql/functions/window.md +++ b/content/influxdb3/cloud-dedicated/reference/sql/functions/window.md @@ -2,384 +2,17 @@ title: SQL window functions list_title: Window functions description: > - .... + SQL window functions perform an operation across a set of rows related to the + current row. menu: influxdb3_cloud_dedicated: name: Window parent: sql-functions weight: 309 -related: - - /influxdb3/cloud-dedicated/query-data/sql/aggregate-select/ -# source: /content/shared/sql-reference/functions/aggregate.md +source: /shared/sql-reference/functions/window.md --- -A _window function_ performs an operation across a set of rows related to the -current row. This is similar to the type of operations -[aggregate functions](/influxdb3/cloud-dedicated/reference/sql/functions/aggregate/) -perform. However, window functions do not return a single output row per group -like non-window aggregate functions do. Instead, rows retain their separate -identities. - -For example, the following query uses the {{< influxdb3/home-sample-link >}} -and returns each temperature reading with the average temperature per room over -the queried time range: - -```sql -SELECT - time, - room, - temp, - avg(temp) OVER (PARTITION BY room) AS avg_room_temp -FROM - home -WHERE - time >= '2022-01-01T08:00:00Z' - AND time <= '2022-01-01T12:00:00Z' -ORDER BY - room, - time -``` - -| time | room | temp | avg_room_temp | -| :------------------ | :---------- | ---: | ------------: | -| 2022-01-01T08:00:00 | Kitchen | 21.0 | 22.32 | -| 2022-01-01T09:00:00 | Kitchen | 23.0 | 22.32 | -| 2022-01-01T10:00:00 | Kitchen | 22.7 | 22.32 | -| 2022-01-01T11:00:00 | Kitchen | 22.4 | 22.32 | -| 2022-01-01T12:00:00 | Kitchen | 22.5 | 22.32 | -| 2022-01-01T08:00:00 | Living Room | 21.1 | 21.74 | -| 2022-01-01T09:00:00 | Living Room | 21.4 | 21.74 | -| 2022-01-01T10:00:00 | Living Room | 21.8 | 21.74 | -| 2022-01-01T11:00:00 | Living Room | 22.2 | 21.74 | -| 2022-01-01T12:00:00 | Living Room | 22.2 | 21.74 | - -## Window function syntax - -```sql -function([expr]) - OVER( - [PARTITION BY expr[, …]] - [ORDER BY expr [ ASC | DESC ][, …]] - [ frame_clause ] - ) -``` - -### OVER clause - -Window functions use an `OVER` clause directly following the window function's -name and arguments. The `OVER` clause syntactically distinguishes a window -function from a normal function or non-window aggregate function and determines -how rows are split up for the window operation. - -### PARTITION BY clause - -The `PARTITION BY` clause in the `OVER` clause divides the rows into groups, or -partitions, that share the same values of the `PARTITION BY` expressions. -The window function operates on all the rows in the same partition as the -current row. - -### ORDER BY clause - -The `ORDER BY` clause inside of the `OVER` clause controls the order that the -window function processes rows in each partition. -When a window clause contains an `ORDER BY` clause, the window frame boundaries -may be explicit or implicit, limiting a window frame size in both directions -relative to the current row. - -> [!Note] -> The `ORDER BY` clause in an `OVER` clause is separate from the `ORDER BY` -> clause of the query and only determines the order that rows in each partition -> are processed in. - -### Frame clause - -The frame clause defines window frame boundaries relative to the current row and -can be one of the following: - -```sql -{ RANGE | ROWS | GROUPS } frame_start -{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end -``` - -#### Frame units - -##### RANGE - -Defines frame boundaries using rows with distinct values for columns specified -in the [`ORDER BY` clause](#order-by-clause) within a value range relative to -the current row value. - -> [!Important] -> When using `RANGE` frame units, you must include an `ORDER BY` clause with -> _exactly one column_. - -The offset is the difference the between the current row value and surrounding -row values. `RANGE` supports the following offset types: - -- Numeric -- String -- Interval - -##### ROWS - -Defines frame boundaries using row positions relative to the current row. -The offset is the difference in row position from the current row. - -##### GROUPS - -Defines frame boundaries using row groups. -Rows with the same values for the columns in the [`ORDER BY` clause](#order-by-clause) -comprise a row group. The offset is the difference in row group position -relative to the the current row group. -When using `GROUPS` frame units, you must include an `ORDER BY` clause. - -#### Frame boundaries - -**frame_start** and **frame_end** can be one of the following: - -```sql -UNBOUNDED PRECEDING -offset PRECEDING -CURRENT ROW -offset FOLLOWING -UNBOUNDED FOLLOWING -``` - -##### UNBOUNDED PRECEDING - - -##### offset PRECEDING - -where **offset** is an non-negative integer. - -##### CURRENT ROW - - - -##### offset FOLLOWING - -where **offset** is an non-negative integer. - -##### UNBOUNDED FOLLOWING - - - - -```sql -SELECT depname, empno, salary, - rank() OVER (PARTITION BY depname ORDER BY salary DESC) -FROM empsalary; - -+-----------+-------+--------+--------+ -| depname | empno | salary | rank | -+-----------+-------+--------+--------+ -| personnel | 2 | 3900 | 1 | -| develop | 8 | 6000 | 1 | -| develop | 10 | 5200 | 2 | -| develop | 11 | 5200 | 2 | -| develop | 9 | 4500 | 4 | -| develop | 7 | 4200 | 5 | -| sales | 1 | 5000 | 1 | -| sales | 4 | 4800 | 2 | -| personnel | 5 | 3500 | 2 | -| sales | 3 | 4800 | 2 | -+-----------+-------+--------+--------+ -``` - -There is another important concept associated with window functions: for each -row, there is a set of rows within its partition called its window frame. Some -window functions act only on the rows of the window frame, rather than of the -whole partition. Here is an example of using window frames in queries: - -```sql -SELECT depname, empno, salary, - avg(salary) OVER(ORDER BY salary ASC ROWS BETWEEN 1 PRECEDING AND 1 -FOLLOWING) AS avg, - min(salary) OVER(ORDER BY empno ASC ROWS BETWEEN UNBOUNDED PRECEDING AND -CURRENT ROW) AS cum_min -FROM empsalary -ORDER BY empno ASC; - -+-----------+-------+--------+--------------------+---------+ -| depname | empno | salary | avg | cum_min | -+-----------+-------+--------+--------------------+---------+ -| sales | 1 | 5000 | 5000.0 | 5000 | -| personnel | 2 | 3900 | 3866.6666666666665 | 3900 | -| sales | 3 | 4800 | 4700.0 | 3900 | -| sales | 4 | 4800 | 4866.666666666667 | 3900 | -| personnel | 5 | 3500 | 3700.0 | 3500 | -| develop | 7 | 4200 | 4200.0 | 3500 | -| develop | 8 | 6000 | 5600.0 | 3500 | -| develop | 9 | 4500 | 4500.0 | 3500 | -| develop | 10 | 5200 | 5133.333333333333 | 3500 | -| develop | 11 | 5200 | 5466.666666666667 | 3500 | -+-----------+-------+--------+--------------------+---------+ -``` - -When a query involves multiple window functions, it is possible to write out -each one with a separate OVER clause, but this is duplicative and error-prone if -the same windowing behavior is wanted for several functions. Instead, each -windowing behavior can be named in a WINDOW clause and then referenced in OVER. -For example: - -```sql -SELECT sum(salary) OVER w, avg(salary) OVER w -FROM empsalary -WINDOW w AS (PARTITION BY depname ORDER BY salary DESC); -``` - -## Aggregate functions - -All [aggregate functions](/influxdb3/cloud-dedicated/reference/sql/functions/aggregate/) -can be used as window functions. - -## Ranking Functions - -- [cume_dist](#cume_dist) -- [dense_rank](#dense_rank) -- [ntile](#ntile) -- [percent_rank](#percent_rank) -- [rank](#rank) -- [row_number](#row_number) - -### `cume_dist` - -Relative rank of the current row: (number of rows preceding or peer with current -row) / (total rows). - -```sql -cume_dist() -``` - -### `dense_rank` - -Returns the rank of the current row without gaps. This function ranks rows in a -dense manner, meaning consecutive ranks are assigned even for identical values. - -```sql -dense_rank() -``` - -### `ntile` - -Integer ranging from 1 to the argument value, dividing the partition as equally -as possible. - -```sql -ntile(expression) -``` - -#### Arguments - -- **expression**: An integer describing the number groups the partition should - be split into. - -### `percent_rank` - -Returns the percentage rank of the current row within its partition. The value -ranges from 0 to 1 and is computed as `(rank - 1) / (total_rows - 1)`. - -```sql -percent_rank() -``` - -### `rank` - -Returns the rank of the current row within its partition, allowing gaps between -ranks. This function provides a ranking similar to `row_number`, but skips ranks -for identical values. - -```sql -rank() -``` - -### `row_number` - -Number of the current row within its partition, counting from 1. - -```sql -row_number() -``` - -## Analytical Functions - -- [first_value](#first_value) -- [lag](#lag) -- [last_value](#last_value) -- [lead](#lead) -- [nth_value](#nth_value) - -### `first_value` - -Returns value evaluated at the row that is the first row of the window frame. - -```sql -first_value(expression) -``` - -#### Arguments - -- **expression**: Expression to operate on. - -### `lag` - -Returns value evaluated at the row that is offset rows before the current row -within the partition; if there is no such row, instead return default (which -must be of the same type as value). - -```sql -lag(expression, offset, default) -``` - -#### Arguments - -- **expression**: Expression to operate on. -- **offset**: Integer. Specifies how many rows back the value of expression - should be retrieved. Defaults to 1. -- **default**: The default value if the offset is not within the partition. Must - be of the same type as expression. - -### `last_value` - -Returns value evaluated at the row that is the last row of the window frame. - -```sql -last_value(expression) -``` - -#### Arguments - -- **expression**: Expression to operate on. - -### `lead` - -Returns value evaluated at the row that is offset rows after the current row -within the partition; if there is no such row, instead return default (which -must be of the same type as value). - -```sql -lead(expression, offset, default) -``` - -#### Arguments - -- **expression**: Expression to operate on. -- **offset**: Integer. Specifies how many rows forward the value of expression - should be retrieved. Defaults to 1. -- **default**: The default value if the offset is not within the partition. Must - be of the same type as expression. - -### `nth_value` - -Returns value evaluated at the row that is the nth row of the window frame -(counting from 1); null if no such row. - -```sql -nth_value(expression, n) -``` - -#### Arguments - -- **expression**: The name the column of which nth value to retrieve. -- **n**: Integer. Specifies the n in nth. + diff --git a/content/influxdb3/cloud-serverless/reference/sql/functions/window.md b/content/influxdb3/cloud-serverless/reference/sql/functions/window.md new file mode 100644 index 000000000..882722f57 --- /dev/null +++ b/content/influxdb3/cloud-serverless/reference/sql/functions/window.md @@ -0,0 +1,18 @@ +--- +title: SQL window functions +list_title: Window functions +description: > + SQL window functions perform an operation across a set of rows related to the + current row. +menu: + influxdb3_cloud_serverless: + name: Window + parent: sql-functions +weight: 309 + +source: /shared/sql-reference/functions/window.md +--- + + diff --git a/content/influxdb3/clustered/reference/sql/functions/window.md b/content/influxdb3/clustered/reference/sql/functions/window.md new file mode 100644 index 000000000..4b4c0052a --- /dev/null +++ b/content/influxdb3/clustered/reference/sql/functions/window.md @@ -0,0 +1,18 @@ +--- +title: SQL window functions +list_title: Window functions +description: > + SQL window functions perform an operation across a set of rows related to the + current row. +menu: + influxdb3_clustered: + name: Window + parent: sql-functions +weight: 309 + +source: /shared/sql-reference/functions/window.md +--- + + diff --git a/content/influxdb3/core/reference/sql/functions/window.md b/content/influxdb3/core/reference/sql/functions/window.md new file mode 100644 index 000000000..e964f30ed --- /dev/null +++ b/content/influxdb3/core/reference/sql/functions/window.md @@ -0,0 +1,18 @@ +--- +title: SQL window functions +list_title: Window functions +description: > + SQL window functions perform an operation across a set of rows related to the + current row. +menu: + influxdb3_core: + name: Window + parent: sql-functions +weight: 309 + +source: /shared/sql-reference/functions/window.md +--- + + diff --git a/content/influxdb3/enterprise/reference/sql/functions/window.md b/content/influxdb3/enterprise/reference/sql/functions/window.md new file mode 100644 index 000000000..934c647be --- /dev/null +++ b/content/influxdb3/enterprise/reference/sql/functions/window.md @@ -0,0 +1,18 @@ +--- +title: SQL window functions +list_title: Window functions +description: > + SQL window functions perform an operation across a set of rows related to the + current row. +menu: + influxdb3_enterprise: + name: Window + parent: sql-functions +weight: 309 + +source: /shared/sql-reference/functions/window.md +--- + + diff --git a/content/shared/sql-reference/functions/window.md b/content/shared/sql-reference/functions/window.md new file mode 100644 index 000000000..bd79f77b7 --- /dev/null +++ b/content/shared/sql-reference/functions/window.md @@ -0,0 +1,986 @@ + +A _window function_ performs an operation across a set of rows related to the +current row. This is similar to the type of operations +[aggregate functions](/influxdb3/cloud-dedicated/reference/sql/functions/aggregate/) +perform. However, window functions do not return a single output row per group +like non-window aggregate functions do. Instead, rows retain their separate +identities. + +For example, the following query uses the {{< influxdb3/home-sample-link >}} +and returns each temperature reading with the average temperature per room over +the queried time range: + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + avg(temp) OVER (PARTITION BY room) AS avg_room_temp +FROM + home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time <= '2022-01-01T09:00:00Z' +ORDER BY + room, + time +``` + +| time | room | temp | avg_room_temp | +| :------------------ | :---------- | ---: | ------------: | +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 22.0 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 22.0 | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 21.25 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 21.25 | + +{{% /influxdb/custom-timestamps %}} + +- [Window frames](#window-frames) +- [Window function syntax](#window-function-syntax) + - [OVER clause](#over-clause) + - [PARTITION BY clause](#partition-by-clause) + - [ORDER BY clause](#order-by-clause) + - [Frame clause](#frame-clause) + - [Frame units](#frame-units) + - [Frame boundaries](#frame-boundaries) + - [WINDOW clause](#window-clause) +- [Aggregate functions](#aggregate-functions) +- [Ranking Functions](#ranking-functions) + - [cume_dist](#cume_dist) + - [dense_rank](#dense_rank) + - [ntile](#ntile) + - [percent_rank](#percent_rank) + - [rank](#rank) + - [row_number](#row_number) +- [Analytical Functions](#analytical-functions) + - [first_value](#first_value) + - [lag](#lag) + - [last_value](#last_value) + - [lead](#lead) + - [nth_value](#nth_value) + +## Window frames + +As window functions operate on a row, there is a set of rows in the row's +partition that the window function uses to perform the operation. This set of +rows is called the _window frame_. Window frame boundaries can be defined using +`RANGE`, `ROW`, or `GROUPS` frame units, each relative to the current row--for +exmaple: + +{{< code-tabs-wrapper >}} +{{% code-tabs %}} +[RANGE](#) +[ROWS](#) +[GROUPS](#) +{{% /code-tabs %}} +{{% code-tab-content %}} +```sql +SELECT + time, + temp, + avg(temp) OVER ( + ORDER BY time + RANGE INTERVAL '3 hours' PRECEDING + ) AS 3h_moving_avg +FROM home +WHERE room = 'Kitchen' +``` +{{% /code-tab-content %}} +{{% code-tab-content %}} +```sql +SELECT + time, + temp, + avg(temp) OVER ( + ROWS 3 PRECEDING + ) AS moving_avg +FROM home +WHERE room = 'Kitchen' +``` +{{% /code-tab-content %}} +{{% code-tab-content %}} +```sql +SELECT + time, + room, + temp, + avg(temp) OVER ( + ORDER BY room + GROUPS 1 PRECEDING + ) AS moving_avg +FROM home +``` +{{% /code-tab-content %}} +{{< /code-tabs-wrapper >}} + +_For more information about how window frames work, see the [frame clause](#frame-clause)._ + +If window frames are not defined, window functions use all rows in the current +partition to perform their operation. + +## Window function syntax + +```sql +function([expr]) + OVER( + [PARTITION BY expr[, …]] + [ORDER BY expr [ ASC | DESC ][, …]] + [ frame_clause ] + ) +``` + +### OVER clause + +Window functions use an `OVER` clause directly following the window function's +name and arguments. The `OVER` clause syntactically distinguishes a window +function from a normal function or non-window aggregate function and determines +how rows are split up for the window operation. + +### PARTITION BY clause + +The `PARTITION BY` clause in the `OVER` clause divides the rows into groups, or +partitions, that share the same values of the `PARTITION BY` expressions. +The window function operates on all the rows in the same partition as the +current row. + +### ORDER BY clause + +The `ORDER BY` clause inside of the `OVER` clause controls the order that the +window function processes rows in each partition. +When a window clause contains an `ORDER BY` clause, the window frame boundaries +may be explicit or implicit, limiting a window frame size in both directions +relative to the current row. + +> [!Note] +> The `ORDER BY` clause in an `OVER` clause is separate from the `ORDER BY` +> clause of the query and only determines the order that rows in each partition +> are processed in. + +### Frame clause + +The frame clause defines window frame boundaries and can be one of the following: + +```sql +{ RANGE | ROWS | GROUPS } frame_start +{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end +``` + +- [Frame units](#frame-units) + - [RANGE](#range) + - [ROWS](#rows) + - [GROUPS](#groups) +- [Frame boundaries](#frame-boundaries) + - [UNBOUNDED PRECEDING](#unbounded-preceding) + - [offset PRECEDING](#offset-preceding) + - [CURRENT ROW](#current-row) + - [offset FOLLOWING](#offset-following) + - [UNBOUNDED FOLLOWING](#unbounded-following) + +#### Frame units + +When defining window frames, you can use one of the following frame units: + +- [RANGE](#range) +- [ROWS](#rows) +- [GROUPS](#groups) + +##### RANGE + +Defines frame boundaries using rows with values for columns specified +in the [`ORDER BY` clause](#order-by-clause) within a value range relative to +the current row value. + +> [!Important] +> When using `RANGE` frame units, you must include an `ORDER BY` clause with +> _exactly one column_. + +The offset is the difference the between the current row value and surrounding +row values. `RANGE` supports the following offset types: + +- Numeric _(non-negative)_ +- Numeric string _(non-negative)_ +- Interval + +{{< expand-wrapper >}} +{{% expand "See how `RANGE` frame units work with numeric offsets" %}} + +To use a numeric offset with the `RANGE` frame unit, you must sort partitions +by a numeric-typed column. + +```sql +... OVER ( + ORDER BY wind_direction + RANGE BETWEEN 45 PRECEDING AND 45 FOLLOWING +) +``` + +The window frame includes rows with sort column values between 45 below and +45 above the current row's value: + +{{< sql/window-frame-units "range numeric" >}} + +{{% /expand %}} + +{{% expand "See how `RANGE` frame units work with interval offsets" %}} + +To use an interval offset with the `RANGE` frame unit, you must sort partitions +by `time` or a timestamp-typed column. + +```sql +... OVER ( + ORDER BY time + RANGE BETWEEN + INTERVAL '3 hours' PRECEDING + AND INTERVAL '1 hour' FOLLOWING +) +``` + +The window frame includes rows with timestamps between three hours before and +one hour after the current row's timestamp: + +{{% influxdb/custom-timestamps %}} + +{{< sql/window-frame-units "range interval" >}} + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +##### ROWS + +Defines frame boundaries using row positions relative to the current row. +The offset is the difference in row position from the current row. +`ROWS` supports the following offset types: + +- Numeric _(non-negative)_ +- Numeric string _(non-negative)_ + +{{< expand-wrapper >}} +{{% expand "See how `ROWS` frame units work" %}} + +When using the `ROWS` frame unit, row positions relative to the current row +determine frame boundaries--for example: + +```sql +... OVER ( + ROWS BETWEEN 2 PRECEDING AND 1 FOLLOWING +) +``` + +The window frame includes the two rows before and the one row after the current row: + +{{< sql/window-frame-units "rows" >}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +##### GROUPS + +Defines frame boundaries using row groups. +Rows with the same values for the columns in the [`ORDER BY` clause](#order-by-clause) +comprise a row group. + +> [!Important] +> When using `GROUPS` frame units, you must include an `ORDER BY` clause. + +The offset is the difference in row group position relative to the the current row group. +`GROUPS` supports the following offset types: + +- Numeric _(non-negative)_ +- Numeric string _(non-negative)_ + +{{< expand-wrapper >}} +{{% expand "See how `GROUPS` frame units work" %}} + +When using the `GROUPS` frame unit, unique combinations column values specified +in the `ORDER BY` clause determine each row group. For example, if you sort +partitions by `country` and `city`: + +```sql +... OVER ( + ORDER BY country, city + GROUPS ... +) +``` + +The query defines row groups in the following way: + +{{< sql/window-frame-units "groups" >}} + +You can then use group offsets to determine frame boundaries: + +```sql +... OVER ( + ORDER BY country, city + GROUPS 2 PRECEDING +) +``` + +The window function uses all rows in the two row groups before the current +row group and the current row group to perform the operation: + +{{< sql/window-frame-units "groups with frame" >}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +#### Frame boundaries + +Frame boundaries (**frame_start** and **frame_end**) define the boundaries of +each frame the window function operates on. Use the following to define +frame boundaries: + +```sql +UNBOUNDED PRECEDING +offset PRECEDING +CURRENT ROW +offset FOLLOWING +UNBOUNDED FOLLOWING +``` + +##### UNBOUNDED PRECEDING + +Use the beginning of the partition to the current row as the frame boundary. + +##### offset PRECEDING + +Use a specified offset of [frame units](#frame-units) _before_ the current row +as a frame boundary. + +##### CURRENT ROW + +Use the current row as a frame boundary. + +##### offset FOLLOWING + +Use a specified offset of [frame units](#frame-units) _after_ the current row +as a frame boundary. + +##### UNBOUNDED FOLLOWING + +Use the current row to the end of the current partition the frame boundary. + +### WINDOW clause + +When a query has multiple window functions that use the same window, rather than +writing each with a separate `OVER` clause (which is duplicative and error-prone), +use the `WINDOW` clause to define the window and then reference the window alias +in each `OVER` clause--for example: + +```sql +SELECT + sum(net_gain) OVER w, + avg(net_net) OVER w +FROM + finance +WINDOW w AS ( PARTITION BY ticker ORDER BY time DESC); +``` + +--- + +## Aggregate functions + +All [aggregate functions](/influxdb3/cloud-dedicated/reference/sql/functions/aggregate/) +can be used as window functions. + +## Ranking Functions + +- [cume_dist](#cume_dist) +- [dense_rank](#dense_rank) +- [ntile](#ntile) +- [percent_rank](#percent_rank) +- [rank](#rank) +- [row_number](#row_number) + +### cume_dist + +Returns the cumulative distribution of a value within a group of values. +The returned value is greater than 0 and less than or equal to 1 and represents +the relative rank of the value in the set of values. +The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines +ranking order. + +```sql +cume_dist() +``` + +> [!Important] +> `cume_dist` needs an [`ORDER BY` clause](#order-by-clause) in the `OVER` clause +> to correctly calculate the cumulative distribution of the current row value. + +{{< expand-wrapper >}} +{{% expand "View `cume_dist` query example" %}} + +The following example uses the {{< influxdb3/home-sample-link >}}. + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + cume_dist() OVER ( + PARTITION BY room + ORDER BY temp + ) AS cume_dist +FROM home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time < '2022-01-01T12:00:00Z' +``` + +| time | room | temp | cume_dist | +| :------------------ | :---------- | ---: | --------: | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 0.25 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 0.5 | +| 2022-01-01T10:00:00 | Living Room | 21.8 | 0.75 | +| 2022-01-01T11:00:00 | Living Room | 22.2 | 1.0 | +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 0.25 | +| 2022-01-01T11:00:00 | Kitchen | 22.4 | 0.5 | +| 2022-01-01T10:00:00 | Kitchen | 22.7 | 0.75 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 1.0 | + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +### dense_rank + +Returns the rank of the current row without gaps. This function ranks rows in a +dense manner, meaning consecutive ranks are assigned even for identical values. +The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines +ranking order. + +```sql +dense_rank() +``` + +{{< expand-wrapper >}} +{{% expand "View `dense_rank` query example" %}} + +The following example uses the {{< influxdb3/home-sample-link >}}. + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + dense_rank() OVER ( + PARTITION BY room + ORDER BY temp + ) AS dense_rank +FROM home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time < '2022-01-01T12:00:00Z' +``` + +| time | room | temp | dense_rank | +| :------------------ | :---------- | ---: | ---------: | +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 1 | +| 2022-01-01T11:00:00 | Kitchen | 22.4 | 2 | +| 2022-01-01T10:00:00 | Kitchen | 22.7 | 3 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 4 | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 1 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 2 | +| 2022-01-01T10:00:00 | Living Room | 21.8 | 3 | +| 2022-01-01T11:00:00 | Living Room | 22.2 | 4 | + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +### ntile + +Distributes the rows in an ordered partition into a specified number of groups. +Each group is numbered, starting at one. For each row, `ntile` returns the +group number to which the row belongs. +Group numbers range from 1 to the `expression` value, dividing the partition as +equally as possible. +The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines +ranking order. + +```sql +ntile(expression) +``` + +##### Arguments + +- **expression**: An integer describing the number groups to split the partition + into. + +{{< expand-wrapper >}} +{{% expand "View `ntile` query example" %}} + +The following example uses the {{< influxdb3/home-sample-link >}}. + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + temp, + ntile(4) OVER ( + ORDER BY time + ) AS ntile +FROM home +WHERE + room = 'Kitchen' + AND time >= '2022-01-01T08:00:00Z' + AND time < '2022-01-01T15:00:00Z' +``` + +| time | temp | ntile | +| :------------------ | ---: | ----: | +| 2022-01-01T08:00:00 | 21.0 | 1 | +| 2022-01-01T09:00:00 | 23.0 | 1 | +| 2022-01-01T10:00:00 | 22.7 | 2 | +| 2022-01-01T11:00:00 | 22.4 | 2 | +| 2022-01-01T12:00:00 | 22.5 | 3 | +| 2022-01-01T13:00:00 | 22.8 | 3 | +| 2022-01-01T14:00:00 | 22.8 | 4 | + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +### percent_rank + +Returns the percentage rank of the current row within its partition. +The returned value is between `0` and `1` and is computed as +`(rank - 1) / (total_rows - 1)`. +The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines +ranking order. + +```sql +percent_rank() +``` + +{{< expand-wrapper >}} +{{% expand "View `percent_rank` query example" %}} + +The following example uses the {{< influxdb3/home-sample-link >}}. + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + percent_rank() OVER ( + PARTITION BY room + ORDER BY temp + ) AS percent_rank +FROM home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time < '2022-01-01T11:00:00Z' +``` + +| time | room | temp | percent_rank | +| :------------------ | :---------- | ---: | -----------: | +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 0.0 | +| 2022-01-01T10:00:00 | Kitchen | 22.7 | 0.5 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 1.0 | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 0.0 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 0.5 | +| 2022-01-01T10:00:00 | Living Room | 21.8 | 1.0 | + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +### rank + +Returns the rank of the current row in its partition, allowing gaps between +ranks. This function provides a ranking similar to [`row_number`](#row_number), +but skips ranks for identical values. +The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines +ranking order. + +```sql +rank() +``` + +{{< expand-wrapper >}} +{{% expand "View `rank` query example" %}} + +The following example uses the {{< influxdb3/home-sample-link >}}. + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + rank() OVER ( + PARTITION BY room + ORDER BY temp + ) AS rank +FROM home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time < '2022-01-01T11:00:00Z' +``` + +| time | room | temp | rank | +| :------------------ | :---------- | ---: | ---: | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 1 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 2 | +| 2022-01-01T10:00:00 | Living Room | 21.8 | 3 | +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 1 | +| 2022-01-01T10:00:00 | Kitchen | 22.7 | 2 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 3 | + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +### row_number + +Returns the position of the current row in its partition, counting from 1. +The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines +row order. + +```sql +row_number() +``` + +{{< expand-wrapper >}} +{{% expand "View `row_number` query example" %}} + +The following example uses the {{< influxdb3/home-sample-link >}}. + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + row_number() OVER ( + PARTITION BY room + ORDER BY temp + ) AS row_number +FROM home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time < '2022-01-01T11:00:00Z' +``` + +| time | room | temp | row_number | +| :------------------ | :---------- | ---: | ---------: | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 1 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 2 | +| 2022-01-01T10:00:00 | Living Room | 21.8 | 3 | +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 1 | +| 2022-01-01T10:00:00 | Kitchen | 22.7 | 2 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 3 | + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +## Analytical Functions + +- [first_value](#first_value) +- [lag](#lag) +- [last_value](#last_value) +- [lead](#lead) +- [nth_value](#nth_value) + +### first_value + +Returns the value from the first row of the window frame. + +```sql +first_value(expression) +``` + +##### Arguments + +- **expression**: Expression to operate on. Can be a constant, column, or + function, and any combination of arithmetic operators. + +##### Related functions + +[last_value](#last_value) + +{{< expand-wrapper >}} +{{% expand "View `first_value` query example" %}} + +The following example uses the {{< influxdb3/home-sample-link >}}. + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + first_value(temp) OVER ( + PARTITION BY room + ORDER BY time + ) AS first_value +FROM home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time < '2022-01-01T11:00:00Z' +ORDER BY room, time +``` + +| time | room | temp | first_value | +| :------------------ | :---------- | ---: | ----------: | +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 21.0 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 21.0 | +| 2022-01-01T10:00:00 | Kitchen | 22.7 | 21.0 | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 21.1 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 21.1 | +| 2022-01-01T10:00:00 | Living Room | 21.8 | 21.1 | + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +### lag + +Returns the value from the row that is at the specified offset before the +current row in the partition. If the offset row is outside of the partition, +the function returns the specified default. + +```sql +lag(expression, offset, default) +``` + +##### Arguments + +- **expression**: Expression to operate on. + Can be a constant, column, or function, and any combination of arithmetic or + string operators. +- **offset**: How many rows _before_ the current row to retrieve the value of + _expression_ from. Default is `1`. +- **default**: The default value to return if the offset is in the partition. + Must be of the same type as _expression_. + +##### Related functions + +[lead](#lead) + +{{< expand-wrapper >}} +{{% expand "View `lag` query example" %}} + +The following example uses the {{< influxdb3/home-sample-link >}}. + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + lag(temp, 1, 0) OVER ( + PARTITION BY room + ORDER BY time + ) AS previous_value +FROM home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time < '2022-01-01T11:00:00Z' +ORDER BY room, time +``` + +| time | room | temp | previous_value | +|:--------------------|:------------|-----:|---------------:| +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 0.0 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 21.0 | +| 2022-01-01T10:00:00 | Kitchen | 22.7 | 23.0 | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 0.0 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 21.1 | +| 2022-01-01T10:00:00 | Living Room | 21.8 | 21.4 | + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +### last_value + +Returns the value from the last row of the window frame. + +```sql +last_value(expression) +``` + +##### Arguments + +- **expression**: Expression to operate on. Can be a constant, column, or + function, and any combination of arithmetic operators. + +##### Related functions + +[first_value](#first_value) + +{{< expand-wrapper >}} +{{% expand "View `last_value` query example" %}} + +The following example uses the {{< influxdb3/home-sample-link >}}. + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + last_value(temp) OVER ( + PARTITION BY room + ORDER BY time + ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING + ) AS last_value +FROM home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time < '2022-01-01T11:00:00Z' +ORDER BY room, time +``` + +| time | room | temp | last_value | +| :------------------ | :---------- | ---: | ---------: | +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 22.7 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 22.7 | +| 2022-01-01T10:00:00 | Kitchen | 22.7 | 22.7 | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 21.8 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 21.8 | +| 2022-01-01T10:00:00 | Living Room | 21.8 | 21.8 | + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +### lead + +Returns the value from the row that is at the specified offset after the +current row in the partition. If the offset row is outside of the partition, +the function returns the specified default. + +```sql +lead(expression, offset, default) +``` + +##### Arguments + +- **expression**: Expression to operate on. + Can be a constant, column, or function, and any combination of arithmetic or + string operators. +- **offset**: How many rows _before_ the current row to retrieve the value of + _expression_ from. Default is `1`. +- **default**: The default value to return if the offset is in the partition. + Must be of the same type as _expression_. + +##### Related functions + +[lag](#lag) + +{{< expand-wrapper >}} +{{% expand "View `lead` query example" %}} + +The following example uses the {{< influxdb3/home-sample-link >}}. + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + lead(temp, 1, 0) OVER ( + PARTITION BY room + ORDER BY time + ) AS next_value +FROM home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time < '2022-01-01T11:00:00Z' +ORDER BY room, time +``` + +| time | room | temp | next_value | +| :------------------ | :---------- | ---: | ---------: | +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 23.0 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 22.7 | +| 2022-01-01T10:00:00 | Kitchen | 22.7 | 0.0 | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 21.4 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 21.8 | +| 2022-01-01T10:00:00 | Living Room | 21.8 | 0.0 | + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} + +### nth_value + +Returns the value from the row that is the nth row of the window frame +(counting from 1). If the nth row doesn't exist, the function returns _null_. + +```sql +nth_value(expression, n) +``` + +##### Arguments + +- **expression**: The expression to operator on. + Can be a constant, column, or function, and any combination of arithmetic or + string operators. +- **n**: Specifies the row number in the current frame and partition to reference. + +{{< expand-wrapper >}} +{{% expand "View `lead` query example" %}} + +The following example uses the {{< influxdb3/home-sample-link >}}. + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + nth_value(temp, 2) OVER ( + PARTITION BY room + ) AS "2nd_temp" +FROM home +WHERE + time >= '2025-02-10T08:00:00Z' + AND time < '2025-02-10T11:00:00Z' +``` + +| time | room | temp | 2nd_temp | +| :------------------ | :---------- | ---: | -------: | +| 2025-02-10T08:00:00 | Kitchen | 21.0 | 22.7 | +| 2025-02-10T10:00:00 | Kitchen | 22.7 | 22.7 | +| 2025-02-10T09:00:00 | Kitchen | 23.0 | 22.7 | +| 2025-02-10T08:00:00 | Living Room | 21.1 | 21.8 | +| 2025-02-10T10:00:00 | Living Room | 21.8 | 21.8 | +| 2025-02-10T09:00:00 | Living Room | 21.4 | 21.8 | + +{{% /influxdb/custom-timestamps %}} + +{{% /expand %}} +{{< /expand-wrapper >}} diff --git a/layouts/shortcodes/sql/window-frame-units..html b/layouts/shortcodes/sql/window-frame-units..html new file mode 100644 index 000000000..e74a0b6bb --- /dev/null +++ b/layouts/shortcodes/sql/window-frame-units..html @@ -0,0 +1,335 @@ +{{ $unit := .Get 0 | default "groups" }} + +{{ if eq $unit "groups" }} + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
timecountrycitywind_direction
2025-02-17T00:00:00FranceStrasbourg181
2025-02-17T01:00:00FranceStrasbourg228
2025-02-17T02:00:00FranceStrasbourg289
2025-02-17T00:00:00FranceToulouse24
2025-02-17T01:00:00FranceToulouse210
2025-02-17T02:00:00FranceToulouse206
2025-02-17T00:00:00ItalyBari2
2025-02-17T01:00:00ItalyBari57
2025-02-17T00:00:00ItalyBologna351
2025-02-17T01:00:00ItalyBologna232
2025-02-17T02:00:00ItalyBologna29
+ +{{ else if eq $unit "groups with frame" }} + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
timecountrycitywind_direction
2025-02-17T00:00:00FranceStrasbourg181
2025-02-17T01:00:00FranceStrasbourg228
2025-02-17T02:00:00FranceStrasbourg289
2025-02-17T00:00:00FranceToulouse24
2025-02-17T01:00:00FranceToulouse210
2025-02-17T02:00:00FranceToulouse206
2025-02-17T00:00:00ItalyBari2
2025-02-17T01:00:00ItalyBari57
2025-02-17T00:00:00ItalyBologna351
2025-02-17T01:00:00ItalyBologna232
2025-02-17T02:00:00ItalyBologna29
+ +{{ else if (eq $unit "range interval") }} + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
timeroomtemp
2022-01-01T08:00:00Kitchen21.0
2022-01-01T09:00:00Kitchen23.0
2022-01-01T10:00:00Kitchen22.7
2022-01-01T11:00:00Kitchen22.4
2022-01-01T12:00:00Kitchen22.5
2022-01-01T13:00:00Kitchen22.8
2022-01-01T14:00:00Kitchen22.8
2022-01-01T15:00:00Kitchen22.7
+ +{{ else if (eq $unit "range numeric") }} + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
timecitywind_direction
2025-02-17T13:00:00Rome33
2025-02-17T08:00:00Rome34
2025-02-17T23:00:00Rome49
2025-02-17T17:00:00Rome86
2025-02-17T11:00:00Rome93
2025-02-17T12:00:00Rome115
2025-02-17T10:00:00Rome156
+ +{{ else if (eq $unit "rows") }} + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
timecitywind_direction
2025-02-17T08:00:00Rome34
2025-02-17T10:00:00Rome156
2025-02-17T11:00:00Rome93
2025-02-17T12:00:00Rome115
2025-02-17T13:00:00Rome33
2025-02-17T17:00:00Rome86
2025-02-17T23:00:00Rome49
+ +{{ end }} \ No newline at end of file From 7afb3d9dc96b1ee996457404849538654138fced Mon Sep 17 00:00:00 2001 From: Scott Anderson Date: Fri, 21 Feb 2025 14:25:01 -0700 Subject: [PATCH 05/27] Apply suggestions from code review Co-authored-by: Jason Stirnaman --- content/shared/v3-core-plugins/_index.md | 30 ++++++++++++------------ 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index 84d138050..6d4b28930 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -1,15 +1,15 @@ -Use the {{% product-name %}} ProcessingEngine to run code and perform tasks +Use the {{% product-name %}} Processing engine to run code and perform tasks for different database events. -{{% product-name %}} provides the InfluxDB 3 Processing Engine, an embedded Python VM that can dynamically load and trigger Python plugins +{{% product-name %}} provides the InfluxDB 3 Processing engine, an embedded Python VM that can dynamically load and trigger Python plugins in response to events in your database. ## Key Concepts ### Plugins -A Processing Engine _plugin_ is Python code you provide to run tasks, such as +A Processing engine _plugin_ is Python code you provide to run tasks, such as downsampling data, monitoring, creating alerts, or calling external services. > [!Note] @@ -25,16 +25,16 @@ A _trigger_ is an InfluxDB 3 resource you create to associate a database event (for example, a WAL flush) with the plugin that should run. When an event occurs, the trigger passes configuration details, optional arguments, and event data to the plugin. -The Processing Engine provides four types of triggers—each type corresponds to an event type with event-specific configuration to let you handle events with targeted logic. +The Processing engine provides four types of triggers—each type corresponds to an event type with event-specific configuration to let you handle events with targeted logic. - **WAL Flush**: Triggered when the write-ahead log (WAL) is flushed to the object store (default is every second). - **Scheduled Tasks**: Triggered on a schedule you specify using cron syntax. -- **On Request**: Triggered on a GET or POST request to the bound HTTP API endpoint at `/api/v3/engine/`. +- **On-request**: Triggered on a GET or POST request to the bound HTTP API endpoint at `/api/v3/engine/`. - **Parquet Persistence (coming soon)**: Triggered when InfluxDB 3 persists data to object storage Parquet files. -### Activate the Processing Engine +### Activate the Processing engine -To enable the Processing Engine, start the {{% product-name %}} server with the `--plugin-dir` option and a path to your plugins directory. If the directory doesn’t exist, the server creates it. +To enable the Processing engine, start the {{% product-name %}} server with the `--plugin-dir` option and a path to your plugins directory. If the directory doesn’t exist, the server creates it. ```bash influxdb3 serve --node-id node0 --object-store [OBJECT STORE TYPE] --plugin-dir /path/to/plugins @@ -216,7 +216,7 @@ obj_to_log = {"hello": "world"} influxdb3_local.info("This is an info message with an object", obj_to_log) ``` -### Trigger Arguments +### Trigger arguments Every plugin type can receive arguments from the configuration of the trigger that runs it. You can use this to provide runtime configuration and drive behavior of a plugin—for example: @@ -239,7 +239,7 @@ def process_writes(influxdb3_local, table_batches, args=None): The `args` parameter is optional. If a plugin doesn’t require arguments, you can omit it from the trigger definition. -## Import Plugin Dependencies +## Import plugin dependencies Use the `influxdb3 install` command to download and install Python packages that your plugin depends on. @@ -278,10 +278,10 @@ For more information, see the `influxdb3` CLI help: influxdb3 install package --help ``` -## Trigger Types and How They Work +## Configure plugin triggers Triggers define when and how plugins execute in response to database events. Each trigger type corresponds to a specific event, allowing precise control over automation within {{% product-name %}}. -### WAL Flush Trigger +### WAL flush trigger When a WAL flush plugin is triggered, the plugin receives a list of `table_batches` filtered by the trigger configuration (either _all tables_ in the database or a specific table). @@ -337,9 +337,9 @@ For more information about trigger arguments, see the CLI help: influxdb3 create trigger help ``` -### Schedule Trigger +### Schedule trigger -Schedule plugins run on a schedule specified in cron syntax. The plugin will receive the local API, the time of the trigger, and any arguments passed in the trigger definition. Here's an example of a simple schedule plugin: +Schedule plugins run on a schedule specified in cron syntax. The plugin receives the local API, the time of the trigger, and any arguments passed in the trigger definition. Here's an example of a simple schedule plugin: ```python # see if a table has been written to in the last 5 minutes @@ -392,9 +392,9 @@ def process_request(influxdb3_local, query_parameters, request_headers, request_ return 200, {"Content-Type": "application/json"}, json.dumps({"status": "ok", "line": line_str}) ``` -#### On request trigger configuration +#### On-request trigger configuration -On Request plugins are set with a `trigger-spec` of `request:`. The `args` parameter can be used to pass configuration to the plugin. For example, if we wanted the above plugin to run on the endpoint `/api/v3/engine/my_plugin`, we would use `request:my_plugin` as the `trigger-spec`. +On-request plugins are set with a `trigger-spec` of `request:`. The `args` parameter can be used to pass configuration to the plugin. For example, if we wanted the above plugin to run on the endpoint `/api/v3/engine/my_plugin`, we would use `request:my_plugin` as the `trigger-spec`. Trigger specs must be unique across all configured plugins, regardless of which database they are tied to, given the path is the same. Here's an example to create a request trigger tied to the "hello-world' path using a plugin in the plugin-dir: From a39644200e6724ca5350390ec048e8393a602a17 Mon Sep 17 00:00:00 2001 From: Scott Anderson Date: Fri, 21 Feb 2025 15:31:55 -0700 Subject: [PATCH 06/27] updated clustered upgrade, closes influxdata/DAR#479, closes influxdata/DAR#480 --- content/influxdb3/clustered/admin/upgrade.md | 6 +++ .../reference/release-notes/clustered.md | 53 +++++++++---------- 2 files changed, 30 insertions(+), 29 deletions(-) diff --git a/content/influxdb3/clustered/admin/upgrade.md b/content/influxdb3/clustered/admin/upgrade.md index aecdc3ed2..5b79d9eea 100644 --- a/content/influxdb3/clustered/admin/upgrade.md +++ b/content/influxdb3/clustered/admin/upgrade.md @@ -15,10 +15,16 @@ related: --- Use Kubernetes to upgrade your InfluxDB Clustered version. +The upgrade is carried out using in-place updates, ensuring minimal downtime. InfluxDB Clustered versioning is defined in the `AppInstance` `CustomResourceDefinition` (CRD) in your [`myinfluxdb.yml`](/influxdb3/clustered/install/set-up-cluster/configure-cluster/). +> [!Important] +> InfluxDB Clustered does not support downgrading. +> If you encounter an issue after upgrading, +> [contact InfluxData support](mailto:support@influxdata.com). + - [Version format](#version-format) - [Upgrade your InfluxDB Clustered version](#upgrade-your-influxdb-clustered-version) diff --git a/content/influxdb3/clustered/reference/release-notes/clustered.md b/content/influxdb3/clustered/reference/release-notes/clustered.md index 70bf67d09..c33a73018 100644 --- a/content/influxdb3/clustered/reference/release-notes/clustered.md +++ b/content/influxdb3/clustered/reference/release-notes/clustered.md @@ -10,17 +10,16 @@ menu: weight: 201 --- -{{% note %}} -## Checkpoint releases {.checkpoint} - -Some InfluxDB Clustered releases are checkpoint releases that introduce a -breaking change to an InfluxDB component. -When [upgrading InfluxDB Clustered](/influxdb3/clustered/admin/upgrade/), -**always upgrade to each checkpoint release first, before proceeding to newer versions**. - -Checkpoint releases are only made when absolutely necessary and are clearly -identified below with the icon. -{{% /note %}} +> [!Note] +> ## Checkpoint releases {.checkpoint} +> +> Some InfluxDB Clustered releases are checkpoint releases that introduce a +> breaking change to an InfluxDB component. +> When [upgrading InfluxDB Clustered](/influxdb3/clustered/admin/upgrade/), +> **always upgrade to each checkpoint release first, before proceeding to newer versions**. +> +> Checkpoint releases are only made when absolutely necessary and are clearly +> identified below with the icon. {{< release-toc >}} @@ -181,11 +180,10 @@ For customers who experience this bug when attempting to upgrade to ## 20240925-1257864 {date="2024-09-25" .checkpoint} -{{% warn %}} -This release has a number of bugs in it which make it unsuitable for customer use. -If you are currently running this version, please upgrade to -[20241024-1354148](#20241024-1354148). -{{% /warn %}} +> [!Caution] +> This release has a number of bugs in it which make it unsuitable for customer use. +> If you are currently running this version, please upgrade to +> [20241024-1354148](#20241024-1354148). ### Quickstart @@ -352,10 +350,9 @@ validation error when omitted. When the `admin` section is omitted, the `admin-token` `Secret` can be used instead to get started quickly. -{{% note %}} -We still highly recommend OAuth for production; however, this lets you run an -InfluxDB Cluster with out having to integrate with an identity provider.** -{{% /note %}} +> [!Note] +> We still highly recommend OAuth for production; however, this lets you run an +> InfluxDB Cluster with out having to integrate with an identity provider.** ### Upgrade notes @@ -680,11 +677,10 @@ Kubernetes scheduler's default behavior. For further details, please consult the - Fix gRPC reflection to only include services served by a particular listening port. - {{% note %}} - `arrow.flight.protocol.FlightService` is known to be missing in the - `iox-shared-querier`'s reflection service even though `iox-shared-querier` - does run that gRPC service. - {{% /note %}} + > [!Note] + > `arrow.flight.protocol.FlightService` is known to be missing in the + > `iox-shared-querier`'s reflection service even though `iox-shared-querier` + > does run that gRPC service. --- @@ -889,10 +885,9 @@ spec: ### Highlights -{{% warn %}} -**This release fixes a regression in the database engine that was introduced in -[20231115-746129](#20231115-746129).** -{{% /warn %}} +> ![Important] +> **This release fixes a regression in the database engine that was introduced in +> [20231115-746129](#20231115-746129).** ### Changes From ae32fa01f981623948f261c4ca57ea8e6b54de6d Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 09:16:58 -0600 Subject: [PATCH 07/27] Update content/influxdb3/clustered/reference/release-notes/clustered.md --- .../influxdb3/clustered/reference/release-notes/clustered.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/influxdb3/clustered/reference/release-notes/clustered.md b/content/influxdb3/clustered/reference/release-notes/clustered.md index c33a73018..200496b09 100644 --- a/content/influxdb3/clustered/reference/release-notes/clustered.md +++ b/content/influxdb3/clustered/reference/release-notes/clustered.md @@ -351,8 +351,8 @@ When the `admin` section is omitted, the `admin-token` `Secret` can be used instead to get started quickly. > [!Note] -> We still highly recommend OAuth for production; however, this lets you run an -> InfluxDB Cluster with out having to integrate with an identity provider.** +> We recommend OAuth for production; however, the `admin-token` lets you run an +> InfluxDB Cluster without having to integrate with an identity provider.** ### Upgrade notes From b527eb1f2d7db4747cbaca94b1e4e1d909764b62 Mon Sep 17 00:00:00 2001 From: WeblWabl Date: Mon, 24 Feb 2025 09:23:18 -0600 Subject: [PATCH 08/27] feat: Update config-data-nodes for query logging This PR updates the data config documentation to include information about `query-log-path`. --- .../configure/config-data-nodes.md | 23 +++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md index f45460a4d..51379b6ae 100644 --- a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md +++ b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md @@ -303,6 +303,29 @@ Very useful for troubleshooting, but will log any sensitive data contained withi Environment variable: `INFLUXDB_DATA_QUERY_LOG_ENABLED` +#### query-log-path + +Default is `""`. + +Whether queries should be logged to a file at a given path. +If the value is set to `""` (default) queries are not logged to a file. +Please make sure you are using the absolute path when configuring this. +We support `SIGHUP` based log rotation. The following is an example of a `logrotate` configuration: + +``` +/var/log/influxdb/queries.log { + rotate 5 + daily + compress + missingok + notifempty + create 644 root root + postrotate + /bin/kill -HUP `pgrep -x influxd` + endscript +} +``` + #### wal-fsync-delay Default is `"0s"`. From 15b2a76c36edc4c41794e7b29150015839ad4d5d Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 10:01:39 -0600 Subject: [PATCH 09/27] Update content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md --- .../v1/administration/configure/config-data-nodes.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md index 51379b6ae..41f3d67d8 100644 --- a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md +++ b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md @@ -305,10 +305,8 @@ Environment variable: `INFLUXDB_DATA_QUERY_LOG_ENABLED` #### query-log-path -Default is `""`. - -Whether queries should be logged to a file at a given path. -If the value is set to `""` (default) queries are not logged to a file. +An absolute path to the query log file. +The default is `""` (queries aren't logged to a file). Please make sure you are using the absolute path when configuring this. We support `SIGHUP` based log rotation. The following is an example of a `logrotate` configuration: From cbe67338e523c103ebf7d12c7563d0025f630218 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 10:01:46 -0600 Subject: [PATCH 10/27] Update content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md --- .../v1/administration/configure/config-data-nodes.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md index 41f3d67d8..b96b71e9d 100644 --- a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md +++ b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md @@ -307,7 +307,8 @@ Environment variable: `INFLUXDB_DATA_QUERY_LOG_ENABLED` An absolute path to the query log file. The default is `""` (queries aren't logged to a file). -Please make sure you are using the absolute path when configuring this. + + We support `SIGHUP` based log rotation. The following is an example of a `logrotate` configuration: ``` From da0cfc1871b3530daa4bd5251904935454e89fc2 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 10:01:53 -0600 Subject: [PATCH 11/27] Update content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md --- .../v1/administration/configure/config-data-nodes.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md index b96b71e9d..b21f63774 100644 --- a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md +++ b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md @@ -309,7 +309,8 @@ An absolute path to the query log file. The default is `""` (queries aren't logged to a file). -We support `SIGHUP` based log rotation. The following is an example of a `logrotate` configuration: +Query logging supports SIGHUP-based log rotation. +The following is an example of a `logrotate` configuration: ``` /var/log/influxdb/queries.log { From 3b346d927d22718bd0ab76dc7d00c68162ac42e7 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 10:03:03 -0600 Subject: [PATCH 12/27] Update content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md --- .../v1/administration/configure/config-data-nodes.md | 1 + 1 file changed, 1 insertion(+) diff --git a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md index b21f63774..6eb3b4ec1 100644 --- a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md +++ b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md @@ -310,6 +310,7 @@ The default is `""` (queries aren't logged to a file). Query logging supports SIGHUP-based log rotation. + The following is an example of a `logrotate` configuration: ``` From b7dee87b03542d3838889a71d32190cb9df854ab Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 10:03:10 -0600 Subject: [PATCH 13/27] Update content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md --- .../v1/administration/configure/config-data-nodes.md | 1 - 1 file changed, 1 deletion(-) diff --git a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md index 6eb3b4ec1..c45d343f7 100644 --- a/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md +++ b/content/enterprise_influxdb/v1/administration/configure/config-data-nodes.md @@ -308,7 +308,6 @@ Environment variable: `INFLUXDB_DATA_QUERY_LOG_ENABLED` An absolute path to the query log file. The default is `""` (queries aren't logged to a file). - Query logging supports SIGHUP-based log rotation. The following is an example of a `logrotate` configuration: From 01236c679140a93078a4787a5dffc34c52a40411 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 12:44:13 -0600 Subject: [PATCH 14/27] fix(sql): Apply suggestions from code review. --- .../shared/sql-reference/functions/window.md | 151 +++++++++++------- 1 file changed, 90 insertions(+), 61 deletions(-) diff --git a/content/shared/sql-reference/functions/window.md b/content/shared/sql-reference/functions/window.md index bd79f77b7..d31457309 100644 --- a/content/shared/sql-reference/functions/window.md +++ b/content/shared/sql-reference/functions/window.md @@ -1,10 +1,8 @@ -A _window function_ performs an operation across a set of rows related to the -current row. This is similar to the type of operations -[aggregate functions](/influxdb3/cloud-dedicated/reference/sql/functions/aggregate/) -perform. However, window functions do not return a single output row per group -like non-window aggregate functions do. Instead, rows retain their separate -identities. +Window functions let you calculate running totals, moving averages, or other aggregate-like results without collapsing rows into groups. +They perform their calculations over a “window” of rows, which you can partition and order in various ways, and return a calculated value for each row in the set. + +Unlike non-window [aggregate functions](/influxdb3/version/reference/sql/functions/aggregate/) that combine each group into a single row, window functions preserve each row’s identity and calculate an additional value for every row in the partition. For example, the following query uses the {{< influxdb3/home-sample-link >}} and returns each temperature reading with the average temperature per room over @@ -65,9 +63,10 @@ ORDER BY As window functions operate on a row, there is a set of rows in the row's partition that the window function uses to perform the operation. This set of -rows is called the _window frame_. Window frame boundaries can be defined using +rows is called the _window frame_. +Window frame boundaries can be defined using `RANGE`, `ROW`, or `GROUPS` frame units, each relative to the current row--for -exmaple: +example: {{< code-tabs-wrapper >}} {{% code-tabs %}} @@ -117,11 +116,9 @@ FROM home _For more information about how window frames work, see the [frame clause](#frame-clause)._ -If window frames are not defined, window functions use all rows in the current +If you don't specify window frames, window functions use all rows in the current partition to perform their operation. -## Window function syntax - ```sql function([expr]) OVER( @@ -133,10 +130,10 @@ function([expr]) ### OVER clause -Window functions use an `OVER` clause directly following the window function's -name and arguments. The `OVER` clause syntactically distinguishes a window -function from a normal function or non-window aggregate function and determines -how rows are split up for the window operation. +Window functions use an `OVER` clause that directly follows the window function's +name and arguments. +The `OVER` clause syntactically distinguishes a window +function from a non-window or aggregate function and defines how to group and order rows for the window operation. ### PARTITION BY clause @@ -154,13 +151,13 @@ may be explicit or implicit, limiting a window frame size in both directions relative to the current row. > [!Note] -> The `ORDER BY` clause in an `OVER` clause is separate from the `ORDER BY` -> clause of the query and only determines the order that rows in each partition -> are processed in. +> The `ORDER BY` clause in an `OVER` clause determines the processing order for +> rows in each partition and is separate from the `ORDER BY` +> clause of the query. ### Frame clause -The frame clause defines window frame boundaries and can be one of the following: +The _frame clause_ defines window frame boundaries and can be one of the following: ```sql { RANGE | ROWS | GROUPS } frame_start @@ -196,7 +193,7 @@ the current row value. > When using `RANGE` frame units, you must include an `ORDER BY` clause with > _exactly one column_. -The offset is the difference the between the current row value and surrounding +The offset is the difference between the current row value and surrounding row values. `RANGE` supports the following offset types: - Numeric _(non-negative)_ @@ -207,7 +204,7 @@ row values. `RANGE` supports the following offset types: {{% expand "See how `RANGE` frame units work with numeric offsets" %}} To use a numeric offset with the `RANGE` frame unit, you must sort partitions -by a numeric-typed column. +by a numeric-type column. ```sql ... OVER ( @@ -226,7 +223,7 @@ The window frame includes rows with sort column values between 45 below and {{% expand "See how `RANGE` frame units work with interval offsets" %}} To use an interval offset with the `RANGE` frame unit, you must sort partitions -by `time` or a timestamp-typed column. +by `time` or a timestamp-type column. ```sql ... OVER ( @@ -251,7 +248,7 @@ one hour after the current row's timestamp: ##### ROWS -Defines frame boundaries using row positions relative to the current row. +Defines window frame boundaries using row positions relative to the current row. The offset is the difference in row position from the current row. `ROWS` supports the following offset types: @@ -279,14 +276,14 @@ The window frame includes the two rows before and the one row after the current ##### GROUPS -Defines frame boundaries using row groups. +Defines window frame boundaries using row groups. Rows with the same values for the columns in the [`ORDER BY` clause](#order-by-clause) comprise a row group. > [!Important] -> When using `GROUPS` frame units, you must include an `ORDER BY` clause. +> When using `GROUPS` frame units, include an `ORDER BY` clause. -The offset is the difference in row group position relative to the the current row group. +The offset is the difference in row group position relative to the current row group. `GROUPS` supports the following offset types: - Numeric _(non-negative)_ @@ -319,8 +316,7 @@ You can then use group offsets to determine frame boundaries: ) ``` -The window function uses all rows in the two row groups before the current -row group and the current row group to perform the operation: +The window function uses all rows in the current row group and the two preceding row groups to perform the operation: {{< sql/window-frame-units "groups with frame" >}} @@ -330,45 +326,75 @@ row group and the current row group to perform the operation: #### Frame boundaries Frame boundaries (**frame_start** and **frame_end**) define the boundaries of -each frame the window function operates on. Use the following to define -frame boundaries: +each frame that the window function operates on. -```sql -UNBOUNDED PRECEDING -offset PRECEDING -CURRENT ROW -offset FOLLOWING -UNBOUNDED FOLLOWING -``` +- [UNBOUNDED PRECEDING](#unbounded-preceding) +- [offset PRECEDING](#offset-preceding) +- CURRENT_ROW](#current-row) +- [offset> FOLLOWING](#offset-following) +- [UNBOUNDED FOLLOWING](#unbounded-following) ##### UNBOUNDED PRECEDING -Use the beginning of the partition to the current row as the frame boundary. +Starts at the first row of the partition and ends at the current row. + +```sql +UNBOUNDED PRECEDING +``` ##### offset PRECEDING -Use a specified offset of [frame units](#frame-units) _before_ the current row -as a frame boundary. +Starts at `offset` [frame units](#frame-units) before the current row and ends at the current row. +For example, `3 PRECEDING` includes 3 rows before the current row. + +```sql + PRECEDING +``` ##### CURRENT ROW -Use the current row as a frame boundary. +Both starts and ends at the current row when used as a boundary. +```sql +CURRENT ROW +``` + +##### offset FOLLOWING + +Starts at the current row and ends at `offset` [frame units](#frame-units) after the current row. +For example, `3 FOLLOWING` includes 3 rows after the current row. + +```sql + FOLLOWING +``` + +##### UNBOUNDED FOLLOWING + +Starts at the current row and ends at the last row of the partition. ##### offset FOLLOWING Use a specified offset of [frame units](#frame-units) _after_ the current row as a frame boundary. +```sql +offset FOLLOWING +``` + ##### UNBOUNDED FOLLOWING Use the current row to the end of the current partition the frame boundary. +```sql +UNBOUNDED FOLLOWING +``` + ### WINDOW clause -When a query has multiple window functions that use the same window, rather than -writing each with a separate `OVER` clause (which is duplicative and error-prone), -use the `WINDOW` clause to define the window and then reference the window alias -in each `OVER` clause--for example: +Use the `WINDOW` clause to define a reusable alias for a window specification. +This is useful when multiple window functions in your query share the same window definition. + +Instead of repeating the same OVER clause for each function, +define the window once and reference it by alias--for example: ```sql SELECT @@ -400,16 +426,15 @@ can be used as window functions. Returns the cumulative distribution of a value within a group of values. The returned value is greater than 0 and less than or equal to 1 and represents the relative rank of the value in the set of values. -The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines -ranking order. +The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause is used +to correctly calculate the cumulative distribution of the current row value. ```sql cume_dist() ``` > [!Important] -> `cume_dist` needs an [`ORDER BY` clause](#order-by-clause) in the `OVER` clause -> to correctly calculate the cumulative distribution of the current row value. +> When using `cume_dist`, include an [`ORDER BY` clause](#order-by-clause) in the `OVER` clause. {{< expand-wrapper >}} {{% expand "View `cume_dist` query example" %}} @@ -451,8 +476,9 @@ WHERE ### dense_rank -Returns the rank of the current row without gaps. This function ranks rows in a -dense manner, meaning consecutive ranks are assigned even for identical values. +Returns a rank for each row without gaps in the numbering. +Unlike [rank()](#rank), this function assigns consecutive ranks even when values +are identical. The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines ranking order. @@ -500,9 +526,9 @@ WHERE ### ntile -Distributes the rows in an ordered partition into a specified number of groups. -Each group is numbered, starting at one. For each row, `ntile` returns the -group number to which the row belongs. +Distributes the rows in an ordered partition into the specified number of groups. +Each group is numbered, starting at one. +For each row, `ntile` returns the group number to which the row belongs. Group numbers range from 1 to the `expression` value, dividing the partition as equally as possible. The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines @@ -514,8 +540,7 @@ ntile(expression) ##### Arguments -- **expression**: An integer describing the number groups to split the partition - into. +- **expression**: An integer. The number of groups to split the partition into. {{< expand-wrapper >}} {{% expand "View `ntile` query example" %}} @@ -556,10 +581,14 @@ WHERE ### percent_rank Returns the percentage rank of the current row within its partition. -The returned value is between `0` and `1` and is computed as -`(rank - 1) / (total_rows - 1)`. +The returned value is between `0` and `1`, computed as: + +``` +(rank - 1) / (total_rows - 1) +``` + The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines -ranking order. +the ranking order. ```sql percent_rank() @@ -760,7 +789,7 @@ ORDER BY room, time ### lag Returns the value from the row that is at the specified offset before the -current row in the partition. If the offset row is outside of the partition, +current row in the partition. If the offset row is outside the partition, the function returns the specified default. ```sql @@ -876,7 +905,7 @@ ORDER BY room, time ### lead Returns the value from the row that is at the specified offset after the -current row in the partition. If the offset row is outside of the partition, +current row in the partition. If the offset row is outside the partition, the function returns the specified default. ```sql From 733bd673d61f67d6f14dbc6dc88bd1f6d4ab4753 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 13:36:28 -0600 Subject: [PATCH 15/27] fix: Remove double slash --- layouts/shortcodes/influxdb3/home-sample-link.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/layouts/shortcodes/influxdb3/home-sample-link.html b/layouts/shortcodes/influxdb3/home-sample-link.html index 32977b266..dec324560 100644 --- a/layouts/shortcodes/influxdb3/home-sample-link.html +++ b/layouts/shortcodes/influxdb3/home-sample-link.html @@ -2,7 +2,7 @@ {{- $product := index $productPathData 2 -}} {{- $isDistributed := in (slice "cloud-dedicated" "cloud-serverless" "clustered") $product -}} {{- if $isDistributed -}} -Get started home sensor sample data +Get started home sensor sample data {{- else -}} -Home sensor sample data +Home sensor sample data {{- end -}} From ba7f10944a192cfe5d7d786c8f42a3e4d8f0445e Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 13:37:15 -0600 Subject: [PATCH 16/27] feat(sql): Add Window aggregate and Ranking functions to SQL reference index --- content/shared/sql-reference/_index.md | 52 ++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/content/shared/sql-reference/_index.md b/content/shared/sql-reference/_index.md index 6f82512ab..8b1fef900 100644 --- a/content/shared/sql-reference/_index.md +++ b/content/shared/sql-reference/_index.md @@ -582,6 +582,58 @@ FROM "h2o_feet" GROUP BY "location" ``` +### Window aggregate functions + +Window functions let you calculate running totals, moving averages, or other +aggregate-like results without collapsing rows into groups +(unlike non-window aggregate functions). + +Window aggregate functions include **all [aggregate functions](#aggregate-functions/)** +and the [ranking functions](#ranking-functions). +The SQL `OVER` clause syntactically distinguishes a window +function from a non-window or aggregate function and defines how to group and +order rows for the window operation. + +#### Examples: + +{{% influxdb/custom-timestamps %}} + +```sql +SELECT + time, + room, + temp, + avg(temp) OVER (PARTITION BY room) AS avg_room_temp +FROM + home +WHERE + time >= '2022-01-01T08:00:00Z' + AND time <= '2022-01-01T09:00:00Z' +ORDER BY + room, + time +``` + +| time | room | temp | avg_room_temp | +| :------------------ | :---------- | ---: | ------------: | +| 2022-01-01T08:00:00 | Kitchen | 21.0 | 22.0 | +| 2022-01-01T09:00:00 | Kitchen | 23.0 | 22.0 | +| 2022-01-01T08:00:00 | Living Room | 21.1 | 21.25 | +| 2022-01-01T09:00:00 | Living Room | 21.4 | 21.25 | + +{{% /influxdb/custom-timestamps %}} + +#### Ranking Functions + +| Function | Description | +| :------- | :--------------------------------------------------------- | +| CUME_DIST() | Returns the cumulative distribution of a value within a group of values | +| DENSE_RANK() | Returns a rank for each row without gaps in the numbering | +| NTILE() | Distributes the rows in an ordered partition into the specified number of groups | +| PERCENT_RANK() | Returns the percentage rank of the current row within its partition | +| RANK() | Returns the rank of the current row in its partition, allowing gaps between ranks | +| ROW_NUMBER() | Returns the position of the current row in its partition | + ### Selector functions Selector functions are unique to InfluxDB. They behave like aggregate functions in that they take a row of data and compute it down to a single value. However, selectors are unique in that they return a **time value** in addition to the computed value. In short, selectors return an aggregated value along with a timestamp. From 28018666a40ba830e71fce43cd3620c73c3ef51e Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 14:40:35 -0600 Subject: [PATCH 17/27] Apply suggestions from code review --- content/shared/v3-core-plugins/_index.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index 6d4b28930..48071ef8a 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -30,7 +30,9 @@ The Processing engine provides four types of triggers—each type corresponds to - **WAL Flush**: Triggered when the write-ahead log (WAL) is flushed to the object store (default is every second). - **Scheduled Tasks**: Triggered on a schedule you specify using cron syntax. - **On-request**: Triggered on a GET or POST request to the bound HTTP API endpoint at `/api/v3/engine/`. + ### Activate the Processing engine From 72d078ffbac5d7f19c38885ac2113a093e361610 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 14:42:11 -0600 Subject: [PATCH 18/27] Update content/shared/v3-core-plugins/_index.md --- content/shared/v3-core-plugins/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index 48071ef8a..89d75c64f 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -394,7 +394,7 @@ def process_request(influxdb3_local, query_parameters, request_headers, request_ return 200, {"Content-Type": "application/json"}, json.dumps({"status": "ok", "line": line_str}) ``` -#### On-request trigger configuration +#### On Request trigger configuration On-request plugins are set with a `trigger-spec` of `request:`. The `args` parameter can be used to pass configuration to the plugin. For example, if we wanted the above plugin to run on the endpoint `/api/v3/engine/my_plugin`, we would use `request:my_plugin` as the `trigger-spec`. From 5976a648446a5c1c20258f4085a754a6a54d1bf8 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 14:42:19 -0600 Subject: [PATCH 19/27] Update content/shared/v3-core-plugins/_index.md --- content/shared/v3-core-plugins/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index 89d75c64f..1e7200d6a 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -367,7 +367,7 @@ influxdb3 create trigger \ --database mydb system-metrics ``` -### On Request Trigger +### On Request trigger On Request plugins are triggered by a request to a specific endpoint under `/api/v3/engine`. The plugin will receive the local API, query parameters `Dict[str, str]`, request headers `Dict[str, str]`, request body (as bytes), and any arguments passed in the trigger definition. Here's an example of a simple On Request plugin: From 4f5122da92ef080f9b5ff0002dcf4171b1273dd5 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 14:42:35 -0600 Subject: [PATCH 20/27] Update content/shared/v3-core-plugins/_index.md --- content/shared/v3-core-plugins/_index.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index 1e7200d6a..8d80ba8ec 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -396,7 +396,15 @@ def process_request(influxdb3_local, query_parameters, request_headers, request_ #### On Request trigger configuration -On-request plugins are set with a `trigger-spec` of `request:`. The `args` parameter can be used to pass configuration to the plugin. For example, if we wanted the above plugin to run on the endpoint `/api/v3/engine/my_plugin`, we would use `request:my_plugin` as the `trigger-spec`. +**On Request** plugins are defined using the `request:` trigger-spec. + +For example, the following command creates an `/api/v3/engine/my_plugin` endpoint that runs a `/examples/my-on-request.py` plugin: + +```bash +influxdb3 create trigger \ + --trigger-spec "request:my_plugin" \ + --plugin-filename "examples/my-on-request.py" \ + --database mydb my-plugin Trigger specs must be unique across all configured plugins, regardless of which database they are tied to, given the path is the same. Here's an example to create a request trigger tied to the "hello-world' path using a plugin in the plugin-dir: From 49c055fd0beb9bbfe351814745c30c4e939906cb Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 14:43:15 -0600 Subject: [PATCH 21/27] Update content/shared/v3-core-plugins/_index.md --- content/shared/v3-core-plugins/_index.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index 8d80ba8ec..4550fd3aa 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -369,7 +369,10 @@ influxdb3 create trigger \ ### On Request trigger -On Request plugins are triggered by a request to a specific endpoint under `/api/v3/engine`. The plugin will receive the local API, query parameters `Dict[str, str]`, request headers `Dict[str, str]`, request body (as bytes), and any arguments passed in the trigger definition. Here's an example of a simple On Request plugin: +On Request plugins are triggered by a request to a specific endpoint under `/api/v3/engine`. The plugin receives the shared API, query parameters `Dict[str, str]`, request headers `Dict[str, str]`, the request body (as bytes), and any arguments passed in the trigger definition. + +#### Example: simple On Request plugin + ```python import json From d8d1d092903e41e0deb0713bfc21411a89092886 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 14:43:23 -0600 Subject: [PATCH 22/27] Update content/shared/v3-core-plugins/_index.md --- content/shared/v3-core-plugins/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index 4550fd3aa..485c471dd 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -409,7 +409,7 @@ influxdb3 create trigger \ --plugin-filename "examples/my-on-request.py" \ --database mydb my-plugin -Trigger specs must be unique across all configured plugins, regardless of which database they are tied to, given the path is the same. Here's an example to create a request trigger tied to the "hello-world' path using a plugin in the plugin-dir: +Because all On Request plugins share the same root URL, trigger specs must be unique across all plugins configured for a server, regardless of which database they are associated with. ```shell influxdb3 create trigger \ From 90eb51e44734d325c9e08a2005692ef46913ee23 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 16:28:38 -0600 Subject: [PATCH 23/27] fix(v3): Core and Enterprise Get Started: fix bad interval syntax, update processing engine overview - Fix interval syntax mentioned in https://influxdata.slack.com/archives/C084G9LR2HL/p1740171190161249. - Update processing engine overview and link to guide. --- content/influxdb3/core/plugins.md | 8 ++- content/influxdb3/enterprise/plugins.md | 10 ++- content/shared/v3-core-get-started/_index.md | 68 +++++++------------ content/shared/v3-core-plugins/_index.md | 48 +++++++------ .../v3-enterprise-get-started/_index.md | 62 +++++++---------- 5 files changed, 89 insertions(+), 107 deletions(-) diff --git a/content/influxdb3/core/plugins.md b/content/influxdb3/core/plugins.md index 86579fe90..84e8cb4d4 100644 --- a/content/influxdb3/core/plugins.md +++ b/content/influxdb3/core/plugins.md @@ -1,11 +1,15 @@ --- -title: Python Plugins and Processing Engine +title: Processing engine and Python plugins description: Use the Python processing engine to trigger and execute custom code on different events in an {{< product-name >}} instance. menu: influxdb3_core: - name: Processing Engine and Python Plugins + name: Processing engine and Python plugins weight: 4 influxdb3/core/tags: [] +related: +- /influxdb3/core/reference/cli/influxdb3/test/wal_plugin/) +- /influxdb3/core/reference/cli/influxdb3/create/plugin/) +- /influxdb3/core/reference/cli/influxdb3/create/trigger/) source: /shared/v3-core-plugins/_index.md --- diff --git a/content/influxdb3/enterprise/plugins.md b/content/influxdb3/enterprise/plugins.md index b9a15185f..f9b876877 100644 --- a/content/influxdb3/enterprise/plugins.md +++ b/content/influxdb3/enterprise/plugins.md @@ -1,11 +1,15 @@ --- -title: Python Plugins and Processing Engine +title: Processing engine and Python plugins description: Use the Python processing engine to trigger and execute custom code on different events in an {{< product-name >}} instance. menu: influxdb3_enterprise: - name: Processing Engine and Python Plugins + name: Processing engine and Python plugins weight: 4 -influxdb3/enterprise/tags: [] +influxdb3/core/tags: [] +related: +- /influxdb3/enterprise/reference/cli/influxdb3/test/wal_plugin/) +- /influxdb3/enterprise/reference/cli/influxdb3/create/plugin/) +- /influxdb3/enterprise/reference/cli/influxdb3/create/trigger/) source: /shared/v3-core-plugins/_index.md --- diff --git a/content/shared/v3-core-get-started/_index.md b/content/shared/v3-core-get-started/_index.md index c216759bb..a51cfbc26 100644 --- a/content/shared/v3-core-get-started/_index.md +++ b/content/shared/v3-core-get-started/_index.md @@ -135,8 +135,12 @@ source ~/.zshrc To start your InfluxDB instance, use the `influxdb3 serve` command and provide the following: -- `--object-store`: Specifies the type of Object store to use. InfluxDB supports the following: local file system (`file`), `memory`, S3 (and compatible services like Ceph or Minio) (`s3`), Google Cloud Storage (`google`), and Azure Blob Storage (`azure`). -- `--node-id`: A string identifier that determines the server's storage path within the configured storage location +- `--object-store`: Specifies the type of Object store to use. + InfluxDB supports the following: local file system (`file`), `memory`, + S3 (and compatible services like Ceph or Minio) (`s3`), + Google Cloud Storage (`google`), and Azure Blob Storage (`azure`). +- `--node-id`: A string identifier that determines the server's storage path + within the configured storage location, and, in a multi-node setup, is used to reference the node. The following examples show how to start InfluxDB 3 with different object store configurations: @@ -216,7 +220,7 @@ InfluxDB is a schema-on-write database. You can start writing data and InfluxDB After a schema is created, InfluxDB validates future write requests against it before accepting the data. Subsequent requests can add new fields on-the-fly, but can't add new tags. -InfluxDB 3 Core is optimized for recent data, but accepts writes from any time period. It persists that data in Parquet files for access by third-party systems for longer term historical analysis and queries. If you require longer historical queries with a compactor that optimizes data organization, consider using [InfluxDB 3 Enterprise](/influxdb3/enterprise/get-started/). +{{% product-name %}} is optimized for recent data, but accepts writes from any time period. It persists that data in Parquet files for access by third-party systems for longer term historical analysis and queries. If you require longer historical queries with a compactor that optimizes data organization, consider using [InfluxDB 3 Enterprise](/influxdb3/enterprise/get-started/). The database has three write API endpoints that respond to HTTP `POST` requests: @@ -278,7 +282,7 @@ With `accept_partial=true`: ``` Line `1` is written and queryable. -The response is an HTTP error (`400`) status, and the response body contains `partial write of line protocol occurred` and details about the problem line. +The response is an HTTP error (`400`) status, and the response body contains the error message `partial write of line protocol occurred` with details about the problem line. ##### Parsing failed for write_lp endpoint @@ -323,7 +327,7 @@ For more information, see [diskless architecture](#diskless-architecture). > Because InfluxDB sends a write response after the WAL file has been flushed to the configured object store (default is every second), individual write requests might not complete quickly, but you can make many concurrent requests to achieve higher total throughput. > Future enhancements will include an API parameter that lets requests return without waiting for the WAL flush. -#### Create a database or Table +#### Create a database or table To create a database without writing data, use the `create` subcommand--for example: @@ -340,9 +344,10 @@ influxdb3 create -h ### Query the database InfluxDB 3 now supports native SQL for querying, in addition to InfluxQL, an -SQL-like language customized for time series queries. {{< product-name >}} limits -query time ranges to 72 hours (both recent and historical) to ensure query performance. +SQL-like language customized for time series queries. +{{< product-name >}} limits +query time ranges to 72 hours (both recent and historical) to ensure query performance. For more information about the 72-hour limitation, see the [update on InfluxDB 3 Core’s 72-hour limitation](https://www.influxdata.com/blog/influxdb3-open-source-public-alpha-jan-27/). @@ -400,7 +405,7 @@ $ influxdb3 query --database=servers "SELECT DISTINCT usage_percent, time FROM c ### Querying using the CLI for InfluxQL -[InfluxQL](/influxdb3/core/reference/influxql/) is an SQL-like language developed by InfluxData with specific features tailored for leveraging and working with InfluxDB. It’s compatible with all versions of InfluxDB, making it a good choice for interoperability across different InfluxDB installations. +[InfluxQL](/influxdb3/version/reference/influxql/) is an SQL-like language developed by InfluxData with specific features tailored for leveraging and working with InfluxDB. It’s compatible with all versions of InfluxDB, making it a good choice for interoperability across different InfluxDB installations. To query using InfluxQL, enter the `influxdb3 query` subcommand and specify `influxql` in the language option--for example: @@ -499,7 +504,7 @@ You can use the `influxdb3` CLI to create a last value cache. Usage: $ influxdb3 create last_cache [OPTIONS] -d -t [CACHE_NAME] Options: - -h, --host URL of the running InfluxDB 3 Core server [env: INFLUXDB3_HOST_URL=] + -h, --host URL of the running {{% product-name %}} server [env: INFLUXDB3_HOST_URL=] -d, --database The database to run the query against [env: INFLUXDB3_DATABASE_NAME=] --token The token for authentication [env: INFLUXDB3_AUTH_TOKEN=] -t, --table
The table for which the cache is created @@ -569,34 +574,25 @@ influxdb3 create distinct_cache -h The InfluxDB 3 Processing engine is an embedded Python VM for running code inside the database to process and transform data. -To use the Processing engine, you create [plugins](#plugin) and [triggers](#trigger). +To activate the Processing engine, pass the `--plugin-dir ` option when starting the {{% product-name %}} server. +`PLUGIN_DIR` is your filesystem location for storing [plugin](#plugin) files for the Processing engine to run. #### Plugin -A plugin is a Python function that has a signature compatible with one of the [trigger types](#trigger-types). -The [`influxdb3 create plugin`](/influxdb3/core/reference/cli/influxdb3/create/plugin/) command loads a Python plugin file into the server. +A plugin is a Python function that has a signature compatible with a Processing engine [trigger](#trigger). #### Trigger -After you load a plugin into an InfluxDB 3 server, you can create one or more -triggers associated with the plugin. -When you create a trigger, you specify a plugin, a database, optional runtime arguments, -and a trigger-spec, which specifies `all_tables` or `table:my_table_name` (for filtering data sent to the plugin). -When you _enable_ a trigger, the server executes the plugin code according to the -plugin signature. +When you create a trigger, you specify a [plugin](#plugin), a database, optional arguments, +and a _trigger-spec_, which defines when the plugin is executed and what data it receives. ##### Trigger types -InfluxDB 3 provides the following types of triggers: +InfluxDB 3 provides the following types of triggers, each with specific trigger-specs: -- **On WAL flush**: Sends the batch of write data to a plugin once a second (configurable). - -> [!Note] -> Currently, only the **WAL flush** trigger is supported, but more are on the way: -> -> - **On Snapshot**: Sends metadata to a plugin for further processing against the Parquet data or to send the information elsewhere (for example, to an Iceberg Catalog). _Not yet available._ -> - **On Schedule**: Executes a plugin on a user-configured schedule, useful for data collection and deadman monitoring. _Not yet available._ -> - **On Request**: Binds a plugin to an HTTP endpoint at `/api/v3/plugins/`. _Not yet available._ +- **On WAL flush**: Sends a batch of written data (for a specific table or all tables) to a plugin (by default, every second). +> - **On Schedule**: Executes a plugin on a user-configured schedule (using a crontab or a duration); useful for data collection and deadman monitoring. +> - **On Request**: Binds a plugin to a custom HTTP API endpoint at `/api/v3/engine/`. > The plugin receives the HTTP request headers and content, and can then parse, process, and send the data into the database or to third-party services. ### Test, create, and trigger plugin code @@ -686,7 +682,7 @@ Test your InfluxDB 3 plugin safely without affecting written data. During a plug To test a plugin, do the following: 1. Create a _plugin directory_--for example, `/path/to/.influxdb/plugins` -2. [Start the InfluxDB server](#start-influxdb) and include the `--plugin-dir` option with your plugin directory path. +2. [Start the InfluxDB server](#start-influxdb) and include the `--plugin-dir ` option. 3. Save the [preceding example code](#example-python-plugin) to a plugin file inside of the plugin directory. If you haven't yet written data to the table in the example, comment out the lines where it queries. 4. To run the test, enter the following command with the following options: @@ -706,7 +702,7 @@ You can quickly see how the plugin behaves, what data it would have written to t You can then edit your Python code in the plugins directory, and rerun the test. The server reloads the file for every request to the `test` API. -For more information, see [`influxdb3 test wal_plugin`](/influxdb3/core/reference/cli/influxdb3/test/wal_plugin/) or run `influxdb3 test wal_plugin -h`. +For more information, see [`influxdb3 test wal_plugin`](/influxdb3/version/reference/cli/influxdb3/test/wal_plugin/) or run `influxdb3 test wal_plugin -h`. With the plugin code inside the server plugin directory, and a successful test, you're ready to create a plugin and a trigger to run on the server. @@ -729,14 +725,6 @@ influxdb3 test wal_plugin \ test.py ``` -```bash -# Create a plugin to run -influxdb3 create plugin \ - -d mydb \ - --code-filename="/path/to/.influxdb3/plugins/test.py" \ - test_plugin -``` - ```bash # Create a trigger that runs the plugin influxdb3 create trigger \ @@ -754,11 +742,7 @@ enable the trigger and have it run the plugin as you write data: influxdb3 enable trigger --database mydb trigger1 ``` -For more information, see the following: - -- [`influxdb3 test wal_plugin`](/influxdb3/core/reference/cli/influxdb3/test/wal_plugin/) -- [`influxdb3 create plugin`](/influxdb3/core/reference/cli/influxdb3/create/plugin/) -- [`influxdb3 create trigger`](/influxdb3/core/reference/cli/influxdb3/create/trigger/) +For more information, see [Python plugins and the Processing engine](/influxdb3/version/plugins/). ### Diskless architecture diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index 485c471dd..e66668995 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -25,7 +25,8 @@ A _trigger_ is an InfluxDB 3 resource you create to associate a database event (for example, a WAL flush) with the plugin that should run. When an event occurs, the trigger passes configuration details, optional arguments, and event data to the plugin. -The Processing engine provides four types of triggers—each type corresponds to an event type with event-specific configuration to let you handle events with targeted logic. +The Processing engine provides four types of triggers--each type corresponds to +an event type with event-specific configuration to let you handle events with targeted logic. - **WAL Flush**: Triggered when the write-ahead log (WAL) is flushed to the object store (default is every second). - **Scheduled Tasks**: Triggered on a schedule you specify using cron syntax. @@ -36,15 +37,15 @@ The Processing engine provides four types of triggers—each type corresponds to ### Activate the Processing engine -To enable the Processing engine, start the {{% product-name %}} server with the `--plugin-dir` option and a path to your plugins directory. If the directory doesn’t exist, the server creates it. +To enable the Processing engine, start the {{% product-name %}} server with the +`--plugin-dir` option and a path to your plugins directory. +If the directory doesn’t exist, the server creates it. ```bash influxdb3 serve --node-id node0 --object-store [OBJECT STORE TYPE] --plugin-dir /path/to/plugins ``` - - -## The Shared API +## Shared API All plugin types provide the InfluxDB 3 _shared API_ for interacting with the database. The shared API provides access to the following: @@ -194,11 +195,11 @@ The shared API `query` function executes an SQL query with optional parameters ( The following examples show how to use the `query` function: ```python -influxdb3_local.query("SELECT * from foo where bar = 'baz' and time > now() - 'interval 1 hour'") +influxdb3_local.query("SELECT * from foo where bar = 'baz' and time > now() - INTERVAL '1 hour'") # Or using parameterized queries args = {"bar": "baz"} -influxdb3_local.query("SELECT * from foo where bar = $bar and time > now() - 'interval 1 hour'", args) +influxdb3_local.query("SELECT * from foo where bar = $bar and time > now() - INTERVAL '1 hour'", args) ``` ### Logging @@ -220,13 +221,20 @@ influxdb3_local.info("This is an info message with an object", obj_to_log) ### Trigger arguments -Every plugin type can receive arguments from the configuration of the trigger that runs it. +A plugin can receive arguments from the trigger that runs it. You can use this to provide runtime configuration and drive behavior of a plugin—for example: - threshold values for monitoring - connection properties for connecting to third-party services -The arguments are passed as a `Dict[str, str]` where the key is the argument name and the value is the argument value. +To pass arguments to a plugin, specify argument key-value pairs in the trigger--for example, using the CLI: + +```bash +influxdb3 create trigger +--trigger-arguments + Comma separated list of key/value pairs to use as trigger arguments. Example: key1=val1,key2=val2 +The arguments are passed to the plugin as a `Dict[str, str]` where the key is +the argument name and the value is the argument value. The following example shows how to use an argument in a WAL plugin: @@ -369,11 +377,14 @@ influxdb3 create trigger \ ### On Request trigger -On Request plugins are triggered by a request to a specific endpoint under `/api/v3/engine`. The plugin receives the shared API, query parameters `Dict[str, str]`, request headers `Dict[str, str]`, the request body (as bytes), and any arguments passed in the trigger definition. +On Request plugins are triggered by a request to an endpoint that you define +under `/api/v3/engine`. +When triggered, the plugin receives the shared API, query parameters `Dict[str, str]`, +request headers `Dict[str, str]`, the request body (as bytes), +and any arguments passed in the trigger definition. #### Example: simple On Request plugin - ```python import json @@ -399,9 +410,10 @@ def process_request(influxdb3_local, query_parameters, request_headers, request_ #### On Request trigger configuration -**On Request** plugins are defined using the `request:` trigger-spec. +Define an On Request plugin using the `request:` trigger-spec. -For example, the following command creates an `/api/v3/engine/my_plugin` endpoint that runs a `/examples/my-on-request.py` plugin: +For example, the following command creates an `/api/v3/engine/my_plugin` endpoint +that runs a `/examples/my-on-request.py` plugin: ```bash influxdb3 create trigger \ @@ -409,11 +421,5 @@ influxdb3 create trigger \ --plugin-filename "examples/my-on-request.py" \ --database mydb my-plugin -Because all On Request plugins share the same root URL, trigger specs must be unique across all plugins configured for a server, regardless of which database they are associated with. - -```shell -influxdb3 create trigger \ - --trigger-spec "request:hello-world" \ - --plugin-filename "hello/hello_world.py" \ - --database mydb hello-world -``` +Because all On Request plugins share the same root URL, trigger specs must be +unique across all plugins configured for a server, regardless of which database they are associated with. diff --git a/content/shared/v3-enterprise-get-started/_index.md b/content/shared/v3-enterprise-get-started/_index.md index 4639943e7..b171fbae1 100644 --- a/content/shared/v3-enterprise-get-started/_index.md +++ b/content/shared/v3-enterprise-get-started/_index.md @@ -126,8 +126,12 @@ source ~/.zshrc To start your InfluxDB instance, use the `influxdb3 serve` command and provide the following: -- `--object-store`: Specifies the type of Object store to use. InfluxDB supports the following: local file system (`file`), `memory`, S3 (and compatible services like Ceph or Minio) (`s3`), Google Cloud Storage (`google`), and Azure Blob Storage (`azure`). -- `--node-id`: A string identifier that determines the server's storage path within the configured storage location, and, in a multi-node setup, is used to reference the node +- `--object-store`: Specifies the type of Object store to use. + InfluxDB supports the following: local file system (`file`), `memory`, + S3 (and compatible services like Ceph or Minio) (`s3`), + Google Cloud Storage (`google`), and Azure Blob Storage (`azure`). +- `--node-id`: A string identifier that determines the server's storage path + within the configured storage location, and, in a multi-node setup, is used to reference the node. The following examples show how to start InfluxDB 3 with different object store configurations: @@ -273,7 +277,7 @@ With `accept_partial=true`: ``` Line `1` is written and queryable. -The response is an HTTP error (`400`) status, and the response body contains `partial write of line protocol occurred` and details about the problem line. +The response is an HTTP error (`400`) status, and the response body contains the error message `partial write of line protocol occurred` with details about the problem line. ##### Parsing failed for write_lp endpoint @@ -390,7 +394,7 @@ $ influxdb3 query --database=servers "SELECT DISTINCT usage_percent, time FROM c ### Querying using the CLI for InfluxQL -[InfluxQL](/influxdb3/enterprise/reference/influxql/) is an SQL-like language developed by InfluxData with specific features tailored for leveraging and working with InfluxDB. It’s compatible with all versions of InfluxDB, making it a good choice for interoperability across different InfluxDB installations. +[InfluxQL](/influxdb3/version/reference/influxql/) is an SQL-like language developed by InfluxData with specific features tailored for leveraging and working with InfluxDB. It’s compatible with all versions of InfluxDB, making it a good choice for interoperability across different InfluxDB installations. To query using InfluxQL, enter the `influxdb3 query` subcommand and specify `influxql` in the language option--for example: @@ -489,7 +493,7 @@ You can use the `influxdb3` CLI to create a last value cache. Usage: $ influxdb3 create last_cache [OPTIONS] -d -t
[CACHE_NAME] Options: - -h, --host URL of the running InfluxDB 3 Enterprise server [env: INFLUXDB3_HOST_URL=] + -h, --host URL of the running {{% product-name %}} server [env: INFLUXDB3_HOST_URL=] -d, --database The database to run the query against [env: INFLUXDB3_DATABASE_NAME=] --token The token for authentication [env: INFLUXDB3_AUTH_TOKEN=] -t, --table
The table for which the cache is created @@ -559,34 +563,25 @@ influxdb3 create distinct_cache -h The InfluxDB 3 Processing engine is an embedded Python VM for running code inside the database to process and transform data. -To use the Processing engine, you create [plugins](#plugin) and [triggers](#trigger). +To activate the Processing engine, pass the `--plugin-dir ` option when starting the {{% product-name %}} server. +`PLUGIN_DIR` is your filesystem location for storing [plugin](#plugin) files for the Processing engine to run. #### Plugin -A plugin is a Python function that has a signature compatible with one of the [trigger types](#trigger-types). -The [`influxdb3 create plugin`](/influxdb3/enterprise/reference/cli/influxdb3/create/plugin/) command loads a Python plugin file into the server. +A plugin is a Python function that has a signature compatible with a Processing engine [trigger](#trigger). #### Trigger -After you load a plugin into an InfluxDB 3 server, you can create one or more -triggers associated with the plugin. -When you create a trigger, you specify a plugin, a database, optional runtime arguments, -and a trigger-spec, which specifies `all_tables` or `table:my_table_name` (for filtering data sent to the plugin). -When you _enable_ a trigger, the server executes the plugin code according to the -plugin signature. +When you create a trigger, you specify a [plugin](#plugin), a database, optional arguments, +and a _trigger-spec_, which defines when the plugin is executed and what data it receives. ##### Trigger types -InfluxDB 3 provides the following types of triggers: +InfluxDB 3 provides the following types of triggers, each with specific trigger-specs: -- **On WAL flush**: Sends the batch of write data to a plugin once a second (configurable). - -> [!Note] -> Currently, only the **WAL flush** trigger is supported, but more are on the way: -> -> - **On Snapshot**: Sends metadata to a plugin for further processing against the Parquet data or to send the information elsewhere (for example, to an Iceberg Catalog). _Not yet available._ -> - **On Schedule**: Executes a plugin on a user-configured schedule, useful for data collection and deadman monitoring. _Not yet available._ -> - **On Request**: Binds a plugin to an HTTP endpoint at `/api/v3/plugins/`. _Not yet available._ +- **On WAL flush**: Sends a batch of written data (for a specific table or all tables) to a plugin (by default, every second). +> - **On Schedule**: Executes a plugin on a user-configured schedule (using a crontab or a duration); useful for data collection and deadman monitoring. +> - **On Request**: Binds a plugin to a custom HTTP API endpoint at `/api/v3/engine/`. > The plugin receives the HTTP request headers and content, and can then parse, process, and send the data into the database or to third-party services. ### Test, create, and trigger plugin code @@ -676,7 +671,7 @@ Test your InfluxDB 3 plugin safely without affecting written data. During a plug To test a plugin, do the following: 1. Create a _plugin directory_--for example, `/path/to/.influxdb/plugins` -2. [Start the InfluxDB server](#start-influxdb) and include the `--plugin-dir` option with your plugin directory path. +2. [Start the InfluxDB server](#start-influxdb) and include the `--plugin-dir ` option. 3. Save the [preceding example code](#example-python-plugin) to a plugin file inside of the plugin directory. If you haven't yet written data to the table in the example, comment out the lines where it queries. 4. To run the test, enter the following command with the following options: @@ -696,7 +691,7 @@ You can quickly see how the plugin behaves, what data it would have written to t You can then edit your Python code in the plugins directory, and rerun the test. The server reloads the file for every request to the `test` API. -For more information, see [`influxdb3 test wal_plugin`](/influxdb3/enterprise/reference/cli/influxdb3/test/wal_plugin/) or run `influxdb3 test wal_plugin -h`. +For more information, see [`influxdb3 test wal_plugin`](/influxdb3/version/reference/cli/influxdb3/test/wal_plugin/) or run `influxdb3 test wal_plugin -h`. With the plugin code inside the server plugin directory, and a successful test, you're ready to create a plugin and a trigger to run on the server. @@ -719,14 +714,6 @@ influxdb3 test wal_plugin \ test.py ``` -```bash -# Create a plugin to run -influxdb3 create plugin \ - -d mydb \ - --code-filename="/path/to/.influxdb3/plugins/test.py" \ - test_plugin -``` - ```bash # Create a trigger that runs the plugin influxdb3 create trigger \ @@ -744,15 +731,12 @@ enable the trigger and have it run the plugin as you write data: influxdb3 enable trigger --database mydb trigger1 ``` -For more information, see the following: - -- [`influxdb3 test wal_plugin`](/influxdb3/enterprise/reference/cli/influxdb3/test/wal_plugin/) -- [`influxdb3 create plugin`](/influxdb3/enterprise/reference/cli/influxdb3/create/plugin/) -- [`influxdb3 create trigger`](/influxdb3/enterprise/reference/cli/influxdb3/create/trigger/) +For more information, see [Python plugins and the Processing engine](/influxdb3/version/plugins/). ### Diskless architecture -InfluxDB 3 is able to operate using only object storage with no locally attached disk. While it can use only a disk with no dependencies, the ability to operate without one is a new capability with this release. The figure below illustrates the write path for data landing in the database. +InfluxDB 3 is able to operate using only object storage with no locally attached disk. +While it can use only a disk with no dependencies, the ability to operate without one is a new capability with this release. The figure below illustrates the write path for data landing in the database. {{< img-hd src="/img/influxdb/influxdb-3-write-path.png" alt="Write Path for InfluxDB 3 Core & Enterprise" />}} From 6a5aeb3cfce70b7c2bf8d7aeee8041b96d38dd0b Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 16:49:29 -0600 Subject: [PATCH 24/27] fix: Processing engine description. --- content/shared/v3-core-plugins/_index.md | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index 485c471dd..6d2d124c2 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -42,9 +42,7 @@ To enable the Processing engine, start the {{% product-name %}} server with the influxdb3 serve --node-id node0 --object-store [OBJECT STORE TYPE] --plugin-dir /path/to/plugins ``` - - -## The Shared API +## Shared API All plugin types provide the InfluxDB 3 _shared API_ for interacting with the database. The shared API provides access to the following: @@ -399,21 +397,17 @@ def process_request(influxdb3_local, query_parameters, request_headers, request_ #### On Request trigger configuration -**On Request** plugins are defined using the `request:` trigger-spec. - -For example, the following command creates an `/api/v3/engine/my_plugin` endpoint that runs a `/examples/my-on-request.py` plugin: +On Request plugins are defined using the `request:` trigger-spec. +For example, the following `influxdb3` CLI command creates an `/api/v3/engine/my-plugin` HTTP endpoint +to execute the `/examples/my-on-request.py` plugin: ```bash influxdb3 create trigger \ - --trigger-spec "request:my_plugin" \ + --trigger-spec "request:my-plugin" \ --plugin-filename "examples/my-on-request.py" \ --database mydb my-plugin -Because all On Request plugins share the same root URL, trigger specs must be unique across all plugins configured for a server, regardless of which database they are associated with. - -```shell -influxdb3 create trigger \ - --trigger-spec "request:hello-world" \ - --plugin-filename "hello/hello_world.py" \ - --database mydb hello-world -``` +Because all On Request plugins for a server share the same `/api/v3/engine/` base URL , +the trigger-spec +you define must be unique across all plugins configured for a server, +regardless of which database they are associated with. From 002074bd1028d3ed88140cb9e8353678aca7ac14 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 17:14:53 -0600 Subject: [PATCH 25/27] fix: On Request configuration --- content/shared/v3-core-plugins/_index.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index 6d2d124c2..2f2793839 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -367,10 +367,10 @@ influxdb3 create trigger \ ### On Request trigger -On Request plugins are triggered by a request to a specific endpoint under `/api/v3/engine`. The plugin receives the shared API, query parameters `Dict[str, str]`, request headers `Dict[str, str]`, the request body (as bytes), and any arguments passed in the trigger definition. - -#### Example: simple On Request plugin +On Request plugins are triggered by a request to an HTTP API endpoint. +The plugin receives the shared API, query parameters `Dict[str, str]`, request headers `Dict[str, str]`, the request body (as bytes), and any arguments passed in the trigger definition. +#### Example: On Request plugin ```python import json @@ -397,9 +397,9 @@ def process_request(influxdb3_local, query_parameters, request_headers, request_ #### On Request trigger configuration -On Request plugins are defined using the `request:` trigger-spec. -For example, the following `influxdb3` CLI command creates an `/api/v3/engine/my-plugin` HTTP endpoint -to execute the `/examples/my-on-request.py` plugin: +To create a trigger for an On Request plugin, specify the `request:` trigger-spec. + +For example, the following command creates an HTTP API `/api/v3/engine/my-plugin` endpoint for the plugin file: ```bash influxdb3 create trigger \ @@ -407,7 +407,8 @@ influxdb3 create trigger \ --plugin-filename "examples/my-on-request.py" \ --database mydb my-plugin -Because all On Request plugins for a server share the same `/api/v3/engine/` base URL , -the trigger-spec -you define must be unique across all plugins configured for a server, +To run the plugin, you send an HTTP request to `/api/v3/engine/my-plugin`. + +Because all On Request plugins for a server share the same `/api/v3/engine/` base URL, +the trigger-spec you define must be unique across all plugins configured for a server, regardless of which database they are associated with. From 971562275f1a6af93de03714b6af51cae9337283 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Mon, 24 Feb 2025 17:19:49 -0600 Subject: [PATCH 26/27] hotfix: codeblock --- content/shared/v3-core-plugins/_index.md | 1 + 1 file changed, 1 insertion(+) diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md index 2f2793839..a2812d5f9 100644 --- a/content/shared/v3-core-plugins/_index.md +++ b/content/shared/v3-core-plugins/_index.md @@ -406,6 +406,7 @@ influxdb3 create trigger \ --trigger-spec "request:my-plugin" \ --plugin-filename "examples/my-on-request.py" \ --database mydb my-plugin +``` To run the plugin, you send an HTTP request to `/api/v3/engine/my-plugin`. From d7200f13635696ccfc1294cb99122b3433fba7b9 Mon Sep 17 00:00:00 2001 From: Jason Stirnaman Date: Tue, 25 Feb 2025 10:06:09 -0600 Subject: [PATCH 27/27] chore(sql): how ranking functions handle duplicate values - Add query example with output from @appletreeisyellow - Clarify function descriptions --- .../shared/sql-reference/functions/window.md | 95 +++++++++++++++++-- 1 file changed, 89 insertions(+), 6 deletions(-) diff --git a/content/shared/sql-reference/functions/window.md b/content/shared/sql-reference/functions/window.md index d31457309..698f46787 100644 --- a/content/shared/sql-reference/functions/window.md +++ b/content/shared/sql-reference/functions/window.md @@ -476,9 +476,10 @@ WHERE ### dense_rank -Returns a rank for each row without gaps in the numbering. -Unlike [rank()](#rank), this function assigns consecutive ranks even when values -are identical. +Returns the rank of the current row in its partition. +Ranking is consecutive; assigns duplicate values the same rank number and the rank sequence continues +with the next distinct value (unlike [`rank()`](#rank)). + The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines ranking order. @@ -521,6 +522,33 @@ WHERE {{% /influxdb/custom-timestamps %}} +{{% /expand %}} +{{% expand "Compare `dense_rank`, `rank`, and `row_number` functions"%}} + +Consider a table with duplicate ID values. +The following query shows how each ranking function handles duplicate values: + +```sql +SELECT + id, + rank() OVER(ORDER BY id), + dense_rank() OVER(ORDER BY id), + row_number() OVER(ORDER BY id) +FROM my_table; +``` + +| ID | rank | dense_rank | row_number | +|:----|-----:|-----------:|-----------:| +| 1 | 1 | 1 | 1 | +| 1 | 1 | 1 | 2 | +| 1 | 1 | 1 | 3 | +| 2 | 4 | 2 | 4 | + +Key differences: + +- [`rank()`](#rank) assigns the same rank to equal values but skips ranks for subsequent values +- [`dense_rank()`](#dense_rank) assigns the same rank to equal values and uses consecutive ranks +- [`row_number()`](#row_number) assigns unique sequential numbers regardless of value (non-deterministic) {{% /expand %}} {{< /expand-wrapper >}} @@ -632,9 +660,10 @@ WHERE ### rank -Returns the rank of the current row in its partition, allowing gaps between -ranks. This function provides a ranking similar to [`row_number`](#row_number), -but skips ranks for identical values. +Returns the rank of the current row in its partition. +For duplicate values, `rank` assigns them the same rank number, skips subsequent ranks (unlike [`dense_rank()`](#dense_rank)), +and then continues ranking with the next distinct value. + The [`ORDER BY` clause](#order-by-clause) in the `OVER` clause determines ranking order. @@ -675,6 +704,33 @@ WHERE {{% /influxdb/custom-timestamps %}} +{{% /expand %}} +{{% expand "Compare `dense_rank`, `rank`, and `row_number` functions"%}} + +Consider a table with duplicate ID values. +The following query shows how each ranking function handles duplicate values: + +```sql +SELECT + id, + rank() OVER(ORDER BY id), + dense_rank() OVER(ORDER BY id), + row_number() OVER(ORDER BY id) +FROM my_table; +``` + +| ID | rank | dense_rank | row_number | +|:----|-----:|-----------:|-----------:| +| 1 | 1 | 1 | 1 | +| 1 | 1 | 1 | 2 | +| 1 | 1 | 1 | 3 | +| 2 | 4 | 2 | 4 | + +Key differences: + +- [`rank()`](#rank) assigns the same rank to equal values but skips ranks for subsequent values +- [`dense_rank()`](#dense_rank) assigns the same rank to equal values and uses consecutive ranks +- [`row_number()`](#row_number) assigns unique sequential numbers regardless of value (non-deterministic) {{% /expand %}} {{< /expand-wrapper >}} @@ -721,6 +777,33 @@ WHERE {{% /influxdb/custom-timestamps %}} +{{% /expand %}} +{{% expand "Compare `dense_rank`, `rank`, and `row_number` functions"%}} + +Consider a table with duplicate ID values. +The following query shows how each ranking function handles duplicate values: + +```sql +SELECT + id, + rank() OVER(ORDER BY id), + dense_rank() OVER(ORDER BY id), + row_number() OVER(ORDER BY id) +FROM my_table; +``` + +| ID | rank | dense_rank | row_number | +|:----|-----:|-----------:|-----------:| +| 1 | 1 | 1 | 1 | +| 1 | 1 | 1 | 2 | +| 1 | 1 | 1 | 3 | +| 2 | 4 | 2 | 4 | + +Key differences: + +- [`rank()`](#rank) assigns the same rank to equal values but skips ranks for subsequent values +- [`dense_rank()`](#dense_rank) assigns the same rank to equal values and uses consecutive ranks +- [`row_number()`](#row_number) assigns unique sequential numbers regardless of value (non-deterministic) {{% /expand %}} {{< /expand-wrapper >}}