Merge branch 'master' into chore/update-v0-management-api-with-new-max-columns-per-table

2025-12-15 10:26:14 -05:00 · 2025-12-15 10:26:14 -05:00 · 3df076a531
parent 095464b7f9 0856e60e55
commit 3df076a531
15 changed files with 285 additions and 27 deletions
--- a/.ci/link-checker/production.lycherc.toml
+++ b/.ci/link-checker/production.lycherc.toml
@ -92,6 +92,13 @@ exclude = [
  # TODO: Remove after fixing canonical URL generation or link-checker domain replacement
  "^https://docs\\.influxdata\\.com/",

+  # Local file URLs with fragments (workaround for link-checker Hugo pretty URL bug)
+  # link-checker converts /path/to/page#fragment to file:///path/to/page#fragment
+  # but the actual file is at /path/to/page/index.html, causing false fragment errors
+  # TODO: Remove after fixing link-checker to handle Hugo pretty URLs with fragments
+  # See: https://github.com/influxdata/docs-tooling/issues/XXX
+  "^file://.*#",
+
  # Common documentation placeholders
  "YOUR_.*",
  "REPLACE_.*",
--- a/PLATFORM_REFERENCE.md
+++ b/PLATFORM_REFERENCE.md
@ -52,7 +52,7 @@ InfluxDB 3 Explorer:
  - Documentation: https://docs.influxdata.com/influxdb3/explorer/

 Telegraf:
-  - Documentation: https://docs.influxdata.com/telegraf/v1.36/
+  - Documentation: https://docs.influxdata.com/telegraf/v1.37/

 Chronograf:
  - Documentation: https://docs.influxdata.com/chronograf/v1.10/
--- a/content/influxdb/v2/reference/release-notes/influxdb.md
+++ b/content/influxdb/v2/reference/release-notes/influxdb.md
@ -8,6 +8,65 @@ menu:
 weight: 101
 ---

+## v2.8.0 {date="2025-12-12"}
+
+### Features
+
+- [5e204dc](https://github.com/influxdata/influxdb/commit/5e204dc): Add optional token hashing
+
+#### Token hashing
+
+Introduces token hashing.
+When activated, token hashing stores all API tokens as hashes on disk.
+While token hashing is a valuable security upgrade, care should be taken when upgrading and enabling token hashing.
+Use the [`use-hashed-tokens`](/influxdb/v2/reference/config-options/#use-hashed-tokens) configuration option to enable token hashing in {{< product-name >}}.
+
+If you are upgrading from an earlier InfluxDB 2.x version, we recommend first upgrading to version 2.8.0 (or later), and then enabling hashed tokens as a separate step.
+
+Token hashing is optional and you can enable token hashing at any time after the upgrade.
+
+#### How token hashing works
+
+Upon upgrading to InfluxDB v2.8.0 or later from version 2.7.12 or earlier releases, the BoltDB schema is upgraded to add a new index bucket. 
+
+On every startup with token hashing enabled, {{< product-name >}} migrates all unhashed tokens to hashed tokens and deletes the unhashed tokens.
+
+With token hashing enabled, any new tokens are stored as hashed tokens.
+If you then disable token hashing, newly created tokens are stored unhashed, but existing tokens remain hashed on disk.
+
+##### Hashed tokens erased when downgrading
+
+Note that once token hashing is enabled, downgrading to a version earlier than 2.8.0 erases all API tokens due to the required schema downgrade. After downgrading, you'll need to create new tokens for your clients, even if you disable token hashing before you downgrade. Disabling token hashing **does not** unhash tokens stored in hashed form.
+
+If token hashing is _never_ enabled, then it is possible to downgrade from v2.8.0 to v2.7.12 and earlier.
+
+##### Recommended process
+
+1. Upgrade InfluxDB.
+   1. Initiate influxd shutdown.
+   2. Wait for a clean shutdown.
+   3. Upgrade influxd.
+   4. Start influxd.
+2. Verify the upgrade is successful.
+3. Optional: Enable token hashing
+   1. Initiate influxd shutdown.
+   2. Wait for a clean shutdown.
+   3. Do _one_ of the following:
+      - Include the `--use-hashed-tokens` command line flag
+      - Set the `INFLUXD_USE_HASHED_TOKENS=true` environment variable in your container environment
+      - Set `use-hashed-tokens` to `true` in your configuration file
+   4. Start influxd.
+
+
+### Bug Fixes
+
+- [305e61d](https://github.com/influxdata/influxdb/commit/305e61d): Fix compilation on Alpine Linux
+
+- [40a6332](https://github.com/influxdata/influxdb/commit/40a6332): Updates post-install for Linux package builds
+
+- [1b83d2c](https://github.com/influxdata/influxdb/commit/1b83d2c): Chore: update to go 1.23.12 (2.7)
+- [40a6332](https://github.com/influxdata/influxdb/commit/40a6332): Updates post-install for Linux package builds
+
 ## v2.7.12 {date="2025-05-20"}

 ### Features
--- a/content/influxdb3/cloud-dedicated/reference/syntax/line-protocol.md
+++ b/content/influxdb3/cloud-dedicated/reference/syntax/line-protocol.md
@ -14,5 +14,4 @@ related:
 source: /shared/v3-line-protocol.md
 ---

-<!-- The content of this file is at 
-// SOURCE content/shared/v3-line-protocol.md-->
+<!-- // SOURCE content/shared/v3-line-protocol.md -->
--- a/content/influxdb3/cloud-dedicated/write-data/best-practices/optimize-writes.md
+++ b/content/influxdb3/cloud-dedicated/write-data/best-practices/optimize-writes.md
@ -429,6 +429,10 @@ Deduplicating your data can reduce your write payload size and resource usage.
 > sometimes sooner—this ordering is not guaranteed if duplicate points are flushed
 > at the same time. As a result, the last written duplicate point may not always
 > be retained in storage.
+>
+> For recommended patterns and anti-patterns to avoid, see
+> [Duplicate points](/influxdb3/cloud-dedicated/reference/syntax/line-protocol/#duplicate-points)
+> in the line protocol reference.

 Use Telegraf and the [Dedup processor plugin](/telegraf/v1/plugins/#processor-dedup)
 to filter data whose field values are exact repetitions of previous values.
--- a/content/influxdb3/cloud-dedicated/write-data/best-practices/schema-design.md
+++ b/content/influxdb3/cloud-dedicated/write-data/best-practices/schema-design.md
@ -83,6 +83,10 @@ In time series data, the primary key for a row of data is typically a combinatio
 In InfluxDB, the primary key for a row is the combination of the point's timestamp and _tag set_ - the collection of [tag keys](/influxdb3/cloud-dedicated/reference/glossary/#tag-key) and [tag values](/influxdb3/cloud-dedicated/reference/glossary/#tag-value) on the point.
 A row's primary key tag set does not include tags with null values.

+> [!Important]
+> Overwriting points with the same primary key (timestamp and tag set) is not reliable for maintaining a last-value view.
+> For recommended patterns, see [Duplicate points](/influxdb3/cloud-dedicated/reference/syntax/line-protocol/#duplicate-points) in the line protocol reference.
+
 ### Tags versus fields

 When designing your schema for InfluxDB, a common question is, "what should be a
--- a/content/influxdb3/cloud-serverless/reference/syntax/line-protocol.md
+++ b/content/influxdb3/cloud-serverless/reference/syntax/line-protocol.md
@ -13,5 +13,4 @@ related:
 source: /shared/influxdb-v2/reference/syntax/line-protocol.md
 ---

-<!-- The content of this file is at 
-// SOURCE content/shared/influxdb-v2/reference/syntax/line-protocol.md-->
+<!-- // SOURCE content/shared/influxdb-v2/reference/syntax/line-protocol.md -->
--- a/content/influxdb3/cloud-serverless/write-data/best-practices/schema-design.md
+++ b/content/influxdb3/cloud-serverless/write-data/best-practices/schema-design.md
@ -64,6 +64,10 @@ In time series data, the primary key for a row of data is typically a combinatio
 In InfluxDB, the primary key for a row is the combination of the point's timestamp and _tag set_ - the collection of [tag keys](/influxdb3/cloud-serverless/reference/glossary/#tag-key) and [tag values](/influxdb3/cloud-serverless/reference/glossary/#tag-value) on the point.
 A row's primary key tag set does not include tags with null values.

+> [!Important]
+> Overwriting points with the same primary key (timestamp and tag set) is not reliable for maintaining a last-value view.
+> For recommended patterns, see [Duplicate points](/influxdb3/cloud-serverless/reference/syntax/line-protocol/#duplicate-points) in the line protocol reference.
+
 ### Tags versus fields

 When designing your schema for InfluxDB, a common question is, "what should be a
--- a/content/influxdb3/clustered/reference/syntax/line-protocol.md
+++ b/content/influxdb3/clustered/reference/syntax/line-protocol.md
@ -14,5 +14,4 @@ related:
 source: /shared/v3-line-protocol.md
 ---

-<!-- The content of this file is at 
-// SOURCE content/shared/v3-line-protocol.md-->
+<!--// SOURCE content/shared/v3-line-protocol.md -->
--- a/content/influxdb3/clustered/write-data/best-practices/optimize-writes.md
+++ b/content/influxdb3/clustered/write-data/best-practices/optimize-writes.md
@ -436,6 +436,10 @@ Deduplicating your data can reduce your write payload size and resource usage.
 > sometimes sooner—this ordering is not guaranteed if duplicate points are flushed
 > at the same time. As a result, the last written duplicate point may not always
 > be retained in storage.
+>
+> For recommended patterns and anti-patterns to avoid, see
+> [Duplicate points](/influxdb3/clustered/reference/syntax/line-protocol/#duplicate-points)
+> in the line protocol reference.

 Use Telegraf and the [Dedup processor plugin](/telegraf/v1/plugins/#processor-dedup)
 to filter data whose field values are exact repetitions of previous values.
--- a/content/influxdb3/clustered/write-data/best-practices/schema-design.md
+++ b/content/influxdb3/clustered/write-data/best-practices/schema-design.md
@ -83,6 +83,10 @@ In time series data, the primary key for a row of data is typically a combinatio
 In InfluxDB, the primary key for a row is the combination of the point's timestamp and _tag set_ - the collection of [tag keys](/influxdb3/clustered/reference/glossary/#tag-key) and [tag values](/influxdb3/clustered/reference/glossary/#tag-value) on the point.
 A row's primary key tag set does not include tags with null values.

+> [!Important]
+> Overwriting points with the same primary key (timestamp and tag set) is not reliable for maintaining a last-value view.
+> For recommended patterns, see [Duplicate points](/influxdb3/clustered/reference/syntax/line-protocol/#duplicate-points) in the line protocol reference.
+
 ### Tags versus fields

 When designing your schema for InfluxDB, a common question is, "what should be a
--- a/content/influxdb3/core/reference/line-protocol.md
+++ b/content/influxdb3/core/reference/line-protocol.md
@ -12,10 +12,8 @@ influxdb3/core/tags: [write, line protocol, syntax]
 related:
  - /influxdb3/core/write-data/
 aliases:
-  - /influxdb3/core/reference/syntax/line-protocol
+  - /influxdb3/core/reference/syntax/line-protocol/
 source: /shared/v3-line-protocol.md
 ---

-<!--
-The content of this file is at content/shared/v3-line-protocol.md
-->
+<!--// SOURCE content/shared/v3-line-protocol.md -->
--- a/content/influxdb3/enterprise/reference/line-protocol.md
+++ b/content/influxdb3/enterprise/reference/line-protocol.md
@ -12,10 +12,8 @@ influxdb3/enterprise/tags: [write, line protocol, syntax]
 related:
  - /influxdb3/enterprise/write-data/
 aliases:
-  - /influxdb3/enterprise/reference/syntax/line-protocol
+  - /influxdb3/enterprise/reference/syntax/line-protocol/
 source: /shared/v3-line-protocol.md
 ---

-<!--
-The content of this file is at content/shared/v3-line-protocol.md
-->
+<!--// SOURCE content/shared/v3-line-protocol.md -->
--- a/content/shared/v3-line-protocol.md
+++ b/content/shared/v3-line-protocol.md
@ -283,14 +283,193 @@ If you submit line protocol with the same table, tag set, and timestamp,
 but with a different field set, the field set becomes the union of the old
 field set and the new field set, where any conflicts favor the new field set.

-{{% show-in "cloud-dedicated,clustered" %}}
-> [!Important]
-> #### Write ordering for duplicate points
+{{% show-in "cloud-dedicated,clustered,cloud-serverless" %}}
+> [!Warning]
+> #### Duplicate point overwrites are non-deterministic
 >
-> {{% product-name %}} attempts to honor write ordering for duplicate points,
-> with the most recently written point taking precedence. However, when data is
-> flushed from the in-memory buffer to Parquet files—typically every 15 minutes,
-> but sometimes sooner—this ordering is not guaranteed if duplicate points are
-> flushed at the same time. As a result, the last written duplicate point may
-> not always be retained in storage.
+> Overwriting duplicate points (same table, tag set, and timestamp) is _not a reliable way to maintain a last-value view_.
+> When duplicate points are flushed together, write ordering is not guaranteed—a prior write may "win."
+> See [Anti-patterns to avoid](#anti-patterns-to-avoid) and [Recommended patterns](#recommended-patterns-for-last-value-tracking) below.
+
+### Recommended patterns for last-value tracking
+
+To reliably maintain a last-value view of your data, use one of these append-only patterns:
+
+#### Append-only with unique timestamps (recommended)
+
+Write each change as a new point with a unique timestamp using the actual event time.
+Query for the most recent point to get the current value.
+
+**Line protocol example**:
+
+```text
+device_status,device_id=sensor01 status="active",temperature=72.5 1700000000000000000
+device_status,device_id=sensor01 status="active",temperature=73.1 1700000300000000000
+device_status,device_id=sensor01 status="inactive",temperature=73.1 1700000600000000000
+```
+
+**SQL query to get latest state**:
+
+```sql
+SELECT
+  device_id,
+  status,
+  temperature,
+  time
+FROM device_status
+WHERE time >= now() - INTERVAL '7 days'
+  AND device_id = 'sensor01'
+ORDER BY time DESC
+LIMIT 1
+```
+
+**InfluxQL query to get latest state**:
+
+```influxql
+SELECT LAST(status), LAST(temperature)
+FROM device_status
+WHERE device_id = 'sensor01'
+  AND time >= now() - 7d
+GROUP BY device_id
+```
+
+#### Append-only with change tracking field
+
+If you need to filter by "changes since a specific time," add a dedicated `last_change_timestamp` field.
+
+**Line protocol example**:
+
+```text
+device_status,device_id=sensor01 status="active",temperature=72.5,last_change_timestamp=1700000000000000000i 1700000000000000000
+device_status,device_id=sensor01 status="active",temperature=73.1,last_change_timestamp=1700000300000000000i 1700000300000000000
+device_status,device_id=sensor01 status="inactive",temperature=73.1,last_change_timestamp=1700000600000000000i 1700000600000000000
+```
+
+**SQL query to get changes since a specific time**:
+
+```sql
+SELECT
+  device_id,
+  status,
+  temperature,
+  time
+FROM device_status
+WHERE last_change_timestamp >= 1700000000000000000
+ORDER BY time DESC
+```
+
+### Anti-patterns to avoid
+
+The following patterns will produce non-deterministic results when duplicate points are flushed together:
+
+#### Don't overwrite the same (time, tags) point
+
+If points with the same time and tag set are flushed to storage together, any of the values might be retained.
+For example, **don't do this**:
+
+```text
+-- All writes use the same timestamp
+device_status,device_id=sensor01 status="active",temperature=72.5 1700000000000000000
+device_status,device_id=sensor01 status="active",temperature=73.1 1700000000000000000
+device_status,device_id=sensor01 status="inactive",temperature=73.1 1700000000000000000
+
+#### Don't add a field while overwriting data (time, tags)
+
+Adding a field doesn't make points unique.
+Points with the same time and tag set are still considered duplicates--for example,
+**don't do this**:
+
+```text
+-- All writes use the same timestamp, but add a version field
+device_status,device_id=sensor01 status="active",temperature=72.5,version=1i 1700000000000000000
+device_status,device_id=sensor01 status="active",temperature=73.1,version=2i 1700000000000000000
+device_status,device_id=sensor01 status="inactive",temperature=73.1,version=3i 1700000000000000000
+
+#### Don't rely on write delays to force ordering
+
+Delays don't guarantee that duplicate points won't be flushed together.
+The flush interval depends on buffer size, ingestion rate, and system load.
+
+For example, **don't do this**:
+
+```text
+-- Writing with delays between each write
+device_status,device_id=sensor01 status="active" 1700000000000000000
+# Wait 10 seconds...
+device_status,device_id=sensor01 status="inactive" 1700000000000000000
+{{% /show-in %}}
+
+{{% show-in "cloud-dedicated" %}}
+### Retention guidance for last-value tables
+
+{{% product-name %}} applies retention at the database level.
+If your last-value view only needs to retain data for days or weeks, but your main database retains data for months or years (for example, ~400 days), consider creating a separate database with shorter retention specifically for last-value tracking.
+
+**Benefits**:
+- Reduces storage costs for last-value data
+- Improves query performance by limiting data volume
+- Allows independent retention policies for different use cases
+
+**Example**:
+
+```bash
+# Create a database for last-value tracking with 7-day retention
+influxctl database create device_status_current --retention-period 7d
+
+# Create your main database with longer retention
+influxctl database create device_status_history --retention-period 400d
+```
+
+Then write current status to `device_status_current` and historical data to `device_status_history`.
+{{% /show-in %}}
+
+{{% show-in "cloud-dedicated,clustered,cloud-serverless" %}}
+### Performance considerations
+
+#### Row count and query performance
+
+Append-only patterns increase row counts compared to overwriting duplicate points.
+To maintain query performance:
+
+1. **Limit query time ranges**: Query only the time range you need (for example, last 7 days for current state)
+2. **Use time-based filters**: Always include a `WHERE time >=` clause to narrow the query scope
+3. **Consider shorter retention**: For last-value views, use a dedicated database with shorter retention
+
+**Example - Good query with time filter**:
+
+```sql
+SELECT device_id, status, temperature, time
+FROM device_status
+WHERE time >= now() - INTERVAL '7 days'
+ORDER BY time DESC
+```
+
+**Example - Avoid querying entire table**:
+
+```sql
+-- Don't do this - queries all historical data
+SELECT device_id, status, temperature, time
+FROM device_status
+ORDER BY time DESC
+```
+
+#### Storage and cache bandwidth
+
+Append-only patterns create more data points, which results in larger Parquet files.
+This can increase cache bandwidth usage when querying large time ranges.
+
+**Mitigation strategies**:
+1. **Narrow time filters**: Query only same-day partitions when possible
+2. **Use partition-aligned time ranges**: Queries that align with partition boundaries are more efficient
+3. **Consider aggregation**: For historical analysis, use downsampled or aggregated data instead of raw points
+
+**Example - Partition-aligned query**:
+
+```sql
+SELECT device_id, status, temperature, time
+FROM device_status
+WHERE time >= '2025-11-20T00:00:00Z'
+  AND time < '2025-11-21T00:00:00Z'
+ORDER BY time DESC
+```
 {{% /show-in %}}
--- a/data/products.yml
+++ b/data/products.yml
@ -177,12 +177,12 @@ influxdb:
  versions:
    - v2
    - v1
-  latest: v2.7
+  latest: v2.8
  latest_patches:
-    v2: 2.7.12
+    v2: 2.8.0
    v1: 1.12.2
  latest_cli:
-    v2: 2.7.5
+    v2: 2.8.0
  detector_config:
    query_languages:
      InfluxQL: