Merge pull request #5738 from influxdata/jstirnaman/influxdb3-concepts

InfluxDB 3 glossary updates and fixes
pull/5734/head
Jason Stirnaman 2025-01-08 14:39:27 -06:00 committed by GitHub
commit b7b1947cea
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 63 additions and 62 deletions

View File

@ -46,7 +46,8 @@ Related entries:
### aggregate
A function that returns an aggregated value across a set of points.
For a list of available aggregation functions, see [SQL aggregate functions](/influxdb/cloud-dedicated/reference/sql/functions/aggregate/).
For a list of available aggregation functions,
see [SQL aggregate functions](/influxdb/cloud-dedicated/reference/sql/functions/aggregate/).
<!-- TODO: Add a link to InfluxQL aggregate functions -->
@ -356,7 +357,7 @@ Related entries:
Flush jitter prevents every Telegraf output plugin from sending writes
simultaneously, which can overwhelm some data sinks.
Each flush interval, every Telegraf output plugin will sleep for a random time
Each flush interval, every Telegraf output plugin sleeps for a random time
between zero and the flush jitter before emitting metrics.
Flush jitter smooths out write spikes when running a large number of Telegraf instances.
@ -400,10 +401,10 @@ Identifiers are tokens that refer to specific database objects such as database
names, field keys, measurement names, tag keys, etc.
Related entries:
[database](#database)
[database](#database),
[field key](#field-key),
[measurement](#measurement),
[tag key](#tag-key),
[tag key](#tag-key)
### influx
@ -422,8 +423,7 @@ and other required processes.
### InfluxDB
An open source time series database (TSDB) developed by InfluxData.
Written in Go and optimized for fast, high-availability storage and retrieval of
An open source time series database (TSDB) developed by InfluxData, optimized for fast, high-availability storage and retrieval of
time series data in fields such as operations monitoring, application metrics,
Internet of Things sensor data, and real-time analytics.
@ -435,8 +435,8 @@ The SQL-like query language used to query data in InfluxDB.
Telegraf input plugins actively gather metrics and deliver them to the core agent,
where aggregator, processor, and output plugins can operate on the metrics.
In order to activate an input plugin, it needs to be enabled and configured in
Telegraf's configuration file.
To activate an input plugin, enable and configure it in the
Telegraf configuration file.
Related entries:
[aggregator plugin](#aggregator-plugin),
@ -760,7 +760,7 @@ in the cluster (replication factor), and the time range covered by shard groups
(shard group duration). RPs are unique per database and along with the measurement
and tag set define a series.
In {{< product-name >}} the equivalent is [retention period](#retention-period),
In {{< product-name >}}, the equivalent is [retention period](#retention-period),
however retention periods are not part of the data model.
The retention period describes the data persistence behavior of a database.
@ -837,8 +837,8 @@ Related entries:
### series
A collection of data in the InfluxDB data structure that share a common
_measurement_, _tag set_, and _field key_.
In the InfluxDB 3 data structure, a collection of data that share a common
_measurement_ and _tag set_.
Related entries:
[field set](#field-set),
@ -847,12 +847,13 @@ Related entries:
### series cardinality
The number of unique measurement, tag set, and field key combinations in an InfluxDB database.
The number of unique measurement (table), tag set, and field key combinations in an InfluxDB database.
For example, assume that an InfluxDB bucket has one measurement.
For example, assume that an InfluxDB database has one measurement.
The single measurement has two tag keys: `email` and `status`.
If there are three different `email`s, and each email address is associated with two
different `status`es, the series cardinality for the measurement is 6
If there are three different `email` tag values,
and each email address is associated with two
different `status` tag values, then the series cardinality for the measurement is 6
(3 × 2 = 6):
| email | status |
@ -867,7 +868,7 @@ different `status`es, the series cardinality for the measurement is 6
In some cases, performing this multiplication may overestimate series cardinality
because of the presence of dependent tags.
Dependent tags are scoped by another tag and do not increase series cardinality.
If we add the tag `firstname` to the example above, the series cardinality
If we add the tag `firstname` to the preceding example, the series cardinality
would not be 18 (3 × 2 × 3 = 18).
The series cardinality would remain unchanged at 6, as `firstname` is already scoped by the `email` tag:
@ -892,7 +893,7 @@ A series key identifies a particular series by measurement, tag set, and field k
For example:
```
```text
# measurement, tag set, field key
h2o_level, location=santa_monica, h2o_feet
```
@ -1129,18 +1130,17 @@ A statement that sets or updates the value stored in a variable.
## W
### WAL (Write Ahead Log) - enterprise
### WAL (Write-Ahead Log)
The temporary cache for recently written points.
To reduce the frequency that permanent storage files are accessed, InfluxDB
caches new points in the WAL until their total size or age triggers a flush to
more permanent storage. This allows for efficient batching of the writes into the TSM.
more permanent storage. This allows for efficient batching of the writes into
the storage engine.
Points in the WAL can be queried and persist through a system reboot.
On process start, all points in the WAL must be flushed before the system accepts new writes.
Related entries:
[tsm](#tsm-time-structured-merge-tree)
Points in the WAL are queryable and persist through a system reboot.
On process start, all points in the WAL must be flushed before the system
accepts new writes.
### windowing

View File

@ -340,6 +340,7 @@ Related entries:
[field](#field),
[field key](#field-key),
[field set](#field-set),
[tag set](#tag-set),
[tag value](#tag-value),
[timestamp](#timestamp)
@ -366,7 +367,7 @@ Related entries:
Flush jitter prevents every Telegraf output plugin from sending writes
simultaneously, which can overwhelm some data sinks.
Each flush interval, every Telegraf output plugin will sleep for a random time
Each flush interval, every Telegraf output plugin sleeps for a random time
between zero and the flush jitter before emitting metrics.
Flush jitter smooths out write spikes when running a large number of Telegraf instances.
@ -434,8 +435,7 @@ and other required processes.
### InfluxDB
An open source time series database (TSDB) developed by InfluxData.
Written in Go and optimized for fast, high-availability storage and retrieval of
An open source time series database (TSDB) developed by InfluxData, optimized for fast, high-availability storage and retrieval of
time series data in fields such as operations monitoring, application metrics,
Internet of Things sensor data, and real-time analytics.
@ -447,8 +447,8 @@ The SQL-like query language used to query data in InfluxDB.
Telegraf input plugins actively gather metrics and deliver them to the core agent,
where aggregator, processor, and output plugins can operate on the metrics.
In order to activate an input plugin, it needs to be enabled and configured in
Telegraf's configuration file.
To activate an input plugin, enable and configure it in the
Telegraf configuration file.
Related entries:
[aggregator plugin](#aggregator-plugin),
@ -471,8 +471,9 @@ Related entries:
### IOx
The IOx (InfluxDB v3) storage engine is a real-time, columnar database optimized for time series
data built in Rust on top of [Apache Arrow](https://arrow.apache.org/) and
The IOx storage engine (InfluxDB v3 storage engine) is a real-time, columnar
database optimized for time series data built in Rust on top of
[Apache Arrow](https://arrow.apache.org/) and
[DataFusion](https://arrow.apache.org/datafusion/user-guide/introduction.html).
IOx replaces the [TSM (Time Structured Merge tree)](#tsm-time-structured-merge-tree) storage engine.
@ -848,8 +849,8 @@ Related entries:
### series
A collection of data in the InfluxDB data structure that share a common
_measurement_, _tag set_, and _field key_.
In the InfluxDB 3 data structure, a collection of data that share a common
_measurement_ and _tag set_.
Related entries:
[field set](#field-set),
@ -860,10 +861,11 @@ Related entries:
The number of unique measurement, tag set, and field key combinations in an {{% product-name %}} bucket.
For example, assume that an InfluxDB bucket has one measurement.
For example, assume that an InfluxDB database has one measurement.
The single measurement has two tag keys: `email` and `status`.
If there are three different `email`s, and each email address is associated with two
different `status`es, the series cardinality for the measurement is 6
If there are three different `email` tag values,
and each email address is associated with two
different `status` tag values, then the series cardinality for the measurement is 6
(3 × 2 = 6):
| email | status |
@ -878,7 +880,7 @@ different `status`es, the series cardinality for the measurement is 6
In some cases, performing this multiplication may overestimate series cardinality
because of the presence of dependent tags.
Dependent tags are scoped by another tag and do not increase series cardinality.
If we add the tag `firstname` to the example above, the series cardinality
If we add the tag `firstname` to the preceding example, the series cardinality
would not be 18 (3 × 2 × 3 = 18).
The series cardinality would remain unchanged at 6, as `firstname` is already scoped by the `email` tag:
@ -1136,18 +1138,17 @@ A statement that sets or updates the value stored in a variable.
## W
### WAL (Write Ahead Log) - enterprise
### WAL (Write-Ahead Log)
The temporary cache for recently written points.
To reduce the frequency that permanent storage files are accessed, InfluxDB
caches new points in the WAL until their total size or age triggers a flush to
more permanent storage. This allows for efficient batching of the writes into the TSM.
more permanent storage. This allows for efficient batching of the writes into
the storage engine.
Points in the WAL can be queried and persist through a system reboot.
On process start, all points in the WAL must be flushed before the system accepts new writes.
Related entries:
[tsm](#tsm-time-structured-merge-tree)
Points in the WAL are queryable and persist through a system reboot.
On process start, all points in the WAL must be flushed before the system
accepts new writes.
### windowing

View File

@ -46,7 +46,8 @@ Related entries:
### aggregate
A function that returns an aggregated value across a set of points.
For a list of available aggregation functions, see [SQL aggregate functions](/influxdb/clustered/reference/sql/functions/aggregate/).
For a list of available aggregation functions,
see [SQL aggregate functions](/influxdb/clustered/reference/sql/functions/aggregate/).
<!-- TODO: Add a link to InfluxQL aggregate functions -->
@ -333,6 +334,7 @@ Related entries:
[field](#field),
[field key](#field-key),
[field set](#field-set),
[tag set](#tag-set),
[tag value](#tag-value),
[timestamp](#timestamp)
@ -403,10 +405,10 @@ Identifiers are tokens that refer to specific database objects such as database
names, field keys, measurement names, tag keys, etc.
Related entries:
[database](#database)
[database](#database),
[field key](#field-key),
[measurement](#measurement),
[tag key](#tag-key),
[tag key](#tag-key)
### influx
@ -425,8 +427,8 @@ and other required processes.
### InfluxDB
An open source time series database (TSDB) developed by InfluxData.
Written in Go and optimized for fast, high-availability storage and retrieval of
An open source time series database (TSDB) developed by InfluxData, optimized
for fast, high-availability storage and retrieval of
time series data in fields such as operations monitoring, application metrics,
Internet of Things sensor data, and real-time analytics.
@ -438,8 +440,8 @@ The SQL-like query language used to query data in InfluxDB.
Telegraf input plugins actively gather metrics and deliver them to the core agent,
where aggregator, processor, and output plugins can operate on the metrics.
In order to activate an input plugin, it needs to be enabled and configured in
Telegraf's configuration file.
To activate an input plugin, enable and configure it in the
Telegraf configuration file.
Related entries:
[aggregator plugin](#aggregator-plugin),
@ -752,7 +754,7 @@ relative to [now](#now).
The minimum retention period is **one hour**.
Related entries:
[bucket](#bucket),
[bucket](#bucket)
### retention policy (RP)
@ -839,8 +841,8 @@ Related entries:
### series
A collection of data in the InfluxDB data structure that share a common
_measurement_, _tag set_, and _field key_.
In the InfluxDB 3 data structure, a collection of data that share a common
_measurement_ and _tag set_.
Related entries:
[field set](#field-set),
@ -849,12 +851,13 @@ Related entries:
### series cardinality
The number of unique measurement, tag set, and field key combinations in an InfluxDB database.
The number of unique measurement (table), tag set, and field key combinations in an InfluxDB database.
For example, assume that an InfluxDB database has one measurement.
The single measurement has two tag keys: `email` and `status`.
If there are three different `email`s, and each email address is associated with two
different `status`es, the series cardinality for the measurement is 6
If there are three different `email` tag values,
and each email address is associated with two
different `status` tag values, then the series cardinality for the measurement is 6
(3 × 2 = 6):
| email | status |
@ -869,7 +872,7 @@ different `status`es, the series cardinality for the measurement is 6
In some cases, performing this multiplication may overestimate series cardinality
because of the presence of dependent tags.
Dependent tags are scoped by another tag and do not increase series cardinality.
If we add the tag `firstname` to the example above, the series cardinality
If we add the tag `firstname` to the preceding example, the series cardinality
would not be 18 (3 × 2 × 3 = 18).
The series cardinality would remain unchanged at 6, as `firstname` is already scoped by the `email` tag:
@ -1048,7 +1051,7 @@ Related entries: [aggregate](#aggregate), [function](#function), [selector](#sel
The InfluxDB v1 and v2 data storage format that allows greater compaction and
higher write and read throughput than B+ or LSM tree implementations.
The TSM storage engine has been replaced by [the InfluxDB v3 storage engine (IOx)](#iox).
The TSM storage engine has been replaced by the [InfluxDB v3 storage engine (IOx)](#iox).
Related entries:
[IOx](#iox)
@ -1143,9 +1146,6 @@ Points in the WAL are queryable and persist through a system reboot.
On process start, all points in the WAL must be flushed before the system
accepts new writes.
Related entries:
[tsm](#tsm-time-structured-merge-tree)
### windowing
Grouping data based on specified time intervals.