29 KiB
title | description | weight | menu | v2.0/tags | |||||
---|---|---|---|---|---|---|---|---|---|
Glossary | Terms related to InfluxData products and platforms. | 6 |
|
|
A | B | C | D | E | F |G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
A
agent
A core part of Telegraf that gathers metrics from declared input plugins and sends metrics to declared output plugins, based on the plugins enabled for a configuration.
Related entries: input plugin, output plugin
aggregator plugin
Receives metrics from input plugins, creates aggregate metrics, and then passes aggregate metrics to configured output plugins.
Related entries: input plugin, output plugin, processor plugin
aggregation
A function that returns an aggregated value across a set of points. For a list of available aggregation functions, see Flux built-in aggregate functions.
Related entries: function, selector, transformation
B
batch
A collection of points in line protocol format, separated by newlines (0x0A
).
Submitting a batch of points to the database using a single HTTP request to the write endpoints drastically increases performance by reducing the HTTP overhead.
InfluxData typically recommends batch sizes of 5,000-10,000 points. In some use cases, performance may improve with significantly smaller or larger batches.
Related entries: line protocol, point
batch size
The Telegraf agent sends metrics to output plugins in batches, not individually. The batch size controls the size of each write batch that Telegraf sends to the output plugins.
Related entries: output plugin
collection interval
The default global interval for collecting data from each input plugin. The collection interval can be overridden by each individual input plugin's configuration.
Related entries: input plugin
collection jitter
Collection jitter is used to prevent every input plugin from collecting metrics simultaneously, which can have a measurable effect on the system. Each collection interval, every input plugin will sleep for a random time between zero and the collection jitter before collecting the metrics.
Related entries: collection interval, input plugin
continuous query (CQ)
An InfluxQL query that runs automatically and periodically within a database.
Continuous queries require a function in the SELECT
clause and must include a GROUP BY time()
clause.
See Continuous Queries.
Related entries: function
D
data node
A node that runs the data service.
For high availability, installations must have at least two data nodes. The number of data nodes in your cluster must be the same as your highest replication factor. Any replication factor greater than two gives you additional fault tolerance and query capacity in the cluster.
Data node sizes will depend on your needs. The Amazon EC2 m4.large or m4.xlarge are good starting points.
Related entries: data service, replication factor
data service
Stores time series data and handles writes and queries.
Related entries: data node
database
A logical container for users, retention policies, continuous queries, and time series data.
Related entries: continuous query, retention policy, user
duration
The attribute of the retention policy that determines how long InfluxDB stores data. Data older than the duration are automatically dropped from the database.
Related entries: retention policy
E
event
Measurements gathered at irregular time intervals.
F
field
The key-value pair in InfluxDB's data structure that records metadata and the actual data value. Fields are required in InfluxDB's data structure and they are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant relative to tags.
Query tip: Compare fields to tags; tags are indexed.
Related entries: field key, field set, field value, tag
field key
The key part of the key-value pair that makes up a field. Field keys are strings and they store metadata.
Related entries: field, field set, field value, tag key
field set
The collection of field keys and field values on a point.
Related entries: field, field key, field value, point
field value
The value part of the key-value pair that makes up a field. Field values are the actual data; they can be strings, floats, integers, or booleans. A field value is always associated with a timestamp.
Field values are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant.
Query tip: Compare field values to tag values; tag values are indexed.
Related entries: field, field key, field set, tag value, timestamp
flush interval
The global interval for flushing data from each output plugin to its destination. This value should not be set lower than the collection interval.
Related entries: collection interval, flush jitter, output plugin
flush jitter
Flush jitter prevents every output plugin from sending writes simultaneously, which can overwhelm some data sinks. Each flush interval, every output plugin will sleep for a random time between zero and the flush jitter before emitting metrics. Flush jitter smooths out write spikes when running a large number of Telegraf instances.
Related entries: flush interval, output plugin
Flux
A lightweight scripting language for querying databases (like InfluxDB) and working with data. Flux is included with InfluxDB 1.7 and 2.0. Flux can also be run independently from InfluxDB.
function
Flux functions aggregate, select, and transform time series data. For a complete list of Flux functions, see Flux functions. Or opt to use Flux functions' predecessor, InfluxQL functions. See InfluxQL functions for a complete list.
Related entries: aggregation, selector, transformation
I
identifier
Identifiers are tokens that refer to task names, bucket names, field keys, measurement names, subscription names, tag keys, and user names. For examples and rules, see Flux language lexical elements.
Related entries: bucket field key, [measurement]/#measurement), retention policy, tag key, user
input plugin
Input plugins actively gather metrics and deliver them to the core agent, where aggregator, processor, and output plugins can operate on the metrics. In order to activate an input plugin, it needs to be enabled and configured in Telegraf's configuration file.
Related entries: aggregator plugin, collection interval, output plugin, processor plugin
L
Line Protocol (LP)
The text based format for writing points to InfluxDB. See Line Protocol.
M
measurement
The part of InfluxDB's structure that describes the data stored in the associated fields. Measurements are strings.
Related entries: field, series
member
meta node
A node that runs the meta service.
For high availability, installations must have three meta nodes. Meta nodes can be very modestly sized instances like an EC2 t2.micro or even a nano. For additional fault tolerance, installations may use five meta nodes. The number of meta nodes must be an odd number.
Related entries: meta service
meta service
The consistent data store that keeps state about the cluster, including which servers, buckets, users, tasks, subscriptions, and blocks of time exist.
Related entries: meta node
metastore
Contains internal information about the status of the system. The metastore contains the user information, buckets, shard metadata, tasks, and subscriptions.
Related entries: bucket, retention policy, user
metric
Measurements gathered at regular time intervals.
metric buffer
The metric buffer caches individual metrics when writes are failing for an output plugin. Telegraf will attempt to flush the buffer upon a successful write to the output. The oldest metrics are dropped first when this buffer fills.
Related entries: output plugin
N
node
An independent influxd
process.
Related entries: server
now()
The local server's nanosecond timestamp.
O
output plugin
Output plugins deliver metrics to their configured destination. In order to activate an output plugin, it needs to be enabled and configured in Telegraf's configuration file.
Related entries: aggregator plugin, flush interval, input plugin, processor plugin
P
point
A point in the InfluxDB data structure that consists of a single collection of fields in a series. Each point is uniquely identified by its series and timestamp. In a series, you cannot store more than one point with the same timestamp. When you write a new point to a series with a timestamp that matches an existing point, the field set becomes a union of the old and new field set, where any ties go to the new field set.
precision
The precision configuration setting determines how much timestamp precision is retained in the points received from input plugins. All incoming timestamps are truncated to the given precision.
Telegraf then pads the truncated timestamps with zeros to create a nanosecond timestamp; output plugins will emit timestamps in nanoseconds.
Valid precisions are ns
, us
or µs
, ms
, and s
.
For example, if the precision is set to ms
, the nanosecond epoch timestamp 1480000000123456789
is truncated to 1480000000123
in millisecond precision and padded with zeroes to make a new, less precise nanosecond timestamp of 1480000000123000000
.
Output plugins do not alter the timestamp further. The precision setting is ignored for service input plugins.
Related entries: aggregator plugin, input plugin, output plugin, processor plugin, service input plugin
processor plugin
Processor plugins transform, decorate, and filter metrics collected by input plugins, passing the transformed metrics to the output plugins.
Related entries: aggregator plugin, input plugin, output plugin
Q
query
An operation that retrieves data from InfluxDB. See Query data in InfluxDB.
R
REPL
record
regular expressions
replication factor (RF)
The attribute of the retention policy that determines how many copies of the
data are stored in the cluster.
InfluxDB replicates data across N
data nodes, where N
is the replication
factor.
To maintain data availability for queries, the replication factor should be less than or equal to the number of data nodes in the cluster:
- Data is fully available when the replication factor is greater than the number of unavailable data nodes.
- Data may be unavailable when the replication factor is less than the number of unavailable data nodes.
Any replication factor greater than two gives you additional fault tolerance and query capacity within the cluster.
Related entries: duration, node, retention policy - obsolete?
retention policy (RP)
The part of InfluxDB's data structure that describes for how long InfluxDB keeps data (duration), how many copies of this data is stored in the cluster (replication factor), and the time range covered by shard groups (shard group duration). RPs are unique per database and along with the measurement and tag set define a series.
When you create a database, InfluxDB automatically creates a retention policy called autogen
with an infinite duration, a replication factor set to one, and a shard group duration set to seven days.
See Database Management for retention policy management.
Related entries: duration, measurement, replication factor, series, shard duration, tag set
S
schema
How the data are organized in InfluxDB. The fundamentals of the InfluxDB schema are databases, retention policies, series, measurements, tag keys, tag values, and field keys. See Schema Design for more information.
Related entries: database, field key, measurement, retention policy, series, tag key, tag value
scrape
selector
An InfluxQL function that returns a single point from the range of specified points. See InfluxQL Functions for a complete list of the available and upcoming selectors.
Related entries: aggregation, function, transformation
series
The collection of data in InfluxDB's data structure that share a measurement, tag set, and retention policy.
Note: The field set is not part of the series identification! - obsolete? remove !
Related entries: field set, measurement, retention policy, tag set- obsolete?
series cardinality
The number of unique database, measurement, tag set, and field key combinations in an InfluxDB instance.
For example, assume that an InfluxDB instance has a single database and one measurement.
The single measurement has two tag keys: email
and status
.
If there are three different email
s, and each email address is associated with two
different status
es then the series cardinality for the measurement is 6
(3 * 2 = 6):
status | |
---|---|
lorr@influxdata.com | start |
lorr@influxdata.com | finish |
marv@influxdata.com | start |
marv@influxdata.com | finish |
cliff@influxdata.com | start |
cliff@influxdata.com | finish |
Note that, in some cases, simply performing that multiplication may overestimate series cardinality because of the presence of dependent tags.
Dependent tags are tags that are scoped by another tag and do not increase series
cardinality.
If we add the tag firstname
to the example above, the series cardinality
would not be 18 (3 * 2 * 3 = 18).
It would remain unchanged at 6, as firstname
is already scoped by the email
tag:
status | firstname | |
---|---|---|
lorr@influxdata.com | start | lorraine |
lorr@influxdata.com | finish | lorraine |
marv@influxdata.com | start | marvin |
marv@influxdata.com | finish | marvin |
cliff@influxdata.com | start | clifford |
cliff@influxdata.com | finish | clifford |
See SHOW CARDINALITY to learn about the InfluxQL commands for series cardinality.
Related entries: field key,measurement, tag key, tag set
server
A machine, virtual or physical, that is running InfluxDB. There should only be one InfluxDB process per server.
Related entries: node
service
service input plugin
Service input plugins are input plugins that run in a passive collection mode while the Telegraf agent is running. They listen on a socket for known protocol inputs, or apply their own logic to ingested metrics before delivering them to the Telegraf agent.
Related entries: aggregator plugin, input plugin, output plugin, processor plugin
shard
A shard contains the actual encoded and compressed data, and is represented by a TSM file on disk. Every shard belongs to one and only one shard group. Multiple shards may exist in a single shard group. Each shard contains a specific set of series. All points falling on a given series in a given shard group will be stored in the same shard (TSM file) on disk.
Related entries: series, shard duration, shard group, tsm
shard duration
The shard duration determines how much time each shard group spans.
The specific interval is determined by the SHARD DURATION
of the retention policy.
See Retention Policy management for more information.
For example, given a retention policy with SHARD DURATION
set to 1w
, each shard group will span a single week and contain all points with timestamps in that week.
Related entries: database, retention policy, series, shard, shard group
shard group
Shard groups are logical containers for shards. Shard groups are organized by time and retention policy. Every retention policy that contains data has at least one associated shard group. A given shard group contains all shards with data for the interval covered by the shard group. The interval spanned by each shard group is the shard duration.
Related entries: database, retention policy, series, shard, shard duration
Single Stat
Snappy compression
source
stacked graph
statement
step-plot
stream
"stream of tables"
string
subscription
Subscriptions allow Kapacitor to receive data from InfluxDB in a push model rather than the pull model based on querying data. When Kapacitor is configured to work with InfluxDB, the subscription will automatically push every write for the subscribed database from InfluxDB to Kapacitor. Subscriptions can use TCP or UDP for transmitting the writes.
T
TCP
TSL
TSM (Time-structured merge tree)
TSM file
table
tag
The key-value pair in InfluxDB's data structure that records metadata. Tags are an optional part of InfluxDB's data structure but they are useful for storing commonly-queried metadata; tags are indexed so queries on tags are performant. Query tip: Compare tags to fields; fields are not indexed.
Related entries: field, tag key, tag set, tag value
tag key
The key part of the key-value pair that makes up a tag. Tag keys are strings and they store metadata. Tag keys are indexed so queries on tag keys are performant.
Query tip: Compare tag keys to field keys; field keys are not indexed.
Related entries: field key, tag, tag set, tag value
tag set
The collection of tag keys and tag values on a point.
Related entries: point, series, tag, tag key, tag value
tag value
The value part of the key-value pair that makes up a tag. Tag values are strings and they store metadata. Tag values are indexed so queries on tag values are performant.
Related entries: tag, tag key, tag set
task
Telegraf
template
template variable
time (data type)
time series data
Sequence of data points typically consisting of successive measurements made from the same source over a time interval. Time series data shows how data evolves over time. On a time series data graph, one of the axes is always time. Time series data may be regular or irregular. Regular time series data changes in constant intervals. Irregular time series data changes at non-constant intervals. When aggregating time series
timestamp
The date and time associated with a point. Time in InfluxDB is UTC.
To specify time when writing data, see Elements of line protocol. To specify time when querying data, see Query InfluxDB with Flux.
Related entries: point
V
values per second
The preferred measurement of the rate at which data are persisted to InfluxDB. Write speeds are generally quoted in values per second.
To calculate the values per second rate, multiply the number of points written per second by the number of values stored per point. For example, if the points have four fields each, and a batch of 5000 points is written 10 times per second, the values per second rate is 4 field values per point * 5000 points per batch * 10 batches per second = 200,000 values per second
.
Related entries: batch, field, point, points per second
variable
variable assignment
W
WAL (Write Ahead Log)
The temporary cache for recently written points. To reduce the frequency that permanent storage files are accessed, InfluxDB caches new points in the WAL until their total size or age triggers a flush to more permanent storage. This allows for efficient batching of the writes into the TSM.
Points in the WAL can be queried and persist through a system reboot. On process start, all points in the WAL must be flushed before the system accepts new writes.
Related entries: tsm