Merge branch 'master' into docs/api-release

pull/4437/head
sunbryely-influxdata 2022-09-09 10:09:54 -07:00 committed by GitHub
commit 08f408d542
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
178 changed files with 44309 additions and 65 deletions

View File

@ -0,0 +1,41 @@
---
title: InfluxDB Enterprise 1.10 documentation
description: >
Documentation for InfluxDB Enterprise, which adds clustering, high availability, fine-grained authorization, and more to InfluxDB OSS.
aliases:
- /enterprise/v1.10/
menu:
enterprise_influxdb_1_10:
name: InfluxDB Enterprise v1.10
weight: 1
---
InfluxDB Enterprise provides a time series database designed to handle high write and query loads and offers highly scalable clusters on your infrastructure with a management UI. Use for DevOps monitoring, IoT sensor data, and real-time analytics. Check out the key features that make InfluxDB Enterprise a great choice for working with time series data.
If you're interested in working with InfluxDB Enterprise, visit
[InfluxPortal](https://portal.influxdata.com/) to sign up, get a license key,
and get started!
## Key features
- High performance datastore written specifically for time series data. High ingest speed and data compression.
- Provides high availability across your cluster and eliminates a single point of failure.
- Written entirely in Go. Compiles into a single binary with no external dependencies.
- Simple, high performing write and query HTTP APIs.
- Plugin support for other data ingestion protocols such as Graphite, collectd, and OpenTSDB.
- Expressive SQL-like query language tailored to easily query aggregated data.
- Continuous queries automatically compute aggregate data to make frequent queries more efficient.
- Tags let you index series for fast and efficient queries.
- Retention policies efficiently auto-expire stale data.
## Next steps
- [Install and deploy](/enterprise_influxdb/v1.10/introduction/installation/)
- Review key [concepts](/enterprise_influxdb/v1.10/concepts/)
- [Get started](/enterprise_influxdb/v1.10/introduction/getting-started/)
<!-- Monitor your cluster
- Manage queries
- Manage users
- Explore and visualize your data
-->

View File

@ -0,0 +1,14 @@
---
title: About the project
description: >
Release notes, licenses, and third-party software details for InfluxDB Enterprise.
menu:
enterprise_influxdb_1_10_ref:
weight: 10
---
{{< children hlevel="h2" >}}
## Commercial license
InfluxDB Enterprise is available with a commercial license. [Contact sales for more information](https://www.influxdata.com/contact-sales/).

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,51 @@
---
title: Third party software
description: >
InfluxData products contain third-party software that is copyrighted,
patented, or otherwise legally protected software of third parties
incorporated in InfluxData products.
menu:
enterprise_influxdb_1_10_ref:
name: Third party software
weight: 20
parent: About the project
---
InfluxData products contain third party software, which means the copyrighted,
patented, or otherwise legally protected software of third parties that is
incorporated in InfluxData products.
Third party suppliers make no representation nor warranty with respect to
such third party software or any portion thereof.
Third party suppliers assume no liability for any claim that might arise with
respect to such third party software, nor for a
customers use of or inability to use the third party software.
InfluxDB Enterprise 1.10 includes the following third party software components, which are maintained on a version by version basis.
| Component | License | Integration |
| :-------- | :-------- | :-------- |
| [ASN1 BER Encoding / Decoding Library for the GO programming language (go-asn1-ber/ans1-ber)](https://github.com/go-asn1-ber/asn1-ber) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
| [Cobra is a commander for modern Go CLI interactions (spf13/cobra)](https://github.com/spf13/cobra) | [BSD 2-Clause](https://opensource.org/licenses/BSD-2-Clause) | Statically linked |
| [A golang registry for global request variables (gorilla/context)](https://github.com/gorilla/context) | [BSD 3-Clause](https://opensource.org/licenses/BSD-3-Clause) | Statically linked |
| [FlatBuffers: Memory Efficient Serialization Library (google/flatbuffers)](https://github.com/google/flatbuffers) | [Apache License 2.0](https://opensource.org/licenses/Apache-2.0) | Statically linked |
| [Flux is a lightweight scripting language for querying databases (like InfluxDB) and working with data (influxdata/flux)](https://github.com/influxdata/flux) | [Apache License 2.0](https://opensource.org/licenses/Apache-2.0) | Statically linked |
| [GoConvey is a yummy Go testing tool for gophers (glycerine/goconvey)](https://github.com/glycerine/goconvey) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
| [An immutable radix tree implementation in Golang (hashicorp/go-immutable-radix)](https://github.com/hashicorp/go-immutable-radix)| [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
| [Some helpful packages for writing Go apps (markbates/going)](https://github.com/markbates/going)| [MIT](https://opensource.org/licenses/MIT) | Statically linked |
| [Golang LRU cache implements a fixed-size thread safe LRU cache (hashicorp/golang-lru)](https://github.com/hashicorp/golang-lru) |[Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
| [Codec - a high performance and feature-rich Idiomatic encode/decode and rpc library for msgpack and Binc (hashicorp/go-msgpack)](https://github.com/hashicorp/go-msgpack)| [BSD 3-Clause](https://opensource.org/licenses/BSD-3-Clause) | Statically linked |
| [A Golang library for exporting performance and runtime metrics to external metrics systems, i.e. statsite, statsd (armon/go-metrics)](https://github.com/armon/go-metrics) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
| [Generates UUID-format strings using purely high quality random bytes (hashicorp/go-uuid)](https://github.com/hashicorp/go-uuid) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
| [Collection of useful handlers for Go net/http package (gorilla/handlers)](https://github.com/gorilla/handlers) | [BSD 2-Clause](https://opensource.org/licenses/BSD-2-Clause) | Statically linked |
| [Golang implementation of JavaScript Object (dvsekhvalnov/jose2go)](https://github.com/dvsekhvalnov/jose2go) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
| [Basic LDAP v3 functionality for the Go programming language (go-ldap/ldap)](https://github.com/go-ldap/ldap) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
| [Basic LDAP v3 functionality for the Go programming language (mark-rushakoff/ldapserver)](https://github.com/mark-rushakoff/ldapserver) | [BSD 3-Clause](https://opensource.org/licenses/BSD-3-Clause) | Statically linked |
| [A powerful URL router and dispatcher for golang (gorilla/mux)](https://github.com/gorilla/mux) | [BSD 2-Clause](https://opensource.org/licenses/BSD-2-Clause) | Statically linked |
| [pkcs7 implements parsing and creating signed and enveloped messages (fullsailor/pkcs7)](https://github.com/fullsailor/pkcs7) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
| [Pretty printing for Go values (kr/pretty)](https://github.com/kr/pretty) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |Statically linked|
|[Go language implementation of the Raft consensus protocol (hashicorp/raft)](https://github.com/hashicorp/raft) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
| [Raft backend implementation using BoltDB (hashicorp/raft-boltdb)](https://github.com/hashicorp/raft-boltdb) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
| [General purpose extensions to golang's database/sql (jmoiron/sqlx)](https://github.com/jmoiron/sqlx) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |Statically linked|
| [Miscellaneous functions for formatting text (kr/text)](https://github.com/kr/text) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
| [Golang connection multiplexing library (hashicorp/yamux)](https://github.com/hashicorp/yamux/) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |

View File

@ -0,0 +1,10 @@
---
title: Administer InfluxDB Enterprise
description: Configuration, security, and logging in InfluxDB enterprise.
menu:
enterprise_influxdb_1_10:
name: Administration
weight: 70
---
{{< children >}}

View File

@ -0,0 +1,515 @@
---
title: Back up and restore
description: >
Back up and restore InfluxDB enterprise clusters to prevent data loss.
aliases:
- /enterprise/v1.10/guides/backup-and-restore/
menu:
enterprise_influxdb_1_10:
name: Back up and restore
weight: 10
parent: Administration
---
- [Overview](#overview)
- [Backup and restore utilities](#backup-and-restore-utilities)
- [Exporting and importing data](#exporting-and-importing-data)
## Overview
When deploying InfluxDB Enterprise in production environments, you should have a strategy and procedures for backing up and restoring your InfluxDB Enterprise clusters to be prepared for unexpected data loss.
The tools provided by InfluxDB Enterprise can be used to:
- Provide disaster recovery due to unexpected events
- Migrate data to new environments or servers
- Restore clusters to a consistent state
- Debugging
Depending on the volume of data to be protected and your application requirements, InfluxDB Enterprise offers two methods, described below, for managing backups and restoring data:
- [Backup and restore utilities](#backup-and-restore-utilities) — For most applications
- [Exporting and importing data](#exporting-and-importing-data) — For large datasets
> **Note:** Use the [`backup` and `restore` utilities (InfluxDB OSS 1.5 and later)](/enterprise_influxdb/v1.10/administration/backup-and-restore/) to:
>
> - Restore InfluxDB Enterprise backup files to InfluxDB OSS instances.
> - Back up InfluxDB OSS data that can be restored in InfluxDB Enterprise clusters.
## Backup and restore utilities
InfluxDB Enterprise supports backing up and restoring data in a cluster,
a single database and retention policy, and single shards.
Most InfluxDB Enterprise applications can use the backup and restore utilities.
Use the `backup` and `restore` utilities to back up and restore between `influxd`
instances with the same versions or with only minor version differences.
For example, you can backup from {{< latest-patch version="1.10" >}} and restore on {{< latest-patch >}}.
### Backup utility
A backup creates a copy of the [metastore](/enterprise_influxdb/v1.10/concepts/glossary/#metastore) and [shard](/enterprise_influxdb/v1.10/concepts/glossary/#shard) data at that point in time and stores the copy in the specified directory.
Or, back up **only the cluster metastore** using the `-strategy only-meta` backup option. For more information, see [perform a metastore only backup](#perform-a-metastore-only-backup).
All backups include a manifest, a JSON file describing what was collected during the backup.
The filenames reflect the UTC timestamp of when the backup was created, for example:
- Metastore backup: `20060102T150405Z.meta` (includes usernames and passwords)
- Shard data backup: `20060102T150405Z.<shard_id>.tar.gz`
- Manifest: `20060102T150405Z.manifest`
Backups can be full, metastore only, or incremental, and they are incremental by default:
- **Full backup**: Creates a copy of the metastore and shard data.
- **Incremental backup**: Creates a copy of the metastore and shard data that have changed since the last incremental backup. If there are no existing incremental backups, the system automatically performs a complete backup.
- **Metastore only backup**: Creates a copy of the metastore data only.
Restoring different types of backups requires different syntax.
To prevent issues with [restore](#restore-utility), keep full backups, metastore only backups, and incremental backups in separate directories.
>**Note:** The backup utility copies all data through the meta node that is used to
execute the backup. As a result, performance of a backup and restore is typically limited by the network IO of the meta node. Increasing the resources available to this meta node (such as resizing the EC2 instance) can significantly improve backup and restore performance.
#### Syntax
```bash
influxd-ctl [global-options] backup [backup-options] <path-to-backup-directory>
```
> **Note:** The `influxd-ctl backup` command exits with `0` for success and `1` for failure. If the backup fails, output can be directed to a log file to troubleshoot.
##### Global options
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.10/tools/influxd-ctl/#global-options)
for a complete list of the global `influxd-ctl` options.
##### Backup options
- `-db <string>`: name of the single database to back up
- `-from <TCP-address>`: the data node TCP address to prefer when backing up
- `-strategy`: select the backup strategy to apply during backup
- `incremental`: _**(Default)**_ backup only data added since the previous backup.
- `full` perform a full backup. Same as `-full`
- `only-meta` perform a backup for meta data only: users, roles,
databases, continuous queries, retention policies. Shards are not exported.
- `-full`: perform a full backup. Deprecated in favour of `-strategy=full`
- `-rp <string>`: the name of the single retention policy to back up (must specify `-db` with `-rp`)
- `-shard <unit>`: the ID of the single shard to back up (cannot be used with `-db`)
- `-start <timestamp>`: Include all points starting with specified timestamp (RFC3339 format). Not compatible with `-since` or `-strategy full`.
- `-end <timestamp>`: Exclude all points after timestamp (RFC3339 format). Not compatible with `-since` or `-strategy full`.
### Examples
#### Back up a database and all retention policies
Store the following incremental backups in different directories.
The first backup specifies `-db myfirstdb` and the second backup specifies
different options: `-db myfirstdb` and `-rp autogen`.
```bash
influxd-ctl backup -db myfirstdb ./myfirstdb-allrp-backup
influxd-ctl backup -db myfirstdb -rp autogen ./myfirstdb-autogen-backup
```
#### Back up a database with a specific retention policy
Store the following incremental backups in the same directory.
Both backups specify the same `-db` flag and the same database.
```bash
influxd-ctl backup -db myfirstdb ./myfirstdb-allrp-backup
influxd-ctl backup -db myfirstdb ./myfirstdb-allrp-backup
```
#### Back up data from a specific time range
To back up data in a specific time range, use the `-start` and `-end` options:
```bash
influxd-ctl backup -db myfirstdb ./myfirstdb-jandata -start 2022-01-01T012:00:00Z -end 2022-01-31T011:59:00Z
```
#### Perform an incremental backup
Perform an incremental backup into the current directory with the command below.
If there are any existing backups the current directory, the system performs an incremental backup.
If there aren't any existing backups in the current directory, the system performs a backup of all data in InfluxDB.
```bash
# Syntax
influxd-ctl backup .
# Example
$ influxd-ctl backup .
Backing up meta data... Done. 421 bytes transferred
Backing up node 7ba671c7644b:8088, db telegraf, rp autogen, shard 4... Done. Backed up in 903.539567ms, 307712 bytes transferred
Backing up node bf5a5f73bad8:8088, db _internal, rp monitor, shard 1... Done. Backed up in 138.694402ms, 53760 bytes transferred
Backing up node 9bf0fa0c302a:8088, db _internal, rp monitor, shard 2... Done. Backed up in 101.791148ms, 40448 bytes transferred
Backing up node 7ba671c7644b:8088, db _internal, rp monitor, shard 3... Done. Backed up in 144.477159ms, 39424 bytes transferred
Backed up to . in 1.293710883s, transferred 441765 bytes
$ ls
20160803T222310Z.manifest 20160803T222310Z.s1.tar.gz 20160803T222310Z.s3.tar.gz
20160803T222310Z.meta 20160803T222310Z.s2.tar.gz 20160803T222310Z.s4.tar.gz
```
#### Perform a full backup
Perform a full backup into a specific directory with the command below.
The directory must already exist.
```bash
# Sytnax
influxd-ctl backup -full <path-to-backup-directory>
# Example
$ influxd-ctl backup -full backup_dir
Backing up meta data... Done. 481 bytes transferred
Backing up node <hostname>:8088, db _internal, rp monitor, shard 1... Done. Backed up in 33.207375ms, 238080 bytes transferred
Backing up node <hostname>:8088, db telegraf, rp autogen, shard 2... Done. Backed up in 15.184391ms, 95232 bytes transferred
Backed up to backup_dir in 51.388233ms, transferred 333793 bytes
$ ls backup_dir
20170130T184058Z.manifest
20170130T184058Z.meta
20170130T184058Z.s1.tar.gz
20170130T184058Z.s2.tar.gz
```
#### Perform an incremental backup on a single database
Point at a remote meta server and back up only one database into a given directory (the directory must already exist):
```bash
# Syntax
influxd-ctl -bind <metahost>:8091 backup -db <db-name> <path-to-backup-directory>
# Example
$ influxd-ctl -bind 2a1b7a338184:8091 backup -db telegraf ./telegrafbackup
Backing up meta data... Done. 318 bytes transferred
Backing up node 7ba671c7644b:8088, db telegraf, rp autogen, shard 4... Done. Backed up in 997.168449ms, 399872 bytes transferred
Backed up to ./telegrafbackup in 1.002358077s, transferred 400190 bytes
$ ls ./telegrafbackup
20160803T222811Z.manifest 20160803T222811Z.meta 20160803T222811Z.s4.tar.gz
```
#### Perform a metadata only backup
Perform a metadata only backup into a specific directory with the command below.
The directory must already exist.
```bash
# Syntax
influxd-ctl backup -strategy only-meta <path-to-backup-directory>
# Example
$ influxd-ctl backup -strategy only-meta backup_dir
Backing up meta data... Done. 481 bytes transferred
Backed up to backup_dir in 51.388233ms, transferred 481 bytes
~# ls backup_dir
20170130T184058Z.manifest
20170130T184058Z.meta
```
### Restore utility
#### Disable anti-entropy (AE) before restoring a backup
> Before restoring a backup, stop the anti-entropy (AE) service (if enabled) on **each data node in the cluster, one at a time**.
>
> 1. Stop the `influxd` service.
> 2. Set `[anti-entropy].enabled` to `false` in the influx configuration file (by default, influx.conf).
> 3. Restart the `influxd` service and wait for the data node to receive read and write requests and for the [hinted handoff queue](/enterprise_influxdb/v1.10/concepts/clustering/#hinted-handoff) to drain.
> 4. Once AE is disabled on all data nodes and each node returns to a healthy state, you're ready to restore the backup. For details on how to restore your backup, see examples below.
> 5. After restoring the backup, restart AE services on each data node.
##### Restore a backup
Restore a backup to an existing cluster or a new cluster.
By default, a restore writes to databases using the backed-up data's [replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor).
An alternate replication factor can be specified with the `-newrf` flag when restoring a single database.
Restore supports both `-full` backups and incremental backups; the syntax for
a restore differs depending on the backup type.
##### Restores from an existing cluster to a new cluster
Restores from an existing cluster to a new cluster restore the existing cluster's
[users](/enterprise_influxdb/v1.10/concepts/glossary/#user), roles,
[databases](/enterprise_influxdb/v1.10/concepts/glossary/#database), and
[continuous queries](/enterprise_influxdb/v1.10/concepts/glossary/#continuous-query-cq) to
the new cluster.
They do not restore Kapacitor [subscriptions](/enterprise_influxdb/v1.10/concepts/glossary/#subscription).
In addition, restores to a new cluster drop any data in the new cluster's
`_internal` database and begin writing to that database anew.
The restore does not write the existing cluster's `_internal` database to
the new cluster.
#### Syntax to restore from incremental and metadata backups
Use the syntax below to restore an incremental or metadata backup to a new cluster or an existing cluster.
**The existing cluster must contain no data in the affected databases.**
Performing a restore from an incremental backup requires the path to the incremental backup's directory.
```bash
influxd-ctl [global-options] restore [restore-options] <path-to-backup-directory>
```
{{% note %}}
The existing cluster can have data in the `_internal` database (the database InfluxDB creates if
[internal monitoring](/platform/monitoring/influxdata-platform/tools/measurements-internal) is enabled).
The system automatically drops the `_internal` database when it performs a complete restore.
{{% /note %}}
##### Global options
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.10/tools/influxd-ctl/#global-options)
for a complete list of the global `influxd-ctl` options.
##### Restore options
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.10/tools/influxd-ctl/#restore)
for a complete list of `influxd-ctl restore` options.
- `-db <string>`: the name of the single database to restore
- `-list`: shows the contents of the backup
- `-newdb <string>`: the name of the new database to restore to (must specify with `-db`)
- `-newrf <int>`: the new replication factor to restore to (this is capped to the number of data nodes in the cluster)
- `-newrp <string>`: the name of the new retention policy to restore to (must specify with `-rp`)
- `-rp <string>`: the name of the single retention policy to restore
- `-shard <unit>`: the shard ID to restore
#### Syntax to restore from a full or manifest only backup
Use the syntax below to restore a full or manifest only backup to a new cluster or an existing cluster.
Note that the existing cluster must contain no data in the affected databases.*
Performing a restore requires the `-full` flag and the path to the backup's manifest file.
```bash
influxd-ctl [global-options] restore [options] -full <path-to-manifest-file>
```
\* The existing cluster can have data in the `_internal` database, the database
that the system creates by default.
The system automatically drops the `_internal` database when it performs a
complete restore.
##### Global options
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.10/tools/influxd-ctl/#global-options)
for a complete list of the global `influxd-ctl` options.
##### Restore options
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.10/tools/influxd-ctl/#restore)
for a complete list of `influxd-ctl restore` options.
- `-db <string>`: the name of the single database to restore
- `-list`: shows the contents of the backup
- `-newdb <string>`: the name of the new database to restore to (must specify with `-db`)
- `-newrf <int>`: the new replication factor to restore to (this is capped to the number of data nodes in the cluster)
- `-newrp <string>`: the name of the new retention policy to restore to (must specify with `-rp`)
- `-rp <string>`: the name of the single retention policy to restore
- `-shard <unit>`: the shard ID to restore
#### Examples
##### Restore from an incremental backup
```bash
# Syntax
influxd-ctl restore <path-to-backup-directory>
# Example
$ influxd-ctl restore my-incremental-backup/
Using backup directory: my-incremental-backup/
Using meta backup: 20170130T231333Z.meta
Restoring meta data... Done. Restored in 21.373019ms, 1 shards mapped
Restoring db telegraf, rp autogen, shard 2 to shard 2...
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 2 in 61.046571ms, 588800 bytes transferred
Restored from my-incremental-backup/ in 83.892591ms, transferred 588800 bytes
```
##### Restore from a metadata backup
In this example, the `restore` command restores a [metadata backup](#perform-a-metadata-only-backup)
stored in the `metadata-backup/` directory.
```bash
# Syntax
influxd-ctl restore <path-to-backup-directory>
# Example
$ influxd-ctl restore metadata-backup/
Using backup directory: metadata-backup/
Using meta backup: 20200101T000000Z.meta
Restoring meta data... Done. Restored in 21.373019ms, 1 shards mapped
Restored from my-incremental-backup/ in 19.2311ms, transferred 588 bytes
```
##### Restore from a `-full` backup
```bash
# Syntax
influxd-ctl restore -full <path-to-manifest-file>
# Example
$ influxd-ctl restore -full my-full-backup/20170131T020341Z.manifest
Using manifest: my-full-backup/20170131T020341Z.manifest
Restoring meta data... Done. Restored in 9.585639ms, 1 shards mapped
Restoring db telegraf, rp autogen, shard 2 to shard 2...
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 2 in 48.095082ms, 569344 bytes transferred
Restored from my-full-backup in 58.58301ms, transferred 569344 bytes
```
##### Restore from an incremental backup for a single database and give the database a new name
```bash
# Syntax
influxd-ctl restore -db <src> -newdb <dest> <path-to-backup-directory>
# Example
$ influxd-ctl restore -db telegraf -newdb restored_telegraf my-incremental-backup/
Using backup directory: my-incremental-backup/
Using meta backup: 20170130T231333Z.meta
Restoring meta data... Done. Restored in 8.119655ms, 1 shards mapped
Restoring db telegraf, rp autogen, shard 2 to shard 4...
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 4 in 57.89687ms, 588800 bytes transferred
Restored from my-incremental-backup/ in 66.715524ms, transferred 588800 bytes
```
##### Restore from an incremental backup for a database and merge that database into an existing database
Your `telegraf` database was mistakenly dropped, but you have a recent backup so you've only lost a small amount of data.
If Telegraf is still running, it will recreate the `telegraf` database shortly after the database is dropped.
You might try to directly restore your `telegraf` backup just to find that you can't restore:
```bash
$ influxd-ctl restore -db telegraf my-incremental-backup/
Using backup directory: my-incremental-backup/
Using meta backup: 20170130T231333Z.meta
Restoring meta data... Error.
restore: operation exited with error: problem setting snapshot: database already exists
```
To work around this, you can restore your telegraf backup into a new database by specifying the `-db` flag for the source and the `-newdb` flag for the new destination:
```bash
$ influxd-ctl restore -db telegraf -newdb restored_telegraf my-incremental-backup/
Using backup directory: my-incremental-backup/
Using meta backup: 20170130T231333Z.meta
Restoring meta data... Done. Restored in 19.915242ms, 1 shards mapped
Restoring db telegraf, rp autogen, shard 2 to shard 7...
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 7 in 36.417682ms, 588800 bytes transferred
Restored from my-incremental-backup/ in 56.623615ms, transferred 588800 bytes
```
Then, in the [`influx` client](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/), use an [`INTO` query](/enterprise_influxdb/v1.10/query_language/explore-data/#the-into-clause) to copy the data from the new database into the existing `telegraf` database:
```bash
$ influx
> USE restored_telegraf
Using database restored_telegraf
> SELECT * INTO telegraf..:MEASUREMENT FROM /.*/ GROUP BY *
name: result
------------
time written
1970-01-01T00:00:00Z 471
```
##### Restore (overwrite) metadata from a full or incremental backup to fix damaged metadata
1. Identify a backup with uncorrupted metadata from which to restore.
2. Restore from backup with `-meta-only-overwrite-force`.
{{% warn %}}
Only use the `-meta-only-overwrite-force` flag to restore from backups of the target cluster.
If you use this flag with metadata from a different cluster, you will lose data.
(since metadata includes shard assignments to data nodes).
{{% /warn %}}
```bash
# Syntax
influxd-ctl restore -meta-only-overwrite-force <path-to-backup-directory>
# Example
$ influxd-ctl restore -meta-only-overwrite-force my-incremental-backup/
Using backup directory: my-incremental-backup/
Using meta backup: 20200101T000000Z.meta
Restoring meta data... Done. Restored in 21.373019ms, 1 shards mapped
Restored from my-incremental-backup/ in 19.2311ms, transferred 588 bytes
```
#### Common issues with restore
##### Restore writes information not part of the original backup
If a [restore from an incremental backup](#syntax-to-restore-from-incremental-and-metadata-backups)
does not limit the restore to the same database, retention policy, and shard specified by the backup command,
the restore may appear to restore information that was not part of the original backup.
Backups consist of a shard data backup and a metastore backup.
The **shard data backup** contains the actual time series data: the measurements, tags, fields, and so on.
The **metastore backup** contains user information, database names, retention policy names, shard metadata, continuous queries, and subscriptions.
When the system creates a backup, the backup includes:
* the relevant shard data determined by the specified backup options
* all of the metastore information in the cluster regardless of the specified backup options
Because a backup always includes the complete metastore information, a restore that doesn't include the same options specified by the backup command may appear to restore data that were not targeted by the original backup.
The unintended data, however, include only the metastore information, not the shard data associated with that metastore information.
##### Restore a backup created prior to version 1.2.0
InfluxDB Enterprise introduced incremental backups in version 1.2.0.
To restore a backup created prior to version 1.2.0, be sure to follow the syntax
for [restoring from a full backup](#restore-from-a-full-backup).
## Exporting and importing data
For most InfluxDB Enterprise applications, the [backup and restore utilities](#backup-and-restore-utilities) provide the tools you need for your backup and restore strategy. However, in some cases, the standard backup and restore utilities may not adequately handle the volumes of data in your application.
As an alternative to the standard backup and restore utilities, use the InfluxDB `influx_inspect export` and `influx -import` commands to create backup and restore procedures for your disaster recovery and backup strategy. These commands can be executed manually or included in shell scripts that run the export and import operations at scheduled intervals (example below).
### Exporting data
Use the [`influx_inspect export` command](/enterprise_influxdb/v1.10/tools/influx_inspect#export) to export data in line protocol format from your InfluxDB Enterprise cluster. Options include:
- Exporting all, or specific, databases
- Filtering with starting and ending timestamps
- Using gzip compression for smaller files and faster exports
For details on optional settings and usage, see [`influx_inspect export` command](/enterprise_influxdb/v1.10/tools/influx_inspect#export).
In the following example, the database is exported filtered to include only one day and compressed for optimal speed and file size.
```bash
influx_inspect export \
-database myDB \
-compress \
-start 2019-05-19T00:00:00.000Z \
-end 2019-05-19T23:59:59.999Z
```
### Importing data
After exporting the data in line protocol format, you can import the data using the [`influx -import` CLI command](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/#import-data-from-a-file-with--import).
In the following example, the compressed data file is imported into the specified database.
```bash
influx -import -database myDB -compressed
```
For details on using the `influx -import` command, see [Import data from a file with -import](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/#import-data-from-a-file-with--import).
### Example
For an example of using the exporting and importing data approach for disaster recovery, see the Capital One presentation from Influxdays 2019 on ["Architecting for Disaster Recovery."](https://www.youtube.com/watch?v=LyQDhSdnm4A). In this presentation, Capital One discusses the following:
- Exporting data every 15 minutes from an active cluster to an AWS S3 bucket.
- Replicating the export file in the S3 bucket using the AWS S3 copy command.
- Importing data every 15 minutes from the AWS S3 bucket to a cluster available for disaster recovery.
- Advantages of the export-import approach over the standard backup and restore utilities for large volumes of data.
- Managing users and scheduled exports and imports with a custom administration tool.

View File

@ -0,0 +1,11 @@
---
title: Configure
description: Configure cluster and node settings in InfluxDB Enterprise.
menu:
enterprise_influxdb_1_10:
name: Configure
weight: 11
parent: Administration
---
{{< children >}}

View File

@ -0,0 +1,349 @@
---
title: Use Anti-Entropy service in InfluxDB Enterprise
description: The Anti-Entropy service monitors and repairs shards in InfluxDB.
aliases:
- /enterprise_influxdb/v1.10/guides/Anti-Entropy/
- /enterprise_influxdb/v1.10/administration/anti-entropy/
menu:
enterprise_influxdb_1_10:
name: Use Anti-entropy service
parent: Configure
weight: 50
---
{{% warn %}}
Prior to InfluxDB Enterprise 1.7.2, the Anti-Entropy (AE) service was enabled by default. When shards create digests with lots of time ranges (10s of thousands), some customers have experienced significant performance issues, including CPU usage spikes. If your shards include a small number of time ranges (most have 1 to 10, some have up to several hundreds) and you can benefit from the AE service, enable AE and monitor it closely to see if your performance is adversely impacted.
{{% /warn %}}
## Introduction
Shard entropy refers to inconsistency among shards in a shard group.
This can be due to the "eventually consistent" nature of data stored in InfluxDB
Enterprise clusters or due to missing or unreachable shards.
The Anti-Entropy (AE) service ensures that each data node has all the shards it
owns according to the metastore and that all shards in a shard group are consistent.
Missing shards are automatically repaired without operator intervention while
out-of-sync shards can be manually queued for repair.
This topic covers how the Anti-Entropy service works and some of the basic situations where it takes effect.
## Concepts
The Anti-Entropy service is a component of the `influxd` service available on each of your data nodes. Use this service to ensure that each data node has all of the shards that the metastore says it owns and ensure all shards in a shard group are in sync.
If any shards are missing, the Anti-Entropy service will copy existing shards from other shard owners.
If data inconsistencies are detected among shards in a shard group, [invoke the Anti-Entropy service](#command-line-tools-for-managing-entropy) and queue the out-of-sync shards for repair.
In the repair process, the Anti-Entropy service will sync the necessary updates from other shards
within a shard group.
By default, the service performs consistency checks every 5 minutes. This interval can be modified in the [`anti-entropy.check-interval`](/enterprise_influxdb/v1.10/administration/config-data-nodes/#check-interval) configuration setting.
The Anti-Entropy service can only address missing or inconsistent shards when
there is at least one copy of the shard available.
In other words, as long as new and healthy nodes are introduced, a replication
factor of 2 can recover from one missing or inconsistent node;
a replication factor of 3 can recover from two missing or inconsistent nodes, and so on.
A replication factor of 1, which is not recommended, cannot be recovered by the Anti-Entropy service.
## Symptoms of entropy
The Anti-Entropy service automatically detects and fixes missing shards, but shard inconsistencies
must be [manually detected and queued for repair](#detecting-and-repairing-entropy).
There are symptoms of entropy that, if seen, would indicate an entropy repair is necessary.
### Different results for the same query
When running queries against an InfluxDB Enterprise cluster, each query may be routed to a different data node.
If entropy affects data within the queried range, the same query will return different
results depending on which node the query runs against.
_**Query attempt 1**_
```sql
SELECT mean("usage_idle") WHERE time > '2018-06-06T18:00:00Z' AND time < '2018-06-06T18:15:00Z' GROUP BY time(3m) FILL(0)
name: cpu
time mean
---- ----
1528308000000000000 99.11867392974537
1528308180000000000 99.15410822137049
1528308360000000000 99.14927494363032
1528308540000000000 99.1980535465783
1528308720000000000 99.18584290492262
```
_**Query attempt 2**_
```sql
SELECT mean("usage_idle") WHERE time > '2018-06-06T18:00:00Z' AND time < '2018-06-06T18:15:00Z' GROUP BY time(3m) FILL(0)
name: cpu
time mean
---- ----
1528308000000000000 99.11867392974537
1528308180000000000 0
1528308360000000000 0
1528308540000000000 0
1528308720000000000 99.18584290492262
```
The results indicate that data is missing in the queried time range and entropy is present.
### Flapping dashboards
A "flapping" dashboard means data visualizations change when data is refreshed
and pulled from a node with entropy (inconsistent data).
It is the visual manifestation of getting [different results from the same query](#different-results-for-the-same-query).
<img src="/img/enterprise/1-6-flapping-dashboard.gif" alt="Flapping dashboard" style="width:100%; max-width:800px">
## Technical details
### Detecting entropy
The Anti-Entropy service runs on each data node and periodically checks its shards' statuses
relative to the next data node in the ownership list.
The service creates a "digest" or summary of data in the shards on the node.
For example, assume there are two data nodes in your cluster: `node1` and `node2`.
Both `node1` and `node2` own `shard1` so `shard1` is replicated across each.
When a status check runs, `node1` will ask `node2` when `shard1` was last modified.
If the reported modification time differs from the previous check, then
`node1` asks `node2` for a new digest of `shard1`, checks for differences (performs a "diff") between the `shard1` digest for `node2` and the local `shard1` digest.
If a difference exists, `shard1` is flagged as having entropy.
### Repairing entropy
If during a status check a node determines the next node is completely missing a shard,
it immediately adds the missing shard to the repair queue.
A background routine monitors the queue and begins the repair process as new shards are added to it.
Repair requests are pulled from the queue by the background process and repaired using a `copy shard` operation.
> Currently, shards that are present on both nodes but contain different data are not automatically queued for repair.
> A user must make the request via `influxd-ctl entropy repair <shard ID>`.
> For more information, see [Detecting and repairing entropy](#detecting-and-repairing-entropy) below.
Using `node1` and `node2` from the [earlier example](#detecting-entropy), `node1` asks `node2` for a digest of `shard1`.
`node1` diffs its own local `shard1` digest and `node2`'s `shard1` digest,
then creates a new digest containing only the differences (the diff digest).
The diff digest is used to create a patch containing only the data `node2` is missing.
`node1` sends the patch to `node2` and instructs it to apply it.
Once `node2` finishes applying the patch, it queues a repair for `shard1` locally.
The "node-to-node" shard repair continues until it runs on every data node that owns the shard in need of repair.
### Repair order
Repairs between shard owners happen in a deterministic order.
This doesn't mean repairs always start on node 1 and then follow a specific node order.
Repairs are viewed at the shard level.
Each shard has a list of owners and the repairs for a particular shard will happen
in a deterministic order among its owners.
When the Anti-Entropy service on any data node receives a repair request for a shard, it determines which
owner node is the first in the deterministic order and forwards the request to that node.
The request is now queued on the first owner.
The first owner's repair processor pulls it from the queue, detects the differences
between the local copy of the shard with the copy of the same shard on the next
owner in the deterministic order, then generates a patch from that difference.
The first owner then makes an RPC call to the next owner instructing it to apply
the patch to its copy of the shard.
Once the next owner has successfully applied the patch, it adds that shard to the Anti-Entropy repair queue.
A list of "visited" nodes follows the repair through the list of owners.
Each owner will check the list to detect when the repair has cycled through all owners,
at which point the repair is finished.
### Hot shards
The Anti-Entropy service does its best to avoid hot shards (shards that are currently receiving writes)
because they change quickly.
While write replication between shard owner nodes (with a
[replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor)
greater than 1) typically happens in milliseconds, this slight difference is
still enough to cause the appearance of entropy where there is none.
Because the Anti-Entropy service repairs only cold shards, unexpected effects can occur.
Consider the following scenario:
1. A shard goes cold.
2. Anti-Entropy detects entropy.
3. Entropy is reported by the [Anti-Entropy `/status` API](/enterprise_influxdb/v1.10/administration/anti-entropy-api/#get-status) or with the `influxd-ctl entropy show` command.
4. Shard takes a write, gets compacted, or something else causes it to go hot.
_These actions are out of Anti-Entropy's control._
5. A repair is requested, but is ignored because the shard is now hot.
In this example, you would have to periodically request a repair of the shard
until it either shows as being in the queue, being repaired, or no longer in the list of shards with entropy.
## Configuration
The configuration settings for the Anti-Entropy service are described in [Anti-Entropy settings](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#anti-entropy-ae-settings) section of the data node configuration.
To enable the Anti-Entropy service, change the default value of the `[anti-entropy].enabled = false` setting to `true` in the `influxdb.conf` file of each of your data nodes.
## Command line tools for managing entropy
>**Note:** The Anti-Entropy service is disabled by default and must be enabled before using these commands.
The `influxd-ctl entropy` command enables you to manage entropy among shards in a cluster.
It includes the following subcommands:
#### `show`
Lists shards that are in an inconsistent state and in need of repair as well as
shards currently in the repair queue.
```bash
influxd-ctl entropy show
```
#### `repair`
Queues a shard for repair.
It requires a Shard ID which is provided in the [`show`](#show) output.
```bash
influxd-ctl entropy repair <shardID>
```
Repairing entropy in a shard is an asynchronous operation.
This command will return quickly as it only adds a shard to the repair queue.
Queuing shards for repair is idempotent.
There is no harm in making multiple requests to repair the same shard even if
it is already queued, currently being repaired, or not in need of repair.
#### `kill-repair`
Removes a shard from the repair queue.
It requires a Shard ID which is provided in the [`show`](#show) output.
```bash
influxd-ctl entropy kill-repair <shardID>
```
This only applies to shards in the repair queue.
It does not cancel repairs on nodes that are in the process of being repaired.
Once a repair has started, requests to cancel it are ignored.
> Stopping a entropy repair for a **missing** shard operation is not currently supported.
> It may be possible to stop repairs for missing shards with the
> [`influxd-ctl kill-copy-shard`](/enterprise_influxdb/v1.10/tools/influxd-ctl/#kill-copy-shard) command.
## InfluxDB Anti-Entropy API
The Anti-Entropy service uses an API for managing and monitoring entropy.
Details on the available API endpoints can be found in [The InfluxDB Anti-Entropy API](/enterprise_influxdb/v1.10/administration/anti-entropy-api).
## Use cases
Common use cases for the Anti-Entropy service include detecting and repairing entropy, replacing unresponsive data nodes, replacing data nodes for upgrades and maintenance, and eliminating entropy in active shards.
### Detecting and repairing entropy
Periodically, you may want to see if shards in your cluster have entropy or are
inconsistent with other shards in the shard group.
Use the `influxd-ctl entropy show` command to list all shards with detected entropy:
```bash
influxd-ctl entropy show
Entropy
==========
ID Database Retention Policy Start End Expires Status
21179 statsdb 1hour 2017-10-09 00:00:00 +0000 UTC 2017-10-16 00:00:00 +0000 UTC 2018-10-22 00:00:00 +0000 UTC diff
25165 statsdb 1hour 2017-11-20 00:00:00 +0000 UTC 2017-11-27 00:00:00 +0000 UTC 2018-12-03 00:00:00 +0000 UTC diff
```
Then use the `influxd-ctl entropy repair` command to add the shards with entropy
to the repair queue:
```bash
influxd-ctl entropy repair 21179
Repair Shard 21179 queued
influxd-ctl entropy repair 25165
Repair Shard 25165 queued
```
Check on the status of the repair queue with the `influxd-ctl entropy show` command:
```bash
influxd-ctl entropy show
Entropy
==========
ID Database Retention Policy Start End Expires Status
21179 statsdb 1hour 2017-10-09 00:00:00 +0000 UTC 2017-10-16 00:00:00 +0000 UTC 2018-10-22 00:00:00 +0000 UTC diff
25165 statsdb 1hour 2017-11-20 00:00:00 +0000 UTC 2017-11-27 00:00:00 +0000 UTC 2018-12-03 00:00:00 +0000 UTC diff
Queued Shards: [21179 25165]
```
### Replacing an unresponsive data node
If a data node suddenly disappears due to a catastrophic hardware failure or for any other reason, as soon as a new data node is online, the Anti-Entropy service will copy the correct shards to the new replacement node. The time it takes for the copying to complete is determined by the number of shards to be copied and how much data is stored in each.
_View the [Replacing Data Nodes](/enterprise_influxdb/v1.10/guides/replacing-nodes/#replace-data-nodes-in-an-influxdb-enterprise-cluster) documentation for instructions on replacing data nodes in your InfluxDB Enterprise cluster._
### Replacing a machine that is running a data node
Perhaps you are replacing a machine that is being decommissioned, upgrading hardware, or something else entirely.
The Anti-Entropy service will automatically copy shards to the new machines.
Once you have successfully run the `influxd-ctl update-data` command, you are free
to shut down the retired node without causing any interruption to the cluster.
The Anti-Entropy process will continue copying the appropriate shards from the
remaining replicas in the cluster.
### Fixing entropy in active shards
In rare cases, the currently active shard, or the shard to which new data is
currently being written, may find itself with inconsistent data.
Because the Anti-Entropy process can't write to hot shards, you must stop writes to the new
shard using the [`influxd-ctl truncate-shards` command](/enterprise_influxdb/v1.10/tools/influxd-ctl/#truncate-shards),
then add the inconsistent shard to the entropy repair queue:
```bash
# Truncate hot shards
influxd-ctl truncate-shards
# Show shards with entropy
influxd-ctl entropy show
Entropy
==========
ID Database Retention Policy Start End Expires Status
21179 statsdb 1hour 2018-06-06 12:00:00 +0000 UTC 2018-06-06 23:44:12 +0000 UTC 2018-12-06 00:00:00 +0000 UTC diff
# Add the inconsistent shard to the repair queue
influxd-ctl entropy repair 21179
```
## Troubleshooting
### Queued repairs are not being processed
The primary reason a repair in the repair queue isn't being processed is because
it went "hot" after the repair was queued.
The Anti-Entropy service only repairs cold shards or shards that are not currently being written to.
If the shard is hot, the Anti-Entropy service will wait until it goes cold again before performing the repair.
If the shard is "old" and writes to it are part of a backfill process, you simply
have to wait until the backfill process is finished. If the shard is the active
shard, run `truncate-shards` to stop writes to active shards. This process is
outlined [above](#fixing-entropy-in-active-shards).
### Anti-Entropy log messages
Below are common messages output by Anti-Entropy along with what they mean.
#### `Checking status`
Indicates that the Anti-Entropy process has begun the [status check process](#detecting-entropy).
#### `Skipped shards`
Indicates that the Anti-Entropy process has skipped a status check on shards because they are currently [hot](#hot-shards).

View File

@ -0,0 +1,238 @@
---
title: InfluxDB Anti-Entropy API
description: >
Monitor and repair shards on InfluxDB Enterprise data nodes the InfluxDB Anti-Entropy API.
menu:
enterprise_influxdb_1_10:
name: Anti-entropy API
weight: 70
parent: Use Anti-entropy service
aliases:
- /enterprise_influxdb/v1.10/administration/anti-entropy-api/
---
>**Note:** The Anti-Entropy API is available from the meta nodes and is only available when the Anti-Entropy service is enabled in the data node configuration settings. For information on the configuration settings, see
> [Anti-Entropy settings](/enterprise_influxdb/v1.10/administration/config-data-nodes/#anti-entropy-ae-settings).
Use the [Anti-Entropy service](/enterprise_influxdb/v1.10/administration/anti-entropy) in InfluxDB Enterprise to monitor and repair entropy in data nodes and their shards. To access the Anti-Entropy API and work with this service, use [`influx-ctl entropy`](/enterprise_influxdb/v1.10/tools/influxd-ctl/#entropy) (also available on meta nodes).
The base URL is:
```text
http://localhost:8086/shard-repair
```
## GET `/status`
### Description
Lists shards that are in an inconsistent state and in need of repair.
### Parameters
| Name | Located in | Description | Required | Type |
| ---- | ---------- | ----------- | -------- | ---- |
| `local` | query | Limits status check to local shards on the data node handling this request | No | boolean |
### Responses
#### Headers
| Header name | Value |
|-------------|--------------------|
| `Accept` | `application/json` |
#### Status codes
| Code | Description | Type |
| ---- | ----------- | ------ |
| `200` | `Successful operation` | object |
### Examples
#### cURL request
```bash
curl -X GET "http://localhost:8086/shard-repair/status?local=true" -H "accept: application/json"
```
#### Request URL
```text
http://localhost:8086/shard-repair/status?local=true
```
### Responses
Example of server response value:
```json
{
"shards": [
{
"id": "1",
"database": "ae",
"retention_policy": "autogen",
"start_time": "-259200000000000",
"end_time": "345600000000000",
"expires": "0",
"status": "diff"
},
{
"id": "3",
"database": "ae",
"retention_policy": "autogen",
"start_time": "62640000000000000",
"end_time": "63244800000000000",
"expires": "0",
"status": "diff"
}
],
"queued_shards": [
"3",
"5",
"9"
],
"processing_shards": [
"3",
"9"
]
}
```
## POST `/repair`
### Description
Queues the specified shard for repair of the inconsistent state.
### Parameters
| Name | Located in | Description | Required | Type |
| ---- | ---------- | ----------- | -------- | ---- |
| `id` | query | ID of shard to queue for repair | Yes | integer |
### Responses
#### Headers
| Header name | Value |
| ----------- | ----- |
| `Accept` | `application/json` |
#### Status codes
| Code | Description |
| ---- | ----------- |
| `204` | `Successful operation` |
| `400` | `Bad request` |
| `500` | `Internal server error` |
### Examples
#### cURL request
```bash
curl -X POST "http://localhost:8086/shard-repair/repair?id=1" -H "accept: application/json"
```
#### Request URL
```text
http://localhost:8086/shard-repair/repair?id=1
```
## POST `/cancel-repair`
### Description
Removes the specified shard from the repair queue on nodes.
### Parameters
| Name | Located in | Description | Required | Type |
| ---- | ---------- | ----------- | -------- | ---- |
| `id` | query | ID of shard to remove from repair queue | Yes | integer |
| `local` | query | Only remove shard from repair queue on node receiving the request | No | boolean |
### Responses
#### Headers
| Header name | Value |
|-------------|--------------------|
| `Accept` | `application/json` |
#### Status codes
| Code | Description |
| ---- | ----------- |
| `204` | `Successful operation` |
| `400` | `Bad request` |
| `500` | `Internal server error` |
### Examples
#### cURL request
```bash
curl -X POST "http://localhost:8086/shard-repair/cancel-repair?id=1&local=false" -H "accept: application/json"
```
#### Request URL
```text
http://localhost:8086/shard-repair/cancel-repair?id=1&local=false
```
## Models
### ShardStatus
| Name | Type | Required |
| ---- | ---- | -------- |
| `id` | string | No |
| `database` | string | No |
| `retention_policy` | string | No |
| `start_time` | string | No |
| `end_time` | string | No |
| `expires` | string | No |
| `status` | string | No |
### Examples
```json
{
"shards": [
{
"id": "1",
"database": "ae",
"retention_policy": "autogen",
"start_time": "-259200000000000",
"end_time": "345600000000000",
"expires": "0",
"status": "diff"
},
{
"id": "3",
"database": "ae",
"retention_policy": "autogen",
"start_time": "62640000000000000",
"end_time": "63244800000000000",
"expires": "0",
"status": "diff"
}
],
"queued_shards": [
"3",
"5",
"9"
],
"processing_shards": [
"3",
"9"
]
}
```

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,459 @@
---
title: Configure InfluxDB Enterprise meta modes
description: >
Configure InfluxDB Enterprise data node settings and environmental variables.
menu:
enterprise_influxdb_1_10:
name: Configure meta nodes
parent: Configure
weight: 30
aliases:
- /enterprise_influxdb/v1.10/administration/config-meta-nodes/
---
* [Meta node configuration settings](#meta-node-configuration-settings)
* [Global options](#global-options)
* [Enterprise license `[enterprise]`](#enterprise)
* [Meta node `[meta]`](#meta)
* [TLS `[tls]`](#tls-settings)
## Meta node configuration settings
### Global options
#### `reporting-disabled`
Default is `false`.
InfluxData, the company, relies on reported data from running nodes primarily to
track the adoption rates of different InfluxDB versions.
These data help InfluxData support the continuing development of InfluxDB.
The `reporting-disabled` option toggles the reporting of data every 24 hours to
`usage.influxdata.com`.
Each report includes a randomly-generated identifier, OS, architecture,
InfluxDB version, and the number of databases, measurements, and unique series.
To disable reporting, set this option to `true`.
> **Note:** No data from user databases are ever transmitted.
#### `bind-address`
Default is `""`.
This setting is not intended for use.
It will be removed in future versions.
#### `hostname`
Default is `""`.
The hostname of the [meta node](/enterprise_influxdb/v1.10/concepts/glossary/#meta-node).
This must be resolvable and reachable by all other members of the cluster.
Environment variable: `INFLUXDB_HOSTNAME`
-----
### Enterprise license settings
#### `[enterprise]`
The `[enterprise]` section contains the parameters for the meta node's
registration with the [InfluxData portal](https://portal.influxdata.com/).
#### `license-key`
Default is `""`.
The license key created for you on [InfluxData portal](https://portal.influxdata.com).
The meta node transmits the license key to
[portal.influxdata.com](https://portal.influxdata.com) over port 80 or port 443
and receives a temporary JSON license file in return.
The server caches the license file locally.
If your server cannot communicate with [https://portal.influxdata.com](https://portal.influxdata.com), you must use the [`license-path` setting](#license-path).
Use the same key for all nodes in the same cluster.
{{% warn %}}The `license-key` and `license-path` settings are mutually exclusive and one must remain set to the empty string.
{{% /warn %}}
> **Note:** You must restart meta nodes to update your configuration. For more information, see how to [renew or update your license key](/enterprise_influxdb/v1.10/administration/renew-license/).
Environment variable: `INFLUXDB_ENTERPRISE_LICENSE_KEY`
#### `license-path`
Default is `""`.
The local path to the permanent JSON license file that you received from InfluxData
for instances that do not have access to the internet.
To obtain a license file, contact [sales@influxdb.com](mailto:sales@influxdb.com).
The license file must be saved on every server in the cluster, including meta nodes
and data nodes.
The file contains the JSON-formatted license, and must be readable by the `influxdb` user.
Each server in the cluster independently verifies its license.
{{% warn %}}
The `license-key` and `license-path` settings are mutually exclusive and one must remain set to the empty string.
{{% /warn %}}
> **Note:** You must restart meta nodes to update your configuration. For more information, see how to [renew or update your license key](/enterprise_influxdb/v1.10/administration/renew-license/).
Environment variable: `INFLUXDB_ENTERPRISE_LICENSE_PATH`
-----
### Meta node settings
#### `[meta]`
#### `dir`
Default is `"/var/lib/influxdb/meta"`.
The directory where cluster meta data is stored.
Environment variable: `INFLUXDB_META_DIR`
#### `bind-address`
Default is `":8089"`.
The bind address(port) for meta node communication.
For simplicity, InfluxData recommends using the same port on all meta nodes,
but this is not necessary.
Environment variable: `INFLUXDB_META_BIND_ADDRESS`
#### `http-bind-address`
Default is `":8091"`.
The default address to bind the API to.
Environment variable: `INFLUXDB_META_HTTP_BIND_ADDRESS`
#### `https-enabled`
Default is `false`.
Determines whether meta nodes use HTTPS to communicate with each other. By default, HTTPS is disabled. We strongly recommend enabling HTTPS.
To enable HTTPS, set https-enabled to `true`, specify the path to the SSL certificate `https-certificate = " "`, and specify the path to the SSL private key `https-private-key = ""`.
Environment variable: `INFLUXDB_META_HTTPS_ENABLED`
#### `https-certificate`
Default is `""`.
If HTTPS is enabled, specify the path to the SSL certificate.
Use either:
* PEM-encoded bundle with both the certificate and key (`[bundled-crt-and-key].pem`)
* Certificate only (`[certificate].crt`)
Environment variable: `INFLUXDB_META_HTTPS_CERTIFICATE`
#### `https-private-key`
Default is `""`.
If HTTPS is enabled, specify the path to the SSL private key.
Use either:
* PEM-encoded bundle with both the certificate and key (`[bundled-crt-and-key].pem`)
* Private key only (`[private-key].key`)
Environment variable: `INFLUXDB_META_HTTPS_PRIVATE_KEY`
#### `https-insecure-tls`
Default is `false`.
Whether meta nodes will skip certificate validation communicating with each other over HTTPS.
This is useful when testing with self-signed certificates.
Environment variable: `INFLUXDB_META_HTTPS_INSECURE_TLS`
#### `data-use-tls`
Default is `false`.
Whether to use TLS to communicate with data nodes.
#### `data-insecure-tls`
Default is `false`.
Whether meta nodes will skip certificate validation communicating with data nodes over TLS.
This is useful when testing with self-signed certificates.
#### `gossip-frequency`
Default is `"5s"`.
The default frequency with which the node will gossip its known announcements.
#### `announcement-expiration`
Default is `"30s"`.
The default length of time an announcement is kept before it is considered too old.
#### `retention-autocreate`
Default is `true`.
Automatically create a default retention policy when creating a database.
#### `election-timeout`
Default is `"1s"`.
The amount of time in candidate state without a leader before we attempt an election.
#### `heartbeat-timeout`
Default is `"1s"`.
The amount of time in follower state without a leader before we attempt an election.
#### `leader-lease-timeout`
Default is `"500ms"`.
The leader lease timeout is the amount of time a Raft leader will remain leader
if it does not hear from a majority of nodes.
After the timeout the leader steps down to the follower state.
Clusters with high latency between nodes may want to increase this parameter to
avoid unnecessary Raft elections.
Environment variable: `INFLUXDB_META_LEADER_LEASE_TIMEOUT`
#### `commit-timeout`
Default is `"50ms"`.
The commit timeout is the amount of time a Raft node will tolerate between
commands before issuing a heartbeat to tell the leader it is alive.
The default setting should work for most systems.
Environment variable: `INFLUXDB_META_COMMIT_TIMEOUT`
#### `consensus-timeout`
Default is `"30s"`.
Timeout waiting for consensus before getting the latest Raft snapshot.
Environment variable: `INFLUXDB_META_CONSENSUS_TIMEOUT`
#### `cluster-tracing`
Default is `false`.
Log all HTTP requests made to meta nodes.
Prints sanitized POST request information to show actual commands.
**Sample log output:**
```
ts=2021-12-08T02:00:54.864731Z lvl=info msg=weblog log_id=0YHxBFZG001 service=meta-http host=172.18.0.1 user-id= username=admin method=POST uri=/user protocol=HTTP/1.1 command="{'{\"action\":\"create\",\"user\":{\"name\":\"fipple\",\"password\":[REDACTED]}}': ''}" status=307 size=0 referrer= user-agent=curl/7.68.0 request-id=ad87ce47-57ca-11ec-8026-0242ac120004 execution-time=63.571ms execution-time-readable=63.570738ms
ts=2021-12-08T02:01:00.070137Z lvl=info msg=weblog log_id=0YHxBEhl001 service=meta-http host=172.18.0.1 user-id= username=admin method=POST uri=/user protocol=HTTP/1.1 command="{'{\"action\":\"create\",\"user\":{\"name\":\"fipple\",\"password\":[REDACTED]}}': ''}" status=200 size=0 referrer= user-agent=curl/7.68.0 request-id=b09eb13a-57ca-11ec-800d-0242ac120003 execution-time=85.823ms execution-time-readable=85.823406ms
ts=2021-12-08T02:01:29.062313Z lvl=info msg=weblog log_id=0YHxBEhl001 service=meta-http host=172.18.0.1 user-id= username=admin method=POST uri=/user protocol=HTTP/1.1 command="{'{\"action\":\"create\",\"user\":{\"name\":\"gremch\",\"hash\":[REDACTED]}}': ''}" status=200 size=0 referrer= user-agent=curl/7.68.0 request-id=c1f3614a-57ca-11ec-8015-0242ac120003 execution-time=1.722ms execution-time-readable=1.722089ms
ts=2021-12-08T02:01:47.457607Z lvl=info msg=weblog log_id=0YHxBEhl001 service=meta-http host=172.18.0.1 user-id= username=admin method=POST uri=/user protocol=HTTP/1.1 command="{'{\"action\":\"create\",\"user\":{\"name\":\"gremchy\",\"hash\":[REDACTED]}}': ''}" status=400 size=37 referrer= user-agent=curl/7.68.0 request-id=ccea84b7-57ca-11ec-8019-0242ac120003 execution-time=0.154ms execution-time-readable=154.417µs
ts=2021-12-08T02:02:05.522571Z lvl=info msg=weblog log_id=0YHxBEhl001 service=meta-http host=172.18.0.1 user-id= username=admin method=POST uri=/user protocol=HTTP/1.1 command="{'{\"action\":\"create\",\"user\":{\"name\":\"thimble\",\"password\":[REDACTED]}}': ''}" status=400 size=37 referrer= user-agent=curl/7.68.0 request-id=d7af0082-57ca-11ec-801f-0242ac120003 execution-time=0.227ms execution-time-readable=227.853µs
```
Environment variable: `INFLUXDB_META_CLUSTER_TRACING`
#### `logging-enabled`
Default is `true`.
Meta logging toggles the logging of messages from the meta service.
Environment variable: `INFLUXDB_META_LOGGING_ENABLED`
#### `pprof-enabled`
Default is `true`.
Enables the `/debug/pprof` endpoint for troubleshooting.
To disable, set the value to `false`.
Environment variable: `INFLUXDB_META_PPROF_ENABLED`
#### `lease-duration`
Default is `"1m0s"`.
The default duration of the leases that data nodes acquire from the meta nodes.
Leases automatically expire after the `lease-duration` is met.
Leases ensure that only one data node is running something at a given time.
For example, [continuous queries](/enterprise_influxdb/v1.10/concepts/glossary/#continuous-query-cq)
(CQs) use a lease so that all data nodes aren't running the same CQs at once.
For more details about `lease-duration` and its impact on continuous queries, see
[Configuration and operational considerations on a cluster](/enterprise_influxdb/v1.10/features/clustering-features/#configuration-and-operational-considerations-on-a-cluster).
Environment variable: `INFLUXDB_META_LEASE_DURATION`
#### `auth-enabled`
Default is `false`.
If true, HTTP endpoints require authentication.
This setting must have the same value as the data nodes' meta.meta-auth-enabled configuration.
#### `ldap-allowed`
Default is `false`.
Whether LDAP is allowed to be set.
If true, you will need to use `influxd ldap set-config` and set enabled=true to use LDAP authentication.
#### `shared-secret`
Default is `""`.
The shared secret to be used by the public API for creating custom JWT authentication.
If you use this setting, set [`auth-enabled`](#auth-enabled) to `true`.
Environment variable: `INFLUXDB_META_SHARED_SECRET`
#### `internal-shared-secret`
Default is `""`.
The shared secret used by the internal API for JWT authentication for
inter-node communication within the cluster.
Set this to a long pass phrase.
This value must be the same value as the
[`[meta] meta-internal-shared-secret`](/enterprise_influxdb/v1.10/administration/config-data-nodes#meta-internal-shared-secret) in the data node configuration file.
To use this option, set [`auth-enabled`](#auth-enabled) to `true`.
Environment variable: `INFLUXDB_META_INTERNAL_SHARED_SECRET`
#### `password-hash`
Default is `"bcrypt"`.
Specifies the password hashing scheme and its configuration.
FIPS-readiness is achieved by specifying an appropriate password hashing scheme, such as `pbkdf2-sha256` or `pbkdf2-sha512`.
The configured password hashing scheme and its FIPS readiness are logged on startup of `influxd` and `influxd-meta` for verification and auditing purposes.
The configuration is a semicolon delimited list.
The first section specifies the password hashing scheme.
Optional sections after this are `key=value` password hash configuration options.
Each scheme has its own set of options.
Any options not specified default to reasonable values as specified below.
This setting must have the same value as the data node option [`meta.password-hash`](/enterprise_influxdb/v1.10/administration/config-data-nodes/#password-hash).
Environment variable: `INFLUXDB_META_PASSWORD_HASH`
**Example hashing configurations:**
| String | Description | FIPS ready |
|:-----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------|------------|
| `bcrypt` | Specifies the [`bcrypt`](#bcrypt) hashing scheme with default options. | No |
| `pbkdf2-sha256;salt_len=32;rounds=64000` | Specifies the [`pbkdf2-sha256`](#pbkdf2-sha256) hashing scheme with options `salt_len` set to `32` and `rounds` set to `64000`. | Yes |
Supported password hashing schemes and options:
##### `bcrypt`
`bcrypt` is the default hashing scheme.
It is not a FIPS-ready password hashing scheme.
**Options:**
* `cost`
* Specifies the cost of hashing.
Number of rounds performed is `2^cost`.
Higher cost gives greater security at the expense of execution time.
* Default value: `10`
* Valid range: [`4`, `31`]
##### `pbkdf2-sha256`
`pbkdf2-sha256` uses the PBKDF2 scheme with SHA-256 as the HMAC function.
It is FIPS-ready according to [NIST Special Publication 800-132] §5.3
when used with appropriate `rounds` and `salt_len` options.
**Options:**
* `rounds`
* Specifies the number of rounds to perform.
Higher cost gives greater security at the expense of execution time.
* Default value: `29000`
* Valid range: [`1`, `4294967295`]
* Must be greater than or equal to `1000`
for FIPS-readiness according to [NIST Special Publication 800-132] §5.2.
* `salt_len`
* Specifies the salt length in bytes.
The longer the salt, the more difficult it is for an attacker to generate a table of password hashes.
* Default value: `16`
* Valid range: [`1`, `1024`]
* Must be greater than or equal to `16`
for FIPS-readiness according to [NIST Special Publication 800-132] §5.1.
##### `pbkdf2-sha512`
`pbkdf2-sha512` uses the PBKDF2 scheme with SHA-256 as the HMAC function.
It is FIPS-ready according to [NIST Special Publication 800-132] §5.3
when used with appropriate `rounds` and `salt_len` options.
**Options:**
* `rounds`
* Specifies the number of rounds to perform.
Higher cost gives greater security at the expense of execution time.
* Default value: `29000`
* Valid range: [`1`, `4294967295`]
* Must be greater than or equal to `1000`
for FIPS-readiness according to [NIST Special Publication 800-132] § 5.2.
* `salt_len`
* Specifies the salt length in bytes.
The longer the salt, the more difficult it is for an attacker to generate a table of password hashes.
* Default value: `16`
* Valid range: [`1`, `1024`]
* Must be greater than or equal to `16`
for FIPS-readiness according to [NIST Special Publication 800-132] § 5.1.
#### `ensure-fips`
Default is `false`.
If `ensure-fips` is set to `true`, then `influxd` and `influxd-meta`
will refuse to start if they are not configured in a FIPS-ready manner.
For example, `password-hash = "bcrypt"` would not be allowed if `ensure-fips = true`.
`ensure-fips` gives the administrator extra confidence that their instances are configured in a FIPS-ready manner.
Environment variable: `INFLUXDB_META_ENSURE_FIPS`
[NIST Special Publication 800-132]: https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-132.pdf
### TLS settings
For more information, see [TLS settings for data nodes](/enterprise_influxdb/v1.10/administration/config-data-nodes#tls-settings).
#### Recommended "modern compatibility" cipher settings
```toml
ciphers = [ "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305",
"TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305",
"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
"TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
"TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
"TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384"
]
min-version = "tls1.3"
max-version = "tls1.3"
```

View File

@ -0,0 +1,183 @@
---
title: Configure InfluxDB Enterprise clusters
description: >
Learn about global options, meta node options, data node options and other InfluxDB Enterprise configuration settings, including
aliases:
- /enterprise/v1.10/administration/configuration/
- /enterprise_influxdb/v1.10/administration/configuration/
menu:
enterprise_influxdb_1_10:
name: Configure clusters
parent: Configure
weight: 10
---
This page contains general information about configuring InfluxDB Enterprise clusters.
For complete listings and descriptions of the configuration settings, see:
* [Configure data nodes](/enterprise_influxdb/v1.10/administration/config-data-nodes)
* [Configure meta nodes](/enterprise_influxdb/v1.10/administration/config-meta-nodes)
## Use configuration files
### Display the default configurations
The following commands print out a TOML-formatted configuration with all
available options set to their default values.
#### Meta node configuration
```bash
influxd-meta config
```
#### Data node configuration
```bash
influxd config
```
#### Create a configuration file
On POSIX systems, generate a new configuration file by redirecting the output
of the command to a file.
New meta node configuration file:
```
influxd-meta config > /etc/influxdb/influxdb-meta-generated.conf
```
New data node configuration file:
```
influxd config > /etc/influxdb/influxdb-generated.conf
```
Preserve custom settings from older configuration files when generating a new
configuration file with the `-config` option.
For example, this overwrites any default configuration settings in the output
file (`/etc/influxdb/influxdb.conf.new`) with the configuration settings from
the file (`/etc/influxdb/influxdb.conf.old`) passed to `-config`:
```
influxd config -config /etc/influxdb/influxdb.conf.old > /etc/influxdb/influxdb.conf.new
```
#### Launch the process with a configuration file
There are two ways to launch the meta or data processes using your customized
configuration file.
* Point the process to the desired configuration file with the `-config` option.
To start the meta node process with `/etc/influxdb/influxdb-meta-generate.conf`:
influxd-meta -config /etc/influxdb/influxdb-meta-generate.conf
To start the data node process with `/etc/influxdb/influxdb-generated.conf`:
influxd -config /etc/influxdb/influxdb-generated.conf
* Set the environment variable `INFLUXDB_CONFIG_PATH` to the path of your
configuration file and start the process.
To set the `INFLUXDB_CONFIG_PATH` environment variable and launch the data
process using `INFLUXDB_CONFIG_PATH` for the configuration file path:
export INFLUXDB_CONFIG_PATH=/root/influxdb.generated.conf
echo $INFLUXDB_CONFIG_PATH
/root/influxdb.generated.conf
influxd
If set, the command line `-config` path overrides any environment variable path.
If you do not supply a configuration file, InfluxDB uses an internal default
configuration (equivalent to the output of `influxd config` and `influxd-meta
config`).
{{% warn %}} Note for 1.3, the influxd-meta binary, if no configuration is specified, will check the INFLUXDB_META_CONFIG_PATH.
If that environment variable is set, the path will be used as the configuration file.
If unset, the binary will check the ~/.influxdb and /etc/influxdb folder for an influxdb-meta.conf file.
If it finds that file at either of the two locations, the first will be loaded as the configuration file automatically.
<br>
This matches a similar behavior that the open source and data node versions of InfluxDB already follow.
{{% /warn %}}
Configure InfluxDB using the configuration file (`influxdb.conf`) and environment variables.
The default value for each configuration setting is shown in the documentation.
Commented configuration options use the default value.
Configuration settings with a duration value support the following duration units:
- `ns` _(nanoseconds)_
- `us` or `µs` _(microseconds)_
- `ms` _(milliseconds)_
- `s` _(seconds)_
- `m` _(minutes)_
- `h` _(hours)_
- `d` _(days)_
- `w` _(weeks)_
### Environment variables
All configuration options can be specified in the configuration file or in
environment variables.
Environment variables override the equivalent options in the configuration
file.
If a configuration option is not specified in either the configuration file
or in an environment variable, InfluxDB uses its internal default
configuration.
In the sections below we name the relevant environment variable in the
description for the configuration setting.
Environment variables can be set in `/etc/default/influxdb-meta` and
`/etc/default/influxdb`.
> **Note:**
To set or override settings in a config section that allows multiple
configurations (any section with double_brackets (`[[...]]`) in the header supports
multiple configurations), the desired configuration must be specified by ordinal
number.
For example, for the first set of `[[graphite]]` environment variables,
prefix the configuration setting name in the environment variable with the
relevant position number (in this case: `0`):
>
INFLUXDB_GRAPHITE_0_BATCH_PENDING
INFLUXDB_GRAPHITE_0_BATCH_SIZE
INFLUXDB_GRAPHITE_0_BATCH_TIMEOUT
INFLUXDB_GRAPHITE_0_BIND_ADDRESS
INFLUXDB_GRAPHITE_0_CONSISTENCY_LEVEL
INFLUXDB_GRAPHITE_0_DATABASE
INFLUXDB_GRAPHITE_0_ENABLED
INFLUXDB_GRAPHITE_0_PROTOCOL
INFLUXDB_GRAPHITE_0_RETENTION_POLICY
INFLUXDB_GRAPHITE_0_SEPARATOR
INFLUXDB_GRAPHITE_0_TAGS
INFLUXDB_GRAPHITE_0_TEMPLATES
INFLUXDB_GRAPHITE_0_UDP_READ_BUFFER
>
For the Nth Graphite configuration in the configuration file, the relevant
environment variables would be of the form `INFLUXDB_GRAPHITE_(N-1)_BATCH_PENDING`.
For each section of the configuration file the numbering restarts at zero.
### `GOMAXPROCS` environment variable
{{% note %}}
_**Note:**_ `GOMAXPROCS` cannot be set using the InfluxDB configuration file.
It can only be set as an environment variable.
{{% /note %}}
The `GOMAXPROCS` [Go language environment variable](https://golang.org/pkg/runtime/#hdr-Environment_Variables)
can be used to set the maximum number of CPUs that can execute simultaneously.
The default value of `GOMAXPROCS` is the number of CPUs
that are visible to the program *on startup*
(based on what the operating system considers to be a CPU).
For a 32-core machine, the `GOMAXPROCS` value would be `32`.
You can override this value to be less than the maximum value,
which can be useful in cases where you are running the InfluxDB
along with other processes on the same machine
and want to ensure that the database doesn't negatively affect those processes.
{{% note %}}
_**Note:**_ Setting `GOMAXPROCS=1` eliminates all parallelization.
{{% /note %}}

View File

@ -0,0 +1,94 @@
---
title: Configure TCP and UDP ports used in InfluxDB Enterprise
description: Configure TCP and UDP ports in InfluxDB Enterprise.
menu:
enterprise_influxdb_1_10:
name: Configure TCP and UDP Ports
parent: Configure
weight: 60
aliases:
- /enterprise/v1.10/administration/ports/
- /enterprise_influxdb/v1.10/administration/ports/
---
![InfluxDB Enterprise network diagram](/img/enterprise/1-8-network-diagram.png)
## Enabled ports
### 8086
The default port that runs the InfluxDB HTTP service.
It is used for the primary public write and query API.
Clients include the CLI, Chronograf, InfluxDB client libraries, Grafana, curl, or anything that wants to write and read time series data to and from InfluxDB.
[Configure this port](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#bind-address)
in the data node configuration file.
_See also: [API Reference](/enterprise_influxdb/v1.10/tools/api/)._
### 8088
Data nodes listen on this port.
Primarily used by other data nodes to handle distributed reads and writes at runtime.
Used to control a data node (e.g., tell it to write to a specific shard or execute a query).
It's also used by meta nodes for cluster-type operations (e.g., tell a data node to join or leave the cluster).
This is the default port used for RPC calls used for inter-node communication and by the CLI for backup and restore operations
(`influxdb backup` and `influxd restore`).
[Configure this port](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#bind-address)
in the configuration file.
This port should not be exposed outside the cluster.
_See also: [Back up and restore](/enterprise_influxdb/v1.10/administration/backup-and-restore/)._
### 8089
Used for communcation between meta nodes.
It is used by the Raft consensus protocol.
The only clients using `8089` should be the other meta nodes in the cluster.
This port should not be exposed outside the cluster.
### 8091
Meta nodes listen on this port.
It is used for the meta service API.
Primarily used by data nodes to stay in sync about databases, retention policies, shards, users, privileges, etc.
Used by meta nodes to receive incoming connections by data nodes and Chronograf.
Clients also include the `influxd-ctl` command line tool and Chronograph,
This port should not be exposed outside the cluster.
## Disabled ports
### 2003
The default port that runs the Graphite service.
[Enable and configure this port](/enterprise_influxdb/v1.10/administration/config-data-nodes/#bind-address-2003)
in the configuration file.
**Resources** [Graphite README](https://github.com/influxdata/influxdb/tree/1.8/services/graphite/README.md)
### 4242
The default port that runs the OpenTSDB service.
[Enable and configure this port](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#opentsdb-settings)
in the configuration file.
**Resources** [OpenTSDB README](https://github.com/influxdata/influxdb/tree/1.8/services/opentsdb/README.md)
### 8089
The default port that runs the UDP service.
[Enable and configure this port](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#udp-settings)
in the configuration file.
**Resources** [UDP README](https://github.com/influxdata/influxdb/tree/1.8/services/udp/README.md)
### 25826
The default port that runs the Collectd service.
[Enable and configure this port](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#collectd-settings)
in the configuration file.
**Resources** [Collectd README](https://github.com/influxdata/influxdb/tree/1.8/services/collectd/README.md)

View File

@ -0,0 +1,16 @@
---
title: Configure security
description: Configure security features in InfluxDB Enterprise.
menu:
enterprise_influxdb_1_10:
name: Configure security
parent: Configure
weight: 40
aliases:
- /enterprise_influxdb/v1.10/administration/security/
---
_For user and permission management (authorization),
see [Manage users and permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/)._
{{< children >}}

View File

@ -0,0 +1,114 @@
---
title: Configure authentication
description: >
Enable authentication to require credentials for a cluster.
menu:
enterprise_influxdb_1_10:
parent: Configure security
name: Configure authentication
weight: 10
---
To configure authentication, do one of the following:
- [Enable authentication](#enable-authentication)
- [Configure authentication using JWT tokens](#configure-authentication-using-jwt-tokens) ([InfluxDB HTTP API](/enterprise_influxdb/v1.10/tools/api/) only)
## Enable authentication
Authentication is disabled by default in InfluxDB and InfluxDB Enterprise.
After [installing the data nodes](/enterprise_influxdb/v1.10/introduction/install-and-deploy/installation/data_node_installation/),
enable authentication to control access to your cluster.
To enable authentication in a cluster, do the following:
1. Set `auth-enabled` to `true` in the `[http]` section of the configuration files
for all meta **and** data nodes:
```toml
[http]
# ...
auth-enabled = true
```
1. Next, create an admin user (if you haven't already).
Using the [`influx` CLI](/enterprise_influxdb/v1.10/tools/influx-cli/),
run the following command:
```
CREATE USER admin WITH PASSWORD 'mypassword' WITH ALL PRIVILEGES
```
1. Restart InfluxDB Enterprise.
Once restarted, InfluxDB Enterprise checks user credentials on every request
and only processes requests with valid credentials.
## Configure authentication using JWT tokens
For a more secure alternative to using passwords, include JWT tokens in requests to the InfluxDB API.
1. **Add a shared secret in your InfluxDB Enterprise configuration file**.
InfluxDB Enterprise uses the shared secret to encode the JWT signature.
By default, `shared-secret` is set to an empty string (no JWT authentication).
Add a custom shared secret in your [InfluxDB configuration file](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#shared-secret)
for each meta and data node.
Longer strings are more secure:
```toml
[http]
shared-secret = "my super secret pass phrase"
```
Alternatively, to avoid keeping your secret phrase as plain text in your InfluxDB configuration file,
set the value with the `INFLUXDB_HTTP_SHARED_SECRET` environment variable (for example, in Linux: `export INFLUXDB_HTTP_SHARED_SECRET=MYSUPERSECRETPASSPHRASE`).
2. **Generate your JWT token**.
Use an authentication service (such as, [https://jwt.io/](https://jwt.io/))
to generate a secure token using your InfluxDB username, an expiration time, and your shared secret.
The payload (or claims) of the token must be in the following format:
```json
{
"username": "myUserName",
"exp": 1516239022
}
```
- **username** - InfluxDB username.
- **exp** - Token expiration in UNIX [epoch time](/enterprise_influxdb/v1.10/query_language/explore-data/#epoch_time).
For increased security, keep token expiration periods short.
For testing, you can manually generate UNIX timestamps using [https://www.unixtimestamp.com/index.php](https://www.unixtimestamp.com/index.php).
To encode the payload using your shared secret, use a JWT library in your own authentication server or encode by hand at [https://jwt.io/](https://jwt.io/).
3. **Include the token in HTTP requests**.
Include your generated token as part of the `Authorization` header in HTTP requests:
```
Authorization: Bearer <myToken>
```
{{% note %}}
Only unexpired tokens will successfully authenticate.
Verify your token has not expired.
{{% /note %}}
#### Example query request with JWT authentication
```bash
curl -G "http://localhost:8086/query?db=demodb" \
--data-urlencode "q=SHOW DATABASES" \
--header "Authorization: Bearer <header>.<payload>.<signature>"
```
## Authentication and authorization HTTP errors
Requests with no authentication credentials or incorrect credentials yield the `HTTP 401 Unauthorized` response.
Requests by unauthorized users yield the `HTTP 403 Forbidden` response.
## Next steps
After configuring authentication,
you can [manage users and permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/)
as necessary.
{{% enterprise-warning-authn-b4-authz %}}

View File

@ -0,0 +1,87 @@
---
title: Configure password hashing
description: >
Configure the cryptographic algorithm used for password hashing.
menu:
enterprise_influxdb_1_10:
name: Configure password hashing
parent: Configure security
weight: 40
related:
- /enterprise_influxdb/v1.10/administration/configuration/
aliases:
- /enterprise_influxdb/v1.10/administration/configure-password-hashing/
- /enterprise_influxdb/v1.10/administration/manage/configure-password-hashing/
---
By default, InfluxDB Enterprise uses `bcrypt` for password hashing.
[FIPS] compliance requires particular hashing alorithms.
Use `pbkdf2-sha256` or `pbkdf2-sha512` for FIPS compliance.
## Change password hashing algorithm
Complete the following steps
to change the password hashing algorithm used by an existing InfluxDB Enterprise cluster:
1. Ensure all meta and data nodes are running InfluxDB Enterprise 1.10.3 or later.
2. In your meta node and data node configuration files, set [`password-hash`] to one of the following:
`pbkdf2-sha256`, or `pbkdf2-sha512`.
Also set [`ensure-fips`] to `true`.
{{% note %}}
The `meta.password-hash` setting must be the same in both the data and meta node configuration files.
{{% /note %}}
3. Restart each meta and data node to load the configuration change.
4. To apply the new hashing algorithm, you must [reset](/enterprise_influxdb/v1.10/administration/authentication_and_authorization/#reset-a-users-password)
all existing passwords in the cluster.
Otherwise, the previous algorithm will continue to be used.
## Example configuration
**Example data node configuration:**
```toml
[meta]
# Configures password hashing scheme. Use "pbkdf2-sha256" or "pbkdf2-sha512"
# for a FIPS-ready password hash. This setting must have the same value as
# the meta nodes' meta.password-hash configuration.
password-hash = "pbkdf2-sha256"
# Configures strict FIPS-readiness check on startup.
ensure-fips = true
```
**Example meta node configuration:**
```toml
[meta]
# Configures password hashing scheme. Use "pbkdf2-sha256" or "pbkdf2-sha512"
# for a FIPS-ready password hash. This setting must have the same value as
# the data nodes' meta.password-hash configuration.
password-hash = "pbkdf2-sha256"
# Configures strict FIPS-readiness check on startup.
ensure-fips = true
```
## Using FIPS readiness checks
InfluxDB Enterprise outputs information about the current password hashing configuration at startup.
For example:
```
2021-07-21T17:20:44.024846Z info Password hashing configuration: pbkdf2-sha256;rounds=29000;salt_len=16 {"log_id": "0VUXBWE0001"}
2021-07-21T17:20:44.024857Z info Password hashing is FIPS-ready: true {"log_id": "0VUXBWE0001"}
```
When `ensure-fips` is enabled, attempting to use `password-hash = bcrypt`
will cause the FIPS check to fail.
The node then exits with an error in the logs:
```
run: create server: passwordhash: not FIPS-ready: config: 'bcrypt'
```
[FIPS]: https://csrc.nist.gov/publications/detail/fips/140/3/final
[`password-hash`]: /enterprise_influxdb/v1.10/administration/config-meta-nodes/#password-hash
[`ensure-fips`]: /enterprise_influxdb/v1.10/administration/config-meta-nodes/#ensure-fips

View File

@ -0,0 +1,297 @@
---
title: Configure HTTPS over TLS
description: >
Enabling HTTPS over TLS encrypts the communication between clients and the InfluxDB Enterprise server, and between nodes in the cluster.
menu:
enterprise_influxdb_1_10:
name: Configure TLS for cluster
parent: Configure security
weight: 20
aliases:
- /enterprise_influxdb/v1.10/guides/https_setup/
- /enterprise_influxdb/v1.10/guides/enable_tls/
- /enterprise_influxdb/v1.10/guides/enable-tls/
---
Enabling HTTPS over TLS encrypts the communication between clients and the InfluxDB Enterprise server, and between nodes in the cluster.
When configured with a signed certificate, HTTPS over TLS can also verify the authenticity of the InfluxDB Enterprise server to connecting clients.
This pages outlines how to set up HTTPS with InfluxDB Enterprise using either a signed or self-signed certificate.
{{% warn %}}
InfluxData **strongly recommends** enabling HTTPS, especially if you plan on sending requests to InfluxDB Enterprise over a network.
{{% /warn %}}
{{% note %}}
These steps have been tested on Debian-based Linux distributions.
Specific steps may vary on other operating systems.
{{% /note %}}
## Requirements
To enable HTTPS with InfluxDB Enterprise, you need a Transport Layer Security (TLS) certificate, also known as a Secured Sockets Layer (SSL) certificate.
InfluxDB supports three types of TLS certificates:
* **Single domain certificates signed by a [Certificate Authority](https://en.wikipedia.org/wiki/Certificate_authority)**
Single domain certificates provide cryptographic security to HTTPS requests and allow clients to verify the identity of the InfluxDB server.
These certificates are signed and issued by a trusted, third-party Certificate Authority (CA).
With this certificate option, every InfluxDB instance requires a unique single domain certificate.
* **Wildcard certificates signed by a Certificate Authority**
These certificates provide cryptographic security to HTTPS requests and allow clients to verify the identity of the InfluxDB server.
Wildcard certificates can be used across multiple InfluxDB Enterprise instances on different servers.
* **Self-signed certificates**
Self-signed certificates are _not_ signed by a trusted, third-party CA.
Unlike CA-signed certificates, self-signed certificates only provide cryptographic security to HTTPS requests.
They do not allow clients to verify the identity of the InfluxDB server.
With this certificate option, every InfluxDB Enterprise instance requires a unique self-signed certificate.
You can generate a self-signed certificate on your own machine.
Regardless of your certificate's type, InfluxDB Enterprise supports certificates composed of
a private key file (`.key`) and a signed certificate file (`.crt`) file pair, as well as certificates
that combine the private key file and the signed certificate file into a single bundled file (`.pem`).
In general, each node node should have its own certificate, whether signed or unsiged.
## Set up HTTPS in an InfluxDB Enterprise cluster
1. **Download or generate certificate files.**
If using a certificate provided by a CA, follow their instructions to download the certificate files.
{{% note %}}
If using one or more self-signed certificates, use the `openssl` utility to create a certificate.
The following command generates a private key file (`.key`) and a self-signed
certificate file (`.crt`) which remain valid for the specified `NUMBER_OF_DAYS`.
```sh
sudo openssl req -x509 -nodes -newkey rsa:2048 \
-keyout influxdb-selfsigned.key \
-out influxdb-selfsigned.crt \
-days <NUMBER_OF_DAYS>
```
The command will prompt you for more information.
You can choose to fill out these fields or leave them blank; both actions generate valid certificate files.
In subsequent steps, you will need to copy the certificate and key (or `.pem` file) to each node in the cluster.
{{% /note %}}
2. **Install the SSL/TLS certificate in each node.**
Place the private key file (`.key`) and the signed certificate file (`.crt`)
or the single bundled file (`.pem`)
in the `/etc/ssl/` directory of each meta node and data node.
{{% note %}}
Some Certificate Authorities provide certificate files with other extensions.
Consult your CA if you are unsure about how to use these files.
{{% /note %}}
3. **Ensure file permissions for each node.**
Certificate files require read and write access by the `influxdb` user.
Ensure that you have the correct file permissions in each meta node and data node by running the following commands:
```sh
sudo chown influxdb:influxdb /etc/ssl/
sudo chmod 644 /etc/ssl/<CA-certificate-file>
sudo chmod 600 /etc/ssl/<private-key-file>
```
4. **Enable HTTPS within the configuration file for each meta node.**
Enable HTTPS for each meta node within the `[meta]` section of the meta node configuration file (`influxdb-meta.conf`) by setting:
```toml
[meta]
[...]
# Determines whether HTTPS is enabled.
https-enabled = true
# The SSL certificate to use when HTTPS is enabled.
https-certificate = "influxdb-meta.crt"
# Use a separate private key location.
https-private-key = "influxdb-meta.key"
# If using a self-signed certificate:
https-insecure-tls = true
# Use TLS when communicating with data notes
data-use-tls = true
data-insecure-tls = true
```
5. **Enable HTTPS within the configuration file for each data node.**
Make the following sets of changes in the configuration file (`influxdb.conf`) on each data node:
1. Enable HTTPS for each data node within the `[http]` section of the configuration file by setting:
```toml
[http]
[...]
# Determines whether HTTPS is enabled.
https-enabled = true
[...]
# The SSL certificate to use when HTTPS is enabled.
https-certificate = "influxdb-data.crt"
# Use a separate private key location.
https-private-key = "influxdb-data.key"
```
2. Configure the data nodes to use HTTPS when communicating with other data nodes.
In the `[cluster]` section of the configuration file, set the following:
```toml
[cluster]
[...]
# Determines whether data nodes use HTTPS to communicate with each other.
https-enabled = true
# The SSL certificate to use when HTTPS is enabled.
https-certificate = "influxdb-data.crt"
# Use a separate private key location.
https-private-key = "influxdb-data.key"
# If using a self-signed certificate:
https-insecure-tls = true
```
3. Configure the data nodes to use HTTPS when communicating with the meta nodes.
In the `[meta]` section of the configuration file, set the following:
```toml
[meta]
[...]
meta-tls-enabled = true
# If using a self-signed certificate:
meta-insecure-tls = true
```
6. **Restart InfluxDB Enterprise.**
Restart the InfluxDB Enterprise processes for the configuration changes to take effect:
```sh
sudo systemctl restart influxdb-meta
```
Restart the InfluxDB Enterprise data node processes for the configuration changes to take effect:
```sh
sudo systemctl restart influxdb
```
7. **Verify the HTTPS setup.**
Verify that HTTPS is working on the meta nodes by using `influxd-ctl`.
```sh
influxd-ctl -bind-tls show
```
If using a self-signed certificate, use:
```sh
influxd-ctl -bind-tls -k show
```
{{% warn %}}
Once you have enabled HTTPS, you must use `-bind-tls` in order for `influxd-ctl` to connect to the meta node.
With a self-signed certificate, you must also use the `-k` option to skip certificate verification.
{{% /warn %}}
A successful connection returns output which should resemble the following:
```
Data Nodes
==========
ID TCP Address Version
4 enterprise-data-01:8088 1.x.y-c1.x.y
5 enterprise-data-02:8088 1.x.y-c1.x.y
Meta Nodes
==========
TCP Address Version
enterprise-meta-01:8091 1.x.y-c1.x.z
enterprise-meta-02:8091 1.x.y-c1.x.z
enterprise-meta-03:8091 1.x.y-c1.x.z
```
Next, verify that HTTPS is working by connecting to InfluxDB Enterprise with the [`influx` command line interface](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/):
```sh
influx -ssl -host <domain_name>.com
```
If using a self-signed certificate, use
```sh
influx -ssl -unsafeSsl -host <domain_name>.com
```
A successful connection returns the following:
```sh
Connected to https://<domain_name>.com:8086 version 1.x.y
InfluxDB shell version: 1.x.y
>
```
That's it! You've successfully set up HTTPS with InfluxDB Enterprise.
## Connect Telegraf to a secured InfluxDB Enterprise instance
Connecting [Telegraf](/{{< latest "telegraf" >}}/)
to an HTTPS-enabled InfluxDB Enterprise instance requires some additional steps.
In Telegraf's configuration file (`/etc/telegraf/telegraf.conf`), under the OUTPUT PLUGINS section,
edit the `urls` setting to indicate `https` instead of `http`.
Also change `localhost` to the relevant domain name.
The best practice in terms of security is to transfer the certificate to the client and make it trusted
(either by putting in the operating system's trusted certificate system or using the `ssl_ca` option).
The alternative is to sign the certificate using an internal CA and then trust the CA certificate.
Provide the file paths of your key and certificate to the InfluxDB output plugin as shown below.
If you're using a self-signed certificate,
uncomment the `insecure_skip_verify` setting and set it to `true`.
```toml
###############################################################################
# OUTPUT PLUGINS #
###############################################################################
# Configuration for influxdb server to send metrics to
[[outputs.influxdb]]
## The full HTTP or UDP endpoint URL for your InfluxDB Enterprise instance.
## Multiple urls can be specified as part of the same cluster,
## this means that only ONE of the urls will be written to each interval.
# urls = ["udp://localhost:8089"] # UDP endpoint example
urls = ["https://<domain_name>.com:8086"]
[...]
## Optional SSL Config
tls_cert = "/etc/telegraf/cert.pem"
tls_key = "/etc/telegraf/key.pem"
insecure_skip_verify = true # <-- Update only if you're using a self-signed certificate
```
Next, restart Telegraf and you're all set!
```sh
sudo systemctl restart telegraf
```

View File

@ -0,0 +1,299 @@
---
title: Configure LDAP authentication
description: >
Configure LDAP authentication in InfluxDB Enterprise and test LDAP connectivity.
menu:
enterprise_influxdb_1_10:
name: Configure LDAP authentication
parent: Configure security
weight: 30
aliases:
- /enterprise_influxdb/v1.10/administration/ldap/
- /enterprise_influxdb/v1.10/administration/manage/security/ldap/
---
Configure InfluxDB Enterprise to use LDAP (Lightweight Directory Access Protocol) to:
- Validate user permissions
- Synchronize InfluxDB and LDAP so each LDAP request doesn't need to be queried
{{% note %}}
LDAP **requires** JWT authentication. For more information, see [Configure authentication using JWT tokens](/enterprise_influxdb/v10/administration/configure/security/authentication/#configure-authentication-using-jwt-tokens).
To configure InfluxDB Enterprise to support LDAP, all users must be managed in the remote LDAP service. If LDAP is configured and enabled, users **must** authenticate through LDAP, including users who may have existed before enabling LDAP.
{{% /note %}}
- [Configure LDAP for an InfluxDB Enterprise cluster](#configure-ldap-for-an-influxdb-enterprise-cluster)
- [Sample LDAP configuration](#sample-ldap-configuration)
- [Troubleshoot LDAP in InfluxDB Enterprise](#troubleshoot-ldap-in-influxdb-enterprise)
## Configure LDAP for an InfluxDB Enterprise cluster
To use LDAP with an InfluxDB Enterprise cluster, do the following:
1. [Configure data nodes](#configure-data-nodes)
2. [Configure meta nodes](#configure-meta-nodes)
3. [Create, verify, and upload the LDAP configuration file](#create-verify-and-upload-the-ldap-configuration-file)
4. [Restart meta and data nodes](#restart-meta-and-data-nodes)
### Configure data nodes
Update the following settings in each data node configuration file (`/etc/influxdb/influxdb.conf`):
1. Under `[http]`, enable HTTP authentication by setting `auth-enabled` to `true`.
(Or set the corresponding environment variable `INFLUXDB_HTTP_AUTH_ENABLED` to `true`.)
2. Configure the HTTP shared secret to validate requests using JSON web tokens (JWT) and sign each HTTP payload with the secret and username.
Set the `[http]` configuration setting for `shared-secret`, or the corresponding environment variable `INFLUXDB_HTTP_SHARED_SECRET`.
3. If you're enabling authentication on meta nodes, you must also include the following configurations:
- `INFLUXDB_META_META_AUTH_ENABLED` environment variable, or `[http]` configuration setting `meta-auth-enabled`, is set to `true`.
This value must be the same value as the meta node's `meta.auth-enabled` configuration.
- `INFLUXDB_META_META_INTERNAL_SHARED_SECRET`,
or the corresponding `[meta]` configuration setting `meta-internal-shared-secret`,
is set a secret value.
This value must be the same value as the meta node's `meta.internal-shared-secret`.
### Configure meta nodes
Update the following settings in each meta node configuration file (`/etc/influxdb/influxdb-meta.conf`):
1. Configure the meta node META shared secret to validate requests using JSON web tokens (JWT) and sign each HTTP payload with the username and shared secret.
2. Set the `[meta]` configuration setting `internal-shared-secret` to `"<internal-shared-secret>"`.
(Or set the `INFLUXDB_META_INTERNAL_SHARED_SECRET` environment variable.)
3. Set the `[meta]` configuration setting `meta.ldap-allowed` to `true` on all meta nodes in your cluster.
(Or set the `INFLUXDB_META_LDAP_ALLOWED`environment variable.)
### Authenticate your connection to InfluxDB
To authenticate your InfluxDB connection, run the following command, replacing `username:password` with your credentials:
{{< keep-url >}}
```bash
curl -u username:password -XPOST "http://localhost:8086/..."
```
For more detail on authentication, see [Authentication and authorization in InfluxDB](/enterprise_influxdb/v1.10/administration/authentication_and_authorization/).
### Create, verify, and upload the LDAP configuration file
1. To create a sample LDAP configuration file, run the following command:
```bash
influxd-ctl ldap sample-config
```
2. Save the sample file and edit as needed for your LDAP server.
For detail, see the [sample LDAP configuration file](#sample-ldap-configuration) below.
> To use fine-grained authorization (FGA) with LDAP, you must map InfluxDB Enterprise roles to key-value pairs in the LDAP database.
For more information, see [Fine-grained authorization in InfluxDB Enterprise](/enterprise_influxdb/v1.10/guides/fine-grained-authorization/).
The InfluxDB admin user doesn't include permissions for InfluxDB Enterprise roles.
3. Restart all meta and data nodes in your InfluxDB Enterprise cluster to load your updated configuration.
On each **meta** node, run:
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[sysvinit](#)
[systemd](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```sh
service influxdb-meta restart
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```sh
sudo systemctl restart influxdb-meta
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
On each **data** node, run:
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[sysvinit](#)
[systemd](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```sh
service influxdb restart
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```sh
sudo systemctl restart influxdb
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
4. To verify your LDAP configuration, run:
```bash
influxd-ctl ldap verify -ldap-config /path/to/ldap.toml
```
5. To load your LDAP configuration file, run the following command:
```bash
influxd-ctl ldap set-config /path/to/ldap.toml
```
## Sample LDAP configuration
The following is a sample configuration file that connects to a publicly available LDAP server.
A `DN` ("distinguished name") uniquely identifies an entry and describes its position in the directory information tree (DIT) hierarchy.
The DN of an LDAP entry is similar to a file path on a file system.
`DNs` refers to multiple DN entries.
{{% truncate %}}
```toml
enabled = true
[[servers]]
enabled = true
[[servers]]
host = "<LDAPserver>"
port = 389
# Security mode for LDAP connection to this server.
# The recommended security is set "starttls" by default. This uses an initial unencrypted connection
# and upgrades to TLS as the first action against the server,
# per the LDAPv3 standard.
# Other options are "starttls+insecure" to behave the same as starttls
# but skip server certificate verification, or "none" to use an unencrypted connection.
security = "starttls"
# Credentials to use when searching for a user or group.
bind-dn = "cn=read-only-admin,dc=example,dc=com"
bind-password = "password"
# Base DNs to use when applying the search-filter to discover an LDAP user.
search-base-dns = [
"dc=example,dc=com",
]
# LDAP filter to discover a user's DN.
# %s will be replaced with the provided username.
search-filter = "(uid=%s)"
# On Active Directory you might use "(sAMAccountName=%s)".
# Base DNs to use when searching for groups.
group-search-base-dns = ["dc=example,dc=com"]
# LDAP filter to identify groups that a user belongs to.
# %s will be replaced with the user's DN.
group-membership-search-filter = "(&(objectClass=groupOfUniqueNames)(uniqueMember=%s))"
# On Active Directory you might use "(&(objectClass=group)(member=%s))".
# Attribute to use to determine the "group" in the group-mappings section.
group-attribute = "ou"
# On Active Directory you might use "cn".
# LDAP filter to search for a group with a particular name.
# This is used when warming the cache to load group membership.
group-search-filter = "(&(objectClass=groupOfUniqueNames)(cn=%s))"
# On Active Directory you might use "(&(objectClass=group)(cn=%s))".
# Attribute of a group that contains the DNs of the group's members.
group-member-attribute = "uniqueMember"
# On Active Directory you might use "member".
# Create an administrator role in InfluxDB and then log in as a member of the admin LDAP group. Only members of a group with the administrator role can complete admin tasks.
# For example, if tesla is the only member of the `italians` group, you must log in as tesla/password.
admin-groups = ["italians"]
# These two roles would have to be created by hand if you want these LDAP group memberships to do anything.
[[servers.group-mappings]]
group = "mathematicians"
role = "arithmetic"
[[servers.group-mappings]]
group = "scientists"
role = "laboratory"
```
{{% /truncate %}}
## Troubleshoot LDAP in InfluxDB Enterprise
### InfluxDB Enterprise does not recognize a new LDAP server
If you ever replace an LDAP server with a new one, you need to update your
InfluxDB Enterprise LDAP configuration file to point to the new server.
However, InfluxDB Enterprise may not recognize or honor the updated configuration.
For InfluxDB Enterprise to recognize an LDAP configuration pointing to a new
LDAP server, do the following:
{{% warn %}}
#### Not recommended in production InfluxDB Enterprise clusters
Performing the following process on a production cluster may have unintended consequences.
Moving to a new LDAP server constitutes and infrastructure change and may better
be handled through a cluster migration.
For assistance, reach out to [InfluxData support](https://support.influxdata.com/s/contactsupport).
{{% /warn %}}
1. On each meta node, update the `auth-enabled` setting to `false` in your
`influxdb-meta.conf` configuration file to temporarily disable authentication.
```toml
[meta]
auth-enabled = false
```
2. Restart all meta nodes to load the updated configuration.
On each meta node, run:
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[sysvinit](#)
[systemd](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```sh
service influxdb-meta restart
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```sh
sudo systemctl restart influxdb-meta
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
3. On each meta node, [create, verify, and upload the _new_ LDAP configuration file](#create-verify-and-upload-the-ldap-configuration-file).
4. On each meta node, update the `auth-enabled` setting to `true` in your `influxdb-meta.conf`
configuration file to reenable authentication.
```toml
[meta]
auth-enabled = true
```
5. Restart all meta nodes to load the updated configuration.
On each meta node, run:
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[sysvinit](#)
[systemd](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```sh
service influxdb-meta restart
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```sh
sudo systemctl restart influxdb-meta
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}

View File

@ -0,0 +1,11 @@
---
title: Manage
description: Manage security, clusters, and subscriptions in InfluxDB enterprise.
menu:
enterprise_influxdb_1_10:
name: Manage
weight: 12
parent: Administration
---
{{< children >}}

View File

@ -0,0 +1,21 @@
---
title: Manage InfluxDB Enterprise clusters
description: >
Use the `influxd-ctl` and `influx` command line tools to manage InfluxDB Enterprise clusters and data.
aliases:
- /enterprise/v1.8/features/cluster-commands/
- /enterprise_influxdb/v1.10/features/cluster-commands/
- /enterprise_influxdb/v1.10/administration/cluster-commands/
menu:
enterprise_influxdb_1_10:
name: Manage clusters
parent: Manage
weight: 10
---
Use the following tools to manage and interact with your InfluxDB Enterprise clusters:
- [`influxd-ctl`](/enterprise_influxdb/v1.10/tools/influxd-ctl/) cluster management utility
- [`influx`](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/) command line interface (CLI)
{{< children >}}

View File

@ -0,0 +1,27 @@
---
title: Add node to existing cluster
description: Add nodes to an existing InfluxDB Enterprise cluster.
aliases:
menu:
enterprise_influxdb_1_10:
name: Add nodes
parent: Manage clusters
weight: 19
---
To add a data node to an existing cluster, follow the steps below.
1. Install and start a new data node.
Complete steps 13 of the [data node installation instructions](/enterprise_influxdb/v1.10/introduction/install-and-deploy/installation/data_node_installation/#step-1-add-appropriate-dns-entries-for-each-of-your-servers).
2. To join the new node to the cluster, do one of the following:
- From a meta node, run:
```sh
influxd-ctl add-data <new data node address>:<port>
```
- From a remote server, run:
```sh
influxd-ctl -bind <existing_meta_node:8091> add-data <new data node
address>:<port>
```
3. (Optional) [Rebalance the cluster](/enterprise_influxdb/v1.10/administration/manage/clusters/rebalance/).

View File

@ -0,0 +1,420 @@
---
title: Rebalance InfluxDB Enterprise clusters
description: Manually rebalance an InfluxDB Enterprise cluster.
aliases:
- /enterprise_influxdb/v1.10/guides/rebalance/
- /enterprise_influxdb/v1.10/guides/rebalance/
menu:
enterprise_influxdb_1_10:
name: Rebalance clusters
parent: Manage clusters
weight: 21
---
## Introduction
This guide describes how to manually rebalance an InfluxDB Enterprise cluster.
Rebalancing a cluster involves two primary goals:
* Evenly distribute
[shards](/enterprise_influxdb/v1.10/concepts/glossary/#shard) across all data nodes in the
cluster
* Ensure that every
shard is on *n* number of nodes, where *n* is determined by the retention policy's
[replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor)
Rebalancing a cluster is essential for cluster health.
Perform a rebalance if you add a new data node to your cluster.
The proper rebalance path depends on the purpose of the new data node.
If you added a data node to expand the disk size of the cluster or increase
write throughput, follow the steps in
[Rebalance Procedure 1](#rebalance-procedure-1-rebalance-a-cluster-to-create-space).
If you added a data node to increase data availability for queries and query
throughput, follow the steps in
[Rebalance Procedure 2](#rebalance-procedure-2-rebalance-a-cluster-to-increase-availability).
### Requirements
The following sections assume that you already added a new data node to the
cluster, and they use the
[`influxd-ctl` tool](/enterprise_influxdb/v1.10/tools/influxd-ctl/) available on
all meta nodes.
{{% warn %}}
Before you begin, stop writing historical data to InfluxDB.
Historical data have timestamps that occur at anytime in the past.
Performing a rebalance while writing historical data can lead to data loss.
{{% /warn %}}
## Rebalance Procedure 1: Rebalance a cluster to create space
For demonstration purposes, the next steps assume that you added a third
data node to a previously two-data-node cluster that has a
[replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor) of
two.
This rebalance procedure is applicable for different cluster sizes and
replication factors, but some of the specific, user-provided values will depend
on that cluster size.
Rebalance Procedure 1 focuses on how to rebalance a cluster after adding a
data node to expand the total disk capacity of the cluster.
In the next steps, you will safely move shards from one of the two original data
nodes to the new data node.
### Step 1: Truncate Hot Shards
Hot shards are shards that are currently receiving writes.
Performing any action on a hot shard can lead to data inconsistency within the
cluster which requires manual intervention from the user.
To prevent data inconsistency, truncate hot shards before moving any shards
across data nodes.
The command below creates a new hot shard which is automatically distributed
across all data nodes in the cluster, and the system writes all new points to
that shard.
All previous writes are now stored in cold shards.
```
influxd-ctl truncate-shards
```
The expected ouput of this command is:
```
Truncated shards.
```
Once you truncate the shards, you can work on redistributing the cold shards
without the threat of data inconsistency in the cluster.
Any hot or new shards are now evenly distributed across the cluster and require
no further intervention.
### Step 2: Identify Cold Shards
In this step, you identify the cold shards that you will copy to the new data node
and remove from one of the original two data nodes.
The following command lists every shard in our cluster:
```
influxd-ctl show-shards
```
The expected output is similar to the items in the codeblock below:
```
Shards
==========
ID Database Retention Policy Desired Replicas [...] End Owners
21 telegraf autogen 2 [...] 2017-01-26T18:00:00Z [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
22 telegraf autogen 2 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
24 telegraf autogen 2 [...] 2017-01-26T19:00:00Z [{5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
```
The sample output includes three shards.
The first two shards are cold shards.
The timestamp in the `End` column occurs in the past (assume that the current
time is just after `2017-01-26T18:05:36.418734949Z`), and the shards' owners
are the two original data nodes: `enterprise-data-01:8088` and
`enterprise-data-02:8088`.
The second shard is the truncated shard; truncated shards have an asterix (`*`)
on the timestamp in the `End` column.
The third shard is the newly-created hot shard; the timestamp in the `End`
column is in the future (again, assume that the current time is just after
`2017-01-26T18:05:36.418734949Z`), and the shard's owners include one of the
original data nodes (`enterprise-data-02:8088`) and the new data node
(`enterprise-data-03:8088`).
That hot shard and any subsequent shards require no attention during
the rebalance process.
Identify the cold shards that you'd like to move from one of the original two
data nodes to the new data node.
Take note of the cold shard's `ID` (for example: `22`) and the TCP address of
one of its owners in the `Owners` column (for example:
`enterprise-data-01:8088`).
> **Note:**
>
Use the following command string to determine the size of the shards in
your cluster:
>
find /var/lib/influxdb/data/ -mindepth 3 -type d -exec du -h {} \;
>
In general, we recommend moving larger shards to the new data node to increase the
available disk space on the original data nodes.
Users should note that moving shards will impact network traffic.
### Step 3: Copy Cold Shards
Next, copy the relevant cold shards to the new data node with the syntax below.
Repeat this command for every cold shard that you'd like to move to the
new data node.
```
influxd-ctl copy-shard <source_TCP_address> <destination_TCP_address> <shard_ID>
```
Where `source_TCP_address` is the address that you noted in step 2,
`destination_TCP_address` is the TCP address of the new data node, and `shard_ID`
is the ID of the shard that you noted in step 2.
The expected output of the command is:
```
Copied shard <shard_ID> from <source_TCP_address> to <destination_TCP_address>
```
### Step 4: Confirm the Copied Shards
Confirm that the TCP address of the new data node appears in the `Owners` column
for every copied shard:
```
influxd-ctl show-shards
```
The expected output shows that the copied shard now has three owners:
```
Shards
==========
ID Database Retention Policy Desired Replicas [...] End Owners
22 telegraf autogen 2 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
```
In addition, verify that the copied shards appear in the new data node's shard
directory and match the shards in the source data node's shard directory.
Shards are located in
`/var/lib/influxdb/data/<database>/<retention_policy>/<shard_ID>`.
Here's an example of the correct output for shard `22`:
```
# On the source data node (enterprise-data-01)
~# ls /var/lib/influxdb/data/telegraf/autogen/22
000000001-000000001.tsm # 👍
# On the new data node (enterprise-data-03)
~# ls /var/lib/influxdb/data/telegraf/autogen/22
000000001-000000001.tsm # 👍
```
It is essential that every copied shard appears on the new data node both
in the `influxd-ctl show-shards` output and in the shard directory.
If a shard does not pass both of the tests above, please repeat step 3.
### Step 5: Remove Unnecessary Cold Shards
Next, remove the copied shard from the original data node with the command below.
Repeat this command for every cold shard that you'd like to remove from one of
the original data nodes.
**Removing a shard is an irrecoverable, destructive action; please be
cautious with this command.**
```
influxd-ctl remove-shard <source_TCP_address> <shard_ID>
```
Where `source_TCP_address` is the TCP address of the original data node and
`shard_ID` is the ID of the shard that you noted in step 2.
The expected output of the command is:
```
Removed shard <shard_ID> from <source_TCP_address>
```
### Step 6: Confirm the Rebalance
For every relevant shard, confirm that the TCP address of the new data node and
only one of the original data nodes appears in the `Owners` column:
```
influxd-ctl show-shards
```
The expected output shows that the copied shard now has only two owners:
```
Shards
==========
ID Database Retention Policy Desired Replicas [...] End Owners
22 telegraf autogen 2 [...] 2017-01-26T18:05:36.418734949Z* [{5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
```
That's it.
You've successfully rebalanced your cluster; you expanded the available disk
size on the original data nodes and increased the cluster's write throughput.
## Rebalance Procedure 2: Rebalance a cluster to increase availability
For demonstration purposes, the next steps assume that you added a third
data node to a previously two-data-node cluster that has a
[replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor) of
two.
This rebalance procedure is applicable for different cluster sizes and
replication factors, but some of the specific, user-provided values will depend
on that cluster size.
Rebalance Procedure 2 focuses on how to rebalance a cluster to improve availability
and query throughput.
In the next steps, you will increase the retention policy's replication factor and
safely copy shards from one of the two original data nodes to the new data node.
### Step 1: Update the Retention Policy
[Update](/enterprise_influxdb/v1.10/query_language/manage-database/#modify-retention-policies-with-alter-retention-policy)
every retention policy to have a replication factor of three.
This step ensures that the system automatically distributes all newly-created
shards across the three data nodes in the cluster.
The following query increases the replication factor to three.
Run the query on any data node for each retention policy and database.
Here, we use InfluxDB's [CLI](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/) to execute the query:
```
> ALTER RETENTION POLICY "<retention_policy_name>" ON "<database_name>" REPLICATION 3
>
```
A successful `ALTER RETENTION POLICY` query returns no results.
Use the
[`SHOW RETENTION POLICIES` query](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-retention-policies)
to verify the new replication factor.
Example:
```
> SHOW RETENTION POLICIES ON "telegraf"
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
autogen 0s 1h0m0s 3 #👍 true
```
### Step 2: Truncate Hot Shards
Hot shards are shards that are currently receiving writes.
Performing any action on a hot shard can lead to data inconsistency within the
cluster which requires manual intervention from the user.
To prevent data inconsistency, truncate hot shards before copying any shards
to the new data node.
The command below creates a new hot shard which is automatically distributed
across the three data nodes in the cluster, and the system writes all new points
to that shard.
All previous writes are now stored in cold shards.
```
influxd-ctl truncate-shards
```
The expected ouput of this command is:
```
Truncated shards.
```
Once you truncate the shards, you can work on distributing the cold shards
without the threat of data inconsistency in the cluster.
Any hot or new shards are now automatically distributed across the cluster and
require no further intervention.
### Step 3: Identify Cold Shards
In this step, you identify the cold shards that you will copy to the new data node.
The following command lists every shard in your cluster:
```
influxd-ctl show-shards
```
The expected output is similar to the items in the codeblock below:
```
Shards
==========
ID Database Retention Policy Desired Replicas [...] End Owners
21 telegraf autogen 3 [...] 2017-01-26T18:00:00Z [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
22 telegraf autogen 3 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
24 telegraf autogen 3 [...] 2017-01-26T19:00:00Z [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
```
The sample output includes three shards.
The first two shards are cold shards.
The timestamp in the `End` column occurs in the past (assume that the current
time is just after `2017-01-26T18:05:36.418734949Z`), and the shards' owners
are the two original data nodes: `enterprise-data-01:8088` and
`enterprise-data-02:8088`.
The second shard is the truncated shard; truncated shards have an asterix (`*`)
on the timestamp in the `End` column.
The third shard is the newly-created hot shard; the timestamp in the `End`
column is in the future (again, assume that the current time is just after
`2017-01-26T18:05:36.418734949Z`), and the shard's owners include all three
data nodes: `enterprise-data-01:8088`, `enterprise-data-02:8088`, and
`enterprise-data-03:8088`.
That hot shard and any subsequent shards require no attention during
the rebalance process.
Identify the cold shards that you'd like to copy from one of the original two
data nodes to the new data node.
Take note of the cold shard's `ID` (for example: `22`) and the TCP address of
one of its owners in the `Owners` column (for example:
`enterprise-data-01:8088`).
### Step 4: Copy Cold Shards
Next, copy the relevant cold shards to the new data node with the syntax below.
Repeat this command for every cold shard that you'd like to move to the
new data node.
```
influxd-ctl copy-shard <source_TCP_address> <destination_TCP_address> <shard_ID>
```
Where `source_TCP_address` is the address that you noted in step 3,
`destination_TCP_address` is the TCP address of the new data node, and `shard_ID`
is the ID of the shard that you noted in step 3.
The expected output of the command is:
```
Copied shard <shard_ID> from <source_TCP_address> to <destination_TCP_address>
```
### Step 5: Confirm the Rebalance
Confirm that the TCP address of the new data node appears in the `Owners` column
for every copied shard:
```
influxd-ctl show-shards
```
The expected output shows that the copied shard now has three owners:
```
Shards
==========
ID Database Retention Policy Desired Replicas [...] End Owners
22 telegraf autogen 3 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
```
In addition, verify that the copied shards appear in the new data node's shard
directory and match the shards in the source data node's shard directory.
Shards are located in
`/var/lib/influxdb/data/<database>/<retention_policy>/<shard_ID>`.
Here's an example of the correct output for shard `22`:
```
# On the source data node (enterprise-data-01)
~# ls /var/lib/influxdb/data/telegraf/autogen/22
000000001-000000001.tsm # 👍
# On the new data node (enterprise-data-03)
~# ls /var/lib/influxdb/data/telegraf/autogen/22
000000001-000000001.tsm # 👍
```
That's it.
You've successfully rebalanced your cluster and increased data availability for
queries and query throughput.

View File

@ -0,0 +1,428 @@
---
title: Replace InfluxDB Enterprise cluster meta nodes and data nodes
description: Replace meta and data nodes in an InfluxDB Enterprise cluster.
aliases:
- /enterprise_influxdb/v1.10/guides/replacing-nodes/
menu:
enterprise_influxdb_1_10:
name: Replace nodes
parent: Manage clusters
weight: 20
---
## Introduction
Nodes in an InfluxDB Enterprise cluster may need to be replaced at some point due to hardware needs, hardware issues, or something else entirely.
This guide outlines processes for replacing both meta nodes and data nodes in an InfluxDB Enterprise cluster.
## Concepts
Meta nodes manage and monitor both the uptime of nodes in the cluster as well as distribution of [shards](/enterprise_influxdb/v1.10/concepts/glossary/#shard) among nodes in the cluster.
They hold information about which data nodes own which shards; information on which the
[anti-entropy](/enterprise_influxdb/v1.10/administration/anti-entropy/) (AE) process depends.
Data nodes hold raw time-series data and metadata. Data shards are both distributed and replicated across data nodes in the cluster. The AE process runs on data nodes and references the shard information stored in the meta nodes to ensure each data node has the shards they need.
`influxd-ctl` is a CLI included in each meta node and is used to manage your InfluxDB Enterprise cluster.
## Scenarios
### Replace nodes in clusters with security enabled
Many InfluxDB Enterprise clusters are configured with security enabled, forcing secure TLS encryption between all nodes in the cluster.
Both `influxd-ctl` and `curl`, the command line tools used when replacing nodes, have options that facilitate the use of TLS.
#### `influxd-ctl -bind-tls`
In order to manage your cluster over TLS, pass the `-bind-tls` flag with any `influxd-ctl` commmand.
> If using a self-signed certificate, pass the `-k` flag to skip certificate verification.
```bash
# Pattern
influxd-ctl -bind-tls [-k] <command>
# Example
influxd-ctl -bind-tls remove-meta enterprise-meta-02:8091
```
#### `curl -k`
`curl` natively supports TLS/SSL connections, but if using a self-signed certificate, pass the `-k`/`--insecure` flag to allow for "insecure" SSL connections.
> Self-signed certificates are considered "insecure" due to their lack of a valid chain of authority. However, data is still encrypted when using self-signed certificates.
```bash
# Pattern
curl [-k, --insecure] <url>
# Example
curl -k https://localhost:8091/status
```
### Replace meta nodes in a functional cluster
If all meta nodes in the cluster are fully functional, simply follow the steps for [replacing meta nodes](#replace-meta-nodes-in-an-influxdb-enterprise-cluster).
### Replace an unresponsive meta node
If replacing a meta node that is either unreachable or unrecoverable, you need to forcefully remove it from the meta cluster. Instructions for forcefully removing meta nodes are provided in the [step 2.2](#2-2-remove-the-non-leader-meta-node) of the [replacing meta nodes](#replace-meta-nodes-in-an-influxdb-enterprise-cluster) process.
### Replace responsive and unresponsive data nodes in a cluster
The process of replacing both responsive and unresponsive data nodes is the same. Simply follow the instructions for [replacing data nodes](#replace-data-nodes-in-an-influxdb-enterprise-cluster).
### Reconnect a data node with a failed disk
A disk drive failing is never a good thing, but it does happen, and when it does,
all shards on that node are lost.
Often in this scenario, rather than replacing the entire host, you just need to replace the disk.
Host information remains the same, but once started again, the `influxd` process doesn't know
to communicate with the meta nodes so the AE process can't start the shard-sync process.
To resolve this, log in to a meta node and use the [`influxd-ctl update-data`](/enterprise_influxdb/v1.10/tools/influxd-ctl/#update-data) command
to [update the failed data node to itself](#2-replace-the-old-data-node-with-the-new-data-node).
```bash
# Pattern
influxd-ctl update-data <data-node-tcp-bind-address> <data-node-tcp-bind-address>
# Example
influxd-ctl update-data enterprise-data-01:8088 enterprise-data-01:8088
```
This will connect the `influxd` process running on the newly replaced disk to the cluster.
The AE process will detect the missing shards and begin to sync data from other
shards in the same shard group.
## Replace meta nodes in an InfluxDB Enterprise cluster
[Meta nodes](/enterprise_influxdb/v1.10/concepts/clustering/#meta-nodes) together form a [Raft](https://raft.github.io/) cluster in which nodes elect a leader through consensus vote.
The leader oversees the management of the meta cluster, so it is important to replace non-leader nodes before the leader node.
The process for replacing meta nodes is as follows:
1. [Identify the leader node](#1-identify-the-leader-node)
2. [Replace all non-leader nodes](#2-replace-all-non-leader-nodes)
2.1. [Provision a new meta node](#2-1-provision-a-new-meta-node)
2.2. [Remove the non-leader meta node](#2-2-remove-the-non-leader-meta-node)
2.3. [Add the new meta node](#2-3-add-the-new-meta-node)
2.4. [Confirm the meta node was added](#2-4-confirm-the-meta-node-was-added)
2.5. [Remove and replace all other non-leader meta nodes](#2-5-remove-and-replace-all-other-non-leader-meta-nodes)
3. [Replace the leader node](#3-replace-the-leader-node)
3.1. [Kill the meta process on the leader node](#3-1-kill-the-meta-process-on-the-leader-node)
3.2. [Remove and replace the old leader node](#3-2-remove-and-replace-the-old-leader-node)
### 1. Identify the leader node
Log into any of your meta nodes and run the following:
```bash
curl -s localhost:8091/status | jq
```
> Piping the command into `jq` is optional, but does make the JSON output easier to read.
The output will include information about the current meta node, the leader of the meta cluster, and a list of "peers" in the meta cluster.
```json
{
"nodeType": "meta",
"leader": "enterprise-meta-01:8089",
"httpAddr": "enterprise-meta-01:8091",
"raftAddr": "enterprise-meta-01:8089",
"peers": [
"enterprise-meta-01:8089",
"enterprise-meta-02:8089",
"enterprise-meta-03:8089"
]
}
```
Identify the `leader` of the cluster. When replacing nodes in a cluster, non-leader nodes should be replaced _before_ the leader node.
### 2. Replace all non-leader nodes
#### 2.1. Provision a new meta node
[Provision and start a new meta node](/enterprise_influxdb/v1.10/installation/meta_node_installation/), but **do not** add it to the cluster yet.
For this guide, the new meta node's hostname will be `enterprise-meta-04`.
#### 2.2. Remove the non-leader meta node
Now remove the non-leader node you are replacing by using the `influxd-ctl remove-meta` command and the TCP address of the meta node (ex. `enterprise-meta-02:8091`):
```bash
# Pattern
influxd-ctl remove-meta <meta-node-tcp-bind-address>
# Example
influxd-ctl remove-meta enterprise-meta-02:8091
```
> Only use `remove-meta` if you want to permanently remove a meta node from a cluster.
<!-- -->
> **For unresponsive or unrecoverable meta nodes:**
>If the meta process is not running on the node you are trying to remove or the node is neither reachable nor recoverable, use the `-force` flag.
When forcefully removing a meta node, you must also pass the `-tcpAddr` flag with the TCP and HTTP bind addresses of the node you are removing.
```bash
# Pattern
influxd-ctl remove-meta -force -tcpAddr <meta-node-tcp-bind-address> <meta-node-http-bind-address>
# Example
influxd-ctl remove-meta -force -tcpAddr enterprise-meta-02:8089 enterprise-meta-02:8091
```
#### 2.3. Add the new meta node
Once the non-leader meta node has been removed, use `influxd-ctl add-meta` to replace it with the new meta node:
```bash
# Pattern
influxd-ctl add-meta <meta-node-tcp-bind-address>
# Example
influxd-ctl add-meta enterprise-meta-04:8091
```
You can also add a meta node remotely through another meta node:
```bash
# Pattern
influxd-ctl -bind <remote-meta-node-bind-address> add-meta <meta-node-tcp-bind-address>
# Example
influxd-ctl -bind enterprise-meta-node-01:8091 add-meta enterprise-meta-node-04:8091
```
>This command contacts the meta node running at `cluster-meta-node-01:8091` and adds a meta node to that meta nodes cluster.
The added meta node has the hostname `cluster-meta-node-04` and runs on port `8091`.
#### 2.4. Confirm the meta node was added
Confirm the new meta-node has been added by running:
```bash
influxd-ctl show
```
The new meta node should appear in the output:
```bash
Data Nodes
==========
ID TCP Address Version
4 enterprise-data-01:8088 {{< latest-patch >}}-c{{< latest-patch >}}
5 enterprise-data-02:8088 {{< latest-patch >}}-c{{< latest-patch >}}
Meta Nodes
==========
TCP Address Version
enterprise-meta-01:8091 {{< latest-patch >}}-c{{< latest-patch >}}
enterprise-meta-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
enterprise-meta-04:8091 {{< latest-patch >}}-c{{< latest-patch >}} # <-- The newly added meta node
```
#### 2.5. Remove and replace all other non-leader meta nodes
**If replacing only one meta node, no further action is required.**
If replacing others, repeat steps [2.1-2.4](#2-1-provision-a-new-meta-node) for all non-leader meta nodes one at a time.
### 3. Replace the leader node
As non-leader meta nodes are removed and replaced, the leader node oversees the replication of data to each of the new meta nodes.
Leave the leader up and running until at least two of the new meta nodes are up, running and healthy.
#### 3.1 - Kill the meta process on the leader node
Log into the leader meta node and kill the meta process.
```bash
# List the running processes and get the
# PID of the 'influx-meta' process
ps aux
# Kill the 'influx-meta' process
kill <PID>
```
Once killed, the meta cluster will elect a new leader using the [raft consensus algorithm](https://raft.github.io/).
Confirm the new leader by running:
```bash
curl localhost:8091/status | jq
```
#### 3.2 - Remove and replace the old leader node
Remove the old leader node and replace it by following steps [2.1-2.4](#2-1-provision-a-new-meta-node).
The minimum number of meta nodes you should have in your cluster is 3.
## Replace data nodes in an InfluxDB Enterprise cluster
[Data nodes](/enterprise_influxdb/v1.10/concepts/clustering/#data-nodes) house all raw time series data and metadata.
The process of replacing data nodes is as follows:
1. [Provision a new data node](#1-provision-a-new-data-node)
2. [Replace the old data node with the new data node](#2-replace-the-old-data-node-with-the-new-data-node)
3. [Confirm the data node was added](#3-confirm-the-data-node-was-added)
4. [Check the copy-shard-status](#4-check-the-copy-shard-status)
### 1. Provision a new data node
[Provision and start a new data node](/enterprise_influxdb/v1.10/installation/data_node_installation/), but **do not** add it to your cluster yet.
### 2. Replace the old data node with the new data node
Log into any of your cluster's meta nodes and use `influxd-ctl update-data` to replace the old data node with the new data node:
```bash
# Pattern
influxd-ctl update-data <old-node-tcp-bind-address> <new-node-tcp-bind-address>
# Example
influxd-ctl update-data enterprise-data-01:8088 enterprise-data-03:8088
```
### 3. Confirm the data node was added
Confirm the new data node has been added by running:
```bash
influxd-ctl show
```
The new data node should appear in the output:
```bash
Data Nodes
==========
ID TCP Address Version
4 enterprise-data-03:8088 {{< latest-patch >}}-c{{< latest-patch >}} # <-- The newly added data node
5 enterprise-data-02:8088 {{< latest-patch >}}-c{{< latest-patch >}}
Meta Nodes
==========
TCP Address Version
enterprise-meta-01:8091 {{< latest-patch >}}-c{{< latest-patch >}}
enterprise-meta-02:8091 {{< latest-patch >}}-c{{< latest-patch >}}
enterprise-meta-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
```
Inspect your cluster's shard distribution with `influxd-ctl show-shards`.
Shards will immediately reflect the new address of the node.
```bash
influxd-ctl show-shards
Shards
==========
ID Database Retention Policy Desired Replicas Shard Group Start End Expires Owners
3 telegraf autogen 2 2 2018-03-19T00:00:00Z 2018-03-26T00:00:00Z [{5 enterprise-data-02:8088} {4 enterprise-data-03:8088}]
1 _internal monitor 2 1 2018-03-22T00:00:00Z 2018-03-23T00:00:00Z 2018-03-30T00:00:00Z [{5 enterprise-data-02:8088}]
2 _internal monitor 2 1 2018-03-22T00:00:00Z 2018-03-23T00:00:00Z 2018-03-30T00:00:00Z [{4 enterprise-data-03:8088}]
4 _internal monitor 2 3 2018-03-23T00:00:00Z 2018-03-24T00:00:00Z 2018-03-01T00:00:00Z [{5 enterprise-data-02:8088}]
5 _internal monitor 2 3 2018-03-23T00:00:00Z 2018-03-24T00:00:00Z 2018-03-01T00:00:00Z [{4 enterprise-data-03:8088}]
6 foo autogen 2 4 2018-03-19T00:00:00Z 2018-03-26T00:00:00Z [{5 enterprise-data-02:8088} {4 enterprise-data-03:8088}]
```
Within the duration defined by [`anti-entropy.check-interval`](/enterprise_influxdb/v1.10/administration/config-data-nodes#check-interval-10m),
the AE service will begin copying shards from other shard owners to the new node.
The time it takes for copying to complete is determined by the number of shards copied and how much data is stored in each.
### 4. Check the `copy-shard-status`
Check on the status of the copy-shard process with:
```bash
influxd-ctl copy-shard-status
```
The output will show all currently running copy-shard processes.
```bash
Source Dest Database Policy ShardID TotalSize CurrentSize StartedAt
enterprise-data-02:8088 enterprise-data-03:8088 telegraf autogen 3 119624324 119624324 2018-04-17 23:45:09.470696179 +0000 UTC
```
> **Important:** If replacing other data nodes in the cluster, make sure shards are completely copied from nodes in the same shard group before replacing the other nodes.
View the [Anti-entropy](/enterprise_influxdb/v1.10/administration/anti-entropy/#concepts) documentation for important information regarding anti-entropy and your database's replication factor.
## Troubleshoot
### Cluster commands result in timeout without error
In some cases, commands used to add or remove nodes from your cluster
timeout, but don't return an error.
```bash
add-data: operation timed out with error:
```
#### Check your InfluxDB user permissions
In order to add or remove nodes to or from a cluster, your user must have `AddRemoveNode` permissions.
Attempting to manage cluster nodes without the appropriate permissions results
in a timeout with no accompanying error.
To check user permissions, log in to one of your meta nodes and `curl` the `/user` API endpoint:
```bash
curl localhost:8091/user
```
You can also check the permissions of a specific user by passing the username with the `name` parameter:
```bash
# Pattern
curl localhost:8091/user?name=<username>
# Example
curl localhost:8091/user?name=bob
```
The JSON output will include user information and permissions:
```json
"users": [
{
"name": "bob",
"hash": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"permissions": {
"": [
"ViewAdmin",
"ViewChronograf",
"CreateDatabase",
"CreateUserAndRole",
"DropDatabase",
"DropData",
"ReadData",
"WriteData",
"ManageShard",
"ManageContinuousQuery",
"ManageQuery",
"ManageSubscription",
"Monitor"
]
}
}
]
```
_In the output above, `bob` does not have the required `AddRemoveNode` permissions
and would not be able to add or remove nodes from the cluster._
#### Check the network connection between nodes
Something may be interrupting the network connection between nodes.
To check, `ping` the server or node you're trying to add or remove.
If the ping is unsuccessful, something in the network is preventing communication.
```bash
ping enterprise-data-03:8088
```
_If pings are unsuccessful, be sure to ping from other meta nodes as well to determine
if the communication issues are unique to specific nodes._

View File

@ -0,0 +1,56 @@
---
title: Rename hosts in InfluxDB Enterprise
description: Rename a host within your InfluxDB Enterprise instance.
aliases:
- /enterprise_influxdb/v1.10/administration/renaming/
menu:
enterprise_influxdb_1_10:
name: Rename hosts
parent: Manage
weight: 40
---
## Host renaming
The following instructions allow you to rename a host within your InfluxDB Enterprise instance.
First, suspend write and query activity to the cluster.
### Rename meta nodes
- Find the meta node leader with `curl localhost:8091/status`. The `leader` field in the JSON output reports the leader meta node. We will start with the two meta nodes that are not leaders.
- On a non-leader meta node, run `influxd-ctl remove-meta`. Once removed, confirm by running `influxd-ctl show` on the meta leader.
- Stop the meta service on the removed node, edit its configuration file to set the new "hostname" under "/etc/influxdb/influxdb-meta.conf".
- Update the actual OS host's name if needed, apply DNS changes.
- Start the meta service.
- On the meta leader, add the meta node with the new hostname using `influxd-ctl add-meta newmetanode:8091`. Confirm with `influxd-ctl show`
- Repeat for the second meta node.
- Once the two non-leaders are updated, stop the leader and wait for another meta node to become the leader - check with `curl localhost:8091/status`.
- Repeat the process for the last meta node (former leader).
### Intermediate verification
- Verify the state of the cluster with `influxd-ctl show`. The version must be reported on all nodes for them to be healthy.
- Verify there is a meta leader with `curl localhost:8091/status` and that all meta nodes list the rest in the output.
- Restart all data nodes one by one. Verify that `/var/lib/influxdb/meta/client.json` on all data nodes references the new meta names.
- Verify the `show shards` output lists all shards and node ownership as expected.
- Verify that the cluster is in good shape functional-wise, responds to writes and queries.
### Rename data nodes
- Find the meta node leader with `curl localhost:8091/status`. The `leader` field in the JSON output reports the leader meta node.
- Stop the service on the data node you want to rename. Edit its configuration file to set the new `hostname` under `/etc/influxdb/influxdb.conf`.
- Update the actual OS host's name if needed, apply DNS changes.
- Start the data service. Errors will be logged until it is added to the cluster again.
- On the meta node leader, run `influxd-ctl update-data oldname:8088 newname:8088`. Upon success you will get a message updated data node ID to `newname:8088`.
- Verify with `influxd-ctl show` on the meta node leader. Verify there are no errors in the logs of the updated data node and other data nodes. Restart the service on the updated data node. Verify writes, replication and queries work as expected.
- Repeat on the remaining data nodes. Remember to only execute the `update-data` command from the meta leader.
### Final verification
- Verify the state of the cluster with `influxd-ctl show`. The version must be reported on all nodes for them to be healthy.
- Verify the `show shards` output lists all shards and node ownership as expected.
- Verify meta queries work (show measurements under a database).
- Verify data are being queried successfully.
Once you've performed the verification steps, resume write and query activity.

View File

@ -0,0 +1,210 @@
---
title: Manage subscriptions in InfluxDB
description: >
Manage subscriptions, which copy all written data to a local or remote endpoint, in InfluxDB OSS.
menu:
enterprise_influxdb_1_10:
name: Manage subscriptions
parent: Manage
weight: 30
aliases:
- /enterprise_influxdb/v1.10/administration/subscription-management/
---
InfluxDB subscriptions are local or remote endpoints to which all data written to InfluxDB is copied.
Subscriptions are primarily used with [Kapacitor](/kapacitor/), but any endpoint
able to accept UDP, HTTP, or HTTPS connections can subscribe to InfluxDB and receive
a copy of all data as it is written.
## How subscriptions work
As data is written to InfluxDB, writes are duplicated to subscriber endpoints via
HTTP, HTTPS, or UDP in [line protocol](/enterprise_influxdb/v1.10/write_protocols/line_protocol_tutorial/).
the InfluxDB subscriber service creates multiple "writers" ([goroutines](https://golangbot.com/goroutines/))
which send writes to the subscription endpoints.
_The number of writer goroutines is defined by the [`write-concurrency`](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#write-concurrency-40) configuration._
As writes occur in InfluxDB, each subscription writer sends the written data to the
specified subscription endpoints.
However, with a high `write-concurrency` (multiple writers) and a high ingest rate,
nanosecond differences in writer processes and the transport layer can result
in writes being received out of order.
> #### Important information about high write loads
> While setting the subscriber `write-concurrency` to greater than 1 does increase your
> subscriber write throughput, it can result in out-of-order writes under high ingest rates.
> Setting `write-concurrency` to 1 ensures writes are passed to subscriber endpoints sequentially,
> but can create a bottleneck under high ingest rates.
>
> What `write-concurrency` should be set to depends on your specific workload
> and need for in-order writes to your subscription endpoint.
## InfluxQL subscription statements
Use the following InfluxQL statements to manage subscriptions:
[`CREATE SUBSCRIPTION`](#create-subscriptions)
[`SHOW SUBSCRIPTIONS`](#show-subscriptions)
[`DROP SUBSCRIPTION`](#remove-subscriptions)
## Create subscriptions
Create subscriptions using the `CREATE SUBSCRIPTION` InfluxQL statement.
Specify the subscription name, the database name and retention policy to subscribe to,
and the URL of the host to which data written to InfluxDB should be copied.
```sql
-- Pattern:
CREATE SUBSCRIPTION "<subscription_name>" ON "<db_name>"."<retention_policy>" DESTINATIONS <ALL|ANY> "<subscription_endpoint_host>"
-- Examples:
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that sends data to 'example.com:9090' via HTTP.
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ALL 'http://example.com:9090'
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that round-robins the data to 'h1.example.com:9090' and 'h2.example.com:9090' via UDP.
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ANY 'udp://h1.example.com:9090', 'udp://h2.example.com:9090'
```
In case authentication is enabled on the subscriber host, adapt the URL to contain the credentials.
```
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that sends data to another InfluxDB on 'example.com:8086' via HTTP. Authentication is enabled on the subscription host (user: subscriber, pass: secret).
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ALL 'http://subscriber:secret@example.com:8086'
```
{{% warn %}}
`SHOW SUBSCRIPTIONS` outputs all subscriber URL in plain text, including those with authentication credentials.
Any user with the privileges to run `SHOW SUBSCRIPTIONS` is able to see these credentials.
{{% /warn %}}
### Sending subscription data to multiple hosts
The `CREATE SUBSCRIPTION` statement allows you to specify multiple hosts as endpoints for the subscription.
In your `DESTINATIONS` clause, you can pass multiple host strings separated by commas.
Using `ALL` or `ANY` in the `DESTINATIONS` clause determines how InfluxDB writes data to each endpoint:
`ALL`: Writes data to all specified hosts.
`ANY`: Round-robins writes between specified hosts.
_**Subscriptions with multiple hosts**_
```sql
-- Write all data to multiple hosts
CREATE SUBSCRIPTION "mysub" ON "mydb"."autogen" DESTINATIONS ALL 'http://host1.example.com:9090', 'http://host2.example.com:9090'
-- Round-robin writes between multiple hosts
CREATE SUBSCRIPTION "mysub" ON "mydb"."autogen" DESTINATIONS ANY 'http://host1.example.com:9090', 'http://host2.example.com:9090'
```
### Subscription protocols
Subscriptions can use HTTP, HTTPS, or UDP transport protocols.
Which to use is determined by the protocol expected by the subscription endpoint.
If creating a Kapacitor subscription, this is defined by the `subscription-protocol`
option in the `[[influxdb]]` section of your [`kapacitor.conf`](/{{< latest "kapacitor" >}}/administration/subscription-management/#subscription-protocol).
_**kapacitor.conf**_
```toml
[[influxdb]]
# ...
subscription-protocol = "http"
# ...
```
_For information regarding HTTPS connections and secure communication between InfluxDB and Kapacitor,
view the [Kapacitor security](/kapacitor/v1.5/administration/security/#secure-influxdb-and-kapacitor) documentation._
## Show subscriptions
The `SHOW SUBSCRIPTIONS` InfluxQL statement returns a list of all subscriptions registered in InfluxDB.
```sql
SHOW SUBSCRIPTIONS
```
_**Example output:**_
```bash
name: _internal
retention_policy name mode destinations
---------------- ---- ---- ------------
monitor kapacitor-39545771-7b64-4692-ab8f-1796c07f3314 ANY [http://localhost:9092]
```
## Remove subscriptions
Remove or drop subscriptions using the `DROP SUBSCRIPTION` InfluxQL statement.
```sql
-- Pattern:
DROP SUBSCRIPTION "<subscription_name>" ON "<db_name>"."<retention_policy>"
-- Example:
DROP SUBSCRIPTION "sub0" ON "mydb"."autogen"
```
### Drop all subscriptions
In some cases, it may be necessary to remove all subscriptions.
Run the following bash script that utilizes the `influx` CLI, loops through all subscriptions, and removes them.
This script depends on the `$INFLUXUSER` and `$INFLUXPASS` environment variables.
If these are not set, export them as part of the script.
```bash
# Environment variable exports:
# Uncomment these if INFLUXUSER and INFLUXPASS are not already globally set.
# export INFLUXUSER=influxdb-username
# export INFLUXPASS=influxdb-password
IFS=$'\n'; for i in $(influx -format csv -username $INFLUXUSER -password $INFLUXPASS -database _internal -execute 'show subscriptions' | tail -n +2 | grep -v name); do influx -format csv -username $INFLUXUSER -password $INFLUXPASS -database _internal -execute "drop subscription \"$(echo "$i" | cut -f 3 -d ',')\" ON \"$(echo "$i" | cut -f 1 -d ',')\".\"$(echo "$i" | cut -f 2 -d ',')\""; done
```
## Configure InfluxDB subscriptions
InfluxDB subscription configuration options are available in the `[subscriber]`
section of the `influxdb.conf`.
In order to use subcriptions, the `enabled` option in the `[subscriber]` section must be set to `true`.
Below is an example `influxdb.conf` subscriber configuration:
```toml
[subscriber]
enabled = true
http-timeout = "30s"
insecure-skip-verify = false
ca-certs = ""
write-concurrency = 40
write-buffer-size = 1000
```
_**Descriptions of `[subscriber]` configuration options are available in the [data node configuration](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#subscription-settings) documentation.**_
## Troubleshooting
### Inaccessible or decommissioned subscription endpoints
Unless a subscription is [dropped](#remove-subscriptions), InfluxDB assumes the endpoint
should always receive data and will continue to attempt to send data.
If an endpoint host is inaccessible or has been decommissioned, you will see errors
similar to the following:
```bash
# Some message content omitted (...) for the sake of brevity
"Post http://x.y.z.a:9092/write?consistency=...: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" ... service=subscriber
"Post http://x.y.z.a:9092/write?consistency=...: dial tcp x.y.z.a:9092: getsockopt: connection refused" ... service=subscriber
"Post http://x.y.z.a:9092/write?consistency=...: dial tcp 172.31.36.5:9092: getsockopt: no route to host" ... service=subscriber
```
In some cases, this may be caused by a networking error or something similar
preventing a successful connection to the subscription endpoint.
In other cases, it's because the subscription endpoint no longer exists and
the subscription hasn't been dropped from InfluxDB.
> Because InfluxDB does not know if a subscription endpoint will or will not become accessible again,
> subscriptions are not automatically dropped when an endpoint becomes inaccessible.
> If a subscription endpoint is removed, you must manually [drop the subscription](#remove-subscriptions) from InfluxDB.

View File

@ -0,0 +1,18 @@
---
title: Manage users and permissions
description: Manage authorization in InfluxDB Enterprise clusters with users, roles, and permissions.
menu:
enterprise_influxdb_1_10:
name: Manage users and permissions
parent: Manage
weight: 20
aliases:
- /enterprise_influxdb/v1.10/administration/authentication_and_authorization/
---
{{% enterprise-warning-authn-b4-authz %}}
_For information about how to configure HTTPs over TLS, LDAP authentication, and password hashing,
see [Configure security](/enterprise_influxdb/v1.10/administration/configure/security/)._
{{< children >}}

View File

@ -0,0 +1,410 @@
---
title: Manage authorization with the InfluxDB Enterprise Meta API
description: >
Manage users and permissions with the InfluxDB Enterprise Meta API.
menu:
enterprise_influxdb_1_10:
name: Manage authorization with the API
parent: Manage users and permissions
weight: 41
aliases:
- /enterprise_influxdb/v1.10/administration/manage/security/authentication_and_authorization-api/
- /enterprise_influxdb/v1.10/administration/security/authentication_and_authorization-api/
---
{{% enterprise-warning-authn-b4-authz %}}
Use the InfluxDB Enterprise Meta API to manage authorization for a cluster.
The API can be used to manage both cluster-wide and database-specific [permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/#permissions).
Chronograf can only manage cluster-wide permissions.
To manage permissions at the database level, use the API.
<!--
## permission "tokens"
Predefined key tokens take the form of verb-object pairs.
When the token lacks the verb part, full management privileges are implied.
These predefined tokens are:
-->
For more information, see [Enterprise users and permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/).
### Example API requests
{{% note %}}
Many of the examples below use the `jq` utility to format JSON output for readability.
[Install `jq`](https://stedolan.github.io/jq/download/) to process JSON output.
If you dont have access to `jq`, remove the `| jq` shown in the example.
{{% /note %}}
**Users**:
- [List users](#list-users)
- [Create a user against a follower node](#create-a-user-against-a-follower-node)
- [Create a user against the lead node](#create-a-user-against-the-lead-node)
- [Retrieve a user details document](#retrieve-a-user-details-document)
- [Grant permissions to a user for all databases](#grant-permissions-to-a-user-for-all-databases)
- [Grant permissions to a user for a specific database](#grant-permissions-to-a-user-for-a-specific-database)
- [Verify user permissions](#verify-user-permissions)
- [Remove permissions from a user](#remove-permissions-from-a-user)
- [Remove a user](#remove-a-user)
- [Verify user removal](#verify-user-removal)
- [Change a user's password](#change-a-users-password)
**Roles**:
- [List roles](#list-roles)
- [Create a role](#create-a-role)
- [Verify roles](#verify-roles)
- [Retrieve a role document](#retrieve-a-role-document)
- [Add permissions to a role for all databases](#add-permissions-to-a-role-for-all-databases)
- [Add permissions to a role for a specific database](#add-permissions-to-a-role-for-a-specific-database)
- [Verify role permissions](#verify-role-permissions)
- [Add a user to a role](#add-a-user-to-a-role)
- [Verify user in role](#verify-user-in-role)
- [Remove a user from a role](#remove-a-user-from-a-role)
- [Remove a permission from a role](#remove-a-permission-from-a-role)
- [Delete a role](#delete-a-role)
- [Verify role deletion](#verify-role-deletion)
#### Users
Use the `/user` endpoint of the InfluxDB Enterprise Meta API to manage users.
##### List users
View a list of existing users.
```sh
curl --location-trusted -u "admin:changeit" -s https://cluster_node_1:8091/user | jq
```
```json
{
"users": [
{
"hash": "$2a$10$NelNfrWdxubN0/TnP7DwquKB9/UmJnyZ7gy0i69MPldK73m.2WfCu",
"name": "admin",
"permissions": {
"": [
"ViewAdmin",
"ViewChronograf",
"CreateDatabase",
"CreateUserAndRole",
"AddRemoveNode",
"DropDatabase",
"DropData",
"ReadData",
"WriteData",
"Rebalance",
"ManageShard",
"ManageContinuousQuery",
"ManageQuery",
"ManageSubscription",
"Monitor",
"CopyShard",
"KapacitorAPI",
"KapacitorConfigAPI"
]
}
}
]
}
```
##### Create a user against a follower node
Transactions that modify the user store must be sent to the lead meta node using `POST`.
If the node returns a 307 redirect message,
try resending the request to the lead node as indicated by the `Location` field in the HTTP response header.
```sh
curl --location-trusted -u "admin:changeit" -s -v \
-d '{"action":"create","user":{"name":"phantom2","password":"changeit"}}' \
https://cluster_node_2:8091/user
```
##### Create a user against the lead node
```sh
curl --location-trusted -u "admin:changeit" -s -v \
-d '{"action":"create","user":{"name":"phantom","password":"changeit"}}' \
https://cluster_node_1:8091/user
```
##### Retrieve a user details document
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/user?name=phantom | jq
```
```json
{
"users": [
{
"hash": "$2a$10$hR.Ih6DpIHUaynA.uqFhpOiNUgrADlwg3rquueHDuw58AEd7zk5hC",
"name": "phantom"
}
]
}
```
##### Grant permissions to a user for all databases
To grant a list of permissions for all databases in a cluster,
use the `""` key in the permissions object, as shown in the example below.
```
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
-d '{"action":"add-permissions","user":{"name":"phantom","permissions":{"":["ReadData", "WriteData"]}}}' \
https://cluster_node_1:8091/user
```
##### Grant permissions to a user for a specific database
Grant `ReadData` and `WriteData` permissions to the user named `phantom` for `MyDatabase`.
```
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
-d '{"action":"add-permissions","user":{"name":"phantom","permissions":{"MyDatabase":["ReadData","WriteData"]}}}' \
https://cluster_node_1:8091/user
```
##### Verify user permissions
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/user?name=phantom | jq
```
```json
{
"users": [
{
"hash": "$2a$10$hR.Ih6DpIHUaynA.uqFhpOiNUgrADlwg3rquueHDuw58AEd7zk5hC",
"name": "phantom",
"permissions": {
"MyDatabase": [
"ReadData",
"WriteData"
]
}
}
]
}
```
##### Remove permissions from a user
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
-d '{"action":"remove-permissions","user":{"name":"phantom","permissions":{"":["KapacitorConfigAPI"]}}}' \
https://cluster_node_1:8091/user
```
##### Remove a user
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
-d '{"action":"delete","user":{"name":"phantom2"}}' \
https://cluster_node_1:8091/user
```
##### Verify user removal
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/user?name=phantom
```
```json
{
"error": "user not found"
}
```
##### Change a user's password
```sh
curl --location-trusted -u "admin:changeit" -H "Content-Type: application/json" \
-d '{"action": "change-password", "user": {"name": "<username>", "password": "newpassword"}}' \
localhost:8091/user
```
<!-- TODO -->
#### Roles
The Influxd-Meta API provides an endpoint `/role` for managing roles.
##### List roles
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role | jq
```
```
{}
```
In a fresh installation no roles will have been created yet.
As when creating a user the lead node must be used.
##### Create a role
```sh
curl --location-trusted --negotiate -u "admin:changeit" -v \
-d '{"action":"create","role":{"name":"spectre"}}' \
https://cluster_node_1:8091/role
```
##### Verify roles
Verify the role has been created.
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role | jq
```
```json
{
"roles": [
{
"name": "djinn",
},
{
"name": "spectre"
},
]
}
```
##### Retrieve a role document
Retrieve a record for a single node.
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role?name=spectre | jq
```
```json
{
"roles": [
{
"name": "spectre"
}
]
}
```
##### Add permissions to a role for all databases
To grant a list of permissions to a role for all databases in a cluster,
use the `""` key in the permissions object, as shown in the example below.
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
-d '{"action":"add-permissions","role":{"name":"spectre","permissions":{"":["ReadData","WriteData"]}}}' \
https://cluster_node_1:8091/role
```
##### Add permissions to a role for a specific database
Grant `ReadData` and `WriteData` permissions to the role named `spectre` for `MyDatabase`.
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
-d '{"action":"add-permissions","role":{"name":"spectre","permissions":{"MyDatabase":["ReadData","WriteData"]}}}' \
https://cluster_node_1:8091/role
```
##### Verify role permissions
Verify permissions have been added.
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role?name=spectre | jq
```
```json
{
"roles": [
{
"name": "spectre",
"permissions": {
"MyDatabase": [
"ReadData",
"WriteData"
]
}
}
]
}
```
##### Add a user to a role
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
-d '{"action":"add-users","role":{"name":"spectre","users":["phantom"]}}' \
https://cluster_node_1:8091/role
```
##### Verify user in role
Verify user has been added to role.
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role?name=spectre | jq
```
```json
{
"roles": [
{
"name": "spectre",
"permissions": {
"": [
"KapacitorAPI",
"KapacitorConfigAPI"
]
},
"users": [
"phantom"
]
}
]
}
```
##### Remove a user from a role
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
-d '{"action":"remove-users","role":{"name":"spectre","users":["phantom"]}}' \
https://admin:changeit@cluster_node_1:8091/role
```
##### Remove a permission from a role
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
-d '{"action":"remove-permissions","role":{"name":"spectre","permissions":{"":["KapacitorConfigAPI"]}}}' \
https://cluster_node_1:8091/role
```
##### Delete a role
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
-d '{"action":"delete","role":{"name":"spectre"}}' \
https://cluster_node_1:8091/role
```
##### Verify role deletion
```sh
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role?name=spectre | jq
```
```json
{
"error": "role not found"
}
```

View File

@ -0,0 +1,255 @@
---
title: Manage authorization with InfluxQL
description: >
Manage users and permissions with InfluxQL.
menu:
enterprise_influxdb_1_10:
parent: Manage users and permissions
weight: 40
related:
- /enterprise_influxdb/v1.10/administration/manage/security/authorization-api.md
- /{{< latest "chronograf" >}}/administration/managing-influxdb-users/
- /enterprise_influxdb/v1.10/administration/manage/security/fine-grained-authorization/
aliases:
- /enterprise_influxdb/v1.10/administration/manage/security/authentication_and_authorization-api/
---
{{% enterprise-warning-authn-b4-authz %}}
{{% note %}}
We recommend using [Chronograf](/{{< latest "chronograf" >}}/administration/managing-influxdb-users/)
and/or the [Enterprise meta API](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/authorization-api/)
to manage InfluxDB Enterprise users and roles.
{{% /note %}}
{{% warn %}}
Outside of [creating users](/enterprise_influxdb/v1.10/query_language/spec/#create-user),
we recommend operators *do not* mix and match InfluxQL
with other authorization management methods (Chronograf and the API).
Doing so may lead to inconsistencies in user permissions.
{{% /warn %}}
This page shows examples of basic user and permission management using InfluxQL statements.
However, *only a subset of Enterprise permissions can be managed with InfluxQL.*
Using InfluxQL, you can perform the following actions:
- Create new users and assign them either the admin role (or no role).
- grant `READ` and/or `WRITE` permissions to users. (`READ`, `WRITE`, `ALL`)
- `REVOKE` permissions from users.
- `GRANT` or `REVOKE` specific database access to individual users.
However, InfluxDB Enterprise offers an [*expanded set of permissions*](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/#permissions).
You can use the Meta API and Chronograf to access and assign these more granular permissions to individual users.
The [InfluxDB Enterprise meta API](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/authorization-api/)
provides the most comprehensive way to manage users, roles, permission
and other [fine grained authorization](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/fine-grained-authorization/) (FGA) capabilities.
#### Non-admin users
When authentication is enabled,
a new non-admin user has no access to any database
until they are specifically [granted privileges to a database](#grant-read-write-or-all-database-privileges-to-an-existing-user)
by an admin user.
Non-admin users can [`SHOW`](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-databases)
the databases for which they have `ReadData` or `WriteData` permissions.
### User management commands
User management commands apply to either
[admin users](#manage-admin-users),
[non-admin users](#manage-non-admin-users),
or [both](#manage-admin-and-non-admin-users).
For more information about these commands,
see [Database management](/enterprise_influxdb/v1.10/query_language/manage-database/) and
[Continuous queries](/enterprise_influxdb/v1.10/query_language/continuous_queries/).
#### Manage admin users
Create an admin user with:
```sql
CREATE USER admin WITH PASSWORD '<password>' WITH ALL PRIVILEGES
```
{{% note %}}
Repeating the exact `CREATE USER` statement is idempotent.
If any values change the database will return a duplicate user error.
```sql
> CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES
> CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES
> CREATE USER todd WITH PASSWORD '123' WITH ALL PRIVILEGES
ERR: user already exists
> CREATE USER todd WITH PASSWORD '123456'
ERR: user already exists
> CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES
>
```
{{% /note %}}
##### `GRANT` administrative privileges to an existing user
```sql
GRANT ALL PRIVILEGES TO <username>
```
##### `REVOKE` administrative privileges from an admin user
```sql
REVOKE ALL PRIVILEGES FROM <username>
```
##### `SHOW` all existing users and their admin status
```sql
SHOW USERS
```
###### CLI Example
```sql
> SHOW USERS
user admin
todd false
paul true
hermione false
dobby false
```
#### Manage non-admin users
##### `CREATE` a new non-admin user
```sql
CREATE USER <username> WITH PASSWORD '<password>'
```
###### CLI example
```js
> CREATE USER todd WITH PASSWORD 'influxdb41yf3'
> CREATE USER alice WITH PASSWORD 'wonder\'land'
> CREATE USER "rachel_smith" WITH PASSWORD 'asdf1234!'
> CREATE USER "monitoring-robot" WITH PASSWORD 'XXXXX'
> CREATE USER "$savyadmin" WITH PASSWORD 'm3tr1cL0v3r'
```
{{% note %}}
##### Important notes about providing user credentials
- The user value must be wrapped in double quotes if
it starts with a digit, is an InfluxQL keyword, contains a hyphen,
or includes any special characters (for example: `!@#$%^&*()-`).
- The password [string](/influxdb/v1.8/query_language/spec/#strings) must be wrapped in single quotes.
Do not include the single quotes when authenticating requests.
We recommend avoiding the single quote (`'`) and backslash (`\`) characters in passwords.
For passwords that include these characters, escape the special character with a backslash
(e.g. (`\'`) when creating the password and when submitting authentication requests.
- Repeating the exact `CREATE USER` statement is idempotent.
If any values change the database will return a duplicate user error.
###### CLI example
```sql
> CREATE USER "todd" WITH PASSWORD '123456'
> CREATE USER "todd" WITH PASSWORD '123456'
> CREATE USER "todd" WITH PASSWORD '123'
ERR: user already exists
> CREATE USER "todd" WITH PASSWORD '123456'
> CREATE USER "todd" WITH PASSWORD '123456' WITH ALL PRIVILEGES
ERR: user already exists
> CREATE USER "todd" WITH PASSWORD '123456'
>
```
{{% /note %}}
##### `GRANT` `READ`, `WRITE` or `ALL` database privileges to an existing user
```sql
GRANT [READ,WRITE,ALL] ON <database_name> TO <username>
```
CLI examples:
`GRANT` `READ` access to `todd` on the `NOAA_water_database` database:
```sql
> GRANT READ ON "NOAA_water_database" TO "todd"
```
`GRANT` `ALL` access to `todd` on the `NOAA_water_database` database:
```sql
> GRANT ALL ON "NOAA_water_database" TO "todd"
```
##### `REVOKE` `READ`, `WRITE`, or `ALL` database privileges from an existing user
```
REVOKE [READ,WRITE,ALL] ON <database_name> FROM <username>
```
CLI examples:
`REVOKE` `ALL` privileges from `todd` on the `NOAA_water_database` database:
```sql
> REVOKE ALL ON "NOAA_water_database" FROM "todd"
```
`REVOKE` `WRITE` privileges from `todd` on the `NOAA_water_database` database:
```sql
> REVOKE WRITE ON "NOAA_water_database" FROM "todd"
```
{{% note %}}
If a user with `ALL` privileges has `WRITE` privileges revoked, they are left with `READ` privileges, and vice versa.
{{% /note %}}
##### `SHOW` a user's database privileges
```sql
SHOW GRANTS FOR <user_name>
```
CLI example:
```sql
> SHOW GRANTS FOR "todd"
database privilege
NOAA_water_database WRITE
another_database_name READ
yet_another_database_name ALL PRIVILEGES
one_more_database_name NO PRIVILEGES
```
#### Manage admin and non-admin users
##### Reset a user's password
```sql
SET PASSWORD FOR <username> = '<password>'
```
CLI example:
```sql
> SET PASSWORD FOR "todd" = 'password4todd'
```
{{% note %}}
The password [string](/influxdb/v1.8/query_language/spec/#strings) must be wrapped in single quotes.
Do not include the single quotes when authenticating requests.
We recommend avoiding the single quote (`'`) and backslash (`\`) characters in passwords
For passwords that include these characters, escape the special character with a backslash (e.g. (`\'`) when creating the password and when submitting authentication requests.
{{% /note %}}
##### `DROP` a user
```sql
DROP USER <username>
```
CLI example:
```sql
> DROP USER "todd"
```

View File

@ -0,0 +1,659 @@
---
title: Manage fine-grained authorization
description: >
Fine-grained authorization (FGA) in InfluxDB Enterprise controls user access at the database, measurement, and series levels.
menu:
enterprise_influxdb_1_10:
parent: Manage users and permissions
weight: 44
aliases:
- /docs/v1.5/administration/fga
- /enterprise_influxdb/v1.10/guides/fine-grained-authorization/
related:
- /enterprise_influxdb/v1.10/administration/authentication_and_authorization/
- /{{< latest "chronograf" >}}/administration/managing-influxdb-users/
---
{{% enterprise-warning-authn-b4-authz %}}
Use fine-grained authorization (FGA) to control user access at the database, measurement, and series levels.
You must have [admin permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/#admin) to set up FGA.
{{% warn %}}
#### FGA does not apply to Flux
FGA does not restrict actions performed by Flux queries (both read and write).
If using FGA, we recommend [disabling Flux](/enterprise_influxdb/v{{< current-version >}}/flux/installation/).
{{% /warn %}}
{{% note %}}
FGA is only available in InfluxDB Enterprise.
InfluxDB OSS 1.x controls access at the database level only.
{{% /note %}}
## Set up fine-grained authorization
1. [Enable authentication](/enterprise_influxdb/v1.10/administration/configure/security/authentication/) in your InfluxDB configuration file.
2. Create users through the InfluxDB query API.
```sql
CREATE USER username WITH PASSWORD 'password'
```
For more information, see [User management commands](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/authorization-influxql/#user-management-commands).
3. Ensure that you can access the **meta node** API (port 8091 by default).
{{% note %}}
In a typical cluster configuration, the HTTP ports for data nodes
(8086 by default) are exposed to clients but the meta node HTTP ports are not.
You may need to work with your network administrator to gain access to the meta node HTTP ports.
{{% /note %}}
4. Create users. Do the following:
1. As Administrator, create users and grant users all permissions. The example below grants users `east` and `west` all permissions on the `datacenters` database.
```sql
CREATE DATABASE datacenters
CREATE USER east WITH PASSWORD 'east'
GRANT ALL ON datacenters TO east
CREATE USER west WITH PASSWORD 'west'
GRANT ALL ON datacenters TO west
```
2. Add fine-grained permissions to users as needed.
5. [Create roles](#manage-roles) to grant permissions to users assigned to a role.
{{% note %}}
For an overview of how users and roles work in InfluxDB Enterprise, see [InfluxDB Enterprise users](/enterprise_influxdb/v1.10/features/users/).
{{% /note %}}
6. [Set up restrictions](#manage-restrictions).
Restrictions apply to all non-admin users.
{{% note %}}
Permissions (currently "read" and "write") may be restricted independently depending on the scenario.
{{% /note %}}
7. [Set up grants](#manage-grants) to remove restrictions for specified users and roles.
---
{{% note %}}
#### Notes about examples
The examples below use `curl`, a command line tool for transferring data, to send
HTTP requests to the Meta API, and [`jq`](https://stedolan.github.io/jq/), a command line JSON processor,
to make the JSON output easier to read.
Alternatives for each are available, but are not covered in this documentation.
All examples assume authentication is enabled in InfluxDB.
Admin credentials must be sent with each request.
Use the `curl -u` flag to pass authentication credentials:
```sh
curl -u `username:password` #...
```
{{% /note %}}
---
## Matching methods
The following matching methods are available when managing restrictions and grants to databases, measurements, or series:
- `exact` (matches only exact string matches)
- `prefix` (matches strings the begin with a specified prefix)
```sh
# Match a database name exactly
"database": {"match": "exact", "value": "my_database"}
# Match any databases that begin with "my_"
"database": {"match": "prefix", "value": "my_"}
```
{{% note %}}
#### Wildcard matching
Neither `exact` nor `prefix` matching methods allow for wildcard matching.
{{% /note %}}
## Manage roles
Roles allow you to assign permissions to groups of users.
The following examples assume the `user1`, `user2` and `ops` users already exist in InfluxDB.
### Create a role
To create a new role, use the InfluxDB Meta API `/role` endpoint with the `action`
field set to `create` in the request body.
The following examples create two new roles:
- east
- west
```sh
# Create east role
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"action": "create",
"role": {
"name": "east"
}
}'
# Create west role
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"action": "create",
"role": {
"name": "west"
}
}'
```
### Specify role permissions
To specify permissions for a role,
use the InfluxDB Meta API `/role` endpoint with the `action` field set to `add-permissions`.
Specify the [permissions](/chronograf/v1.8/administration/managing-influxdb-users/#permissions) to add for each database.
The following example sets read and write permissions on `db1` for both `east` and `west` roles.
```sh
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"action": "add-permissions",
"role": {
"name": "east",
"permissions": {
"db1": ["ReadData", "WriteData"]
}
}
}'
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"action": "add-permissions",
"role": {
"name": "west",
"permissions": {
"db1": ["ReadData", "WriteData"]
}
}
}'
```
### Remove role permissions
To remove permissions from a role, use the InfluxDB Meta API `/role` endpoint with the `action` field
set to `remove-permissions`.
Specify the [permissions](/{{< latest "chronograf" >}}/administration/managing-influxdb-users/#permissions) to remove from each database.
The following example removes read and write permissions from `db1` for the `east` role.
```sh
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"action": "remove-permissions",
"role": {
"name": "east",
"permissions": {
"db1": ["ReadData", "WriteData"]
}
}
}'
```
### Assign users to a role
To assign users to role, set the `action` field to `add-users` and include a list
of users in the `role` field.
The following examples add user1, user2 and the ops user to the `east` and `west` roles.
```sh
# Add user1 and ops to the east role
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"action": "add-users",
"role": {
"name": "east",
"users": ["user1", "ops"]
}
}'
# Add user1 and ops to the west role
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"action": "add-users",
"role": {
"name": "west",
"users": ["user2", "ops"]
}
}'
```
### View existing roles
To view existing roles with their assigned permissions and users, use the `GET`
request method with the InfluxDB Meta API `/role` endpoint.
```sh
curl --location-trusted -XGET http://localhost:8091/role | jq
```
### Delete a role
To delete a role, the InfluxDB Meta API `/role` endpoint and set the `action`
field to `delete` and include the name of the role to delete.
```sh
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"action": "delete",
"role": {
"name": "west"
}
}'
```
{{% note %}}
Deleting a role does not delete users assigned to the role.
{{% /note %}}
## Manage restrictions
Restrictions restrict either or both read and write permissions on InfluxDB assets.
Restrictions apply to all non-admin users.
[Grants](#manage-grants) override restrictions.
> In order to run meta queries (such as `SHOW MEASUREMENTS` or `SHOW TAGS` ),
> users must have read permissions for the database and retention policy they are querying.
Manage restrictions using the InfluxDB Meta API `acl/restrictions` endpoint.
```sh
curl --location-trusted -XGET "http://localhost:8091/influxdb/v2/acl/restrictions"
```
- [Restrict by database](#restrict-by-database)
- [Restrict by measurement in a database](#restrict-by-measurement-in-a-database)
- [Restrict by series in a database](#restrict-by-series-in-a-database)
- [View existing restrictions](#view-existing-restrictions)
- [Update a restriction](#update-a-restriction)
- [Remove a restriction](#remove-a-restriction)
> **Note:** For the best performance, set up minimal restrictions.
### Restrict by database
In most cases, restricting the database is the simplest option, and has minimal impact on performance.
The following example restricts reads and writes on the `my_database` database.
```sh
curl --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"permissions": ["read", "write"]
}'
```
### Restrict by measurement in a database
The following example restricts read and write permissions on the `network`
measurement in the `my_database` database.
_This restriction does not apply to other measurements in the `my_database` database._
```sh
curl --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"measurement": {"match": "exact", "value": "network"},
"permissions": ["read", "write"]
}'
```
### Restrict by series in a database
The most fine-grained restriction option is to restrict specific tags in a measurement and database.
The following example restricts read and write permissions on the `datacenter=east` tag in the
`network` measurement in the `my_database` database.
_This restriction does not apply to other tags or tag values in the `network` measurement._
```sh
curl --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"measurement": {"match": "exact", "value": "network"},
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
"permissions": ["read", "write"]
}'
```
_Consider this option carefully, as it allows writes to `network` without tags or
writes to `network` with a tag key of `datacenter` and a tag value of anything but `east`._
##### Apply restrictions to a series defined by multiple tags
```sh
curl --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"measurement": {"match": "exact", "value": "network"},
"tags": [
{"match": "exact", "key": "tag1", "value": "value1"},
{"match": "exact", "key": "tag2", "value": "value2"}
],
"permissions": ["read", "write"]
}'
```
{{% note %}}
#### Create multiple restrictions at a time
There may be times where you need to create restrictions using unique values for each.
To create multiple restrictions for a list of values, use a bash `for` loop:
```sh
for value in val1 val2 val3 val4; do
curl --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"measurement": {"match": "exact", "value": "network"},
"tags": [{"match": "exact", "key": "datacenter", "value": "'$value'"}],
"permissions": ["read", "write"]
}'
done
```
{{% /note %}}
### View existing restrictions
To view existing restrictions, use the `GET` request method with the `acl/restrictions` endpoint.
```sh
curl --location-trusted -u "admin-username:admin-password" -XGET "http://localhost:8091/influxdb/v2/acl/restrictions" | jq
```
### Update a restriction
_You can not directly modify a restriction.
Delete the existing restriction and create a new one with updated parameters._
### Remove a restriction
To remove a restriction, obtain the restriction ID using the `GET` request method
with the `acl/restrictions` endpoint.
Use the `DELETE` request method to delete a restriction by ID.
```sh
# Obtain the restriction ID from the list of restrictions
curl --location-trusted -u "admin-username:admin-password" \
-XGET "http://localhost:8091/influxdb/v2/acl/restrictions" | jq
# Delete the restriction using the restriction ID
curl --location-trusted -u "admin-username:admin-password" \
-XDELETE "http://localhost:8091/influxdb/v2/acl/restrictions/<restriction_id>"
```
## Manage grants
Grants remove restrictions and grant users or roles either or both read and write
permissions on InfluxDB assets.
Manage grants using the InfluxDB Meta API `acl/grants` endpoint.
```sh
curl --location-trusted -u "admin-username:admin-password" \
-XGET "http://localhost:8091/influxdb/v2/acl/grants"
```
- [Grant permissions by database](#grant-permissions-by-database)
- [Grant permissions by measurement in a database](#grant-permissions-by-measurement-in-a-database)
- [Grant permissions by series in a database](#grant-permissions-by-series-in-a-database)
- [View existing grants](#view-existing-grants)
- [Update a grant](#update-a-grant)
- [Remove a grant](#remove-a-grant)
### Grant permissions by database
The following examples grant read and write permissions on the `my_database` database.
> **Note:** This offers no guarantee that the users will write to the correct measurement or use the correct tags.
##### Grant database-level permissions to users
```sh
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"permissions": ["read", "write"],
"users": [
{"name": "user1"},
{"name": "user2"}
]
}'
```
##### Grant database-level permissions to roles
```sh
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"permissions": ["read", "write"],
"roles": [
{"name": "role1"},
{"name": "role2"}
]
}'
```
### Grant permissions by measurement in a database
The following examples grant permissions to the `network` measurement in the `my_database` database.
These grants do not apply to other measurements in the `my_database` database nor
guarantee that users will use the correct tags.
##### Grant measurement-level permissions to users
```sh
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"measurement": {"match": "exact", "value": "network"},
"permissions": ["read", "write"],
"users": [
{"name": "user1"},
{"name": "user2"}
]
}'
```
To grant access for roles, run:
```sh
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"measurement": {"match": "exact", "value": "network"},
"permissions": ["read", "write"],
"roles": [
{"name": "role1"},
{"name": "role2"}
]
}'
```
### Grant permissions by series in a database
The following examples grant access only to data with the corresponding `datacenter` tag.
_Neither guarantees the users will use the `network` measurement._
##### Grant series-level permissions to users
```sh
# Grant user1 read/write permissions on data with the 'datacenter=east' tag set.
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
"permissions": ["read", "write"],
"users": [{"name": "user1"}]
}'
# Grant user2 read/write permissions on data with the 'datacenter=west' tag set.
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
"permissions": ["read", "write"],
"users": [{"name": "user2"}]
}'
```
##### Grant series-level permissions to roles
```sh
# Grant role1 read/write permissions on data with the 'datacenter=east' tag set.
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
"permissions": ["read", "write"],
"roles": [{"name": "role1"}]
}'
# Grant role2 read/write permissions on data with the 'datacenter=west' tag set.
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
"permissions": ["read", "write"],
"roles": [{"name": "role2"}]
}'
```
### Grant access to specific series in a measurement
The following examples grant read and write permissions to corresponding `datacenter`
tags in the `network` measurement.
_They each specify the measurement in the request body._
##### Grant series-level permissions in a measurement to users
```sh
# Grant user1 read/write permissions on data with the 'datacenter=west' tag set
# inside the 'network' measurement.
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"measurement": {"match": "exact", "value": "network"},
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
"permissions": ["read", "write"],
"users": [{"name": "user1"}]
}'
# Grant user2 read/write permissions on data with the 'datacenter=west' tag set
# inside the 'network' measurement.
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"measurement": {"match": "exact", "value": "network"},
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
"permissions": ["read", "write"],
"users": [{"name": "user2"}]
}'
```
##### Grant series-level permissions in a measurement to roles
```sh
# Grant role1 read/write permissions on data with the 'datacenter=west' tag set
# inside the 'network' measurement.
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"measurement": {"match": "exact", "value": "network"},
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
"permissions": ["read", "write"],
"roles": [{"name": "role1"}]
}'
# Grant role2 read/write permissions on data with the 'datacenter=west' tag set
# inside the 'network' measurement.
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
-u "admin-username:admin-password" \
-H "Content-Type: application/json" \
--data-binary '{
"database": {"match": "exact", "value": "my_database"},
"measurement": {"match": "exact", "value": "network"},
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
"permissions": ["read", "write"],
"roles": [{"name": "role2"}]
}'
```
Grants for specific series also apply to [meta queries](/enterprise_influxdb/v1.10/query_language/schema_exploration).
Results from meta queries are restricted based on series-level permissions.
For example, `SHOW TAG VALUES` only returns tag values that the user is authorized to see.
With these grants in place, a user or role can only read or write data from or to
the `network` measurement if the data includes the appropriate `datacenter` tag set.
{{% note %}}
Note that this is only the requirement of the presence of that tag;
`datacenter=east,foo=bar` will also be accepted.
{{% /note %}}
### View existing grants
To view existing grants, use the `GET` request method with the `acl/grants` endpoint.
```sh
curl --location-trusted -u "admin-username:admin-password" \
-XGET "http://localhost:8091/influxdb/v2/acl/grants" | jq
```
### Update a grant
_You can not directly modify a grant.
Delete the existing grant and create a new one with updated parameters._
### Remove a grant
To delete a grant, obtain the grant ID using the `GET` request method with the
`acl/grants` endpoint.
Use the `DELETE` request method to delete a grant by ID.
```sh
# Obtain the grant ID from the list of grants
curl --location-trusted -u "admin-username:admin-password" \
-XGET "http://localhost:8091/influxdb/v2/acl/grants" | jq
# Delete the grant using the grant ID
curl --location-trusted -u "admin-username:admin-password" \
-XDELETE "http://localhost:8091/influxdb/v2/acl/grants/<grant_id>"
```

View File

@ -0,0 +1,84 @@
---
title: Introduction to authorization in InfluxDB Enterprise
description: >
Learn the basics of managing users and permissions in InfluxDB Enterprise.
menu:
enterprise_influxdb_1_10:
name: Introduction to authorization
parent: Manage users and permissions
weight: 30
related:
- /enterprise_influxdb/v1.10/guides/fine-grained-authorization/
- /{{< latest "chronograf" >}}/administration/managing-influxdb-users/
---
Authorization in InfluxDB Enterprise refers to managing user permissions.
To secure and manage access to an InfluxDB Enterprise cluster,
first [configure authentication](/enterprise_influxdb/v1.10/administration/configure/security/authentication/).
You can then manage users and permissions as necessary.
This page is meant to help new users choose the best method
for managing permissions in InfluxDB Enterprise.
## Permissions in InfluxDB Enterprise
InfluxDB Enterprise has an [expanded set of 16 permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/#permissions).
These permissions allow for
controlling read and write access to data for all databases and for individual databases,
as well as permitting certain cluster-management actions like creating or deleting resources.
InfluxDB 1.x OSS only supports database-level privileges: `READ` and `WRITE`.
A third permission, `ALL`, grants admin privileges.
These three permissions exist in InfluxDB Enterprise as well.
They can _only be granted by using InfluxQL_.
## Manage user authorization
Choose one of the following methods manage authorizations in InfluxDB Enterprise:
- using [InfluxQL](#manage-read-and-write-privileges-with-influxql)
{{% note %}}
InfluxQL can can only grant `READ`, `WRITE`, and `ALL PRIVILEGES` privileges.
To use the full set of InfluxDB Enterprise [permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/),
use [Chronograf](#manage-specific-privileges-with-chronograf)
or the [Meta API (recommended)](#influxdb-enterprise-meta-api).
{{% /note %}}
- using [Chronograf](#manage-enterprise-permissions-with-chronograf)
- using the [InfluxDB Enterprise meta API](#manage-enterprise-permissions-with-the-meta-api) (**Recommended**)
### Manage read and write privileges with InfluxQL
If you only need to manage basic `READ`, `WRITE`, and `ALL` privileges,
use InfluxQL to manage authorizations.
(For instance, if you upgraded from InfluxDB OSS 1.x
and do not need the more detailed authorization in InfluxDB Enterprise, continue to use InfluxQL.)
{{% warn %}}
We recommend operators *do not* mix and match InfluxQL
with other authorization management methods (Chronograf and the API).
Doing so may lead to inconsistencies in user permissions.
{{% /warn %}}
### Manage Enterprise permissions with Chronograf
The Chronograf user interface can manage the
[full set of InfluxDB Enterprise permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/#permissions).
The permissions listed in Chronograf are global for the cluster, and available through the API.
Outside of [FGA](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/fine-grained-authorization),
the only database-level permissions available are the basic `READ` and `WRITE`.
These can only be managed using [InfluxQL](#manage-read-and-write-privileges-with-influxql).
Chronograf can only set permissions globally, for all databases, within a cluster.
If you need to set permissions at the database level, use the [Meta API](#influxdb-enterprise-meta-api).
See ["Manage InfluxDB users in Chronograf"](/chronograf/v1.10/administration/managing-influxdb-users/)
for instructions.
### Manage Enterprise permissions with the Meta API
The InfluxDB Enterprise API is the recommended method for managing permissions.
Use the API to manage setting cluster-wide and database-specific permissions.
For more information on using the meta API,
see [here](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/authorization-api).

View File

@ -0,0 +1,154 @@
---
title: Enterprise users and permissions reference
description: >
Detailed reference for users, roles, permissions, and permission-to-statement mappings.
menu:
enterprise_influxdb_1_10:
parent: Manage users and permissions
weight: 100
aliases:
- /enterprise_influxdb/v1.10/features/users/
---
{{% enterprise-warning-authn-b4-authz %}}
- [Users](#users)
- [Permissions](#permissions)
## Users
Users have permissions and roles.
### Roles
Roles are groups of permissions.
A single role can belong to several users.
InfluxDB Enterprise clusters have two built-in roles:
#### Global Admin
The Global Admin role has all 16 [cluster permissions](#permissions).
#### Admin
The Admin role has all [cluster permissions](#permissions) except for the
permissions to:
* Add/Remove Nodes
* Copy Shard
* Manage Shards
* Rebalance
## Permissions
A **permission** (also *privilege*) is the ability to access a resource in some way, including:
- viewing the resource
- copying the resource
- dropping the resource
- writing to the resource
- full management capabilities
InfluxDB Enterprise clusters have 16 permissions:
| Permission | Description | Token |
|:--------------------------|---------------------------------------------------------|------------------------|
| View Admin | Permission to view or edit admin screens | `ViewAdmin` |
| View Chronograf | Permission to use Chronograf tools | `ViewChronograf` |
| Create Databases | Permission to create databases | `CreateDatabase` |
| Create Users & Roles | Permission to create users and roles | `CreateUserAndRole` |
| Add/Remove Nodes | Permission to add/remove nodes from a cluster | `AddRemoveNode` |
| Drop Databases | Permission to drop databases | `DropDatabase` |
| Drop Data | Permission to drop measurements and series | `DropData` |
| Read | Permission to read data | `ReadData` |
| Write | Permission to write data | `WriteData` |
| Rebalance | Permission to rebalance a cluster | `Rebalance` |
| Manage Shards | Permission to copy and delete shards | `ManageShard` |
| Manage Continuous Queries | Permission to create, show, and drop continuous queries | `ManageContnuousQuery` |
| Manage Queries | Permission to show and kill queries | `ManageQuery` |
| Manage Subscriptions | Permission to show, add, and drop subscriptions | `ManageSubscription` |
| Monitor | Permission to show stats and diagnostics | `Monitor` |
| Copy Shard | Permission to copy shards | `CopyShard` |
In addition, two tokens govern Kapacitor permissions:
* `KapacitorAPI`:
Grants the user permission to create, read, update and delete
tasks, topics, handlers and similar Kapacitor artifacts.
* `KapacitorConfigAPI`:
Grants the user permission to override the Kapacitor configuration
dynamically using the configuration endpoint.
### Permissions scope
Using the InfluxDB Enterprise Meta API,
these permissions can be set at the cluster-wide level (for all databases at once)
and for specific databases.
For examples, see [Manage authorization with the InfluxDB Enterprise Meta API](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/authorization-api/).
### Permission to Statement
The following table describes permissions required to execute the associated database statement.
<!-- It also describes whether these permissions apply just to InfluxDB (Database) or InfluxDB Enterprise (Cluster). -->
| Permission | Statement |
|----------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CreateDatabasePermission | AlterRetentionPolicyStatement, CreateDatabaseStatement, CreateRetentionPolicyStatement, ShowRetentionPoliciesStatement |
| ManageContinuousQueryPermission | CreateContinuousQueryStatement, DropContinuousQueryStatement, ShowContinuousQueriesStatement |
| ManageSubscriptionPermission | CreateSubscriptionStatement, DropSubscriptionStatement, ShowSubscriptionsStatement |
| CreateUserAndRolePermission | CreateUserStatement, DropUserStatement, GrantAdminStatement, GrantStatement, RevokeAdminStatement, RevokeStatement, SetPasswordUserStatement, ShowGrantsForUserStatement, ShowUsersStatement |
| DropDataPermission | DeleteSeriesStatement, DeleteStatement, DropMeasurementStatement, DropSeriesStatement |
| DropDatabasePermission | DropDatabaseStatement, DropRetentionPolicyStatement |
| ManageShardPermission | DropShardStatement,ShowShardGroupsStatement, ShowShardsStatement |
| ManageQueryPermission | KillQueryStatement, ShowQueriesStatement |
| MonitorPermission | ShowDiagnosticsStatement, ShowStatsStatement |
| ReadDataPermission | ShowFieldKeysStatement, ShowMeasurementsStatement, ShowSeriesStatement, ShowTagKeysStatement, ShowTagValuesStatement, ShowRetentionPoliciesStatement |
| NoPermissions | ShowDatabasesStatement |
| Determined by type of select statement | SelectStatement |
### Statement to Permission
The following table describes database statements and the permissions required to execute them.
It also describes whether these permissions apply the the database or cluster level.
| Statement | Permissions | Scope | |
|--------------------------------|----------------------------------------|----------|--------------------------------------------------------------------------|
| AlterRetentionPolicyStatement | CreateDatabasePermission | Database | |
| CreateContinuousQueryStatement | ManageContinuousQueryPermission | Database | |
| CreateDatabaseStatement | CreateDatabasePermission | Cluster | |
| CreateRetentionPolicyStatement | CreateDatabasePermission | Database | |
| CreateSubscriptionStatement | ManageSubscriptionPermission | Database | |
| CreateUserStatement | CreateUserAndRolePermission | Database | |
| DeleteSeriesStatement | DropDataPermission | Database | |
| DeleteStatement | DropDataPermission | Database | |
| DropContinuousQueryStatement | ManageContinuousQueryPermission | Database | |
| DropDatabaseStatement | DropDatabasePermission | Cluster | |
| DropMeasurementStatement | DropDataPermission | Database | |
| DropRetentionPolicyStatement | DropDatabasePermission | Database | |
| DropSeriesStatement | DropDataPermission | Database | |
| DropShardStatement | ManageShardPermission | Cluster | |
| DropSubscriptionStatement | ManageSubscriptionPermission | Database | |
| DropUserStatement | CreateUserAndRolePermission | Database | |
| GrantAdminStatement | CreateUserAndRolePermission | Database | |
| GrantStatement | CreateUserAndRolePermission | Database | |
| KillQueryStatement | ManageQueryPermission | Database | |
| RevokeAdminStatement | CreateUserAndRolePermission | Database | |
| RevokeStatement | CreateUserAndRolePermission | Database | |
| SelectStatement | Determined by type of select statement | n/a | |
| SetPasswordUserStatement | CreateUserAndRolePermission | Database | |
| ShowContinuousQueriesStatement | ManageContinuousQueryPermission | Database | |
| ShowDatabasesStatement | NoPermissions | Cluster | The user's grants determine which databases are returned in the results. |
| ShowDiagnosticsStatement | MonitorPermission | Database | |
| ShowFieldKeysStatement | ReadDataPermission | Database | |
| ShowGrantsForUserStatement | CreateUserAndRolePermission | Database | |
| ShowMeasurementsStatement | ReadDataPermission | Database | |
| ShowQueriesStatement | ManageQueryPermission | Database | |
| ShowRetentionPoliciesStatement | CreateDatabasePermission | Database | |
| ShowSeriesStatement | ReadDataPermission | Database | |
| ShowShardGroupsStatement | ManageShardPermission | Cluster | |
| ShowShardsStatement | ManageShardPermission | Cluster | |
| ShowStatsStatement | MonitorPermission | Database | |
| ShowSubscriptionsStatement | ManageSubscriptionPermission | Database | |
| ShowTagKeysStatement | ReadDataPermission | Database | |
| ShowTagValuesStatement | ReadDataPermission | Database | |
| ShowUsersStatement | CreateUserAndRolePermission | Database | |

View File

@ -0,0 +1,30 @@
---
title: Monitor InfluxDB Enterprise
description: Monitor InfluxDB Enterprise with InfluxDB Cloud or OSS.
menu:
enterprise_influxdb_1_10:
name: Monitor
parent: Administration
weight: 50
---
Monitoring is the act of observing changes in data over time.
There are multiple ways to monitor your InfluxDB Enterprise cluster.
See the guides below to monitor a cluster using another InfluxDB instance.
Alternatively, to view your output data occasionally (_e.g._, for auditing or diagnostics),
do one of the following:
- [Log and trace InfluxDB Enterprise operations](/enterprise_influxdb/v1.10/administration/monitor/logs/)
- [Use InfluxQL for diagnostics](/enterprise_influxdb/v1.10/administration/monitor/diagnostics/)
{{% note %}}
### Monitor with InfluxDB Aware and Influx Insights
InfluxDB Aware and Influx Insights is a free Enterprise service that sends your data to a free Cloud account.
Aware assists you in monitoring your data by yourself.
Insights assists you in monitoring your data with the help of the support team.
To apply for this service, please contact the [support team](https://support.influxdata.com/s/login/).
{{% /note %}}
{{< children >}}

View File

@ -0,0 +1,27 @@
---
title: Use InfluxQL for diagnostics
description: Use InfluxQL commands for diagnostics and statistics.
menu:
enterprise_influxdb_1_10:
name: Diagnostics
parent: Monitor
weight: 104
---
The commands below are useful when diagnosing issues with InfluxDB Enterprise clusters.
Use the [`influx` CLI](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/) to run these commands.
### SHOW STATS
To see node statistics, run `SHOW STATS`.
The statistics returned by `SHOW STATS` are stored in memory only,
and are reset to zero when the node is restarted.
For details on this command, see [`SHOW STATS`](/enterprise_influxdb/v1.10/query_language/spec#show-stats).
### SHOW DIAGNOSTICS
To see node diagnostic information, run `SHOW DIAGNOSTICS`.
This returns information such as build information, uptime, hostname, server configuration, memory usage, and Go runtime diagnostics.
For details on this command, see [`SHOW DIAGNOSTICS`](/enterprise_influxdb/v1.10/query_language/spec#show-diagnostics).

View File

@ -0,0 +1,308 @@
---
title: Log and trace InfluxDB Enterprise operations
description: >
Learn about logging locations, redirecting HTTP request logging, structured logging, and tracing.
menu:
enterprise_influxdb_1_10:
name: Log and trace
parent: Monitor
weight: 103
aliases:
- /enterprise_influxdb/v1.10/administration/logs/
---
* [Logging locations](#logging-locations)
* [Redirect HTTP request logging](#redirect-http-access-logging)
* [Structured logging](#structured-logging)
* [Tracing](#tracing)
InfluxDB writes log output, by default, to `stderr`.
Depending on your use case, this log information can be written to another location.
Some service managers may override this default.
## Logging locations
### Run InfluxDB directly
If you run InfluxDB directly, using `influxd`, all logs will be written to `stderr`.
You may redirect this log output as you would any output to `stderr` like so:
```bash
influxdb-meta 2>$HOME/my_log_file # Meta nodes
influxd 2>$HOME/my_log_file # Data nodes
influx-enterprise 2>$HOME/my_log_file # Enterprise Web
```
### Launched as a service
#### sysvinit
If InfluxDB was installed using a pre-built package, and then launched
as a service, `stderr` is redirected to
`/var/log/influxdb/<node-type>.log`, and all log data will be written to
that file. You can override this location by setting the variable
`STDERR` in the file `/etc/default/<node-type>`.
For example, if on a data node `/etc/default/influxdb` contains:
```bash
STDERR=/dev/null
```
all log data will be discarded. You can similarly direct output to
`stdout` by setting `STDOUT` in the same file. Output to `stdout` is
sent to `/dev/null` by default when InfluxDB is launched as a service.
InfluxDB must be restarted to pick up any changes to `/etc/default/<node-type>`.
##### Meta nodes
For meta nodes, the <node-type> is `influxdb-meta`.
The default log file is `/var/log/influxdb/influxdb-meta.log`
The service configuration file is `/etc/default/influxdb-meta`.
##### Data nodes
For data nodes, the <node-type> is `influxdb`.
The default log file is `/var/log/influxdb/influxdb.log`
The service configuration file is `/etc/default/influxdb`.
##### Enterprise Web
For Enterprise Web nodes, the <node-type> is `influx-enterprise`.
The default log file is `/var/log/influxdb/influx-enterprise.log`
The service configuration file is `/etc/default/influx-enterprise`.
#### systemd
Starting with version 1.0, InfluxDB on systemd systems no longer
writes files to `/var/log/<node-type>.log` by default, and now uses the
system configured default for logging (usually `journald`). On most
systems, the logs will be directed to the systemd journal and can be
accessed with the command:
```
sudo journalctl -u <node-type>.service
```
Please consult the systemd journald documentation for configuring
journald.
##### Meta nodes
For data nodes the <node-type> is `influxdb-meta`.
The default log command is `sudo journalctl -u influxdb-meta.service`
The service configuration file is `/etc/default/influxdb-meta`.
##### Data nodes
For data nodes the <node-type> is `influxdb`.
The default log command is `sudo journalctl -u influxdb.service`
The service configuration file is `/etc/default/influxdb`.
##### Enterprise Web
For data nodes the <node-type> is `influx-enterprise`.
The default log command is `sudo journalctl -u influx-enterprise.service`
The service configuration file is `/etc/default/influx-enterprise`.
### Use logrotate
You can use [logrotate](https://manpages.ubuntu.com/manpages/jammy/en/man8/logrotate.8.html)
to rotate the log files generated by InfluxDB on systems where logs are written to flat files.
If using the package install on a sysvinit system, the config file for logrotate is installed in `/etc/logrotate.d`.
You can view the file [here](https://github.com/influxdb/influxdb/blob/master/scripts/logrotate).
## Redirect HTTP access logging
InfluxDB 1.5 introduces the option to log HTTP request traffic separately from the other InfluxDB log output. When HTTP request logging is enabled, the HTTP logs are intermingled by default with internal InfluxDB logging. By redirecting the HTTP request log entries to a separate file, both log files are easier to read, monitor, and debug.
See [Redirecting HTTP request logging](/enterprise_influxdb/v1.10/administration/logs/#redirecting-http-access-logging) in the InfluxDB OSS documentation.
## Structured logging
With InfluxDB 1.5, structured logging is supported and enable machine-readable and more developer-friendly log output formats. The two new structured log formats, `logfmt` and `json`, provide easier filtering and searching with external tools and simplifies integration of InfluxDB logs with Splunk, Papertrail, Elasticsearch, and other third party tools.
See [Structured logging](/enterprise_influxdb/v1.10/administration/logs/#structured-logging) in the InfluxDB OSS documentation.
## Tracing
Logging has been enhanced to provide tracing of important InfluxDB operations.
Tracing is useful for error reporting and discovering performance bottlenecks.
### Logging keys used in tracing
#### Tracing identifier key
The `trace_id` key specifies a unique identifier for a specific instance of a trace.
You can use this key to filter and correlate all related log entries for an operation.
All operation traces include consistent starting and ending log entries, with the same message (`msg`) describing the operation (e.g., "TSM compaction"), but adding the appropriate `op_event` context (either `start` or `end`).
For an example, see [Finding all trace log entries for an InfluxDB operation](#finding-all-trace-log-entries-for-an-influxdb-operation).
**Example:** `trace_id=06R0P94G000`
#### Operation keys
The following operation keys identify an operation's name, the start and end timestamps, and the elapsed execution time.
##### `op_name`
Unique identifier for an operation.
You can filter on all operations of a specific name.
**Example:** `op_name=tsm1_compact_group`
##### `op_event`
Specifies the start and end of an event.
The two possible values, `(start)` or `(end)`, are used to indicate when an operation started or ended.
For example, you can grep by values in `op_name` AND `op_event` to find all starting operation log entries.
For an example of this, see [Finding all starting log entries](#finding-all-starting-operation-log-entries).
**Example:** `op_event=start`
##### `op_elapsed`
Duration of the operation execution.
Logged with the ending trace log entry.
Valid duration units are `ns`, `µs`, `ms`, and `s`.
**Example:** `op_elapsed=352ms`
#### Log identifier context key
The log identifier key (`log_id`) lets you easily identify _every_ log entry for a single execution of an `influxd` process.
There are other ways a log file could be split by a single execution, but the consistent `log_id` eases the searching of log aggregation services.
**Example:** `log_id=06QknqtW000`
#### Database context keys
- **db\_instance**: Database name
- **db\_rp**: Retention policy name
- **db\_shard\_id**: Shard identifier
- **db\_shard\_group**: Shard group identifier
### Tooling
Here are a couple of popular tools available for processing and filtering log files output in `logfmt` or `json` formats.
#### hutils
The [hutils](https://blog.heroku.com/hutils-explore-your-structured-data-logs) utility collection, provided by Heroku, provides tools for working with `logfmt`-encoded logs, including:
- **lcut**: Extracts values from a `logfmt` trace based on a specified field name.
- **lfmt**: Prettifies `logfmt` lines as they emerge from a stream, and highlights their key sections.
- **ltap**: Accesses messages from log providers in a consistent way to allow easy parsing by other utilities that operate on `logfmt` traces.
- **lviz**: Visualizes `logfmt` output by building a tree out of a dataset combining common sets of key-value pairs into shared parent nodes.
#### lnav (Log File Navigator)
The [lnav (Log File Navigator)](http://lnav.org) is an advanced log file viewer useful for watching and analyzing your log files from a terminal.
The lnav viewer provides a single log view, automatic log format detection, filtering, timeline view, pretty-print view, and querying logs using SQL.
### Operations
The following operations, listed by their operation name (`op_name`) are traced in InfluxDB internal logs and available for use without changes in logging level.
#### Initial opening of data files
The `tsdb_open` operation traces include all events related to the initial opening of the `tsdb_store`.
#### Retention policy shard deletions
The `retention.delete_check` operation includes all shard deletions related to the retention policy.
#### TSM snapshotting in-memory cache to disk
The `tsm1_cache_snapshot` operation represents the snapshotting of the TSM in-memory cache to disk.
#### TSM compaction strategies
The `tsm1_compact_group` operation includes all trace log entries related to TSM compaction strategies and displays the related TSM compaction strategy keys:
- **tsm1\_strategy**: level or full
- **tsm1\_level**: 1, 2, or 3
- **tsm\_optimize**: true or false
#### Series file compactions
The `series_partition_compaction` operation includes all trace log entries related to series file compactions.
#### Continuous query execution (if logging enabled)
The `continuous_querier_execute` operation includes all continuous query executions, if logging is enabled.
#### TSI log file compaction
The `tsi1_compact_log_file` operation includes all trace log entries related to log file compactions.
#### TSI level compaction
The `tsi1_compact_to_level` operation includes all trace log entries for TSI level compactions.
### Tracing examples
#### Finding all trace log entries for an InfluxDB operation
In the example below, you can see the log entries for all trace operations related to a "TSM compaction" process.
Note that the initial entry shows the message "TSM compaction (start)" and the final entry displays the message "TSM compaction (end)".
{{% note %}}
Log entries were grepped using the `trace_id` value and then the specified key values were displayed using `lcut` (an `hutils` tool).
{{% /note %}}\]
```
$ grep "06QW92x0000" influxd.log | lcut ts lvl msg strategy level
2018-02-21T20:18:56.880065Z info TSM compaction (start) full
2018-02-21T20:18:56.880162Z info Beginning compaction full
2018-02-21T20:18:56.880185Z info Compacting file full
2018-02-21T20:18:56.880211Z info Compacting file full
2018-02-21T20:18:56.880226Z info Compacting file full
2018-02-21T20:18:56.880254Z info Compacting file full
2018-02-21T20:19:03.928640Z info Compacted file full
2018-02-21T20:19:03.928687Z info Finished compacting files full
2018-02-21T20:19:03.928707Z info TSM compaction (end) full
```
#### Finding all starting operation log entries
To find all starting operation log entries, you can grep by values in `op_name` AND `op_event`.
In the following example, the grep returned 101 entries, so the result below only displays the first entry.
In the example result entry, the timestamp, level, strategy, trace_id, op_name, and op_event values are included.
```
$ grep -F 'op_name=tsm1_compact_group' influxd.log | grep -F 'op_event=start'
ts=2018-02-21T20:16:16.709953Z lvl=info msg="TSM compaction" log_id=06QVNNCG000 engine=tsm1 level=1 strategy=level trace_id=06QV~HHG000 op_name=tsm1_compact_group op_event=start
...
```
Using the `lcut` utility (in hutils), the following command uses the previous `grep` command, but adds an `lcut` command to only display the keys and their values for keys that are not identical in all of the entries.
The following example includes 19 examples of unique log entries displaying selected keys: `ts`, `strategy`, `level`, and `trace_id`.
```
$ grep -F 'op_name=tsm1_compact_group' influxd.log | grep -F 'op_event=start' | lcut ts strategy level trace_id | sort -u
2018-02-21T20:16:16.709953Z level 1 06QV~HHG000
2018-02-21T20:16:40.707452Z level 1 06QW0k0l000
2018-02-21T20:17:04.711519Z level 1 06QW2Cml000
2018-02-21T20:17:05.708227Z level 2 06QW2Gg0000
2018-02-21T20:17:29.707245Z level 1 06QW3jQl000
2018-02-21T20:17:53.711948Z level 1 06QW5CBl000
2018-02-21T20:18:17.711688Z level 1 06QW6ewl000
2018-02-21T20:18:56.880065Z full 06QW92x0000
2018-02-21T20:20:46.202368Z level 3 06QWFizW000
2018-02-21T20:21:25.292557Z level 1 06QWI6g0000
2018-02-21T20:21:49.294272Z level 1 06QWJ_RW000
2018-02-21T20:22:13.292489Z level 1 06QWL2B0000
2018-02-21T20:22:37.292431Z level 1 06QWMVw0000
2018-02-21T20:22:38.293320Z level 2 06QWMZqG000
2018-02-21T20:23:01.293690Z level 1 06QWNygG000
2018-02-21T20:23:25.292956Z level 1 06QWPRR0000
2018-02-21T20:24:33.291664Z full 06QWTa2l000
2018-02-21T21:12:08.017055Z full 06QZBpKG000
2018-02-21T21:12:08.478200Z full 06QZBr7W000
```

View File

@ -0,0 +1,185 @@
---
title: Monitor InfluxDB Enterprise with InfluxDB Cloud
description: >
Monitor your InfluxDB Enterprise instance using InfluxDB Cloud and
a pre-built InfluxDB template.
menu:
enterprise_influxdb_1_10:
name: Monitor with Cloud
parent: Monitor
weight: 100
aliases:
- /enterprise_influxdb/v1.10/administration/monitor-enterprise/monitor-with-cloud/
---
Use [InfluxDB Cloud](/influxdb/cloud/), the [InfluxDB Enterprise 1.x Template](https://github.com/influxdata/community-templates/tree/master/influxdb-enterprise-1x), and Telegraf to monitor one or more InfluxDB Enterprise instances.
Do the following:
1. [Review requirements](#review-requirements)
2. [Install the InfluxDB Enterprise Monitoring template](#install-the-influxdb-enterprise-monitoring-template)
3. [Set up InfluxDB Enterprise for monitoring](#set-up-influxdb-enterprise-for-monitoring)
4. [Set up Telegraf](#set-up-telegraf)
5. [View the Monitoring dashboard](#view-the-monitoring-dashboard)
6. (Optional) [Alert when metrics stop reporting](#alert-when-metrics-stop-reporting)
7. (Optional) [Create a notification endpoint and rule](#create-a-notification-endpoint-and-rule)
8. (Optional) [Monitor with InfluxDB Insights and Aware](#monitor-with-influxdb-insights-and-aware)
## Review requirements
Before you begin, make sure you have access to the following:
- An InfluxDB Cloud account. ([Sign up for free here](https://cloud2.influxdata.com/signup)).
- Command line access to a machine [running InfluxDB Enterprise 1.x](/enterprise_influxdb/v1.10/introduction/install-and-deploy/) and permissions to install Telegraf on this machine.
- Internet connectivity from the machine running InfluxDB Enterprise 1.x and Telegraf to InfluxDB Cloud.
- Sufficient resource availability to install the template. (InfluxDB Cloud Free Plan accounts include a finite number of [available resources](/influxdb/cloud/account-management/limits/#free-plan-limits).)
## Install the InfluxDB Enterprise Monitoring template
The InfluxDB Enterprise Monitoring template includes a Telegraf configuration that sends InfluxDB Enterprise metrics to an InfluxDB endpoint, and a dashboard that visualizes the metrics.
1. [Log into your InfluxDB Cloud account](https://cloud2.influxdata.com/), go to **Settings > Templates**, and enter the following template URL:
```
https://raw.githubusercontent.com/influxdata/community-templates/master/influxdb-enterprise-1x/enterprise.yml
```
2. Click **Lookup Template**, and then click **Install Template**. InfluxDB Cloud imports the template, which includes the following resources:
- Telegraf Configuration `monitoring-enterprise-1x`
- Dashboard `InfluxDB 1.x Enterprise`
- Label `enterprise`
- Variables `influxdb_host` and `bucket`
## Set up InfluxDB Enterprise for monitoring
By default, InfluxDB Enterprise 1.x has a `/metrics` endpoint available, which exports Prometheus-style system metrics.
1. Make sure the `/metrics` endpoint is [enabled](/{{< latest "influxdb" >}}/reference/config-options/#metrics-disabled). If you've changed the default settings to disable the `/metrics` endpoint, [re-enable these settings](/{{< latest "influxdb" >}}/reference/config-options/#metrics-disabled).
2. Navigate to the `/metrics` endpoint of your InfluxDB Enterprise instance to view the InfluxDB Enterprise system metrics in your browser:
```
http://localhost:8086/metrics
```
Or use `curl` to fetch metrics:
```sh
curl http://localhost:8086/metrics
# HELP boltdb_reads_total Total number of boltdb reads
# TYPE boltdb_reads_total counter
boltdb_reads_total 41
# HELP boltdb_writes_total Total number of boltdb writes
# TYPE boltdb_writes_total counter
boltdb_writes_total 28
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
...
```
3. Add your **InfluxDB Cloud** account information (URL and organization) to your Telegraf configuration by doing the following:
1. Go to **Load Data > Telegraf** [in your InfluxDB Cloud account](https://cloud2.influxdata.com/), and click **InfluxDB Output Plugin** at the top-right corner.
2. Copy the `urls`, `token`, `organization`, and `bucket` and close the window.
3. Click **monitoring-enterprise-1.x**.
4. Replace `urls`, `token`, `organization`, and `bucket` under `outputs.influxdb_v2` with your InfluxDB Cloud account information. Alternatively, store this information in your environment variables and include the environment variables in your configuration.
{{% note %}}
To ensure the InfluxDB Enterprise monitoring dashboard can display the recorded metrics, set the destination bucket name to `enterprise_metrics` in your `telegraf.conf`.
{{% /note %}}
5. Add the [Prometheus input plugin](https://github.com/influxdata/telegraf/blob/release-1.19/plugins/inputs/prometheus/README.md) to your `telegraf.conf`. Specify your your InfluxDB Enterprise URL(s) in the `urls` parameter. For example:
{{< keep-url >}}
```toml
[[inputs.prometheus]]
urls = ["http://localhost:8086/metrics"]
username = "$INFLUX_USER"
password = "$INFLUX_PASSWORD"
```
If you're using unique URLs or have authentication set up for your `/metrics` endpoint, configure those options here and save the updated configuration.
For more information about customizing Telegraf, see [Configure Telegraf](/{{< latest "telegraf" >}}/administration/configuration/#global-tags).
4. Click **Save Changes**.
## Set up Telegraf
Set up Telegraf to scrape metrics from InfluxDB Enterprise to send to your InfluxDB Cloud account.
On each InfluxDB Enterprise instance you want to monitor, do the following:
1. Go to **Load Data > Telegraf** [in your InfluxDB Cloud account](https://cloud2.influxdata.com/).
2. Click **Setup Instructions** under **monitoring-enterprise-1.x**.
3. Complete the Telegraf Setup instructions. If you are using environment variables, set them up now.
{{% note %}}
For your API token, generate a new token or use an existing All Access token. If you run Telegraf as a service, edit your init script to set the environment variable and ensure that it's available to the service.
{{% /note %}}
Telegraf runs quietly in the background (no immediate output appears), and Telegraf begins pushing metrics to your InfluxDB Cloud account.
## View the Monitoring dashboard
To see your data in real time, view the Monitoring dashboard.
1. Select **Boards** (**Dashboards**) in your **InfluxDB Cloud** account.
{{< nav-icon "dashboards" >}}
2. Click **InfluxDB Enterprise Metrics**. Metrics appear in your dashboard.
3. Customize your monitoring dashboard as needed. For example, send an alert in the following cases:
- Users create a new task or bucket
- You're testing machine limits
- [Metrics stop reporting](#alert-when-metrics-stop-reporting)
## Alert when metrics stop reporting
The Monitoring template includes a [deadman check](/influxdb/cloud/monitor-alert/checks/create/#deadman-check) to verify metrics are reported at regular intervals.
To alert when data stops flowing from InfluxDB OSS instances to your InfluxDB Cloud account, do the following:
1. [Customize the deadman check](#customize-the-deadman-check) to identify the fields you want to monitor.
2. [Create a notification endpoint and rule](#create-a-notification-endpoint-and-rule) to receive notifications when your deadman check is triggered.
### Customize the deadman check
1. To view the deadman check, click **Alerts** in the navigation bar of your **InfluxDB Cloud** account.
{{< nav-icon "alerts" >}}
2. Choose a InfluxDB OSS field or create a new OSS field for your deadman alert:
1. Click **{{< icon "plus" "v2" >}} Create** and select **Deadman Check** in the dropown menu.
2. Define your query with at least one field.
3. Click **Submit** and **Configure Check**.
When metrics stop reporting, you'll receive an alert.
3. Start under **Schedule Every**, set the amount of time to check for data.
4. Set the amount of time to wait before switching to a critical alert.
5. Save the Check and click on **View History** of the Check under the gear icon to verify it is running.
## Create a notification endpoint and rule
To receive a notification message when your deadman check is triggered, create a [notification endpoint](#create-a-notification-endpoint) and [rule](#create-a-notification-rule).
### Create a notification endpoint
InfluxDB Cloud supports different endpoints: Slack, PagerDuty, and HTTP. Slack is free for all users, while PagerDuty and HTTP are exclusive to the Usage-Based Plan.
#### Send a notification to Slack
1. Create a [Slack Webhooks](https://api.slack.com/messaging/webhooks).
2. Go to **Alerts > Notification Endpoint** and click **{{< icon "plus" "v2" >}} Create**, and enter a name and description for your Slack endpoint.
3. Enter your Slack Webhook under **Incoming Webhook URL** and click **Create Notification Endpoint**.
#### Send a notification to PagerDuty or HTTP
Send a notification to PagerDuty or HTTP endpoints (other webhooks) by [upgrading your InfluxDB Cloud account](/influxdb/cloud/account-management/billing/#upgrade-to-usage-based-plan).
### Create a notification rule
[Create a notification rule](/influxdb/cloud/monitor-alert/notification-rules/create/) to set rules for when to send a deadman alert message to your notification endpoint.
1. Go to **Alerts > Notification Rules** and click **{{< icon "plus" "v2" >}} Create**.
2. Fill out the **About** and **Conditions** section then click **Create Notification Rule**.
## Monitor with InfluxDB Insights and Aware
For InfluxDB Enterprise customers, Insights and Aware are free services that can monitor your data. InfluxDB Insights sends your data to a private Cloud account and will be monitored with the help of the support team. InfluxDB Aware is a similar service, but you monitor your data yourself.
To apply for this service, please contact the [InfluxData Support team](mailto:support@influxdata.com).

View File

@ -0,0 +1,181 @@
---
title: Monitor InfluxDB Enterprise with InfluxDB OSS
description: >
Monitor your InfluxDB Enterprise instance using InfluxDB OSS and
a pre-built InfluxDB template.
menu:
enterprise_influxdb_1_10:
name: Monitor with OSS
parent: Monitor
weight: 101
related:
- /platform/monitoring/influxdata-platform/tools/measurements-internal
aliases:
- /enterprise_influxdb/v1.10/administration/monitor-enterprise/monitor-with-oss/
---
Use [InfluxDB OSS](/influxdb/v2.0/), the [InfluxDB Enterprise 1.x Template](https://github.com/influxdata/community-templates/tree/master/influxdb-enterprise-1x), and Telegraf to monitor one or more InfluxDB Enterprise instances.
Do the following:
1. [Review requirements](#review-requirements)
2. [Install the InfluxDB Enterprise Monitoring template](#install-the-influxdb-enterprise-monitoring-template)
3. [Set up InfluxDB Enterprise for monitoring](#set-up-influxdb-enterprise-for-monitoring)
4. [Set up Telegraf](#set-up-telegraf)
5. [View the Monitoring dashboard](#view-the-monitoring-dashboard)
6. (Optional) [Alert when metrics stop reporting](#alert-when-metrics-stop-reporting)
7. (Optional) [Create a notification endpoint and rule](#create-a-notification-endpoint-and-rule)
8. (Optional) [Monitor with InfluxDB Insights and Aware](#monitor-with-influxdb-insights-and-aware)
## Review requirements
Before you begin, make sure you have access to the following:
- A self-hosted OSS 2.x instance. ([get started for free here](/influxdb/v2.0/get-started/))
- Command line access to a machine [running InfluxDB Enterprise 1.x](/enterprise_influxdb/v1.10/introduction/install-and-deploy/) and permissions to install Telegraf on this machine.
- Internet connectivity from the machine running InfluxDB Enterprise 1.x and Telegraf to InfluxDB OSS.
- Sufficient resource availability to install the template.
## Install the InfluxDB Enterprise Monitoring template
The InfluxDB Enterprise Monitoring template includes a Telegraf configuration that sends InfluxDB Enterprise metrics to an InfluxDB endpoint and a dashboard that visualizes the metrics.
1. [Log into your InfluxDB OSS UI](http://localhost:8086/signin), go to **Settings > Templates**, and enter the following template URL:
```
https://raw.githubusercontent.com/influxdata/community-templates/master/influxdb-enterprise-1x/enterprise.yml
```
2. Click **Lookup Template**, and then click **Install Template**. InfluxDB OSS imports the template, which includes the following resources:
- Telegraf Configuration `monitoring-enterprise-1x`
- Dashboard `InfluxDB 1.x Enterprise`
- Label `enterprise`
- Variables `influxdb_host` and `bucket`
## Set up InfluxDB Enterprise for monitoring
By default, InfluxDB Enterprise 1.x has a `/metrics` endpoint available, which exports Prometheus-style system metrics.
1. Make sure the `/metrics` endpoint is [enabled](/{{< latest "influxdb" >}}/reference/config-options/#metrics-disabled). If you've changed the default settings to disable the `/metrics` endpoint, [re-enable these settings](/{{< latest "influxdb" >}}/reference/config-options/#metrics-disabled).
2. Navigate to the `/metrics` endpoint of your InfluxDB Enterprise instance to view the InfluxDB Enterprise system metrics in your browser:
```
http://localhost:8086/metrics
```
Or use `curl` to fetch metrics:
```sh
curl http://localhost:8086/metrics
# HELP boltdb_reads_total Total number of boltdb reads
# TYPE boltdb_reads_total counter
boltdb_reads_total 41
# HELP boltdb_writes_total Total number of boltdb writes
# TYPE boltdb_writes_total counter
boltdb_writes_total 28
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
...
```
3. Add your **InfluxDB OSS** account information (URL and organization) to your Telegraf configuration by doing the following:
1. Go to **Load Data > Telegraf** [in your InfluxDB OSS account](http://localhost:8086/), and click **InfluxDB Output Plugin** at the top-right corner.
2. Copy the `urls`, `token`, `organization`, and `bucket` and close the window.
3. Click **monitoring-enterprise-1.x**.
4. Replace `urls`, `token`, `organization`, and `bucket` under `outputs.influxdb_v2` with your InfluxDB OSS account information. Alternatively, store this information in your environment variables and include the environment variables in your configuration.
{{% note %}}
To ensure the InfluxDB Enterprise monitoring dashboard can display the recorded metrics, set the destination bucket name to `enterprise_metrics` in your `telegraf.conf`.
{{% /note %}}
5. Add the [Prometheus input plugin](https://github.com/influxdata/telegraf/blob/release-1.19/plugins/inputs/prometheus/README.md) to your `telegraf.conf`. Specify your your InfluxDB Enterprise URL(s) in the `urls` parameter. For example:
{{< keep-url >}}
```toml
[[inputs.prometheus]]
urls = ["http://localhost:8086/metrics"]
username = "$INFLUX_USER"
password = "$INFLUX_PASSWORD"
```
If you're using unique URLs or have security set up for your `/metrics` endpoint, configure those options here and save the updated configuration.
For more information about customizing Telegraf, see [Configure Telegraf](/{{< latest "telegraf" >}}/administration/configuration/#global-tags).
4. Click **Save Changes**.
## Set up Telegraf
Set up Telegraf to scrape metrics from InfluxDB Enterprise to send to your InfluxDB OSS account.
On each InfluxDB Enterprise instance you want to monitor, do the following:
1. Go to **Load Data > Telegraf** [in your InfluxDB OSS account](http://localhost:8086/signin).
2. Click **Setup Instructions** under **monitoring-enterprise-1.x**.
3. Complete the Telegraf Setup instructions. If you are using environment variables, set them up now.
{{% note %}}
For your API token, generate a new token or use an existing All Access token. If you run Telegraf as a service, edit your init script to set the environment variable and ensure its available to the service.
{{% /note %}}
Telegraf runs quietly in the background (no immediate output appears), and Telegraf begins pushing metrics to your InfluxDB OSS account.
## View the Monitoring dashboard
To see your data in real time, view the Monitoring dashboard.
1. Select **Boards** (**Dashboards**) in your **InfluxDB OSS** account.
{{< nav-icon "dashboards" >}}
2. Click **InfluxDB Enterprise Metrics**. Metrics appear in your dashboard.
3. Customize your monitoring dashboard as needed. For example, send an alert in the following cases:
- Users create a new task or bucket
- You're testing machine limits
- [Metrics stop reporting](#alert-when-metrics-stop-reporting)
## Alert when metrics stop reporting
The Monitoring template includes a [deadman check](/influxdb/v2.0/monitor-alert/checks/create/#deadman-check) to verify metrics are reported at regular intervals.
To alert when data stops flowing from InfluxDB OSS instances to your InfluxDB OSS account, do the following:
1. [Customize the deadman check](#customize-the-deadman-check) to identify the fields you want to monitor.
2. [Create a notification endpoint and rule](#create-a-notification-endpoint-and-rule) to receive notifications when your deadman check is triggered.
### Customize the deadman check
1. To view the deadman check, click **Alerts** in the navigation bar of your **InfluxDB OSS** account.
{{< nav-icon "alerts" >}}
2. Choose a InfluxDB OSS field or create a new OSS field for your deadman alert:
1. Click **{{< icon "plus" "v2" >}} Create** and select **Deadman Check** in the dropown menu.
2. Define your query with at least one field.
3. Click **Submit** and **Configure Check**.
When metrics stop reporting, you'll receive an alert.
3. Start under **Schedule Every**, set the amount of time to check for data.
4. Set the amount of time to wait before switching to a critical alert.
5. Save the Check and click on **View History** of the Check under the gear icon to verify it is running.
## Create a notification endpoint and rule
To receive a notification message when your deadman check is triggered, create a [notification endpoint](#create-a-notification-endpoint) and [rule](#create-a-notification-rule).
### Create a notification endpoint
InfluxData supports different endpoints: Slack, PagerDuty, and HTTP. Slack is free for all users, while PagerDuty and HTTP are exclusive to the Usage-Based Plan.
#### Send a notification to Slack
1. Create a [Slack Webhooks](https://api.slack.com/messaging/webhooks).
2. Go to **Alerts > Notification Endpoint** and click **{{< icon "plus" "v2" >}} Create**, and enter a name and description for your Slack endpoint.
3. Enter your Slack Webhook under **Incoming Webhook URL** and click **Create Notification Endpoint**.
#### Send a notification to PagerDuty or HTTP
Send a notification to PagerDuty or HTTP endpoints (other webhooks) by [upgrading your InfluxDB OSS account](/influxdb/v2.0/reference/cli/influxd/upgrade/).
### Create a notification rule
[Create a notification rule](/influxdb/v2.0/monitor-alert/notification-rules/create/) to set rules for when to send a deadman alert message to your notification endpoint.
1. Go to **Alerts > Notification Rules** and click **{{< icon "plus" "v2" >}} Create**.
2. Fill out the **About** and **Conditions** section then click **Create Notification Rule**.

View File

@ -0,0 +1,22 @@
---
title: Renew or update a license key or file
description: >
Renew or update a license key or file for your InfluxDB enterprise cluster.
menu:
enterprise_influxdb_1_10:
name: Renew a license
weight: 50
parent: Administration
---
Use this procedure to renew or update an existing license key or file, switch from a license key to a license file, or switch from a license file to a license key.
> **Note:** To request a new license to renew or expand your InfluxDB Enterprise cluster, contact [sales@influxdb.com](mailto:sales@influxdb.com).
To update a license key or file, do the following:
1. If you are switching from a license key to a license file (or vice versa), delete your existing license key or file.
2. **Add the license key or file** to your [meta nodes](/enterprise_influxdb/v1.10/administration/config-meta-nodes/#enterprise-license-settings) and [data nodes](/enterprise_influxdb/v1.10/administration/config-data-nodes/#enterprise-license-settings) configuration settings. For more information, see [how to configure InfluxDB Enterprise clusters](/enterprise_influxdb/v1.10/administration/configuration/).
3. **On each meta node**, run `service influxdb-meta restart`, and wait for the meta node service to come back up successfully before restarting the next meta node.
The cluster should remain unaffected as long as only one node is restarting at a time.
4. **On each data node**, run `killall -s HUP influxd` to signal the `influxd` process to reload its configuration file.

View File

@ -0,0 +1,29 @@
---
title: Stability and compatibility
description: >
API and storage engine compatibility and stability in InfluxDB OSS.
menu:
enterprise_influxdb_1_10:
weight: 90
parent: Administration
---
## 1.x API compatibility and stability
One of the more important aspects of the 1.0 release is that this marks the stabilization of our API and storage format. Over the course of the last three years weve iterated aggressively, often breaking the API in the process. With the release of 1.0 and for the entire 1.x line of releases were committing to the following:
### No breaking InfluxDB API changes
When it comes to the InfluxDB API, if a command works in 1.0 it will work unchanged in all 1.x releases...with one caveat. We will be adding [keywords](/enterprise_influxdb/v1.10/query_language/spec/#keywords) to the query language. New keywords won't break your queries if you wrap all [identifiers](/enterprise_influxdb/v1.10/concepts/glossary/#identifier) in double quotes and all string literals in single quotes. This is generally considered best practice so it should be followed anyway. For users following that guideline, the query and ingestion APIs will have no breaking changes for all 1.x releases. Note that this does not include the Go code in the project. The underlying Go API in InfluxDB can and will change over the course of 1.x development. Users should be accessing InfluxDB through the [InfluxDB API](/enterprise_influxdb/v1.10/tools/api/).
### Storage engine stability
The [TSM](/enterprise_influxdb/v1.10/concepts/glossary/#tsm-time-structured-merge-tree) storage engine file format is now at version 1. While we may introduce new versions of the format in the 1.x releases, these new versions will run side-by-side with previous versions. What this means for users is there will be no lengthy migrations when upgrading from one 1.x release to another.
### Additive changes
The query engine will have additive changes over the course of the new releases. Well introduce new query functions and new functionality into the language without breaking backwards compatibility. We may introduce new protocol endpoints (like a binary format) and versions of the line protocol and query API to improve performance and/or functionality, but they will have to run in parallel with the existing versions. Existing versions will be supported for the entirety of the 1.x release line.
### Ongoing support
Well continue to fix bugs on the 1.x versions of the [line protocol](/enterprise_influxdb/v1.10/concepts/glossary/#influxdb-line-protocol), query API, and TSM storage format. Users should expect to upgrade to the latest 1.x.x release for bug fixes, but those releases will all be compatible with the 1.0 API and wont require data migrations. For instance, if a user is running 1.2 and there are bug fixes released in 1.3, they should upgrade to the 1.3 release. Until 1.4 is released, patch fixes will go into 1.3.x. Because all future 1.x releases are drop in replacements for previous 1.x releases, users should upgrade to the latest in the 1.x line to get all bug fixes.

View File

@ -0,0 +1,308 @@
---
title: Upgrade InfluxDB Enterprise clusters
description: Upgrade to the latest version of InfluxDB Enterprise.
aliases:
- /enterprise/v1.10/administration/upgrading/
menu:
enterprise_influxdb_1_10:
name: Upgrade
weight: 50
parent: Administration
---
To successfully perform a rolling upgrade of InfluxDB Enterprise clusters to {{< latest-patch >}}, complete the following steps:
1. [Back up your cluster](#back-up-your-cluster).
2. [Upgrade meta nodes](#upgrade-meta-nodes).
3. [Upgrade data nodes](#upgrade-data-nodes).
> ***Note:*** A rolling upgrade lets you update your cluster with zero downtime. To downgrade to an earlier version, complete the following procedures, replacing the version numbers with the version that you want to downgrade to.
## Back up your cluster
Before performing an upgrade, create a full backup of your InfluxDB Enterprise cluster. Also, if you create incremental backups, trigger a final incremental backup.
> ***Note:*** For information on performing a final incremental backup or a full backup,
> see [Back up and restore InfluxDB Enterprise clusters](/enterprise_influxdb/v1.10/administration/backup-and-restore/).
## Upgrade meta nodes
Complete the following steps to upgrade meta nodes:
1. [Download the meta node package](#download-the-meta-node-package).
2. [Install the meta node package](#install-the-meta-node-package).
3. [Update the meta node configuration file](#update-the-meta-node-configuration-file).
4. [Restart the `influxdb-meta` service](#restart-the-influxdb-meta-service).
5. Repeat steps 1-4 for each meta node in your cluster.
6. [Confirm the meta nodes upgrade](#confirm-the-meta-nodes-upgrade).
### Download the meta node package
##### Ubuntu and Debian (64-bit)
```bash
wget https://dl.influxdata.com/enterprise/releases/influxdb-meta_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
```
##### RedHat and CentOS (64-bit)
```bash
wget https://dl.influxdata.com/enterprise/releases/influxdb-meta-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
```
### Install the meta node package
##### Ubuntu and Debian (64-bit)
```bash
sudo dpkg -i influxdb-meta_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
```
##### RedHat and CentOS (64-bit)
```bash
sudo yum localinstall influxdb-meta-{{< latest-patch >}}-c{{< latest-patch >}}.x86_64.rpm
```
### Update the meta node configuration file
Migrate any custom settings from your previous meta node configuration file.
To enable HTTPS, you must update the meta node configuration file (`influxdb-meta.conf`). For information, see [Enable HTTPS within the configuration file for each Meta Node](/enterprise_influxdb/v1.10/guides/https_setup/#set-up-https-in-an-influxdb-enterprise-cluster).
### Restart the `influxdb-meta` service
##### sysvinit systems
```bash
service influxdb-meta restart
```
##### systemd systems
```bash
sudo systemctl restart influxdb-meta
```
### Confirm the meta nodes upgrade
After upgrading _**all**_ meta nodes, check your node version numbers using the
`influxd-ctl show` command.
The [`influxd-ctl` utility](/enterprise_influxdb/v1.10/tools/influxd-ctl/) is available on all meta nodes.
```bash
~# influxd-ctl show
Data Nodes
==========
ID TCP Address Version
4 rk-upgrading-01:8088 1.8.x_c1.8.y
5 rk-upgrading-02:8088 1.8.x_c1.8.y
6 rk-upgrading-03:8088 1.8.x_c1.8.y
Meta Nodes
==========
TCP Address Version
rk-upgrading-01:8091 {{< latest-patch >}}-c{{< latest-patch >}} # {{< latest-patch >}}-c{{< latest-patch >}} = 👍
rk-upgrading-02:8091 {{< latest-patch >}}-c{{< latest-patch >}}
rk-upgrading-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
```
Ensure that the meta cluster is healthy before upgrading the data nodes.
## Upgrade data nodes
Complete the following steps to upgrade data nodes:
1. [Stop traffic to data nodes](#stop-traffic-to-the-data-node).
2. [Download the data node package](#download-the-data-node-package).
3. [Install the data node package](#install-the-data-node-package).
4. [Update the data node configuration file](#update-the-data-node-configuration-file).
5. For Time Series Index (TSI) only. [Rebuild TSI indexes](#rebuild-tsi-indexes).
6. [Restart the `influxdb` service](#restart-the-influxdb-service).
7. [Restart traffic to data nodes](#restart-traffic-to-data-nodes).
8. Repeat steps 1-7 for each data node in your cluster.
9. [Confirm the data nodes upgrade](#confirm-the-data-nodes-upgrade).
### Stop traffic to the data node
To stop traffic to data nodes, **do one of the following:**
- **Disable traffic to data nodes in the node balancer**
- If you have access to the load balancer configuration, use your load balancer to stop routing read and write requests to the data node server (port 8086).
- If you cannot access the load balancer configuration, work with your networking team to prevent traffic to the data node server before continuing to upgrade.
{{% note %}}
Disabling traffic to a data node in the load balancer still allows other data
node in the cluster to write to current data node.
{{% /note %}}
- **Stop the `influxdb` service on the data node**
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[sysvinit](#)
[systemd](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
service influxdb stop
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
sudo systemctl stop influxdb
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
{{% note %}}
Stopping the `influxdb` process the data node takes longer than disabling
traffic at the load balancer, but it ensures all writes stop, including writes
from other data nodes in the cluster.
{{% /note %}}
### Download the data node package
##### Ubuntu and Debian (64-bit)
```bash
wget https://dl.influxdata.com/enterprise/releases/influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
```
##### RedHat and CentOS (64-bit)
```bash
wget https://dl.influxdata.com/enterprise/releases/influxdb-data-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
```
### Install the data node package
When you run the install command, you're prompted to keep or overwrite your
current configuration file with the file for version {{< latest-patch >}}.
Enter `N` or `O` to keep your current configuration file.
You'll make the configuration changes for version {{< latest-patch >}} in the
next procedure, [Update the data node configuration file](#update-the-data-node-configuration-file).
##### Ubuntu & Debian (64-bit)
```bash
sudo dpkg -i influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
```
##### RedHat & CentOS (64-bit)
```bash
sudo yum localinstall influxdb-data-{{< latest-patch >}}-c{{< latest-patch >}}.x86_64.rpm
```
### Update the data node configuration file
Migrate any custom settings from your previous data node configuration file.
- To enable HTTPS, see [Enable HTTPS within the configuration file for each Data Node](/enterprise_influxdb/v1.10/guides/https_setup/#set-up-https-in-an-influxdb-enterprise-cluster).
- To enable TSI, open `/etc/influxdb/influxdb.conf`, and then adjust and save the settings shown in the following table.
| Section | Setting |
| --------| ----------------------------------------------------------|
| `[data]` | <ul><li>To use Time Series Index (TSI) disk-based indexing, add [`index-version = "tsi1"`](/enterprise_influxdb/v1.10/administration/config-data-nodes#index-version-inmem) <li>To use TSM in-memory index, add [`index-version = "inmem"`](/enterprise_influxdb/v1.10/administration/config-data-nodes#index-version-inmem) <li>Add [`wal-fsync-delay = "0s"`](/enterprise_influxdb/v1.10/administration/config-data-nodes#wal-fsync-delay-0s) <li>Add [`max-concurrent-compactions = 0`](/enterprise_influxdb/v1.10/administration/config-data-nodes#max-concurrent-compactions-0)<li>Set[`cache-max-memory-size`](/enterprise_influxdb/v1.10/administration/config-data-nodes#cache-max-memory-size-1g) to `1073741824` |
| `[cluster]`| <ul><li>Add [`pool-max-idle-streams = 100`](/enterprise_influxdb/v1.10/administration/config-data-nodes#pool-max-idle-streams-100) <li>Add[`pool-max-idle-time = "1m0s"`](/enterprise_influxdb/v1.10/administration/config-data-nodes#pool-max-idle-time-60s) <li>Remove `max-remote-write-connections`
|[`[anti-entropy]`](/enterprise_influxdb/v1.10/administration/config-data-nodes#anti-entropy)| <ul><li>Add `enabled = true` <li>Add `check-interval = "30s"` <li>Add `max-fetch = 10`|
|`[admin]`| Remove entire section.|
For more information about TSI, see [TSI overview](/enterprise_influxdb/v1.10/concepts/time-series-index/) and [TSI details](/enterprise_influxdb/v1.10/concepts/tsi-details/).
### Rebuild TSI indexes
Complete the following steps for Time Series Index (TSI) only.
1. Delete all `_series` directories in the `/data` directory (by default, stored at `/data/<dbName>/_series`).
2. Delete all TSM-based shard `index` directories (by default, located at `/data/<dbName/<rpName>/<shardID>/index`).
3. Use the [`influx_inspect buildtsi`](/enterprise_influxdb/v1.10/tools/influx_inspect#buildtsi) utility to rebuild the TSI index. For example, run the following command:
```js
influx_inspect buildtsi -datadir /yourDataDirectory -waldir /wal
```
Replacing `yourDataDirectory` with the name of your directory. Running this command converts TSM-based shards to TSI shards or rebuilds existing TSI shards.
> **Note:** Run the `buildtsi` command using the same system user that runs the `influxd` service, or a user with the same permissions.
### Restart the `influxdb` service
Restart the `influxdb` service to restart the data nodes.
Do one of the following:
- **If the `influxdb` service is still running**, but isn't receiving traffic from the load balancer:
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[sysvinit](#)
[systemd](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
service influxdb restart
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
sudo systemctl restart influxdb
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
- **If the `influxdb` service is stopped**:
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[sysvinit](#)
[systemd](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
service influxdb start
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
sudo systemctl start influxdb
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
### Restart traffic to data nodes
Restart routing read and write requests to the data node server (port 8086) through your load balancer.
> **Note:** Allow the hinted handoff queue (HHQ) to write all missed data to the updated node before upgrading the next data node. Once all data has been written, the disk space used in the hinted handoff queue should be 0. Check the disk space on your hh directory by running the [`du`] command, for example, `du /var/lib/influxdb/hh`.
### Confirm the data nodes upgrade
After upgrading _**all**_ data nodes, check your node version numbers using the
`influxd-ctl show` command.
The [`influxd-ctl` utility](/enterprise_influxdb/v1.10/tools/influxd-ctl/) is available on all meta nodes.
```bash
~# influxd-ctl show
Data Nodes
==========
ID TCP Address Version
4 rk-upgrading-01:8088 {{< latest-patch >}}-c{{< latest-patch >}} # {{< latest-patch >}}-c{{< latest-patch >}} = 👍
5 rk-upgrading-02:8088 {{< latest-patch >}}-c{{< latest-patch >}}
6 rk-upgrading-03:8088 {{< latest-patch >}}-c{{< latest-patch >}}
Meta Nodes
==========
TCP Address Version
rk-upgrading-01:8091 {{< latest-patch >}}-c{{< latest-patch >}}
rk-upgrading-02:8091 {{< latest-patch >}}-c{{< latest-patch >}}
rk-upgrading-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
```
If you have any issues upgrading your cluster, contact InfluxData support.

View File

@ -0,0 +1,12 @@
---
title: InfluxDB Enterprise concepts
description: Clustering and other key concepts in InfluxDB Enterprise.
aliases:
- /enterprise/v1.10/concepts/
menu:
enterprise_influxdb_1_10:
name: Concepts
weight: 50
---
{{< children hlevel="h2" type="list" >}}

View File

@ -0,0 +1,141 @@
---
title: Clustering in InfluxDB Enterprise
description: >
Learn how meta nodes and data nodes interact in InfluxDB Enterprise.
aliases:
- /enterprise/v1.10/concepts/clustering/
- /enterprise_influxdb/v1.10/high_availability/clusters/
menu:
enterprise_influxdb_1_10:
name: Clustering
weight: 10
parent: Concepts
---
This document describes in detail how clustering works in InfluxDB Enterprise.
It starts with a high level description of the different components of a cluster
and then outlines implementation details.
## Architectural overview
An InfluxDB Enterprise installation consists of two groups of software processes: data nodes and meta nodes.
Communication within a cluster looks like this:
{{< diagram >}}
flowchart TB
subgraph meta[Meta Nodes]
Meta1 <-- TCP :8089 --> Meta2 <-- TCP :8089 --> Meta3
end
meta <-- HTTP :8091 --> data
subgraph data[Data Nodes]
Data1 <-- TCP :8088 --> Data2
end
{{< /diagram >}}
The meta nodes communicate with each other via a TCP protocol and the Raft consensus protocol that all use port `8089` by default. This port must be reachable between the meta nodes. The meta nodes also expose an HTTP API bound to port `8091` by default that the `influxd-ctl` command uses.
Data nodes communicate with each other through a TCP protocol that is bound to port `8088`. Data nodes communicate with the meta nodes through their HTTP API bound to `8091`. These ports must be reachable between the meta and data nodes.
Within a cluster, all meta nodes must communicate with all other meta nodes. All data nodes must communicate with all other data nodes and all meta nodes.
The meta nodes keep a consistent view of the metadata that describes the cluster. The meta cluster uses the [HashiCorp implementation of Raft](https://github.com/hashicorp/raft) as the underlying consensus protocol. This is the same implementation that they use in Consul.
The data nodes replicate data and query each other using the Protobuf protocol over TCP. Details on replication and querying are covered later in this document.
## Where data lives
The meta and data nodes are each responsible for different parts of the database.
### Meta nodes
Meta nodes hold all of the following meta data:
* all nodes in the cluster and their role
* all databases and retention policies that exist in the cluster
* all shards and shard groups, and on what nodes they exist
* cluster users and their permissions
* all continuous queries
The meta nodes keep this data in the Raft database on disk, backed by BoltDB. By default the Raft database is `/var/lib/influxdb/meta/raft.db`.
> **Note:** Meta nodes require the `/meta` directory.
### Data nodes
Data nodes hold all of the raw time series data and metadata, including:
* measurements
* tag keys and values
* field keys and values
On disk, the data is always organized by `<database>/<retention_policy>/<shard_id>`. By default the parent directory is `/var/lib/influxdb/data`.
> **Note:** Data nodes require all four subdirectories of `/var/lib/influxdb/`, including `/meta` (specifically, the clients.json file), `/data`, `/wal`, and `/hh`.
## Optimal server counts
When creating a cluster, you need to decide how many meta and data nodes to configure and connect. You can think of InfluxDB Enterprise as two separate clusters that communicate with each other: a cluster of meta nodes and one of data nodes. The number of meta nodes is driven by the number of meta node failures they need to be able to handle, while the number of data nodes scales based on your storage and query needs.
The Raft consensus protocol requires a quorum to perform any operation, so there should always be an odd number of meta nodes. For almost all applications, 3 meta nodes is what you want. It gives you an odd number of meta nodes so that a quorum can be reached. And, if one meta node is lost, the cluster can still operate with the remaining 2 meta nodes until the third one is replaced. Additional meta nodes exponentially increases the communication overhead and is not recommended unless you expect the cluster to frequently lose meta nodes.
Data nodes hold the actual time series data. The minimum number of data nodes to run is 1 and can scale up from there. **Generally, you'll want to run a number of data nodes that is evenly divisible by your replication factor.** For instance, if you have a replication factor of 2, you'll want to run 2, 4, 6, 8, 10, etc. data nodes.
## Chronograf
[Chronograf](/{{< latest "chronograf" >}}/introduction/getting-started/) is the user interface component of InfluxDatas TICK stack.
It makes owning the monitoring and alerting for your infrastructure easy to setup and maintain.
It talks directly to the data and meta nodes over their HTTP protocols, which are bound by default to ports `8086` for data nodes and port `8091` for meta nodes.
## Writes in a cluster
This section describes how writes in a cluster work. We'll work through some examples using a cluster of four data nodes: `A`, `B`, `C`, and `D`. Assume that we have a retention policy with a replication factor of 2 with shard durations of 1 day.
### Shard groups
The cluster creates shards within a shard group to maximize the number of data nodes utilized. If there are N data nodes in the cluster and the replication factor is X, then N/X shards are created in each shard group, discarding any fractions.
This means that a new shard group gets created for each day of data that gets written in. Within each shard group 2 shards are created. Because of the replication factor of 2, each of those two shards are copied on 2 servers. For example we have a shard group for `2016-09-19` that has two shards `1` and `2`. Shard `1` is replicated to servers `A` and `B` while shard `2` is copied to servers `C` and `D`.
When a write comes in with values that have a timestamp in `2016-09-19` the cluster must first determine which shard within the shard group should receive the write. This is done by taking a hash of the `measurement` + sorted `tagset` (the metaseries) and bucketing into the correct shard. In Go this looks like:
```go
// key is measurement + tagset
// shardGroup is the group for the values based on timestamp
// hash with fnv and then bucket
shard := shardGroup.shards[fnv.New64a(key) % len(shardGroup.Shards)]
```
There are multiple implications to this scheme for determining where data lives in a cluster. First, for any given metaseries all data on any given day exists in a single shard, and thus only on those servers hosting a copy of that shard. Second, once a shard group is created, adding new servers to the cluster won't scale out write capacity for that shard group. The replication is fixed when the shard group is created.
However, there is a method for expanding writes in the current shard group (i.e. today) when growing a cluster. The current shard group can be truncated to stop at the current time using `influxd-ctl truncate-shards`. This immediately closes the current shard group, forcing a new shard group to be created. That new shard group inherits the latest retention policy and data node changes and then copies itself appropriately to the newly available data nodes. Run `influxd-ctl truncate-shards help` for more information on the command.
### Write consistency
Each request to the HTTP API can specify the consistency level via the `consistency` query parameter. For this example let's assume that an HTTP write is being sent to server `D` and the data belongs in shard `1`. The write needs to be replicated to the owners of shard `1`: data nodes `A` and `B`. When the write comes into `D`, that node determines from its local cache of the metastore that the write needs to be replicated to the `A` and `B`, and it immediately tries to write to both. The subsequent behavior depends on the consistency level chosen:
* `any` - return success to the client as soon as any node has responded with a write success, or the receiving node has written the data to its hinted handoff queue. In our example, if `A` or `B` return a successful write response to `D`, or if `D` has cached the write in its local hinted handoff, `D` returns a write success to the client.
* `one` - return success to the client as soon as any node has responded with a write success, but not if the write is only in hinted handoff. In our example, if `A` or `B` return a successful write response to `D`, `D` returns a write success to the client. If `D` could not send the data to either `A` or `B` but instead put the data in hinted handoff, `D` returns a write failure to the client. Note that this means writes may return a failure and yet the data may eventually persist successfully when hinted handoff drains.
* `quorum` - return success when a majority of nodes return success. This option is only useful if the replication factor is greater than 2, otherwise it is equivalent to `all`. In our example, if both `A` and `B` return a successful write response to `D`, `D` returns a write success to the client. If either `A` or `B` does not return success, then a majority of nodes have not successfully persisted the write and `D` returns a write failure to the client. If we assume for a moment the data were bound for three nodes, `A`, `B`, and `C`, then if any two of those nodes respond with a write success, `D` returns a write success to the client. If one or fewer nodes respond with a success, `D` returns a write failure to the client. Note that this means writes may return a failure and yet the data may eventually persist successfully when hinted handoff drains.
* `all` - return success only when all nodes return success. In our example, if both `A` and `B` return a successful write response to `D`, `D` returns a write success to the client. If either `A` or `B` does not return success, then `D` returns a write failure to the client. If we again assume three destination nodes `A`, `B`, and `C`, then all if three nodes respond with a write success, `D` returns a write success to the client. Otherwise, `D` returns a write failure to the client. Note that this means writes may return a failure and yet the data may eventually persist successfully when hinted handoff drains.
The important thing to note is how failures are handled. In the case of failures, the database uses the hinted handoff system.
### Hinted handoff
Hinted handoff is how InfluxDB Enterprise deals with data node outages while writes are happening. Hinted handoff is essentially a durable disk based queue. When writing at `any`, `one` or `quorum` consistency, hinted handoff is used when one or more replicas return an error after a success has already been returned to the client. When writing at `all` consistency, writes cannot return success unless all nodes return success. Temporarily stalled or failed writes may still go to the hinted handoff queues but the cluster would have already returned a failure response to the write. The receiving node creates a separate queue on disk for each data node (and shard) it cannot reach.
Let's again use the example of a write coming to `D` that should go to shard `1` on `A` and `B`. If we specified a consistency level of `one` and node `A` returns success, `D` immediately returns success to the client even though the write to `B` is still in progress.
Now let's assume that `B` returns an error. Node `D` then puts the write into its hinted handoff queue for shard `1` on node `B`. In the background, node `D` continues to attempt to empty the hinted handoff queue by writing the data to node `B`. The configuration file has settings for the maximum size and age of data in hinted handoff queues.
If a data node is restarted it checks for pending writes in the hinted handoff queues and resume attempts to replicate the writes. The important thing to note is that the hinted handoff queue is durable and does survive a process restart.
When restarting nodes within an active cluster, during upgrades or maintenance, for example, other nodes in the cluster store hinted handoff writes to the offline node and replicates them when the node is again available. Thus, a healthy cluster should have enough resource headroom on each data node to handle the burst of hinted handoff writes following a node outage. The returning node needs to handle both the steady state traffic and the queued hinted handoff writes from other nodes, meaning its write traffic will have a significant spike following any outage of more than a few seconds, until the hinted handoff queue drains.
If a node with pending hinted handoff writes for another data node receives a write destined for that node, it adds the write to the end of the hinted handoff queue rather than attempt a direct write. This ensures that data nodes receive data in mostly chronological order, as well as preventing unnecessary connection attempts while the other node is offline.
## Queries in a cluster
Queries in a cluster are distributed based on the time range being queried and the replication factor of the data. For example if the retention policy has a replication factor of 4, the coordinating data node receiving the query randomly picks any of the 4 data nodes that store a replica of the shard(s) to receive the query. If we assume that the system has shard durations of one day, then for each day of time covered by a query the coordinating node selects one data node to receive the query for that day.
The coordinating node executes and fulfill the query locally whenever possible. If a query must scan multiple shard groups (multiple days in the example above), the coordinating node forwards queries to other nodes for shards it does not have locally. The queries are forwarded in parallel to scanning its own local data. The queries are distributed to as many nodes as required to query each shard group once. As the results come back from each data node, the coordinating data node combines them into the final result that gets returned to the user.

View File

@ -0,0 +1,219 @@
---
title: Compare InfluxDB to SQL databases
description: Differences between InfluxDB and SQL databases.
menu:
enterprise_influxdb_1_10:
name: Compare InfluxDB to SQL databases
weight: 30
parent: Concepts
---
InfluxDB is similar to a SQL database, but different in many ways.
InfluxDB is purpose-built for time series data.
Relational databases _can_ handle time series data, but are not optimized for common time series workloads.
InfluxDB is designed to store large volumes of time series data and quickly perform real-time analysis on that data.
### Timing is everything
In InfluxDB, a timestamp identifies a single point in any given data series.
This is like an SQL database table where the primary key is pre-set by the system and is always time.
InfluxDB also recognizes that your [schema](/enterprise_influxdb/v1.10/concepts/glossary/#schema) preferences may change over time.
In InfluxDB you don't have to define schemas up front.
Data points can have one of the fields on a measurement, all of the fields on a measurement, or any number in-between.
You can add new fields to a measurement simply by writing a point for that new field.
If you need an explanation of the terms measurements, tags, and fields check out the next section for an SQL database to InfluxDB terminology crosswalk.
## Terminology
The table below is a (very) simple example of a table called `foodships` in an SQL database
with the unindexed column `#_foodships` and the indexed columns `park_id`, `planet`, and `time`.
``` sql
+---------+---------+---------------------+--------------+
| park_id | planet | time | #_foodships |
+---------+---------+---------------------+--------------+
| 1 | Earth | 1429185600000000000 | 0 |
| 1 | Earth | 1429185601000000000 | 3 |
| 1 | Earth | 1429185602000000000 | 15 |
| 1 | Earth | 1429185603000000000 | 15 |
| 2 | Saturn | 1429185600000000000 | 5 |
| 2 | Saturn | 1429185601000000000 | 9 |
| 2 | Saturn | 1429185602000000000 | 10 |
| 2 | Saturn | 1429185603000000000 | 14 |
| 3 | Jupiter | 1429185600000000000 | 20 |
| 3 | Jupiter | 1429185601000000000 | 21 |
| 3 | Jupiter | 1429185602000000000 | 21 |
| 3 | Jupiter | 1429185603000000000 | 20 |
| 4 | Saturn | 1429185600000000000 | 5 |
| 4 | Saturn | 1429185601000000000 | 5 |
| 4 | Saturn | 1429185602000000000 | 6 |
| 4 | Saturn | 1429185603000000000 | 5 |
+---------+---------+---------------------+--------------+
```
Those same data look like this in InfluxDB:
```sql
name: foodships
tags: park_id=1, planet=Earth
time #_foodships
---- ------------
2015-04-16T12:00:00Z 0
2015-04-16T12:00:01Z 3
2015-04-16T12:00:02Z 15
2015-04-16T12:00:03Z 15
name: foodships
tags: park_id=2, planet=Saturn
time #_foodships
---- ------------
2015-04-16T12:00:00Z 5
2015-04-16T12:00:01Z 9
2015-04-16T12:00:02Z 10
2015-04-16T12:00:03Z 14
name: foodships
tags: park_id=3, planet=Jupiter
time #_foodships
---- ------------
2015-04-16T12:00:00Z 20
2015-04-16T12:00:01Z 21
2015-04-16T12:00:02Z 21
2015-04-16T12:00:03Z 20
name: foodships
tags: park_id=4, planet=Saturn
time #_foodships
---- ------------
2015-04-16T12:00:00Z 5
2015-04-16T12:00:01Z 5
2015-04-16T12:00:02Z 6
2015-04-16T12:00:03Z 5
```
Referencing the example above, in general:
* An InfluxDB measurement (`foodships`) is similar to an SQL database table.
* InfluxDB tags ( `park_id` and `planet`) are like indexed columns in an SQL database.
* InfluxDB fields (`#_foodships`) are like unindexed columns in an SQL database.
* InfluxDB points (for example, `2015-04-16T12:00:00Z 5`) are similar to SQL rows.
Building on this comparison of database terminology,
InfluxDB [continuous queries](/enterprise_influxdb/v1.10/concepts/glossary/#continuous-query-cq)
and [retention policies](/enterprise_influxdb/v1.10/concepts/glossary/#retention-policy-rp) are
similar to stored procedures in an SQL database.
They're specified once and then performed regularly and automatically.
Of course, there are some major disparities between SQL databases and InfluxDB.
SQL `JOIN`s aren't available for InfluxDB measurements; your schema design should reflect that difference.
And, as we mentioned above, a measurement is like an SQL table where the primary index is always pre-set to time.
InfluxDB timestamps must be in UNIX epoch (GMT) or formatted as a date-time string valid under RFC3339.
For more detailed descriptions of the InfluxDB terms mentioned in this section see our [Glossary of Terms](/enterprise_influxdb/v1.10/concepts/glossary/).
## Query languages
InfluxDB supports multiple query languages:
- [Flux](#flux)
- [InfluxQL](#influxql)
### Flux
[Flux](/enterprise_influxdb/v1.10/flux/) is a data scripting language designed for querying, analyzing, and acting on time series data.
Beginning with **InfluxDB 1.8.0**, Flux is available for production use along side InfluxQL.
For those familiar with [InfluxQL](#influxql), Flux is intended to address
many of the outstanding feature requests that we've received since introducing InfluxDB 1.0.
For a comparison between Flux and InfluxQL, see [Flux vs InfluxQL](/enterprise_influxdb/v1.10/flux/flux-vs-influxql/).
Flux is the primary language for working with data in [InfluxDB OSS 2.0](/influxdb/v2.0/get-started)
and [InfluxDB Cloud](/influxdb/cloud/get-started/),
a generally available Platform as a Service (PaaS) available across multiple Cloud Service Providers.
Using Flux with InfluxDB 1.8+ lets you get familiar with Flux concepts and syntax
and ease the transition to InfluxDB 2.0.
### InfluxQL
InfluxQL is an SQL-like query language for interacting with InfluxDB.
It has been crafted to feel familiar to those coming from other
SQL or SQL-like environments while also providing features specific
to storing and analyzing time series data.
However **InfluxQL is not SQL** and lacks support for more advanced operations
like `UNION`, `JOIN` and `HAVING` that SQL power-users are accustomed to.
This functionality is available with [Flux](/flux/latest/introduction).
InfluxQL's `SELECT` statement follows the form of an SQL `SELECT` statement:
```sql
SELECT <stuff> FROM <measurement_name> WHERE <some_conditions>
```
where `WHERE` is optional.
To get the InfluxDB output in the section above, you'd enter:
```sql
SELECT * FROM "foodships"
```
If you only wanted to see data for the planet `Saturn`, you'd enter:
```sql
SELECT * FROM "foodships" WHERE "planet" = 'Saturn'
```
If you wanted to see data for the planet `Saturn` after 12:00:01 UTC on April 16, 2015, you'd enter:
```sql
SELECT * FROM "foodships" WHERE "planet" = 'Saturn' AND time > '2015-04-16 12:00:01'
```
As shown in the example above, InfluxQL allows you to specify the time range of your query in the `WHERE` clause.
You can use date-time strings wrapped in single quotes that have the
format `YYYY-MM-DD HH:MM:SS.mmm`
(`mmm` is milliseconds and is optional, and you can also specify microseconds or nanoseconds).
You can also use relative time with `now()` which refers to the server's current timestamp:
```sql
SELECT * FROM "foodships" WHERE time > now() - 1h
```
That query outputs the data in the `foodships` measure where the timestamp is newer than the server's current time minus one hour.
The options for specifying time durations with `now()` are:
|Letter|Meaning|
|:---:|:---:|
| ns | nanoseconds |
|u or µ|microseconds|
| ms | milliseconds |
|s | seconds |
| m | minutes |
| h | hours |
| d | days |
| w | weeks |
InfluxQL also supports regular expressions, arithmetic in expressions, `SHOW` statements, and `GROUP BY` statements.
See our [data exploration](/enterprise_influxdb/v1.10/query_language/explore-data/) page for an in-depth discussion of those topics.
InfluxQL functions include `COUNT`, `MIN`, `MAX`, `MEDIAN`, `DERIVATIVE` and more.
For a full list check out the [functions](/enterprise_influxdb/v1.10/query_language/functions/) page.
Now that you have the general idea, check out our [Getting Started Guide](/enterprise_influxdb/v1.10/introduction/getting-started/).
## InfluxDB is not CRUD
InfluxDB is a database that has been optimized for time series data.
This data commonly comes from sources like distributed sensor groups, click data from large websites, or lists of financial transactions.
One thing this data has in common is that it is more useful in the aggregate.
One reading saying that your computers CPU is at 12% utilization at 12:38:35 UTC on a Tuesday is hard to draw conclusions from.
It becomes more useful when combined with the rest of the series and visualized.
This is where trends over time begin to show, and actionable insight can be drawn from the data.
In addition, time series data is generally written once and rarely updated.
The result is that InfluxDB is not a full CRUD database but more like a CR-ud, prioritizing the performance of creating and reading data over update and destroy, and [preventing some update and destroy behaviors](/enterprise_influxdb/v1.10/concepts/insights_tradeoffs/) to make create and read more performant:
* To update a point, insert one with [the same measurement, tag set, and timestamp](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points).
* You can [drop or delete a series](/enterprise_influxdb/v1.10/query_language/manage-database/#drop-series-from-the-index-with-drop-series), but not individual points based on field values. As a workaround, you can search for the field value, retrieve the time, then [DELETE based on the `time` field](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-series-with-delete).
* You can't update or rename tags yet - see GitHub issue [#4157](https://github.com/influxdata/influxdb/issues/4157) for more information. To modify the tag of a series of points, find the points with the offending tag value, change the value to the desired one, write the points back, then drop the series with the old tag value.
* You can't delete tags by tag key (as opposed to value) - see GitHub issue [#8604](https://github.com/influxdata/influxdb/issues/8604).

View File

@ -0,0 +1,110 @@
---
title: InfluxDB file system layout
description: >
The InfluxDB Enterprise file system layout depends on the operating system, package manager,
or containerization platform used to install InfluxDB.
weight: 102
menu:
enterprise_influxdb_1_10:
name: File system layout
parent: Concepts
---
The InfluxDB Enterprise file system layout depends on the installation method
or containerization platform used to install InfluxDB Enterprise.
- [InfluxDB Enterprise file structure](#influxdb-enterprise-file-structure)
- [File system layout](#file-system-layout)
## InfluxDB Enterprise file structure
The InfluxDB file structure includes the following:
- [Data directory](#data-directory)
- [WAL directory](#wal-directory)
- [Metastore directory](#metastore-directory)
- [Hinted handoff directory](#hinted-handoff-directory)
- [InfluxDB Enterprise configuration files](#influxdb-enterprise-configuration-files)
### Data directory
(**Data nodes only**)
Directory path where InfluxDB Enterprise stores time series data (TSM files).
To customize this path, use the [`[data].dir`](/enterprise_influxdb/v1.10/administration/config-data-nodes/#dir)
configuration option.
### WAL directory
(**Data nodes only**)
Directory path where InfluxDB Enterprise stores Write Ahead Log (WAL) files.
To customize this path, use the [`[data].wal-dir`](/enterprise_influxdb/v1.10/administration/config-data-nodes/#wal-dir)
configuration option.
### Hinted handoff directory
(**Data nodes only**)
Directory path where hinted handoff (HH) queues are stored.
To customize this path, use the [`[hinted-handoff].dir`](/enterprise_influxdb/v1.10/administration/config-data-nodes/#dir)
configuration option.
### Metastore directory
Directory path of the InfluxDB Enterprise metastore, which stores information
about the cluster, users, databases, retention policies, shards, and continuous queries.
**On data nodes**, the metastore contains information about InfluxDB Enterprise meta nodes.
To customize this path, use the [`[meta].dir` configuration option in your data node configuration file](/enterprise_influxdb/v1.10/administration/config-data-nodes/#dir).
**On meta nodes**, the metastore contains information about the InfluxDB Enterprise RAFT cluster.
To customize this path, use the [`[meta].dir` configuration option in your meta node configuration file](/enterprise_influxdb/v1.10/administration/config-meta-nodes/#dir).
### InfluxDB Enterprise configuration files
InfluxDB Enterprise stores default data and meta node configuration file on disk.
For more information about using InfluxDB Enterprise configuration files, see:
- [Configure data nodes](/enterprise_influxdb/v1.10/administration/config-data-nodes/)
- [Configure meta nodes](/enterprise_influxdb/v1.10/administration/config-meta-nodes/)
## File system layout
InfluxDB Enterprise supports **.deb-** and **.rpm-based** Linux package managers.
The file system layout is the same with each.
- [Data node file system layout](#data-node-file-system-layout)
- [Meta node file system layout](#meta-node-file-system-layout)
### Data node file system layout
| Path | Default |
| :------------------------------------------------------------------- | :---------------------------- |
| [Data directory](#data-directory) | `/var/lib/influxdb/data/` |
| [WAL directory](#wal-directory) | `/var/lib/influxdb/wal/` |
| [Metastore directory](#metastore-directory) | `/var/lib/influxdb/meta/` |
| [Hinted handoff directory](#hinted-handoff-directory) | `/var/lib/influxdb/hh/` |
| [Default config file path](#influxdb-enterprise-configuration-files) | `/etc/influxdb/influxdb.conf` |
##### Data node file system overview
{{% filesystem-diagram %}}
- /etc/influxdb/
- influxdb.conf _<span style="opacity:.4">(Data node configuration file)</span>_
- /var/lib/influxdb/
- data/
- _<span style="opacity:.4">TSM directories and files</span>_
- hh/
- _<span style="opacity:.4">HH queue files</span>_
- meta/
- client.json
- wal/
- _<span style="opacity:.4">WAL directories and files</span>_
{{% /filesystem-diagram %}}
### Meta node file system layout
| Path | Default |
| :------------------------------------------------------------------- | :--------------------------------- |
| [Metastore directory](#metastore-directory) | `/var/lib/influxdb/meta/` |
| [Default config file path](#influxdb-enterprise-configuration-files) | `/etc/influxdb/influxdb-meta.conf` |
##### Meta node file system overview
{{% filesystem-diagram %}}
- /etc/influxdb/
- influxdb-meta.conf _<span style="opacity:.4">(Meta node configuration file)</span>_
- /var/lib/influxdb/
- meta/
- peers.json
- raft.db
- snapshots/
- _<span style="opacity:.4">Snapshot directories and files</span>_
{{% /filesystem-diagram %}}

View File

@ -0,0 +1,456 @@
---
title: Glossary
description: Terms related to InfluxDB Enterprise.
aliases:
- /enterprise/v1.8/concepts/glossary/
menu:
enterprise_influxdb_1_10:
weight: 20
parent: Concepts
---
## aggregation
An InfluxQL function that returns an aggregated value across a set of points.
For a complete list of the available and upcoming aggregations,
see [InfluxQL functions](/enterprise_influxdb/v1.10/query_language/functions/#aggregations).
Related entries: [function](#function), [selector](#selector), [transformation](#transformation)
## batch
A collection of data points in InfluxDB line protocol format, separated by newlines (`0x0A`).
A batch of points may be submitted to the database using a single HTTP request to the write endpoint.
This makes writes using the InfluxDB API much more performant by drastically reducing the HTTP overhead.
InfluxData recommends batch sizes of 5,000-10,000 points, although different use cases may be better served by significantly smaller or larger batches.
Related entries: [InfluxDB line protocol](#influxdb-line-protocol), [point](#point)
## bucket
A bucket is a named location where time series data is stored in **InfluxDB 2.0**. In InfluxDB 1.8+, each combination of a database and a retention policy (database/retention-policy) represents a bucket. Use the [InfluxDB 2.0 API compatibility endpoints](/enterprise_influxdb/v1.10/tools/api#influxdb-2-0-api-compatibility-endpoints) included with InfluxDB 1.8+ to interact with buckets.
## continuous query (CQ)
An InfluxQL query that runs automatically and periodically within a database.
Continuous queries require a function in the `SELECT` clause and must include a `GROUP BY time()` clause.
See [Continuous Queries](/enterprise_influxdb/v1.10/query_language/continuous_queries/).
Related entries: [function](#function)
## data node
A node that runs the data service.
For high availability, installations must have at least two data nodes.
The number of data nodes in your cluster must be the same as your highest
replication factor.
Any replication factor greater than two gives you additional fault tolerance and
query capacity within the cluster.
Data node sizes will depend on your needs.
The Amazon EC2 m4.large or m4.xlarge are good starting points.
Related entries: [data service](#data-service), [replication factor](#replication-factor)
## data service
Stores all time series data and handles all writes and queries.
Related entries: [data node](#data-node)
## database
A logical container for users, retention policies, continuous queries, and time series data.
Related entries: [continuous query](#continuous-query-cq), [retention policy](#retention-policy-rp), [user](#user)
## duration
The attribute of the retention policy that determines how long InfluxDB stores data.
Data older than the duration are automatically dropped from the database.
See [Database Management](/enterprise_influxdb/v1.10/query_language/manage-database/#create-retention-policies-with-create-retention-policy) for how to set duration.
Related entries: [retention policy](#retention-policy-rp)
## field
The key-value pair in an InfluxDB data structure that records metadata and the actual data value.
Fields are required in InfluxDB data structures and they are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant relative to tags.
*Query tip:* Compare fields to tags; tags are indexed.
Related entries: [field key](#field-key), [field set](#field-set), [field value](#field-value), [tag](#tag)
## field key
The key part of the key-value pair that makes up a field.
Field keys are strings and they store metadata.
Related entries: [field](#field), [field set](#field-set), [field value](#field-value), [tag key](#tag-key)
## field set
The collection of field keys and field values on a point.
Related entries: [field](#field), [field key](#field-key), [field value](#field-value), [point](#point)
## field value
The value part of the key-value pair that makes up a field.
Field values are the actual data; they can be strings, floats, integers, or booleans.
A field value is always associated with a timestamp.
Field values are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant.
*Query tip:* Compare field values to tag values; tag values are indexed.
Related entries: [field](#field), [field key](#field-key), [field set](#field-set), [tag value](#tag-value), [timestamp](#timestamp)
## function
InfluxQL aggregations, selectors, and transformations.
See [InfluxQL Functions](/enterprise_influxdb/v1.10/query_language/functions/) for a complete list of InfluxQL functions.
Related entries: [aggregation](#aggregation), [selector](#selector), [transformation](#transformation)
<!--
## grant
-->
## identifier
Tokens that refer to continuous query names, database names, field keys,
measurement names, retention policy names, subscription names, tag keys, and
user names.
See [Query Language Specification](/enterprise_influxdb/v1.10/query_language/spec/#identifiers).
Related entries:
[database](#database),
[field key](#field-key),
[measurement](#measurement),
[retention policy](#retention-policy-rp),
[tag key](#tag-key),
[user](#user)
## InfluxDB line protocol
The text based format for writing points to InfluxDB. See [InfluxDB line protocol](/enterprise_influxdb/v1.10/write_protocols/).
## measurement
The part of the InfluxDB data structure that describes the data stored in the associated fields.
Measurements are strings.
Related entries: [field](#field), [series](#series)
## meta node
A node that runs the meta service.
For high availability, installations must have three meta nodes.
Meta nodes can be very modestly sized instances like an EC2 t2.micro or even a
nano.
For additional fault tolerance installations may use five meta nodes; the
number of meta nodes must be an odd number.
Related entries: [meta service](#meta-service)
## meta service
The consistent data store that keeps state about the cluster, including which
servers, databases, users, continuous queries, retention policies, subscriptions,
and blocks of time exist.
Related entries: [meta node](#meta-node)
## metastore
Contains internal information about the status of the system.
The metastore contains the user information, databases, retention policies, shard metadata, continuous queries, and subscriptions.
Related entries: [database](#database), [retention policy](#retention-policy-rp), [user](#user)
## node
An independent `influxd` process.
Related entries: [server](#server)
## now()
The local server's nanosecond timestamp.
## passive node (experimental)
Passive nodes act as load balancers--they accept write calls, perform shard lookup and RPC calls (on active data nodes), and distribute writes to active data nodes. They do not own shards or accept writes.
**Note:** This is an experimental feature.
<!--
## permission
-->
## point
In InfluxDB, a point represents a single data record, similar to a row in a SQL database table. Each point:
- has a measurement, a tag set, a field key, a field value, and a timestamp;
- is uniquely identified by its series and timestamp.
You cannot store more than one point with the same timestamp in a series.
If you write a point to a series with a timestamp that matches an existing point, the field set becomes a union of the old and new field set, and any ties go to the new field set.
For more information about duplicate points, see [How does InfluxDB handle duplicate points?](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points)
Related entries: [field set](#field-set), [series](#series), [timestamp](#timestamp)
## points per second
A deprecated measurement of the rate at which data are persisted to InfluxDB.
The schema allows and even encourages the recording of multiple metric values per point, rendering points per second ambiguous.
Write speeds are generally quoted in values per second, a more precise metric.
Related entries: [point](#point), [schema](#schema), [values per second](#values-per-second)
## query
An operation that retrieves data from InfluxDB.
See [Data Exploration](/enterprise_influxdb/v1.10/query_language/explore-data/), [Schema Exploration](/enterprise_influxdb/v1.10/query_language/explore-schema/), [Database Management](/enterprise_influxdb/v1.10/query_language/manage-database/).
## replication factor (RF)
The attribute of the retention policy that determines how many copies of the
data are stored in the cluster. Replicating copies ensures that data is accessible when one or more data nodes are unavailable.
InfluxDB replicates data across `N` data nodes, where `N` is the replication
factor.
To maintain data availability for queries, the replication factor should be less
than or equal to the number of data nodes in the cluster:
* Data is fully available when the replication factor is greater than the
number of unavailable data nodes.
* Data may be unavailable when the replication factor is less than the number of
unavailable data nodes.
Any replication factor greater than two gives you additional fault tolerance and
query capacity within the cluster.
Related entries: [duration](#duration), [node](#node),
[retention policy](#retention-policy-rp)
## retention policy (RP)
Describes how long InfluxDB keeps data (duration), how many copies of the data to store in the cluster (replication factor), and the time range covered by shard groups (shard group duration). RPs are unique per database and along with the measurement and tag set define a series.
When you create a database, InfluxDB creates a retention policy called `autogen` with an infinite duration, a replication factor set to one, and a shard group duration set to seven days.
For more information, see [Retention policy management](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management).
Related entries: [duration](#duration), [measurement](#measurement), [replication factor](#replication-factor), [series](#series), [shard duration](#shard-duration), [tag set](#tag-set)
<!--
## role
-->
## schema
How the data are organized in InfluxDB.
The fundamentals of the InfluxDB schema are databases, retention policies, series, measurements, tag keys, tag values, and field keys.
See [Schema Design](/enterprise_influxdb/v1.10/concepts/schema_and_data_layout/) for more information.
Related entries: [database](#database), [field key](#field-key), [measurement](#measurement), [retention policy](#retention-policy-rp), [series](#series), [tag key](#tag-key), [tag value](#tag-value)
## selector
An InfluxQL function that returns a single point from the range of specified points.
See [InfluxQL Functions](/enterprise_influxdb/v1.10/query_language/functions/#selectors) for a complete list of the available and upcoming selectors.
Related entries: [aggregation](#aggregation), [function](#function), [transformation](#transformation)
## series
A logical grouping of data defined by shared measurement, tag set, and field key.
Related entries: [field set](#field-set), [measurement](#measurement), [tag set](#tag-set)
## series cardinality
The number of unique database, measurement, tag set, and field key combinations in an InfluxDB instance.
For example, assume that an InfluxDB instance has a single database and one measurement.
The single measurement has two tag keys: `email` and `status`.
If there are three different `email`s, and each email address is associated with two
different `status`es then the series cardinality for the measurement is 6
(3 * 2 = 6):
| email | status |
| :-------------------- | :----- |
| lorr@influxdata.com | start |
| lorr@influxdata.com | finish |
| marv@influxdata.com | start |
| marv@influxdata.com | finish |
| cliff@influxdata.com | start |
| cliff@influxdata.com | finish |
Note that, in some cases, simply performing that multiplication may overestimate series cardinality because of the presence of dependent tags.
Dependent tags are tags that are scoped by another tag and do not increase series
cardinality.
If we add the tag `firstname` to the example above, the series cardinality
would not be 18 (3 * 2 * 3 = 18).
It would remain unchanged at 6, as `firstname` is already scoped by the `email` tag:
| email | status | firstname |
| :-------------------- | :----- | :-------- |
| lorr@influxdata.com | start | lorraine |
| lorr@influxdata.com | finish | lorraine |
| marv@influxdata.com | start | marvin |
| marv@influxdata.com | finish | marvin |
| cliff@influxdata.com | start | clifford |
| cliff@influxdata.com | finish | clifford |
See [SHOW CARDINALITY](/enterprise_influxdb/v1.10/query_language/spec/#show-cardinality) to learn about the InfluxQL commands for series cardinality.
Related entries: [field key](#field-key),[measurement](#measurement), [tag key](#tag-key), [tag set](#tag-set)
## series key
A series key identifies a particular series by measurement, tag set, and field key.
For example:
```
# measurement, tag set, field key
h2o_level, location=santa_monica, h2o_feet
```
Related entries: [series](#series)
## server
A machine, virtual or physical, that is running InfluxDB.
There should only be one InfluxDB process per server.
Related entries: [node](#node)
## shard
A shard contains the actual encoded and compressed data, and is represented by a TSM file on disk.
Every shard belongs to one and only one shard group.
Multiple shards may exist in a single shard group.
Each shard contains a specific set of series.
All points falling on a given series in a given shard group will be stored in the same shard (TSM file) on disk.
Related entries: [series](#series), [shard duration](#shard-duration), [shard group](#shard-group), [tsm](#tsm-time-structured-merge-tree)
## shard duration
The shard duration determines how much time each shard group spans.
The specific interval is determined by the `SHARD DURATION` of the retention policy.
See [Retention Policy management](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management) for more information.
For example, given a retention policy with `SHARD DURATION` set to `1w`, each shard group will span a single week and contain all points with timestamps in that week.
Related entries: [database](#database), [retention policy](#retention-policy-rp), [series](#series), [shard](#shard), [shard group](#shard-group)
## shard group
Shard groups are logical containers for shards.
Shard groups are organized by time and retention policy.
Every retention policy that contains data has at least one associated shard group.
A given shard group contains all shards with data for the interval covered by the shard group.
The interval spanned by each shard group is the shard duration.
Related entries: [database](#database), [retention policy](#retention-policy-rp), [series](#series), [shard](#shard), [shard duration](#shard-duration)
## subscription
Subscriptions allow [Kapacitor](/{{< latest "kapacitor" >}}/) to receive data from InfluxDB in a push model rather than the pull model based on querying data.
When Kapacitor is configured to work with InfluxDB, the subscription will automatically push every write for the subscribed database from InfluxDB to Kapacitor.
Subscriptions can use TCP or UDP for transmitting the writes.
## tag
The key-value pair in the InfluxDB data structure that records metadata.
Tags are an optional part of the data structure, but they are useful for storing commonly-queried metadata; tags are indexed so queries on tags are performant.
*Query tip:* Compare tags to fields; fields are not indexed.
Related entries: [field](#field), [tag key](#tag-key), [tag set](#tag-set), [tag value](#tag-value)
## tag key
The key part of the key-value pair that makes up a tag.
Tag keys are strings and they store metadata.
Tag keys are indexed so queries on tag keys are performant.
*Query tip:* Compare tag keys to field keys; field keys are not indexed.
Related entries: [field key](#field-key), [tag](#tag), [tag set](#tag-set), [tag value](#tag-value)
## tag set
The collection of tag keys and tag values on a point.
Related entries: [point](#point), [series](#series), [tag](#tag), [tag key](#tag-key), [tag value](#tag-value)
## tag value
The value part of the key-value pair that makes up a tag.
Tag values are strings and they store metadata.
Tag values are indexed so queries on tag values are performant.
Related entries: [tag](#tag), [tag key](#tag-key), [tag set](#tag-set)
## timestamp
The date and time associated with a point.
All time in InfluxDB is UTC.
For how to specify time when writing data, see [Write Syntax](/enterprise_influxdb/v1.10/write_protocols/write_syntax/).
For how to specify time when querying data, see [Data Exploration](/enterprise_influxdb/v1.10/query_language/explore-data/#time-syntax).
Related entries: [point](#point)
## transformation
An InfluxQL function that returns a value or a set of values calculated from specified points, but does not return an aggregated value across those points.
See [InfluxQL Functions](/enterprise_influxdb/v1.10/query_language/functions/#transformations) for a complete list of the available and upcoming aggregations.
Related entries: [aggregation](#aggregation), [function](#function), [selector](#selector)
## TSM (Time Structured Merge tree)
The purpose-built data storage format for InfluxDB. TSM allows for greater compaction and higher write and read throughput than existing B+ or LSM tree implementations. See [Storage Engine](/enterprise_influxdb/v1.10/concepts/storage_engine/) for more.
## user
There are three kinds of users in InfluxDB Enterprise:
* *Global admin users* have all permissions.
* *Admin users* have `READ` and `WRITE` access to all databases and full access to administrative queries and user management commands.
* *Non-admin users* have `READ`, `WRITE`, or `ALL` (both `READ` and `WRITE`) access per database.
When authentication is enabled, InfluxDB only executes HTTP requests that are sent with a valid username and password.
See [Authentication and Authorization](/enterprise_influxdb/v1.10/administration/authentication_and_authorization/).
## values per second
The preferred measurement of the rate at which data are persisted to InfluxDB. Write speeds are generally quoted in values per second.
To calculate the values per second rate, multiply the number of points written per second by the number of values stored per point. For example, if the points have four fields each, and a batch of 5000 points is written 10 times per second, then the values per second rate is `4 field values per point * 5000 points per batch * 10 batches per second = 200,000 values per second`.
Related entries: [batch](#batch), [field](#field), [point](#point), [points per second](#points-per-second)
## WAL (Write Ahead Log)
The temporary cache for recently written points. To reduce the frequency with which the permanent storage files are accessed, InfluxDB caches new points in the WAL until their total size or age triggers a flush to more permanent storage. This allows for efficient batching of the writes into the TSM.
Points in the WAL can be queried, and they persist through a system reboot. On process start, all points in the WAL must be flushed before the system accepts new writes.
Related entries: [tsm](#tsm-time-structured-merge-tree)
## web console
Legacy user interface for the InfluxDB Enterprise.
This interface has been deprecated. We recommend using [Chronograf](/{{< latest "chronograf" >}}/introduction/).
If you are transitioning from the Enterprise Web Console to Chronograf, see how to [transition from the InfluxDB Web Admin Interface](/chronograf/v1.7/guides/transition-web-admin-interface/).

View File

@ -0,0 +1,85 @@
---
title: InfluxDB Enterprise startup process
description: >
On startup, InfluxDB Enterprise starts all subsystems and services in a deterministic order.
menu:
enterprise_influxdb_1_10:
weight: 10
name: Startup process
parent: Concepts
---
On startup, InfluxDB Enterprise starts all subsystems and services in the following order:
1. [TSDBStore](#tsdbstore)
2. [Monitor](#monitor)
3. [Cluster](#cluster)
4. [Precreator](#precreator)
5. [Snapshotter](#snapshotter)
6. [Continuous Query](#continuous-query)
7. [Announcer](#announcer)
8. [Retention](#retention)
9. [Stats](#stats)
10. [Anti-entropy](#anti-entropy)
11. [HTTP API](#http-api)
A **subsystem** is a collection of related services managed together as part of a greater whole.
A **service** is a process that provides specific functionality.
## Subsystems and services
### TSDBStore
The TSDBStore subsystem starts and manages the TSM storage engine.
This includes services such as the points writer (write), reads (query),
and [hinted handoff (HH)](/enterprise_influxdb/v1.10/concepts/clustering/#hinted-handoff).
TSDBSTore first opens all the shards and loads write-ahead log (WAL) data into the in-memory write cache.
If `influxd` was cleanly shutdown previously, there will not be any WAL data.
It then loads a portion of each shard's index.
{{% note %}}
#### Index versions and startup times
If using `inmem` indexing, InfluxDB loads all shard indexes into memory, which,
depending on the number of series in the database, can take time.
If using `tsi1` indexing, InfluxDB only loads hot shard indexes
(the most recent shards or shards currently being written to) into memory and
stores cold shard indexes on disk.
Use `tsi1` indexing to see shorter startup times.
{{% /note %}}
### Monitor
The Monitor service provides statistical and diagnostic information to InfluxDB about InfluxDB itself.
This information helps with database troubleshooting and performance analysis.
### Cluster
The Cluster service provides implementations of InfluxDB OSS v1.8 interfaces
that operate on an InfluxDB Enterprise v1.8 cluster.
### Precreator
The Precreator service creates shards before they are needed.
This ensures necessary shards exist before new time series data arrives and that
write-throughput is not affected the creation of a new shard.
### Snapshotter
The Snapshotter service routinely creates snapshots of InfluxDB Enterprise metadata.
### Continuous Query
The Continuous Query (CQ) subsystem manages all InfluxDB CQs.
### Announcer
The Announcer service announces a data node's status to meta nodes.
### Retention
The Retention service enforces [retention policies](/enterprise_influxdb/v1.10/concepts/glossary/#retention-policy-rp)
and drops data as it expires.
### Stats
The Stats service monitors cluster-level statistics.
### Anti-entropy
The Anti-entropy (AE) subsystem is responsible for reconciling differences between shards.
For more information, see [Use anti-entropy](/enterprise_influxdb/v1.10/administration/anti-entropy/).
### HTTP API
The InfluxDB HTTP API service provides a public facing interface to interact with
InfluxDB Enterprise and internal interfaces used within the InfluxDB Enterprise cluster.

View File

@ -0,0 +1,60 @@
---
title: InfluxDB design insights and tradeoffs
description: >
Optimizing for time series use case entails some tradeoffs, primarily to increase performance at the cost of functionality.
menu:
enterprise_influxdb_1_10:
name: InfluxDB design insights and tradeoffs
weight: 40
parent: Concepts
v2: /influxdb/v2.0/reference/key-concepts/design-principles/
---
InfluxDB is a time series database.
Optimizing for this use case entails some tradeoffs, primarily to increase performance at the cost of functionality.
Below is a list of some of those design insights that lead to tradeoffs:
1. For the time series use case, we assume that if the same data is sent multiple times, it is the exact same data that a client just sent several times.
_**Pro:**_ Simplified [conflict resolution](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points) increases write performance.
_**Con:**_ Cannot store duplicate data; may overwrite data in rare circumstances.
2. Deletes are a rare occurrence.
When they do occur it is almost always against large ranges of old data that are cold for writes.
_**Pro:**_ Restricting access to deletes allows for increased query and write performance.
_**Con:**_ Delete functionality is significantly restricted.
3. Updates to existing data are a rare occurrence and contentious updates never happen.
Time series data is predominantly new data that is never updated.
_**Pro:**_ Restricting access to updates allows for increased query and write performance.
_**Con:**_ Update functionality is significantly restricted.
4. The vast majority of writes are for data with very recent timestamps and the data is added in time ascending order.
_**Pro:**_ Adding data in time ascending order is significantly more performant.
_**Con:**_ Writing points with random times or with time not in ascending order is significantly less performant.
5. Scale is critical.
The database must be able to handle a *high* volume of reads and writes.
_**Pro:**_ The database can handle a *high* volume of reads and writes.
_**Con:**_ The InfluxDB development team was forced to make tradeoffs to increase performance.
6. Being able to write and query the data is more important than having a strongly consistent view.
_**Pro:**_ Writing and querying the database can be done by multiple clients and at high loads.
_**Con:**_ Query returns may not include the most recent points if database is under heavy load.
7. Many time [series](/enterprise_influxdb/v1.10/concepts/glossary/#series) are ephemeral.
There are often time series that appear only for a few hours and then go away, e.g.
a new host that gets started and reports for a while and then gets shut down.
_**Pro:**_ InfluxDB is good at managing discontinuous data.
_**Con:**_ Schema-less design means that some database functions are not supported e.g. there are no cross table joins.
8. No one point is too important.
_**Pro:**_ InfluxDB has very powerful tools to deal with aggregate data and large data sets.
_**Con:**_ Points don't have IDs in the traditional sense, they are differentiated by timestamp and series.

View File

@ -0,0 +1,202 @@
---
title: InfluxDB key concepts
description: Covers key concepts to learn about InfluxDB.
menu:
enterprise_influxdb_1_10:
name: Key concepts
weight: 10
parent: Concepts
v2: /influxdb/v2.0/reference/key-concepts/
---
Before diving into InfluxDB, it's good to get acquainted with some key concepts of the database. This document introduces key InfluxDB concepts and elements. To introduce the key concepts, well cover how the following elements work together in InfluxDB:
- [database](/enterprise_influxdb/v1.10/concepts/glossary/#database)
- [field key](/enterprise_influxdb/v1.10/concepts/glossary/#field-key)
- [field set](/enterprise_influxdb/v1.10/concepts/glossary/#field-set)
- [field value](/enterprise_influxdb/v1.10/concepts/glossary/#field-value)
- [measurement](/enterprise_influxdb/v1.10/concepts/glossary/#measurement)
- [point](/enterprise_influxdb/v1.10/concepts/glossary/#point)
- [retention policy](/enterprise_influxdb/v1.10/concepts/glossary/#retention-policy-rp)
- [series](/enterprise_influxdb/v1.10/concepts/glossary/#series)
- [tag key](/enterprise_influxdb/v1.10/concepts/glossary/#tag-key)
- [tag set](/enterprise_influxdb/v1.10/concepts/glossary/#tag-set)
- [tag value](/enterprise_influxdb/v1.10/concepts/glossary/#tag-value)
- [timestamp](/enterprise_influxdb/v1.10/concepts/glossary/#timestamp)
### Sample data
The next section references the data printed out below.
The data is fictional, but represents a believable setup in InfluxDB.
They show the number of butterflies and honeybees counted by two scientists (`langstroth` and `perpetua`) in two locations (location `1` and location `2`) over the time period from August 18, 2015 at midnight through August 18, 2015 at 6:12 AM.
Assume that the data lives in a database called `my_database` and are subject to the `autogen` retention policy (more on databases and retention policies to come).
*Hint:* Hover over the links for tooltips to get acquainted with InfluxDB terminology and the layout.
**name:** <span class="tooltip" data-tooltip-text="Measurement">census</span>
| time | <span class ="tooltip" data-tooltip-text ="Field key">butterflies</span> | <span class ="tooltip" data-tooltip-text ="Field key">honeybees</span> | <span class ="tooltip" data-tooltip-text ="Tag key">location</span> | <span class ="tooltip" data-tooltip-text ="Tag key">scientist</span> |
| ---- | ------------------------------------------------------------------------ | ---------------------------------------------------------------------- | ------------------------------------------------------------------- | -------------------------------------------------------------------- |
| 2015-08-18T00:00:00Z | 12 | 23 | 1 | langstroth |
| 2015-08-18T00:00:00Z | 1 | 30 | 1 | perpetua |
| 2015-08-18T00:06:00Z | 11 | 28 | 1 | langstroth |
| <span class="tooltip" data-tooltip-text="Timestamp">2015-08-18T00:06:00Z</span> | <span class ="tooltip" data-tooltip-text ="Field value">3</span> | <span class ="tooltip" data-tooltip-text ="Field value">28</span> | <span class ="tooltip" data-tooltip-text ="Tag value">1</span> | <span class ="tooltip" data-tooltip-text ="Tag value">perpetua</span> |
| 2015-08-18T05:54:00Z | 2 | 11 | 2 | langstroth |
| 2015-08-18T06:00:00Z | 1 | 10 | 2 | langstroth |
| 2015-08-18T06:06:00Z | 8 | 23 | 2 | perpetua |
| 2015-08-18T06:12:00Z | 7 | 22 | 2 | perpetua |
### Discussion
Now that you've seen some sample data in InfluxDB this section covers what it all means.
InfluxDB is a time series database so it makes sense to start with what is at the root of everything we do: time.
In the data above there's a column called `time` - all data in InfluxDB have that column.
`time` stores timestamps, and the <a name="timestamp"></a>_**timestamp**_ shows the date and time, in [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) UTC, associated with particular data.
The next two columns, called `butterflies` and `honeybees`, are fields.
Fields are made up of field keys and field values.
<a name="field-key"></a>_**Field keys**_ (`butterflies` and `honeybees`) are strings; the field key `butterflies` tells us that the field values `12`-`7` refer to butterflies and the field key `honeybees` tells us that the field values `23`-`22` refer to, well, honeybees.
<a name="field-value"></a>_**Field values**_ are your data; they can be strings, floats, integers, or Booleans, and, because InfluxDB is a time series database, a field value is always associated with a timestamp.
The field values in the sample data are:
```
12 23
1 30
11 28
3 28
2 11
1 10
8 23
7 22
```
In the data above, the collection of field-key and field-value pairs make up a <a name="field-set"></a>_**field set**_.
Here are all eight field sets in the sample data:
* `butterflies = 12 honeybees = 23`
* `butterflies = 1 honeybees = 30`
* `butterflies = 11 honeybees = 28`
* `butterflies = 3 honeybees = 28`
* `butterflies = 2 honeybees = 11`
* `butterflies = 1 honeybees = 10`
* `butterflies = 8 honeybees = 23`
* `butterflies = 7 honeybees = 22`
Fields are a required piece of the InfluxDB data structure - you cannot have data in InfluxDB without fields.
It's also important to note that fields are not indexed.
[Queries](/enterprise_influxdb/v1.10/concepts/glossary/#query) that use field values as filters must scan all values that match the other conditions in the query.
As a result, those queries are not performant relative to queries on tags (more on tags below).
In general, fields should not contain commonly-queried metadata.
The last two columns in the sample data, called `location` and `scientist`, are tags.
Tags are made up of tag keys and tag values.
Both <a name="tag-key"></a>_**tag keys**_ and <a name="tag-value"></a>_**tag values**_ are stored as strings and record metadata.
The tag keys in the sample data are `location` and `scientist`.
The tag key `location` has two tag values: `1` and `2`.
The tag key `scientist` also has two tag values: `langstroth` and `perpetua`.
In the data above, the <a name="tag-set"></a>_**tag set**_ is the different combinations of all the tag key-value pairs.
The four tag sets in the sample data are:
* `location = 1`, `scientist = langstroth`
* `location = 2`, `scientist = langstroth`
* `location = 1`, `scientist = perpetua`
* `location = 2`, `scientist = perpetua`
Tags are optional.
You don't need to have tags in your data structure, but it's generally a good idea to make use of them because, unlike fields, tags are indexed.
This means that queries on tags are faster and that tags are ideal for storing commonly-queried metadata.
Avoid using the following reserved keys:
* `_field`
* `_measurement`
* `time`
If reserved keys are included as a tag or field key, the associated point is discarded.
> **Why indexing matters: The schema case study**
> Say you notice that most of your queries focus on the values of the field keys `honeybees` and `butterflies`:
> `SELECT * FROM "census" WHERE "butterflies" = 1`
> `SELECT * FROM "census" WHERE "honeybees" = 23`
> Because fields aren't indexed, InfluxDB scans every value of `butterflies` in the first query and every value of `honeybees` in the second query before it provides a response.
That behavior can hurt query response times - especially on a much larger scale.
To optimize your queries, it may be beneficial to rearrange your [schema](/enterprise_influxdb/v1.10/concepts/glossary/#schema) such that the fields (`butterflies` and `honeybees`) become the tags and the tags (`location` and `scientist`) become the fields:
> **name:** <span class="tooltip" data-tooltip-text="Measurement">census</span>
>
| time | <span class ="tooltip" data-tooltip-text ="Field key">location</span> | <span class ="tooltip" data-tooltip-text ="Field key">scientist</span> | <span class ="tooltip" data-tooltip-text ="Tag key">butterflies</span> | <span class ="tooltip" data-tooltip-text ="Tag key">honeybees</span> |
| ---- | --------------------------------------------------------------------- | ---------------------------------------------------------------------- | ---------------------------------------------------------------------- | -------------------------------------------------------------------- |
| 2015-08-18T00:00:00Z | 1 | langstroth | 12 | 23 |
| 2015-08-18T00:00:00Z | 1 | perpetua | 1 | 30 |
| 2015-08-18T00:06:00Z | 1 | langstroth | 11 | 28 |
| <span class="tooltip" data-tooltip-text="Timestamp">2015-08-18T00:06:00Z</span> | <span class ="tooltip" data-tooltip-text ="Field value">1</span> | <span class ="tooltip" data-tooltip-text ="Field value">perpetua</span> | <span class ="tooltip" data-tooltip-text ="Tag value">3</span> | <span class ="tooltip" data-tooltip-text ="Tag value">28</span> |
| 2015-08-18T05:54:00Z | 2 | langstroth | 2 | 11 |
| 2015-08-18T06:00:00Z | 2 | langstroth | 1 | 10 |
| 2015-08-18T06:06:00Z | 2 | perpetua | 8 | 23 |
| 2015-08-18T06:12:00Z | 2 | perpetua | 7 | 22 |
> Now that `butterflies` and `honeybees` are tags, InfluxDB won't have to scan every one of their values when it performs the queries above - this means that your queries are even faster.
The <a name=measurement></a>_**measurement**_ acts as a container for tags, fields, and the `time` column, and the measurement name is the description of the data that are stored in the associated fields.
Measurement names are strings, and, for any SQL users out there, a measurement is conceptually similar to a table.
The only measurement in the sample data is `census`.
The name `census` tells us that the field values record the number of `butterflies` and `honeybees` - not their size, direction, or some sort of happiness index.
A single measurement can belong to different retention policies.
A <a name="retention-policy"></a>_**retention policy**_ describes how long InfluxDB keeps data (`DURATION`) and how many copies of this data is stored in the cluster (`REPLICATION`).
If you're interested in reading more about retention policies, check out [Database Management](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management).
{{% warn %}} Replication factors do not serve a purpose with single node instances.
{{% /warn %}}
In the sample data, everything in the `census` measurement belongs to the `autogen` retention policy.
InfluxDB automatically creates that retention policy; it has an infinite duration and a replication factor set to one.
Now that you're familiar with measurements, tag sets, and retention policies, let's discuss series.
In InfluxDB, a <a name=series></a>_**series**_ is a collection of points that share a measurement, tag set, and field key.
The data above consist of eight series:
| Series number | Measurement | Tag set | Field key |
|:------------------------ | ----------- | ------- | --------- |
| series 1 | `census` | `location = 1`,`scientist = langstroth` | `butterflies` |
| series 2 | `census` | `location = 2`,`scientist = langstroth` | `butterflies` |
| series 3 | `census` | `location = 1`,`scientist = perpetua` | `butterflies` |
| series 4 | `census` | `location = 2`,`scientist = perpetua` | `butterflies` |
| series 5 | `census` | `location = 1`,`scientist = langstroth` | `honeybees` |
| series 6 | `census` | `location = 2`,`scientist = langstroth` | `honeybees` |
| series 7 | `census` | `location = 1`,`scientist = perpetua` | `honeybees` |
| series 8 | `census` | `location = 2`,`scientist = perpetua` | `honeybees` |
Understanding the concept of a series is essential when designing your [schema](/enterprise_influxdb/v1.10/concepts/glossary/#schema) and when working with your data in InfluxDB.
A <a name="point"></a>_**point**_ represents a single data record that has four components: a measurement, tag set, field set, and a timestamp. A point is uniquely identified by its series and timestamp.
For example, here's a single point:
```
name: census
-----------------
time butterflies honeybees location scientist
2015-08-18T00:00:00Z 1 30 1 perpetua
```
The point in this example is part of series 3 and 7 and defined by the measurement (`census`), the tag set (`location = 1`, `scientist = perpetua`), the field set (`butterflies = 1`, `honeybees = 30`), and the timestamp `2015-08-18T00:00:00Z`.
All of the stuff we've just covered is stored in a database - the sample data are in the database `my_database`.
An InfluxDB <a name=database></a>_**database**_ is similar to traditional relational databases and serves as a logical container for users, retention policies, continuous queries, and, of course, your time series data.
See [Authentication and Authorization](/enterprise_influxdb/v1.10/administration/authentication_and_authorization/) and [Continuous Queries](/enterprise_influxdb/v1.10/query_language/continuous_queries/) for more on those topics.
Databases can have several users, continuous queries, retention policies, and measurements.
InfluxDB is a schemaless database which means it's easy to add new measurements, tags, and fields at any time.
It's designed to make working with time series data awesome.
You made it!
You've covered the fundamental concepts and terminology in InfluxDB.
If you're just starting out, we recommend taking a look at [Getting Started](/enterprise_influxdb/v1.10/introduction/getting_started/) and the [Writing Data](/enterprise_influxdb/v1.10/guides/writing_data/) and [Querying Data](/enterprise_influxdb/v1.10/guides/querying_data/) guides.
May our time series database serve you well 🕔.

View File

@ -0,0 +1,279 @@
---
title: InfluxDB schema design and data layout
description: >
General guidelines for InfluxDB schema design and data layout.
menu:
enterprise_influxdb_1_10:
name: Schema design and data layout
weight: 50
parent: Concepts
---
Each InfluxDB use case is unique and your [schema](/enterprise_influxdb/v1.10/concepts/glossary/#schema) reflects that uniqueness.
In general, a schema designed for querying leads to simpler and more performant queries.
We recommend the following design guidelines for most use cases:
- [Where to store data (tag or field)](#where-to-store-data-tag-or-field)
- [Avoid too many series](#avoid-too-many-series)
- [Use recommended naming conventions](#use-recommended-naming-conventions)
- [Shard Group Duration Management](#shard-group-duration-management)
## Where to store data (tag or field)
Your queries should guide what data you store in [tags](/enterprise_influxdb/v1.10/concepts/glossary/#tag) and what you store in [fields](/enterprise_influxdb/v1.10/concepts/glossary/#field) :
- Store commonly-queried and grouping ([`group()`](/flux/v0.x/stdlib/universe/group) or [`GROUP BY`](/enterprise_influxdb/v1.10/query_language/explore-data/#group-by-tags)) metadata in tags.
- Store data in fields if each data point contains a different value.
- Store numeric values as fields ([tag values](/enterprise_influxdb/v1.10/concepts/glossary/#tag-value) only support string values).
## Avoid too many series
IndexDB indexes the following data elements to speed up reads:
- [measurement](/enterprise_influxdb/v1.10/concepts/glossary/#measurement)
- [tags](/enterprise_influxdb/v1.10/concepts/glossary/#tag)
[Tag values](/enterprise_influxdb/v1.10/concepts/glossary/#tag-value) are indexed and [field values](/enterprise_influxdb/v1.10/concepts/glossary/#field-value) are not.
This means that querying by tags is more performant than querying by fields.
However, when too many indexes are created, both writes and reads may start to slow down.
Each unique set of indexed data elements forms a [series key](/enterprise_influxdb/v1.10/concepts/glossary/#series-key).
[Tags](/enterprise_influxdb/v1.10/concepts/glossary/#tag) containing highly variable information like unique IDs, hashes, and random strings lead to a large number of [series](/enterprise_influxdb/v1.10/concepts/glossary/#series), also known as high [series cardinality](/enterprise_influxdb/v1.10/concepts/glossary/#series-cardinality).
High series cardinality is a primary driver of high memory usage for many database workloads.
Therefore, to reduce memory consumption, consider storing high-cardinality values in field values rather than in tags or field keys.
{{% note %}}
If reads and writes to InfluxDB start to slow down, you may have high series cardinality (too many series).
See [how to find and reduce high series cardinality](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#why-does-series-cardinality-matter).
{{% /note %}}
## Use recommended naming conventions
Use the following conventions when naming your tag and field keys:
- [Avoid reserved keywords in tag and field keys](#avoid-reserved-keywords-in-tag-and-field-keys)
- [Avoid the same tag and field name](#avoid-the-same-name-for-a-tag-and-a-field)
- [Avoid encoding data in measurements and keys](#avoid-encoding-data-in-measurements-and-keys)
- [Avoid more than one piece of information in one tag](#avoid-putting-more-than-one-piece-of-information-in-one-tag)
### Avoid reserved keywords in tag and field keys
Not required, but avoiding the use of reserved keywords in your tag keys and field keys simplifies writing queries because you won't have to wrap your keys in double quotes.
See [InfluxQL](https://github.com/influxdata/influxql/blob/master/README.md#keywords) and [Flux keywords](/{{< latest "flux" >}}/spec/lexical-elements/#keywords) to avoid.
Also, if a tag key or field key contains characters other than `[A-z,_]`, you must wrap it in double quotes in InfluxQL or use [bracket notation](/{{< latest "flux" >}}/data-types/composite/record/#bracket-notation) in Flux.
### Avoid the same name for a tag and a field
Avoid using the same name for a tag and field key.
This often results in unexpected behavior when querying data.
If you inadvertently add the same name for a tag and a field, see
[Frequently asked questions](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#tag-and-field-key-with-the-same-name)
for information about how to query the data predictably and how to fix the issue.
### Avoid encoding data in measurements and keys
Store data in [tag values](/enterprise_influxdb/v1.10/concepts/glossary/#tag-value) or [field values](/enterprise_influxdb/v1.10/concepts/glossary/#field-value), not in [tag keys](/enterprise_influxdb/v1.10/concepts/glossary/#tag-key), [field keys](/enterprise_influxdb/v1.10/concepts/glossary/#field-key), or [measurements](/enterprise_influxdb/v1.10/concepts/glossary/#measurement). If you design your schema to store data in tag and field values,
your queries will be easier to write and more efficient.
In addition, you'll keep cardinality low by not creating measurements and keys as you write data.
To learn more about the performance impact of high series cardinality, see [how to find and reduce high series cardinality](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#why-does-series-cardinality-matter).
#### Compare schemas
Compare the following valid schemas represented by line protocol.
**Recommended**: the following schema stores metadata in separate `crop`, `plot`, and `region` tags. The `temp` field contains variable numeric data.
##### {id="good-measurements-schema"}
```
Good Measurements schema - Data encoded in tags (recommended)
-------------
weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
```
**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the measurement, similar to Graphite metrics.
##### {id="bad-measurements-schema"}
```
Bad Measurements schema - Data encoded in the measurement (not recommended)
-------------
blueberries.plot-1.north temp=50.1 1472515200000000000
blueberries.plot-2.midwest temp=49.8 1472515200000000000
```
**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the field key.
##### {id="bad-keys-schema"}
```
Bad Keys schema - Data encoded in field keys (not recommended)
-------------
weather_sensor blueberries.plot-1.north.temp=50.1 1472515200000000000
weather_sensor blueberries.plot-2.midwest.temp=49.8 1472515200000000000
```
#### Compare queries
Compare the following queries of the [_Good Measurements_](#good-measurements-schema) and [_Bad Measurements_](#bad-measurements-schema) schemas.
The [Flux](/{{< latest "flux" >}}/) queries calculate the average `temp` for blueberries in the `north` region
**Easy to query**: [_Good Measurements_](#good-measurements-schema) data is easily filtered by `region` tag values, as in the following example.
```js
// Query *Good Measurements*, data stored in separate tags (recommended)
from(bucket: "<database>/<retention_policy>")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|> mean()
```
**Difficult to query**: [_Bad Measurements_](#bad-measurements-schema) requires regular expressions to extract `plot` and `region` from the measurement, as in the following example.
```js
// Query *Bad Measurements*, data encoded in the measurement (not recommended)
from(bucket: "<database>/<retention_policy>")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
|> mean()
```
Complex measurements make some queries impossible. For example, calculating the average temperature of both plots is not possible with the [_Bad Measurements_](#bad-measurements-schema) schema.
##### InfluxQL example to query schemas
```
# Query *Bad Measurements*, data encoded in the measurement (not recommended)
> SELECT mean("temp") FROM /\.north$/
# Query *Good Measurements*, data stored in separate tag values (recommended)
> SELECT mean("temp") FROM "weather_sensor" WHERE "region" = 'north'
```
### Avoid putting more than one piece of information in one tag
Splitting a single tag with multiple pieces into separate tags simplifies your queries and improves performance by
reducing the need for regular expressions.
Consider the following schema represented by line protocol.
#### Example line protocol schemas
```
Schema 1 - Multiple data encoded in a single tag
-------------
weather_sensor,crop=blueberries,location=plot-1.north temp=50.1 1472515200000000000
weather_sensor,crop=blueberries,location=plot-2.midwest temp=49.8 1472515200000000000
```
The Schema 1 data encodes multiple separate parameters, the `plot` and `region` into a long tag value (`plot-1.north`).
Compare this to the following schema represented in line protocol.
```
Schema 2 - Data encoded in multiple tags
-------------
weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
```
Use Flux or InfluxQL to calculate the average `temp` for blueberries in the `north` region.
Schema 2 is preferable because using multiple tags, you don't need a regular expression.
#### Flux example to query schemas
```js
// Schema 1 - Query for multiple data encoded in a single tag
from(bucket:"<database>/<retention_policy>")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.location =~ /\.north$/ and r._field == "temp")
|> mean()
// Schema 2 - Query for data encoded in multiple tags
from(bucket:"<database>/<retention_policy>")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|> mean()
```
#### InfluxQL example to query schemas
```
# Schema 1 - Query for multiple data encoded in a single tag
> SELECT mean("temp") FROM "weather_sensor" WHERE location =~ /\.north$/
# Schema 2 - Query for data encoded in multiple tags
> SELECT mean("temp") FROM "weather_sensor" WHERE region = 'north'
```
## Shard group duration management
### Shard group duration overview
InfluxDB stores data in shard groups.
Shard groups are organized by [retention policy](/enterprise_influxdb/v1.10/concepts/glossary/#retention-policy-rp) (RP) and store data with timestamps that fall within a specific time interval called the [shard duration](/enterprise_influxdb/v1.10/concepts/glossary/#shard-duration).
If no shard group duration is provided, the shard group duration is determined by the RP [duration](/enterprise_influxdb/v1.10/concepts/glossary/#duration) at the time the RP is created. The default values are:
| RP Duration | Shard Group Duration |
|---|---|
| < 2 days | 1 hour |
| >= 2 days and <= 6 months | 1 day |
| > 6 months | 7 days |
The shard group duration is also configurable per RP.
To configure the shard group duration, see [Retention Policy Management](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management).
### Shard group duration tradeoffs
Determining the optimal shard group duration requires finding the balance between:
- Better overall performance with longer shards
- Flexibility provided by shorter shards
#### Long shard group duration
Longer shard group durations let InfluxDB store more data in the same logical location.
This reduces data duplication, improves compression efficiency, and improves query speed in some cases.
#### Short shard group duration
Shorter shard group durations allow the system to more efficiently drop data and record incremental backups.
When InfluxDB enforces an RP it drops entire shard groups, not individual data points, even if the points are older than the RP duration.
A shard group will only be removed once a shard group's duration *end time* is older than the RP duration.
For example, if your RP has a duration of one day, InfluxDB will drop an hour's worth of data every hour and will always have 25 shard groups. One for each hour in the day and an extra shard group that is partially expiring, but isn't removed until the whole shard group is older than 24 hours.
>**Note:** A special use case to consider: filtering queries on schema data (such as tags, series, measurements) by time. For example, if you want to filter schema data within a one hour interval, you must set the shard group duration to 1h. For more information, see [filter schema data by time](/enterprise_influxdb/v1.10/query_language/explore-schema/#filter-meta-queries-by-time).
### Shard group duration recommendations
The default shard group durations work well for most cases. However, high-throughput or long-running instances will benefit from using longer shard group durations.
Here are some recommendations for longer shard group durations:
| RP Duration | Shard Group Duration |
|---|---|
| <= 1 day | 6 hours |
| > 1 day and <= 7 days | 1 day |
| > 7 days and <= 3 months | 7 days |
| > 3 months | 30 days |
| infinite | 52 weeks or longer |
> **Note:** Note that `INF` (infinite) is not a [valid shard group duration](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management).
In extreme cases where data covers decades and will never be deleted, a long shard group duration like `1040w` (20 years) is perfectly valid.
Other factors to consider before setting shard group duration:
* Shard groups should be twice as long as the longest time range of the most frequent queries
* Shard groups should each contain more than 100,000 [points](/enterprise_influxdb/v1.10/concepts/glossary/#point) per shard group
* Shard groups should each contain more than 1,000 points per [series](/enterprise_influxdb/v1.10/concepts/glossary/#series)
#### Shard group duration for backfilling
Bulk insertion of historical data covering a large time range in the past will trigger the creation of a large number of shards at once.
The concurrent access and overhead of writing to hundreds or thousands of shards can quickly lead to slow performance and memory exhaustion.
When writing historical data, we highly recommend temporarily setting a longer shard group duration so fewer shards are created. Typically, a shard group duration of 52 weeks works well for backfilling.

View File

@ -0,0 +1,438 @@
---
title: In-memory indexing and the Time-Structured Merge Tree (TSM)
description: >
InfluxDB storage engine, in-memory indexing, and the Time-Structured Merge Tree (TSM) in InfluxDB OSS.
menu:
enterprise_influxdb_1_10:
name: In-memory indexing with TSM
weight: 60
parent: Concepts
v2: /influxdb/v2.0/reference/internals/storage-engine/
---
## The InfluxDB storage engine and the Time-Structured Merge Tree (TSM)
The InfluxDB storage engine looks very similar to a LSM Tree.
It has a write ahead log and a collection of read-only data files which are similar in concept to SSTables in an LSM Tree.
TSM files contain sorted, compressed series data.
InfluxDB will create a [shard](/enterprise_influxdb/v1.10/concepts/glossary/#shard) for each block of time.
For example, if you have a [retention policy](/enterprise_influxdb/v1.10/concepts/glossary/#retention-policy-rp) with an unlimited duration, shards will be created for each 7 day block of time.
Each of these shards maps to an underlying storage engine database.
Each of these databases has its own [WAL](/enterprise_influxdb/v1.10/concepts/glossary/#wal-write-ahead-log) and TSM files.
We'll dig into each of these parts of the storage engine.
## Storage engine
The storage engine ties a number of components together and provides the external interface for storing and querying series data. It is composed of a number of components that each serve a particular role:
* In-Memory Index - The in-memory index is a shared index across shards that provides the quick access to [measurements](/enterprise_influxdb/v1.10/concepts/glossary/#measurement), [tags](/enterprise_influxdb/v1.10/concepts/glossary/#tag), and [series](/enterprise_influxdb/v1.10/concepts/glossary/#series). The index is used by the engine, but is not specific to the storage engine itself.
* WAL - The WAL is a write-optimized storage format that allows for writes to be durable, but not easily queryable. Writes to the WAL are appended to segments of a fixed size.
* Cache - The Cache is an in-memory representation of the data stored in the WAL. It is queried at runtime and merged with the data stored in TSM files.
* TSM Files - TSM files store compressed series data in a columnar format.
* FileStore - The FileStore mediates access to all TSM files on disk. It ensures that TSM files are installed atomically when existing ones are replaced as well as removing TSM files that are no longer used.
* Compactor - The Compactor is responsible for converting less optimized Cache and TSM data into more read-optimized formats. It does this by compressing series, removing deleted data, optimizing indices and combining smaller files into larger ones.
* Compaction Planner - The Compaction Planner determines which TSM files are ready for a compaction and ensures that multiple concurrent compactions do not interfere with each other.
* Compression - Compression is handled by various Encoders and Decoders for specific data types. Some encoders are fairly static and always encode the same type the same way; others switch their compression strategy based on the shape of the data.
* Writers/Readers - Each file type (WAL segment, TSM files, tombstones, etc..) has Writers and Readers for working with the formats.
### Write Ahead Log (WAL)
The WAL is organized as a bunch of files that look like `_000001.wal`.
The file numbers are monotonically increasing and referred to as WAL segments.
When a segment reaches 10MB in size, it is closed and a new one is opened. Each WAL segment stores multiple compressed blocks of writes and deletes.
When a write comes in the new points are serialized, compressed using Snappy, and written to a WAL file.
The file is `fsync`'d and the data is added to an in-memory index before a success is returned.
This means that batching points together is required to achieve high throughput performance.
(Optimal batch size seems to be 5,000-10,000 points per batch for many use cases.)
Each entry in the WAL follows a [TLV standard](https://en.wikipedia.org/wiki/Type-length-value) with a single byte representing the type of entry (write or delete), a 4 byte `uint32` for the length of the compressed block, and then the compressed block.
### Cache
The Cache is an in-memory copy of all data points current stored in the WAL.
The points are organized by the key, which is the measurement, [tag set](/enterprise_influxdb/v1.10/concepts/glossary/#tag-set), and unique [field](/enterprise_influxdb/v1.10/concepts/glossary/#field).
Each field is kept as its own time-ordered range.
The Cache data is not compressed while in memory.
Queries to the storage engine will merge data from the Cache with data from the TSM files.
Queries execute on a copy of the data that is made from the cache at query processing time.
This way writes that come in while a query is running won't affect the result.
Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
The Cache exposes a few controls for snapshotting behavior.
The two most important controls are the memory limits.
There is a lower bound, [`cache-snapshot-memory-size`](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#cache-snapshot-memory-size), which when exceeded will trigger a snapshot to TSM files and remove the corresponding WAL segments.
There is also an upper bound, [`cache-max-memory-size`](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes#cache-max-memory-size-1g), which when exceeded will cause the Cache to reject new writes.
These configurations are useful to prevent out of memory situations and to apply back pressure to clients writing data faster than the instance can persist it.
The checks for memory thresholds occur on every write.
The other snapshot controls are time based.
The idle threshold, [`cache-snapshot-write-cold-duration`](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes#cache-snapshot-write-cold-duration), forces the Cache to snapshot to TSM files if it hasn't received a write within the specified interval.
The in-memory Cache is recreated on restart by re-reading the WAL files on disk.
### TSM files
TSM files are a collection of read-only files that are memory mapped.
The structure of these files looks very similar to an SSTable in LevelDB or other LSM Tree variants.
A TSM file is composed of four sections: header, blocks, index, and footer.
```
+--------+------------------------------------+-------------+--------------+
| Header | Blocks | Index | Footer |
|5 bytes | N bytes | N bytes | 4 bytes |
+--------+------------------------------------+-------------+--------------+
```
The Header is a magic number to identify the file type and a version number.
```
+-------------------+
| Header |
+-------------------+
| Magic │ Version |
| 4 bytes │ 1 byte |
+-------------------+
```
Blocks are sequences of pairs of CRC32 checksums and data.
The block data is opaque to the file.
The CRC32 is used for block level error detection.
The length of the blocks is stored in the index.
```
+--------------------------------------------------------------------+
│ Blocks │
+---------------------+-----------------------+----------------------+
| Block 1 | Block 2 | Block N |
+---------------------+-----------------------+----------------------+
| CRC | Data | CRC | Data | CRC | Data |
| 4 bytes | N bytes | 4 bytes | N bytes | 4 bytes | N bytes |
+---------------------+-----------------------+----------------------+
```
Following the blocks is the index for the blocks in the file.
The index is composed of a sequence of index entries ordered lexicographically by key and then by time.
The key includes the measurement name, tag set, and one field.
Multiple fields per point creates multiple index entries in the TSM file.
Each index entry starts with a key length and the key, followed by the block type (float, int, bool, string) and a count of the number of index block entries that follow for that key.
Each index block entry is composed of the min and max time for the block, the offset into the file where the block is located and the size of the block. There is one index block entry for each block in the TSM file that contains the key.
The index structure can provide efficient access to all blocks as well as the ability to determine the cost associated with accessing a given key.
Given a key and timestamp, we can determine whether a file contains the block for that timestamp.
We can also determine where that block resides and how much data must be read to retrieve the block.
Knowing the size of the block, we can efficiently provision our IO statements.
```
+-----------------------------------------------------------------------------+
│ Index │
+-----------------------------------------------------------------------------+
│ Key Len │ Key │ Type │ Count │Min Time │Max Time │ Offset │ Size │...│
│ 2 bytes │ N bytes │1 byte│2 bytes│ 8 bytes │ 8 bytes │8 bytes │4 bytes │ │
+-----------------------------------------------------------------------------+
```
The last section is the footer that stores the offset of the start of the index.
```
+---------+
│ Footer │
+---------+
│Index Ofs│
│ 8 bytes │
+---------+
```
### Compression
Each block is compressed to reduce storage space and disk IO when querying.
A block contains the timestamps and values for a given series and field.
Each block has one byte header, followed by the compressed timestamps and then the compressed values.
```
+--------------------------------------------------+
| Type | Len | Timestamps | Values |
|1 Byte | VByte | N Bytes | N Bytes │
+--------------------------------------------------+
```
The timestamps and values are compressed and stored separately using encodings dependent on the data type and its shape.
Storing them independently allows timestamp encoding to be used for all timestamps, while allowing different encodings for different field types.
For example, some points may be able to use run-length encoding whereas other may not.
Each value type also contains a 1 byte header indicating the type of compression for the remaining bytes.
The four high bits store the compression type and the four low bits are used by the encoder if needed.
#### Timestamps
Timestamp encoding is adaptive and based on the structure of the timestamps that are encoded.
It uses a combination of delta encoding, scaling, and compression using simple8b run-length encoding, as well as falling back to no compression if needed.
Timestamp resolution is variable but can be as granular as a nanosecond, requiring up to 8 bytes to store uncompressed.
During encoding, the values are first delta-encoded.
The first value is the starting timestamp and subsequent values are the differences from the prior value.
This usually converts the values into much smaller integers that are easier to compress.
Many timestamps are also monotonically increasing and fall on even boundaries of time such as every 10s.
When timestamps have this structure, they are scaled by the largest common divisor that is also a factor of 10.
This has the effect of converting very large integer deltas into smaller ones that compress even better.
Using these adjusted values, if all the deltas are the same, the time range is stored using run-length encoding.
If run-length encoding is not possible and all values are less than (1 << 60) - 1 ([~36.5 years](https://www.wolframalpha.com/input/?i=\(1+%3C%3C+60\)+-+1+nanoseconds+to+years) at nanosecond resolution), then the timestamps are encoded using [simple8b encoding](https://github.com/jwilder/encoding/tree/master/simple8b).
Simple8b encoding is a 64bit word-aligned integer encoding that packs multiple integers into a single 64bit word.
If any value exceeds the maximum the deltas are stored uncompressed using 8 bytes each for the block.
Future encodings may use a patched scheme such as Patched Frame-Of-Reference (PFOR) to handle outliers more effectively.
#### Floats
Floats are encoded using an implementation of the [Facebook Gorilla paper](http://www.vldb.org/pvldb/vol8/p1816-teller.pdf).
The encoding XORs consecutive values together to produce a small result when the values are close together.
The delta is then stored using control bits to indicate how many leading and trailing zeroes are in the XOR value.
Our implementation removes the timestamp encoding described in paper and only encodes the float values.
#### Integers
Integer encoding uses two different strategies depending on the range of values in the uncompressed data.
Encoded values are first encoded using [ZigZag encoding](https://developers.google.com/protocol-buffers/docs/encoding#signed-integers).
This interleaves positive and negative integers across a range of positive integers.
For example, [-2,-1,0,1] becomes [3,1,0,2].
See Google's [Protocol Buffers documentation](https://developers.google.com/protocol-buffers/docs/encoding#signed-integers) for more information.
If all ZigZag encoded values are less than (1 << 60) - 1, they are compressed using simple8b encoding.
If any values are larger than the maximum then all values are stored uncompressed in the block.
If all values are identical, run-length encoding is used.
This works very well for values that are frequently constant.
#### Booleans
Booleans are encoded using a simple bit packing strategy where each Boolean uses 1 bit.
The number of Booleans encoded is stored using variable-byte encoding at the beginning of the block.
#### Strings
Strings are encoding using [Snappy](http://google.github.io/snappy/) compression.
Each string is packed consecutively and they are compressed as one larger block.
### Compactions
Compactions are recurring processes that migrate data stored in a write-optimized format into a more read-optimized format.
There are a number of stages of compaction that take place while a shard is hot for writes:
- **Snapshots** - Values in the Cache and WAL must be converted to TSM files to free memory and disk space used by the WAL segments.
These compactions occur based on the cache memory and time thresholds.
- **Level Compactions** - Level compactions (levels 1-4) occur as the TSM files grow.
TSM files are compacted from snapshots to level 1 files.
Multiple level 1 files are compacted to produce level 2 files.
The process continues until files reach level 4 (full compaction) and the max size for a TSM file.
They will not be compacted further unless deletes, index optimization compactions, or full compactions need to run.
Lower level compactions use strategies that avoid CPU-intensive activities like decompressing and combining blocks.
Higher level (and thus less frequent) compactions will re-combine blocks to fully compact them and increase the compression ratio.
- **Index Optimization** - When many level 4 TSM files accumulate, the internal indexes become larger and more costly to access.
An index optimization compaction splits the series and indices across a new set of TSM files, sorting all points for a given series into one TSM file.
Before an index optimization, each TSM file contained points for most or all series, and thus each contains the same series index.
After an index optimization, each TSM file contains points from a minimum of series and there is little series overlap between files.
Each TSM file thus has a smaller unique series index, instead of a duplicate of the full series list.
In addition, all points from a particular series are contiguous in a TSM file rather than spread across multiple TSM files.
- **Full Compactions** - Full compactions (level 4 compactions) run when a shard has become cold for writes for long time, or when deletes have occurred on the shard.
Full compactions produce an optimal set of TSM files and include all optimizations from Level and Index Optimization compactions.
Once a shard is fully compacted, no other compactions will run on it unless new writes or deletes are stored.
### Writes
Writes are appended to the current WAL segment and are also added to the Cache.
Each WAL segment has a maximum size.
Writes roll over to a new file once the current file fills up.
The cache is also size bounded; snapshots are taken and WAL compactions are initiated when the cache becomes too full.
If the inbound write rate exceeds the WAL compaction rate for a sustained period, the cache may become too full, in which case new writes will fail until the snapshot process catches up.
When WAL segments fill up and are closed, the Compactor snapshots the Cache and writes the data to a new TSM file.
When the TSM file is successfully written and `fsync`'d, it is loaded and referenced by the FileStore.
### Updates
Updates (writing a newer value for a point that already exists) occur as normal writes.
Since cached values overwrite existing values, newer writes take precedence.
If a write would overwrite a point in a prior TSM file, the points are merged at query runtime and the newer write takes precedence.
### Deletes
Deletes occur by writing a delete entry to the WAL for the measurement or series and then updating the Cache and FileStore.
The Cache evicts all relevant entries.
The FileStore writes a tombstone file for each TSM file that contains relevant data.
These tombstone files are used at startup time to ignore blocks as well as during compactions to remove deleted entries.
Queries against partially deleted series are handled at query time until a compaction removes the data fully from the TSM files.
### Queries
When a query is executed by the storage engine, it is essentially a seek to a given time associated with a specific series key and field.
First, we do a search on the data files to find the files that contain a time range matching the query as well containing matching series.
Once we have the data files selected, we next need to find the position in the file of the series key index entries.
We run a binary search against each TSM index to find the location of its index blocks.
In common cases the blocks will not overlap across multiple TSM files and we can search the index entries linearly to find the start block from which to read.
If there are overlapping blocks of time, the index entries are sorted to ensure newer writes will take precedence and that blocks can be processed in order during query execution.
When iterating over the index entries the blocks are read sequentially from the blocks section.
The block is decompressed and we seek to the specific point.
# The new InfluxDB storage engine: from LSM Tree to B+Tree and back again to create the Time Structured Merge Tree
Writing a new storage format should be a last resort.
So how did InfluxData end up writing our own engine?
InfluxData has experimented with many storage formats and found each lacking in some fundamental way.
The performance requirements for InfluxDB are significant, and eventually overwhelm other storage systems.
The 0.8 line of InfluxDB allowed multiple storage engines, including LevelDB, RocksDB, HyperLevelDB, and LMDB.
The 0.9 line of InfluxDB used BoltDB as the underlying storage engine.
This writeup is about the Time Structured Merge Tree storage engine that was released in 0.9.5 and is the only storage engine supported in InfluxDB 0.11+, including the entire 1.x family.
The properties of the time series data use case make it challenging for many existing storage engines.
Over the course of InfluxDB development, InfluxData tried a few of the more popular options.
We started with LevelDB, an engine based on LSM Trees, which are optimized for write throughput.
After that we tried BoltDB, an engine based on a memory mapped B+Tree, which is optimized for reads.
Finally, we ended up building our own storage engine that is similar in many ways to LSM Trees.
With our new storage engine we were able to achieve up to a 45x reduction in disk space usage from our B+Tree setup with even greater write throughput and compression than what we saw with LevelDB and its variants.
This post will cover the details of that evolution and end with an in-depth look at our new storage engine and its inner workings.
## Properties of time series data
The workload of time series data is quite different from normal database workloads.
There are a number of factors that conspire to make it very difficult to scale and remain performant:
* Billions of individual data points
* High write throughput
* High read throughput
* Large deletes (data expiration)
* Mostly an insert/append workload, very few updates
The first and most obvious problem is one of scale.
In DevOps, IoT, or APM it is easy to collect hundreds of millions or billions of unique data points every day.
For example, let's say we have 200 VMs or servers running, with each server collecting an average of 100 measurements every 10 seconds.
Given there are 86,400 seconds in a day, a single measurement will generate 8,640 points in a day per server.
That gives us a total of 172,800,000 (`200 * 100 * 8,640`) individual data points per day.
We find similar or larger numbers in sensor data use cases.
The volume of data means that the write throughput can be very high.
We regularly get requests for setups than can handle hundreds of thousands of writes per second.
Some larger companies will only consider systems that can handle millions of writes per second.
At the same time, time series data can be a high read throughput use case.
It's true that if you're tracking 700,000 unique metrics or time series you can't hope to visualize all of them.
That leads many people to think that you don't actually read most of the data that goes into the database.
However, other than dashboards that people have up on their screens, there are automated systems for monitoring or combining the large volume of time series data with other types of data.
Inside InfluxDB, aggregate functions calculated on the fly may combine tens of thousands of distinct time series into a single view.
Each one of those queries must read each aggregated data point, so for InfluxDB the read throughput is often many times higher than the write throughput.
Given that time series is mostly an append-only workload, you might think that it's possible to get great performance on a B+Tree.
Appends in the keyspace are efficient and you can achieve greater than 100,000 per second.
However, we have those appends happening in individual time series.
So the inserts end up looking more like random inserts than append only inserts.
One of the biggest problems we found with time series data is that it's very common to delete all data after it gets past a certain age.
The common pattern here is that users have high precision data that is kept for a short period of time like a few days or months.
Users then downsample and aggregate that data into lower precision rollups that are kept around much longer.
The naive implementation would be to simply delete each record once it passes its expiration time.
However, that means that once the first points written reach their expiration date, the system is processing just as many deletes as writes, which is something most storage engines aren't designed for.
Let's dig into the details of the two types of storage engines we tried and how these properties had a significant impact on our performance.
## LevelDB and log structured merge trees
When the InfluxDB project began, we picked LevelDB as the storage engine because we had used it for time series data storage in the product that was the precursor to InfluxDB.
We knew that it had great properties for write throughput and everything seemed to "just work".
LevelDB is an implementation of a log structured merge tree (LSM tree) that was built as an open source project at Google.
It exposes an API for a key-value store where the key space is sorted.
This last part is important for time series data as it allowed us to quickly scan ranges of time as long as the timestamp was in the key.
LSM Trees are based on a log that takes writes and two structures known as Mem Tables and SSTables.
These tables represent the sorted keyspace.
SSTables are read only files that are continuously replaced by other SSTables that merge inserts and updates into the keyspace.
The two biggest advantages that LevelDB had for us were high write throughput and built in compression.
However, as we learned more about what people needed with time series data, we encountered a few insurmountable challenges.
The first problem we had was that LevelDB doesn't support hot backups.
If you want to do a safe backup of the database, you have to close it and then copy it.
The LevelDB variants RocksDB and HyperLevelDB fix this problem, but there was another more pressing problem that we didn't think they could solve.
Our users needed a way to automatically manage data retention.
That meant we needed deletes on a very large scale.
In LSM Trees, a delete is as expensive, if not more so, than a write.
A delete writes a new record known as a tombstone.
After that queries merge the result set with any tombstones to purge the deleted data from the query return.
Later, a compaction runs that removes the tombstone record and the underlying deleted record in the SSTable file.
To get around doing deletes, we split data across what we call shards, which are contiguous blocks of time.
Shards would typically hold either one day or seven days worth of data.
Each shard mapped to an underlying LevelDB.
This meant that we could drop an entire day of data by just closing out the database and removing the underlying files.
Users of RocksDB may at this point bring up a feature called ColumnFamilies.
When putting time series data into Rocks, it's common to split blocks of time into column families and then drop those when their time is up.
It's the same general idea: create a separate area where you can just drop files instead of updating indexes when you delete a large block of data.
Dropping a column family is a very efficient operation.
However, column families are a fairly new feature and we had another use case for shards.
Organizing data into shards meant that it could be moved within a cluster without having to examine billions of keys.
At the time of this writing, it was not possible to move a column family in one RocksDB to another.
Old shards are typically cold for writes so moving them around would be cheap and easy.
We would have the added benefit of having a spot in the keyspace that is cold for writes so it would be easier to do consistency checks later.
The organization of data into shards worked great for a while, until a large amount of data went into InfluxDB.
LevelDB splits the data out over many small files.
Having dozens or hundreds of these databases open in a single process ended up creating a big problem.
Users that had six months or a year of data would run out of file handles.
It's not something we found with the majority of users, but anyone pushing the database to its limits would hit this problem and we had no fix for it.
There were simply too many file handles open.
## BoltDB and mmap B+Trees
After struggling with LevelDB and its variants for a year we decided to move over to BoltDB, a pure Golang database heavily inspired by LMDB, a mmap B+Tree database written in C.
It has the same API semantics as LevelDB: a key value store where the keyspace is ordered.
Many of our users were surprised.
Our own posted tests of the LevelDB variants vs. LMDB (a mmap B+Tree) showed RocksDB as the best performer.
However, there were other considerations that went into this decision outside of the pure write performance.
At this point our most important goal was to get to something stable that could be run in production and backed up.
BoltDB also had the advantage of being written in pure Go, which simplified our build chain immensely and made it easy to build for other OSes and platforms.
The biggest win for us was that BoltDB used a single file as the database.
At this point our most common source of bug reports were from people running out of file handles.
Bolt solved the hot backup problem and the file limit problems all at the same time.
We were willing to take a hit on write throughput if it meant that we'd have a system that was more reliable and stable that we could build on.
Our reasoning was that for anyone pushing really big write loads, they'd be running a cluster anyway.
We released versions 0.9.0 to 0.9.2 based on BoltDB.
From a development perspective it was delightful.
Clean API, fast and easy to build in our Go project, and reliable.
However, after running for a while we found a big problem with write throughput.
After the database got over a few GB, writes would start spiking IOPS.
Some users were able to get past this by putting InfluxDB on big hardware with near unlimited IOPS.
However, most users are on VMs with limited resources in the cloud.
We had to figure out a way to reduce the impact of writing a bunch of points into hundreds of thousands of series at a time.
With the 0.9.3 and 0.9.4 releases our plan was to put a write ahead log (WAL) in front of Bolt.
That way we could reduce the number of random insertions into the keyspace.
Instead, we'd buffer up multiple writes that were next to each other and then flush them at once.
However, that only served to delay the problem.
High IOPS still became an issue and it showed up very quickly for anyone operating at even moderate work loads.
However, our experience building the first WAL implementation in front of Bolt gave us the confidence we needed that the write problem could be solved.
The performance of the WAL itself was fantastic, the index simply could not keep up.
At this point we started thinking again about how we could create something similar to an LSM Tree that could keep up with our write load.
Thus was born the Time Structured Merge Tree.

View File

@ -0,0 +1,51 @@
---
title: Time Series Index (TSI) overview
description: >
The Time Series Index (TSI) storage engine supports high cardinality in time series data.
menu:
enterprise_influxdb_1_10:
name: Time Series Index (TSI) overview
weight: 70
parent: Concepts
---
Find overview and background information on Time Series Index (TSI) in this topic. For detail, including how to enable and configure TSI, see [Time Series Index (TSI) details](/enterprise_influxdb/v1.10/concepts/tsi-details/).
## Overview
To support a large number of time series, that is, a very high cardinality in the number of unique time series that the database stores, InfluxData has added the new Time Series Index (TSI).
InfluxData supports customers using InfluxDB with tens of millions of time series.
InfluxData's goal, however, is to expand to hundreds of millions, and eventually billions.
Using InfluxData's TSI storage engine, users should be able to have millions of unique time series.
The goal is that the number of series should be unbounded by the amount of memory on the server hardware.
Importantly, the number of series that exist in the database will have a negligible impact on database startup time.
This work represents the most significant technical advancement in the database since InfluxData released the Time Series Merge Tree (TSM) storage engine in 2016.
## Background information
InfluxDB actually looks like two databases in one, a time series data store and an inverted index for the measurement, tag, and field metadata.
### Time-Structured Merge Tree (TSM)
The Time-Structured Merge Tree (TSM) engine solves the problem of getting maximum throughput, compression, and query speed for raw time series data.
Up until TSI, the inverted index was an in-memory data structure that was built during startup of the database based on the data in TSM.
This meant that for every measurement, tag key-value pair, and field name, there was a lookup table in-memory to map those bits of metadata to an underlying time series.
For users with a high number of ephemeral series, memory utilization continued increasing as new time series were created.
And, startup times increased since all of that data would have to be loaded onto the heap at start time.
> For details, see [TSM-based data storage and in-memory indexing](/enterprise_influxdb/v1.10/concepts/storage_engine/).
### Time Series Index (TSI)
The new time series index (TSI) moves the index to files on disk that we memory map.
This means that we let the operating system handle being the Least Recently Used (LRU) memory.
Much like the TSM engine for raw time series data we have a write-ahead log with an in-memory structure that gets merged at query time with the memory-mapped index.
Background routines run constantly to compact the index into larger and larger files to avoid having to do too many index merges at query time.
Under the covers, were using techniques like Robin Hood Hashing to do fast index lookups and HyperLogLog++ to keep sketches of cardinality estimates.
The latter will give us the ability to add things to the query languages like the [SHOW CARDINALITY](/enterprise_influxdb/v1.10/query_language/spec#show-cardinality) queries.
### Issues solved by TSI and remaining to be solved
The primary issue that Time Series Index (TSI) addresses is ephemeral time series. Most frequently, this occurs in use cases that want to track per process metrics or per container metrics by putting identifiers in tags. For example, the [Heapster project for Kubernetes](https://github.com/kubernetes/heapster) does this. For series that are no longer hot for writes or queries, they wont take up space in memory.
The issue that the Heapster project and similar use cases did not address is limiting the scope of data returned by the SHOW queries. Well have updates to the query language in the future to limit those results by time. We also dont solve the problem of having all these series hot for reads and writes. For that problem, scale-out clustering is the solution. Well have to continue to optimize the query language and engine to work with large sets of series. Well need to add guard rails and limits into the language and eventually, add spill-to-disk query processing. That work will be on-going in every release of InfluxDB.

View File

@ -0,0 +1,168 @@
---
title: Time Series Index (TSI) details
description: Enable and understand the Time Series Index (TSI).
menu:
enterprise_influxdb_1_10:
name: Time Series Index (TSI) details
weight: 80
parent: Concepts
---
When InfluxDB ingests data, we store not only the value but we also index the measurement and tag information so that it can be queried quickly.
In earlier versions, index data could only be stored in-memory, however, that requires a lot of RAM and places an upper bound on the number of series a machine can hold.
This upper bound is usually somewhere between 1 - 4 million series depending on the machine used.
The Time Series Index (TSI) was developed to allow us to go past that upper bound.
TSI stores index data on disk so that we are no longer restricted by RAM.
TSI uses the operating system's page cache to pull hot data into memory and let cold data rest on disk.
## Enable TSI
To enable TSI, set the following line in the InfluxDB configuration file (`influxdb.conf`):
```
index-version = "tsi1"
```
(Be sure to include the double quotes.)
### InfluxDB Enterprise
- To convert your data nodes to support TSI, see [Upgrade InfluxDB Enterprise clusters](/enterprise_influxdb/v1.10/administration/upgrading/).
- For detail on configuration, see [Configure InfluxDB Enterprise clusters](/enterprise_influxdb/v1.10/administration/configuration/).
## Tooling
### `influx_inspect dumptsi`
If you are troubleshooting an issue with an index, you can use the `influx_inspect dumptsi` command.
This command allows you to print summary statistics on an index, file, or a set of files.
This command only works on one index at a time.
For details on this command, see [influx_inspect dumptsi](/enterprise_influxdb/v1.10/tools/influx_inspect/#dumptsi).
### `influx_inspect buildtsi`
If you want to convert an existing shard from an in-memory index to a TSI index, or if you have an existing TSI index which has become corrupt, you can use the `buildtsi` command to create the index from the underlying TSM data.
If you have an existing TSI index that you want to rebuild, first delete the `index` directory within your shard.
This command works at the server-level but you can optionally add database, retention policy and shard filters to only apply to a subset of shards.
For details on this command, see [influx inspect buildtsi](/enterprise_influxdb/v1.10/tools/influx_inspect/#buildtsi).
## Understanding TSI
### File organization
TSI (Time Series Index) is a log-structured merge tree-based database for InfluxDB series data.
TSI is composed of several parts:
* **Index**: Contains the entire index dataset for a single shard.
* **Partition**: Contains a sharded partition of the data for a shard.
* **LogFile**: Contains newly written series as an in-memory index and is persisted as a WAL.
* **IndexFile**: Contains an immutable, memory-mapped index built from a LogFile or merged from two contiguous index files.
There is also a **SeriesFile** which contains a set of all series keys across the entire database.
Each shard within the database shares the same series file.
### Writes
The following occurs when a write comes into the system:
1. Series is added to the series file or is looked up if it already exists. This returns an auto-incrementing series ID.
2. The series is sent to the Index. The index maintains a roaring bitmap of existing series IDs and ignores series that have already been created.
3. The series is hashed and sent to the appropriate Partition.
4. The Partition writes the series as an entry to the LogFile.
5. The LogFile writes the series to a write-ahead log file on disk and adds the series to a set of in-memory indexes.
### Compaction
Once the LogFile exceeds a threshold (5MB), then a new active log file is created and the previous one begins compacting into an IndexFile.
This first index file is at level 1 (L1).
The log file is considered level 0 (L0).
Index files can also be created by merging two smaller index files together.
For example, if contiguous two L1 index files exist then they can be merged into an L2 index file.
### Reads
The index provides several API calls for retrieving sets of data such as:
* `MeasurementIterator()`: Returns a sorted list of measurement names.
* `TagKeyIterator()`: Returns a sorted list of tag keys in a measurement.
* `TagValueIterator()`: Returns a sorted list of tag values for a tag key.
* `MeasurementSeriesIDIterator()`: Returns a sorted list of all series IDs for a measurement.
* `TagKeySeriesIDIterator()`: Returns a sorted list of all series IDs for a tag key.
* `TagValueSeriesIDIterator()`: Returns a sorted list of all series IDs for a tag value.
These iterators are all composable using several merge iterators.
For each type of iterator (measurement, tag key, tag value, series id), there are multiple merge iterator types:
* **Merge**: Deduplicates items from two iterators.
* **Intersect**: Returns only items that exist in two iterators.
* **Difference**: Only returns items from first iterator that don't exist in the second iterator.
For example, a query with a WHERE clause of `region != 'us-west'` that operates across two shards will construct a set of iterators like this:
```
DifferenceSeriesIDIterators(
MergeSeriesIDIterators(
Shard1.MeasurementSeriesIDIterator("m"),
Shard2.MeasurementSeriesIDIterator("m"),
),
MergeSeriesIDIterators(
Shard1.TagValueSeriesIDIterator("m", "region", "us-west"),
Shard2.TagValueSeriesIDIterator("m", "region", "us-west"),
),
)
```
### Log File Structure
The log file is simply structured as a list of LogEntry objects written to disk in sequential order. Log files are written until they reach 5MB and then they are compacted into index files.
The entry objects in the log can be of any of the following types:
* AddSeries
* DeleteSeries
* DeleteMeasurement
* DeleteTagKey
* DeleteTagValue
The in-memory index on the log file tracks the following:
* Measurements by name
* Tag keys by measurement
* Tag values by tag key
* Series by measurement
* Series by tag value
* Tombstones for series, measurements, tag keys, and tag values.
The log file also maintains bitsets for series ID existence and tombstones.
These bitsets are merged with other log files and index files to regenerate the full index bitset on startup.
### Index File Structure
The index file is an immutable file that tracks similar information to the log file, but all data is indexed and written to disk so that it can be directly accessed from a memory-map.
The index file has the following sections:
* **TagBlocks:** Maintains an index of tag values for a single tag key.
* **MeasurementBlock:** Maintains an index of measurements and their tag keys.
* **Trailer:** Stores offset information for the file as well as HyperLogLog sketches for cardinality estimation.
### Manifest
The MANIFEST file is stored in the index directory and lists all the files that belong to the index and the order in which they should be accessed.
This file is updated every time a compaction occurs.
Any files that are in the directory that are not in the index file are index files that are in the process of being compacted.
### FileSet
A file set is an in-memory snapshot of the manifest that is obtained while the InfluxDB process is running.
This is required to provide a consistent view of the index at a point-in-time.
The file set also facilitates reference counting for all of its files so that no file will be deleted via compaction until all readers of the file are done with it.

View File

@ -0,0 +1,79 @@
---
title: InfluxDB Enterprise features
description: Users, clustering, and other InfluxDB Enterprise features.
aliases:
- /enterprise/v1.8/features/
menu:
enterprise_influxdb_1_10:
name: Enterprise features
weight: 60
---
InfluxDB Enterprise has additional capabilities that enhance
[availability](#clustering),
[scalability](#clustering), and
[security](#security),
and provide [eventual consistency](#eventual-consistency).
## Clustering
InfluxDB Enterprise runs on a network of independent servers, a *cluster*,
to provide fault tolerance, availability, and horizontal scalability of the database.
While many InfluxDB Enterprise features are available
when run with a single meta node and a single data node, this configuration does not take advantage of the clustering capablity
or ensure high availablity.
Nodes can be added to an existing cluster to improve database performance for querying and writing data.
Certain configurations (e.g., 3 meta and 2 data node) provide high-availability assurances
while making certain tradeoffs in query peformance when compared to a single node.
Further increasing the number of nodes can improve performance in both respects.
For example, a cluster with 4 data nodes and a [replication factor](https://docs.influxdata.com/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor)
of 2 can support a higher volume of write traffic than a single node could.
It can also support a higher *query* workload, as the data is replicated
in two locations. Performance of the queries may be on par with a single
node in cases where the query can be answered directly by the node which
receives the query.
For more information on clustering, see [Clustering in InfluxDB Enterprise](/enterprise_influxdb/v1.10/concepts/clustering/).
## Security
Enterprise authorization uses an expanded set of [*16 user permissions and roles*](/enterprise_influxdb/v1.10/features/users/).
(InfluxDB OSS only has `READ` and `WRITE` permissions.)
Administrators can give users permission to read and write to databases,
create and remove databases, rebalance a cluster, and manage particular resources.
Organizations can automate managing permissions with the [InfluxDB Enterprise Meta API](/enterprise_influxdb/v1.10/administration/manage/security/authentication_and_authorization-api/).
[Fine-grained authorization](/enterprise_influxdb/v1.10/guides/fine-grained-authorization/)
for particular data is also available.
InfluxDB Enterprise can also use [LDAP for managing authentication](/enterprise_influxdb/v1.10/administration/manage/security/ldap/).
For FIPS compliance, InfluxDB Enterprise password hashing alogrithms are configurable.
{{% note %}}
Kapacitor OSS can also delegate its LDAP and security setup to InfluxDB Enterprise.
For details, see ["Set up InfluxDB Enterprise authorizations"](/{{< latest "kapacitor" >}}/administration/auth/influxdb-enterprise-auth/).
{{% /note %}}
## Eventual consistency
### Hinted handoff
Hinted handoff (HH) is how InfluxDB Enterprise deals with data node outages while writes are happening.
HH is essentially a durable disk based queue.
For more information, see ["Hinted handoff"](/enterprise_influxdb/v1.10/concepts/clustering/#hinted-handoff).
### Anti-entropy
Anti-entropy is an optional service to eliminate edge cases related to cluster consistency.
For more information, see ["Use Anti-Entropy service in InfluxDB Enterprise"](/enterprise_influxdb/v1.10/administration/anti-entropy/).
---
{{< children hlevel="h3" >}}

View File

@ -0,0 +1,152 @@
---
title: InfluxDB Enterprise cluster features
description: Overview of features related to InfluxDB Enterprise clustering.
aliases:
- /enterprise/v1.8/features/clustering-features/
menu:
enterprise_influxdb_1_10:
name: Cluster features
weight: 20
parent: Enterprise features
---
{{% note %}}
_For an overview of InfluxDB Enterprise security features,
see ["InfluxDB Enterprise features - Security"](/enterprise_influxdb/v1.10/features/#security).
To secure your InfluxDB Enterprise cluster, see
["Configure security"](/enterprise_influxdb/v1.10/administration/configure/security/).
{{% /note %}}
## Entitlements
A valid license key is required in order to start `influxd-meta` or `influxd`.
License keys restrict the number of data nodes that can be added to a cluster as well as the number of CPU cores a data node can use.
Without a valid license, the process will abort startup.
Access your license expiration date with the `/debug/vars` endpoint.
{{< keep-url >}}
```sh
$ curl http://localhost:8086/debug/vars | jq '.entitlements'
{
"name": "entitlements",
"tags": null,
"values": {
"licenseExpiry": "2022-02-15T00:00:00Z",
"licenseType": "license-key"
}
}
```
{{% caption %}}
This examples uses `curl` and [`jq`](https://stedolan.github.io/jq/).
{{% /caption %}}
## Query management
Query management works cluster wide. Specifically, `SHOW QUERIES` and `KILL QUERY <ID>` on `"<host>"` can be run on any data node. `SHOW QUERIES` will report all queries running across the cluster and the node which is running the query.
`KILL QUERY` can abort queries running on the local node or any other remote data node. For details on using the `SHOW QUERIES` and `KILL QUERY` on InfluxDB Enterprise clusters,
see [Query Management](/enterprise_influxdb/v1.10/troubleshooting/query_management/).
## Subscriptions
Subscriptions used by Kapacitor work in a cluster. Writes to any node will be forwarded to subscribers across all supported subscription protocols.
## Continuous queries
### Configuration and operational considerations on a cluster
It is important to understand how to configure InfluxDB Enterprise and how this impacts the continuous queries (CQ) engines behavior:
- **Data node configuration** `[continuous queries]`
[run-interval](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#run-interval)
-- The interval at which InfluxDB checks to see if a CQ needs to run. Set this option to the lowest interval
at which your CQs run. For example, if your most frequent CQ runs every minute, set run-interval to 1m.
- **Meta node configuration** `[meta]`
[lease-duration](/enterprise_influxdb/v1.10/administration/configure/config-meta-nodes/#lease-duration)
-- The default duration of the leases that data nodes acquire from the meta nodes. Leases automatically expire after the
lease-duration is met. Leases ensure that only one data node is running something at a given time. For example, Continuous
Queries use a lease so that all data nodes arent running the same CQs at once.
- **Execution time of CQs** CQs are sequentially executed. Depending on the amount of work that they need to accomplish
in order to complete, the configuration parameters mentioned above can have an impact on the observed behavior of CQs.
The CQ service is running on every node, but only a single node is granted exclusive access to execute CQs at any one time.
However, every time the `run-interval` elapses (and assuming a node isn't currently executing CQs), a node attempts to
acquire the CQ lease. By default the `run-interval` is one second so the data nodes are aggressively checking to see
if they can acquire the lease. On clusters where all CQs execute in an amount of time less than `lease-duration`
(default is 1m), there's a good chance that the first data node to acquire the lease will still hold the lease when
the `run-interval` elapses. Other nodes will be denied the lease and when the node holding the lease requests it again,
the lease is renewed with the expiration extended to `lease-duration`. So in a typical situation, we observe that a
single data node acquires the CQ lease and holds on to it. It effectively becomes the executor of CQs until it is
recycled (for any reason).
Now consider the the following case, CQs take longer to execute than the `lease-duration`, so when the lease expires,
~1 second later another data node requests and is granted the lease. The original holder of the lease is busily working
on sequentially executing the list of CQs it was originally handed and the data node now holding the lease begins
executing CQs from the top of the list.
Based on this scenario, it may appear that CQs are “executing in parallel” because multiple data nodes are
essentially “rolling” sequentially through the registered CQs and the lease is rolling from node to node.
The “long pole” here is effectively your most complex CQ and it likely means that at some point all nodes
are attempting to execute that same complex CQ (and likely competing for resources as they overwrite points
generated by that query on each node that is executing it --- likely with some phased offset).
To avoid this behavior, and this is desirable because it reduces the overall load on your cluster,
you should set the lease-duration to a value greater than the aggregate execution time for ALL the CQs that you are running.
Based on the current way in which CQs are configured to execute, the way to address parallelism is by using
Kapacitor for the more complex CQs that you are attempting to run.
[See Kapacitor as a continuous query engine](/{{< latest "kapacitor" >}}/guides/continuous_queries/).
However, you can keep the more simplistic and highly performant CQs within the database
but ensure that the lease duration is greater than their aggregate execution time to ensure that
“extra” load is not being unnecessarily introduced on your cluster.
## PProf endpoints
Meta nodes expose the `/debug/pprof` endpoints for profiling and troubleshooting.
## Shard movement
* [Copy shard](/enterprise_influxdb/v1.10/tools/influxd-ctl/#copy-shard) support - copy a shard from one node to another
* [Copy shard status](/enterprise_influxdb/v1.10/tools/influxd-ctl/#copy-shard-status) - query the status of a copy shard request
* [Kill copy shard](/enterprise_influxdb/v1.10/tools/influxd-ctl/#kill-copy-shard) - kill a running shard copy
* [Remove shard](/enterprise_influxdb/v1.10/tools/influxd-ctl/#remove-shard) - remove a shard from a node (this deletes data)
* [Truncate shards](/enterprise_influxdb/v1.10/tools/influxd-ctl/#truncate-shards) - truncate all active shard groups and start new shards immediately (This is useful when adding nodes or changing replication factors.)
This functionality is exposed via an API on the meta service and through [`influxd-ctl` sub-commands](/enterprise_influxdb/v1.10/tools/influxd-ctl/).
## OSS conversion
Importing a OSS single server as the first data node is supported.
See [OSS to cluster migration](/enterprise_influxdb/v1.10/guides/migration/) for
step-by-step instructions.
## Query routing
The query engine skips failed nodes that hold a shard needed for queries.
If there is a replica on another node, it will retry on that node.
## Backup and restore
InfluxDB Enterprise clusters support backup and restore functionality starting with
version 0.7.1.
See [Backup and restore](/enterprise_influxdb/v1.10/administration/backup-and-restore/) for
more information.
## Passive node setup (experimental)
Passive nodes act as load balancers--they accept write calls, perform shard lookup and RPC calls (on active data nodes), and distribute writes to active data nodes. They do not own shards or accept writes.
Use this feature when you have a replication factor (RF) of 2 or more and your CPU usage is consistently above 80 percent. Using the passive feature lets you scale a cluster when you can no longer vertically scale. Especially useful if you experience a large amount of hinted handoff growth. The passive node writes the hinted handoff queue to its own disk, and then communicates periodically with the appropriate node until it can send the queue contents there.
Best practices when using an active-passive node setup:
- Use when you have a large cluster setup, generally 8 or more nodes.
- Keep the ratio of active to passive nodes between 1:1 and 2:1.
- Passive nodes should receive all writes.
For more inforrmation, see how to [add a passive node to a cluster](/enterprise_influxdb/v1.10/tools/influxd-ctl/#add-a-passive-node-to-the-cluster).
{{% note %}}
**Note:** This feature is experimental and available only in InfluxDB Enterprise.
{{% /note %}}

View File

@ -0,0 +1,35 @@
---
title: Flux data scripting language
description: >
Flux is a functional data scripting language designed for querying, analyzing, and acting on time series data.
menu:
enterprise_influxdb_1_10:
name: Flux
weight: 71
v2: /influxdb/v2.0/query-data/get-started/
---
Flux is a functional data scripting language designed for querying, analyzing, and acting on time series data.
Its takes the power of [InfluxQL](/enterprise_influxdb/v1.10/query_language/spec/) and the functionality of [TICKscript](/{{< latest "kapacitor" >}}/tick/introduction/) and combines them into a single, unified syntax.
> Flux v0.65 is production-ready and included with [InfluxDB v1.8](/enterprise_influxdb/v1.10).
> The InfluxDB v1.8 implementation of Flux is read-only and does not support
> writing data back to InfluxDB.
## Flux design principles
Flux is designed to be usable, readable, flexible, composable, testable, contributable, and shareable.
Its syntax is largely inspired by [2018's most popular scripting language](https://insights.stackoverflow.com/survey/2018#technology),
Javascript, and takes a functional approach to data exploration and processing.
The following example illustrates pulling data from a bucket (similar to an InfluxQL database) for the last five minutes,
filtering that data by the `cpu` measurement and the `cpu=cpu-total` tag, windowing the data in 1 minute intervals,
and calculating the average of each window:
```js
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu" and r.cpu == "cpu-total")
|> aggregateWindow(every: 1m, fn: mean)
```
{{< children >}}

View File

@ -0,0 +1,157 @@
---
title: Execute Flux queries
description: Use the InfluxDB CLI, API, and the Chronograf Data Explorer to execute Flux queries.
menu:
enterprise_influxdb_1_10:
name: Execute Flux queries
parent: Flux
weight: 1
aliases:
- /enterprise_influxdb/v1.10/flux/guides/executing-queries/
- /enterprise_influxdb/v1.10/flux/guides/execute-queries/
v2: /influxdb/v2.0/query-data/execute-queries/
---
There are multiple ways to execute Flux queries with InfluxDB Enterprise and Chronograf v1.8+.
This guide covers the different options:
1. [Chronograf's Data Explorer](#chronograf-s-data-explorer)
2. [Influx CLI](#influx-cli)
3. [InfluxDB API](#influxdb-api)
> Before attempting these methods, make sure Flux is enabled by setting
> `flux-enabled = true` in the `[http]` section of your InfluxDB configuration file.
## Chronograf's Data Explorer
Chronograf v1.8+ supports Flux in its Data Explorer.
Flux queries can be built, executed, and visualized from within the Chronograf user interface.
## Influx CLI
To start an interactive Flux read-eval-print-loop (REPL) with the InfluxDB Enterprise 1.10+
`influx` CLI, run the `influx` command with the following flags:
- `-type=flux`
- `-path-prefix=/api/v2/query`
{{% note %}}
If [authentication is enabled](/enterprise_influxdb/v1.10/administration/authentication_and_authorization)
on your InfluxDB instance, use the `-username` flag to provide your InfluxDB username and
the `-password` flag to provide your password.
{{% /note %}}
##### Enter an interactive Flux REPL
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[No Auth](#)
[Auth Enabled](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
influx -type=flux -path-prefix=/api/v2/query
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
influx -type=flux \
-path-prefix=/api/v2/query \
-username myuser \
-password PasSw0rd
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
Any Flux query can be executed within the REPL.
### Submit a Flux query via parameter
Flux queries can also be passed to the Flux REPL as a parameter using the `influx` CLI's `-type=flux` option and the `-execute` parameter.
The accompanying string is executed as a Flux query and results are output in your terminal.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[No Auth](#)
[Auth Enabled](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
influx -type=flux \
-path-prefix=/api/v2/query \
-execute '<flux query>'
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
influx -type=flux \
-path-prefix=/api/v2/query \
-username myuser \
-password PasSw0rd \
-execute '<flux query>'
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
### Submit a Flux query via via STDIN
Flux queries an be piped into the `influx` CLI via STDIN.
Query results are otuput in your terminal.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[No Auth](#)
[Auth Enabled](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
echo '<flux query>' | influx -type=flux -path-prefix=/api/v2/query
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
echo '<flux query>' | influx -type=flux \
-path-prefix=/api/v2/query \
-username myuser \
-password PasSw0rd
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
## InfluxDB API
Flux can be used to query InfluxDB through InfluxDB's `/api/v2/query` endpoint.
Queried data is returned in annotated CSV format.
In your request, set the following:
- `Accept` header to `application/csv`
- `Content-type` header to `application/vnd.flux`
- If [authentication is enabled](/enterprise_influxdb/v1.10/administration/authentication_and_authorization)
on your InfluxDB instance, `Authorization` header to `Token <username>:<password>`
This allows you to POST the Flux query in plain text and receive the annotated CSV response.
Below is an example `curl` command that queries InfluxDB using Flux:
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[No Auth](#)
[Auth Enabled](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
curl -XPOST localhost:8086/api/v2/query -sS \
-H 'Accept:application/csv' \
-H 'Content-type:application/vnd.flux' \
-d 'from(bucket:"telegraf")
|> range(start:-5m)
|> filter(fn:(r) => r._measurement == "cpu")'
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
curl -XPOST localhost:8086/api/v2/query -sS \
-H 'Accept:application/csv' \
-H 'Content-type:application/vnd.flux' \
-H 'Authorization: Token <username>:<password>' \
-d 'from(bucket:"telegraf")
|> range(start:-5m)
|> filter(fn:(r) => r._measurement == "cpu")'
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}

View File

@ -0,0 +1,368 @@
---
title: Flux vs InfluxQL
description:
menu:
enterprise_influxdb_1_10:
name: Flux vs InfluxQL
parent: Flux
weight: 5
---
Flux is an alternative to [InfluxQL](/enterprise_influxdb/v1.10/query_language/) and other SQL-like query languages for querying and analyzing data.
Flux uses functional language patterns making it incredibly powerful, flexible, and able to overcome many of the limitations of InfluxQL.
This article outlines many of the tasks possible with Flux but not InfluxQL and provides information about Flux and InfluxQL parity.
- [Possible with Flux](#possible-with-flux)
- [InfluxQL and Flux parity](#influxql-and-flux-parity)
## Possible with Flux
- [Joins](#joins)
- [Math across measurements](#math-across-measurements)
- [Sort by tags](#sort-by-tags)
- [Group by any column](#group-by-any-column)
- [Window by calendar months and years](#window-by-calendar-months-and-years)
- [Work with multiple data sources](#work-with-multiple-data-sources)
- [DatePart-like queries](#datepart-like-queries)
- [Pivot](#pivot)
- [Histograms](#histograms)
- [Covariance](#covariance)
- [Cast booleans to integers](#cast-booleans-to-integers)
- [String manipulation and data shaping](#string-manipulation-and-data-shaping)
- [Work with geo-temporal data](#work-with-geo-temporal-data)
### Joins
InfluxQL has never supported joins. They can be accomplished using [TICKscript](/{{< latest "kapacitor" >}}/tick/introduction/),
but even TICKscript's join capabilities are limited.
Flux's [`join()` function](/{{< latest "flux" >}}/stdlib/universe/join/) lets you
to join data **from any bucket, any measurement, and on any columns** as long as
each data set includes the columns on which they are to be joined.
This opens the door for really powerful and useful operations.
```js
dataStream1 = from(bucket: "bucket1")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "network" and r._field == "bytes-transferred")
dataStream2 = from(bucket: "bucket1")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "httpd" and r._field == "requests-per-sec")
join(tables: {d1: dataStream1, d2: dataStream2}, on: ["_time", "_stop", "_start", "host"])
```
---
_For an in-depth walkthrough of using the `join()` function, see [How to join data with Flux](/enterprise_influxdb/v1.10/flux/guides/join)._
---
### Math across measurements
Being able to perform cross-measurement joins also allows you to run calculations using
data from separate measurements a highly requested feature from the InfluxData community.
The example below takes two data streams from separate measurements, `mem` and `processes`,
joins them, then calculates the average amount of memory used per running process:
```js
// Memory used (in bytes)
memUsed = from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used")
// Total processes running
procTotal = from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "processes" and r._field == "total")
// Join memory used with total processes and calculate
// the average memory (in MB) used for running processes.
join(tables: {mem: memUsed, proc: procTotal}, on: ["_time", "_stop", "_start", "host"])
|> map(fn: (r) => ({_time: r._time, _value: r._value_mem / r._value_proc / 1000000}))
```
### Sort by tags
InfluxQL's sorting capabilities are very limited, allowing you only to control the
sort order of `time` using the `ORDER BY time` clause.
Flux's [`sort()` function](/{{< latest "flux" >}}/stdlib/universe/sort) sorts records based on list of columns.
Depending on the column type, records are sorted lexicographically, numerically, or chronologically.
```js
from(bucket: "telegraf/autogen")
|> range(start: -12h)
|> filter(fn: (r) => r._measurement == "system" and r._field == "uptime")
|> sort(columns: ["region", "host", "_value"])
```
### Group by any column
InfluxQL lets you group by tags or by time intervals, but nothing else.
Flux lets you group by any column in the dataset, including `_value`.
Use the Flux [`group()` function](/{{< latest "flux" >}}/stdlib/universe/group/)
to define which columns to group data by.
```js
from(bucket:"telegraf/autogen")
|> range(start:-12h)
|> filter(fn: (r) => r._measurement == "system" and r._field == "uptime" )
|> group(columns:["host", "_value"])
```
### Window by calendar months and years
InfluxQL does not support windowing data by calendar months and years due to their varied lengths.
Flux supports calendar month and year duration units (`1mo`, `1y`) and lets you
window and aggregate data by calendar month and year.
```js
from(bucket:"telegraf/autogen")
|> range(start:-1y)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent" )
|> aggregateWindow(every: 1mo, fn: mean)
```
### Work with multiple data sources
InfluxQL can only query data stored in InfluxDB.
Flux can query data from other data sources such as CSV, PostgreSQL, MySQL, Google BigTable, and more.
Join that data with data in InfluxDB to enrich query results.
- [Flux CSV package](/{{< latest "flux" >}}/stdlib/csv/)
- [Flux SQL package](/{{< latest "flux" >}}/stdlib/sql/)
- [Flux BigTable package](/{{< latest "flux" >}}/stdlib/experimental/bigtable/)
<!-- -->
```js
import "csv"
import "sql"
csvData = csv.from(csv: rawCSV)
sqlData = sql.from(
driverName: "postgres",
dataSourceName: "postgresql://user:password@localhost",
query: "SELECT * FROM example_table",
)
data = from(bucket: "telegraf/autogen")
|> range(start: -24h)
|> filter(fn: (r) => r._measurement == "sensor")
auxData = join(tables: {csv: csvData, sql: sqlData}, on: ["sensor_id"])
enrichedData = join(tables: {data: data, aux: auxData}, on: ["sensor_id"])
enrichedData
|> yield(name: "enriched_data")
```
---
_For an in-depth walkthrough of querying SQL data, see [Query SQL data sources](/enterprise_influxdb/v1.10/flux/guides/sql)._
---
### DatePart-like queries
InfluxQL doesn't support DatePart-like queries that only return results during specified hours of the day.
The Flux [`hourSelection` function](/{{< latest "flux" >}}/stdlib/universe/hourselection/)
returns only data with time values in a specified hour range.
```js
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu" and r.cpu == "cpu-total")
|> hourSelection(start: 9, stop: 17)
```
### Pivot
Pivoting data tables has never been supported in InfluxQL.
The Flux [`pivot()` function](/{{< latest "flux" >}}/stdlib/universe/pivot) provides the ability
to pivot data tables by specifying `rowKey`, `columnKey`, and `valueColumn` parameters.
```js
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu" and r.cpu == "cpu-total")
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
```
### Histograms
The ability to generate histograms has been a highly requested feature for InfluxQL, but has never been supported.
Flux's [`histogram()` function](/{{< latest "flux" >}}/stdlib/universe/histogram) uses input
data to generate a cumulative histogram with support for other histogram types coming in the future.
```js
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|> histogram(buckets: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100,])
```
---
_For an example of using Flux to create a cumulative histogram, see [Create histograms](/enterprise_influxdb/v1.10/flux/guides/histograms)._
---
### Covariance
Flux provides functions for simple covariance calculation.
The [`covariance()` function](/{{< latest "flux" >}}/stdlib/universe/covariance)
calculates the covariance between two columns and the [`cov()` function](/{{< latest "flux" >}}/stdlib/universe/cov)
calculates the covariance between two data streams.
###### Covariance between two columns
```js
from(bucket: "telegraf/autogen")
|> range(start: -5m)
|> covariance(columns: ["x", "y"])
```
###### Covariance between two streams of data
```js
table1 = from(bucket: "telegraf/autogen")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "measurement_1")
table2 = from(bucket: "telegraf/autogen")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "measurement_2")
cov(x: table1, y: table2, on: ["_time", "_field"])
```
### Cast booleans to integers
InfluxQL supports type casting, but only for numeric data types (floats to integers and vice versa).
[Flux type conversion functions](/{{< latest "flux" >}}/stdlib/universe/type-conversions/)
provide much broader support for type conversions and let you perform some long-requested
operations like casting a boolean values to integers.
##### Cast boolean field values to integers
```js
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "m" and r._field == "bool_field")
|> toInt()
```
### String manipulation and data shaping
InfluxQL doesn't support string manipulation when querying data.
The [Flux Strings package](/{{< latest "flux" >}}/stdlib/strings/) is a collection of functions that operate on string data.
When combined with the [`map()` function](/{{< latest "flux" >}}/stdlib/universe/map/),
functions in the string package allow for operations like string sanitization and normalization.
```js
import "strings"
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "weather" and r._field == "temp")
|> map(
fn: (r) => ({
r with
location: strings.toTitle(v: r.location),
sensor: strings.replaceAll(v: r.sensor, t: " ", u: "-"),
status: strings.substring(v: r.status, start: 0, end: 8),
})
)
```
### Work with geo-temporal data
InfluxQL doesn't provide functionality for working with geo-temporal data.
The [Flux Geo package](/{{< latest "flux" >}}/stdlib/experimental/geo/) is a collection of functions that
let you shape, filter, and group geo-temporal data.
```js
import "experimental/geo"
from(bucket: "geo/autogen")
|> range(start: -1w)
|> filter(fn: (r) => r._measurement == "taxi")
|> geo.shapeData(latField: "latitude", lonField: "longitude", level: 20)
|> geo.filterRows(region: {lat: 40.69335938, lon: -73.30078125, radius: 20.0}, strict: true)
|> geo.asTracks(groupBy: ["fare-id"])
```
## InfluxQL and Flux parity
Flux is working towards complete parity with InfluxQL and new functions are being added to that end.
The table below shows InfluxQL statements, clauses, and functions along with their equivalent Flux functions.
_For a complete list of Flux functions, [view all Flux functions](/{{< latest "flux" >}}/stdlib/all-functions)._
### InfluxQL and Flux parity
| InfluxQL | Flux Functions |
| :------------------------------------------------------------------------------------------------------------------------------------------ | :----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [SELECT](/enterprise_influxdb/v1.10/query_language/explore-data/#the-basic-select-statement) | [filter()](/{{< latest "flux" >}}/stdlib/universe/filter/) |
| [WHERE](/enterprise_influxdb/v1.10/query_language/explore-data/#the-where-clause) | [filter()](/{{< latest "flux" >}}/stdlib/universe/filter/), [range()](/{{< latest "flux" >}}/stdlib/universe/range/) |
| [GROUP BY](/enterprise_influxdb/v1.10/query_language/explore-data/#the-group-by-clause) | [group()](/{{< latest "flux" >}}/stdlib/universe/group/) |
| [INTO](/enterprise_influxdb/v1.10/query_language/explore-data/#the-into-clause) | [to()](/{{< latest "flux" >}}/stdlib/universe/to/) <span><a style="color:orange" href="#footnote">*</a></span> |
| [ORDER BY](/enterprise_influxdb/v1.10/query_language/explore-data/#order-by-time-desc) | [sort()](/{{< latest "flux" >}}/stdlib/universe/sort/) |
| [LIMIT](/enterprise_influxdb/v1.10/query_language/explore-data/#the-limit-clause) | [limit()](/{{< latest "flux" >}}/stdlib/universe/limit/) |
| [SLIMIT](/enterprise_influxdb/v1.10/query_language/explore-data/#the-slimit-clause) | -- |
| [OFFSET](/enterprise_influxdb/v1.10/query_language/explore-data/#the-offset-clause) | -- |
| [SOFFSET](/enterprise_influxdb/v1.10/query_language/explore-data/#the-soffset-clause) | -- |
| [SHOW DATABASES](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-databases) | [buckets()](/{{< latest "flux" >}}/stdlib/universe/buckets/) |
| [SHOW MEASUREMENTS](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-measurements) | [v1.measurements](/{{< latest "flux" >}}/stdlib/influxdb-v1/measurements) |
| [SHOW FIELD KEYS](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-field-keys) | [keys()](/{{< latest "flux" >}}/stdlib/universe/keys/) |
| [SHOW RETENTION POLICIES](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-retention-policies) | [buckets()](/{{< latest "flux" >}}/stdlib/universe/buckets/) |
| [SHOW TAG KEYS](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-tag-keys) | [v1.tagKeys()](/{{< latest "flux" >}}/stdlib/influxdb-v1/tagkeys), [v1.measurementTagKeys()](/{{< latest "flux" >}}/stdlib/influxdb-v1/measurementtagkeys) |
| [SHOW TAG VALUES](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-tag-values) | [v1.tagValues()](/{{< latest "flux" >}}/stdlib/influxdb-v1/tagvalues), [v1.measurementTagValues()](/{{< latest "flux" >}}/stdlib/influxdb-v1/measurementtagvalues) |
| [SHOW SERIES](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-series) | -- |
| [CREATE DATABASE](/enterprise_influxdb/v1.10/query_language/manage-database/#create-database) | -- |
| [DROP DATABASE](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-a-database-with-drop-database) | -- |
| [DROP SERIES](/enterprise_influxdb/v1.10/query_language/manage-database/#drop-series-from-the-index-with-drop-series) | -- |
| [DELETE](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-series-with-delete) | -- |
| [DROP MEASUREMENT](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-measurements-with-drop-measurement) | -- |
| [DROP SHARD](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-a-shard-with-drop-shard) | -- |
| [CREATE RETENTION POLICY](/enterprise_influxdb/v1.10/query_language/manage-database/#create-retention-policies-with-create-retention-policy) | -- |
| [ALTER RETENTION POLICY](/enterprise_influxdb/v1.10/query_language/manage-database/#modify-retention-policies-with-alter-retention-policy) | -- |
| [DROP RETENTION POLICY](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-retention-policies-with-drop-retention-policy) | -- |
| [COUNT](/enterprise_influxdb/v1.10/query_language/functions#count) | [count()](/{{< latest "flux" >}}/stdlib/universe/count/) |
| [DISTINCT](/enterprise_influxdb/v1.10/query_language/functions#distinct) | [distinct()](/{{< latest "flux" >}}/stdlib/universe/distinct/) |
| [INTEGRAL](/enterprise_influxdb/v1.10/query_language/functions#integral) | [integral()](/{{< latest "flux" >}}/stdlib/universe/integral/) |
| [MEAN](/enterprise_influxdb/v1.10/query_language/functions#mean) | [mean()](/{{< latest "flux" >}}/stdlib/universe/mean/) |
| [MEDIAN](/enterprise_influxdb/v1.10/query_language/functions#median) | [median()](/{{< latest "flux" >}}/stdlib/universe/median/) |
| [MODE](/enterprise_influxdb/v1.10/query_language/functions#mode) | [mode()](/{{< latest "flux" >}}/stdlib/universe/mode/) |
| [SPREAD](/enterprise_influxdb/v1.10/query_language/functions#spread) | [spread()](/{{< latest "flux" >}}/stdlib/universe/spread/) |
| [STDDEV](/enterprise_influxdb/v1.10/query_language/functions#stddev) | [stddev()](/{{< latest "flux" >}}/stdlib/universe/stddev/) |
| [SUM](/enterprise_influxdb/v1.10/query_language/functions#sum) | [sum()](/{{< latest "flux" >}}/stdlib/universe/sum/) |
| [BOTTOM](/enterprise_influxdb/v1.10/query_language/functions#bottom) | [bottom()](/{{< latest "flux" >}}/stdlib/universe/bottom/) |
| [FIRST](/enterprise_influxdb/v1.10/query_language/functions#first) | [first()](/{{< latest "flux" >}}/stdlib/universe/first/) |
| [LAST](/enterprise_influxdb/v1.10/query_language/functions#last) | [last()](/{{< latest "flux" >}}/stdlib/universe/last/) |
| [MAX](/enterprise_influxdb/v1.10/query_language/functions#max) | [max()](/{{< latest "flux" >}}/stdlib/universe/max/) |
| [MIN](/enterprise_influxdb/v1.10/query_language/functions#min) | [min()](/{{< latest "flux" >}}/stdlib/universe/min/) |
| [PERCENTILE](/enterprise_influxdb/v1.10/query_language/functions#percentile) | [quantile()](/{{< latest "flux" >}}/stdlib/universe/quantile/) |
| [SAMPLE](/enterprise_influxdb/v1.10/query_language/functions#sample) | [sample()](/{{< latest "flux" >}}/stdlib/universe/sample/) |
| [TOP](/enterprise_influxdb/v1.10/query_language/functions#top) | [top()](/{{< latest "flux" >}}/stdlib/universe/top/) |
| [ABS](/enterprise_influxdb/v1.10/query_language/functions#abs) | [math.abs()](/{{< latest "flux" >}}/stdlib/math/abs/) |
| [ACOS](/enterprise_influxdb/v1.10/query_language/functions#acos) | [math.acos()](/{{< latest "flux" >}}/stdlib/math/acos/) |
| [ASIN](/enterprise_influxdb/v1.10/query_language/functions#asin) | [math.asin()](/{{< latest "flux" >}}/stdlib/math/asin/) |
| [ATAN](/enterprise_influxdb/v1.10/query_language/functions#atan) | [math.atan()](/{{< latest "flux" >}}/stdlib/math/atan/) |
| [ATAN2](/enterprise_influxdb/v1.10/query_language/functions#atan2) | [math.atan2()](/{{< latest "flux" >}}/stdlib/math/atan2/) |
| [CEIL](/enterprise_influxdb/v1.10/query_language/functions#ceil) | [math.ceil()](/{{< latest "flux" >}}/stdlib/math/ceil/) |
| [COS](/enterprise_influxdb/v1.10/query_language/functions#cos) | [math.cos()](/{{< latest "flux" >}}/stdlib/math/cos/) |
| [CUMULATIVE_SUM](/enterprise_influxdb/v1.10/query_language/functions#cumulative-sum) | [cumulativeSum()](/{{< latest "flux" >}}/stdlib/universe/cumulativesum/) |
| [DERIVATIVE](/enterprise_influxdb/v1.10/query_language/functions#derivative) | [derivative()](/{{< latest "flux" >}}/stdlib/universe/derivative/) |
| [DIFFERENCE](/enterprise_influxdb/v1.10/query_language/functions#difference) | [difference()](/{{< latest "flux" >}}/stdlib/universe/difference/) |
| [ELAPSED](/enterprise_influxdb/v1.10/query_language/functions#elapsed) | [elapsed()](/{{< latest "flux" >}}/stdlib/universe/elapsed/) |
| [EXP](/enterprise_influxdb/v1.10/query_language/functions#exp) | [math.exp()](/{{< latest "flux" >}}/stdlib/math/exp/) |
| [FLOOR](/enterprise_influxdb/v1.10/query_language/functions#floor) | [math.floor()](/{{< latest "flux" >}}/stdlib/math/floor/) |
| [HISTOGRAM](/enterprise_influxdb/v1.10/query_language/functions#histogram) | [histogram()](/{{< latest "flux" >}}/stdlib/universe/histogram/) |
| [LN](/enterprise_influxdb/v1.10/query_language/functions#ln) | [math.log()](/{{< latest "flux" >}}/stdlib/math/log/) |
| [LOG](/enterprise_influxdb/v1.10/query_language/functions#log) | [math.logb()](/{{< latest "flux" >}}/stdlib/math/logb/) |
| [LOG2](/enterprise_influxdb/v1.10/query_language/functions#log2) | [math.log2()](/{{< latest "flux" >}}/stdlib/math/log2/) |
| [LOG10](/enterprise_influxdb/v1.10/query_language/functions/#log10) | [math.log10()](/{{< latest "flux" >}}/stdlib/math/log10/) |
| [MOVING_AVERAGE](/enterprise_influxdb/v1.10/query_language/functions#moving-average) | [movingAverage()](/{{< latest "flux" >}}/stdlib/universe/movingaverage/) |
| [NON_NEGATIVE_DERIVATIVE](/enterprise_influxdb/v1.10/query_language/functions#non-negative-derivative) | [derivative(nonNegative:true)](/{{< latest "flux" >}}/stdlib/universe/derivative/) |
| [NON_NEGATIVE_DIFFERENCE](/enterprise_influxdb/v1.10/query_language/functions#non-negative-difference) | [difference(nonNegative:true)](/{{< latest "flux" >}}/stdlib/universe/derivative/) |
| [POW](/enterprise_influxdb/v1.10/query_language/functions#pow) | [math.pow()](/{{< latest "flux" >}}/stdlib/math/pow/) |
| [ROUND](/enterprise_influxdb/v1.10/query_language/functions#round) | [math.round()](/{{< latest "flux" >}}/stdlib/math/round/) |
| [SIN](/enterprise_influxdb/v1.10/query_language/functions#sin) | [math.sin()](/{{< latest "flux" >}}/stdlib/math/sin/) |
| [SQRT](/enterprise_influxdb/v1.10/query_language/functions#sqrt) | [math.sqrt()](/{{< latest "flux" >}}/stdlib/math/sqrt/) |
| [TAN](/enterprise_influxdb/v1.10/query_language/functions#tan) | [math.tan()](/{{< latest "flux" >}}/stdlib/math/tan/) |
| [HOLT_WINTERS](/enterprise_influxdb/v1.10/query_language/functions#holt-winters) | [holtWinters()](/{{< latest "flux" >}}/stdlib/universe/holtwinters/) |
| [CHANDE_MOMENTUM_OSCILLATOR](/enterprise_influxdb/v1.10/query_language/functions#chande-momentum-oscillator) | [chandeMomentumOscillator()](/{{< latest "flux" >}}/stdlib/universe/chandemomentumoscillator/) |
| [EXPONENTIAL_MOVING_AVERAGE](/enterprise_influxdb/v1.10/query_language/functions#exponential-moving-average) | [exponentialMovingAverage()](/{{< latest "flux" >}}/stdlib/universe/exponentialmovingaverage/) |
| [DOUBLE_EXPONENTIAL_MOVING_AVERAGE](/enterprise_influxdb/v1.10/query_language/functions#double-exponential-moving-average) | [doubleEMA()](/{{< latest "flux" >}}/stdlib/universe/doubleema/) |
| [KAUFMANS_EFFICIENCY_RATIO](/enterprise_influxdb/v1.10/query_language/functions#kaufmans-efficiency-ratio) | [kaufmansER()](/{{< latest "flux" >}}/stdlib/universe/kaufmanser/) |
| [KAUFMANS_ADAPTIVE_MOVING_AVERAGE](/enterprise_influxdb/v1.10/query_language/functions#kaufmans-adaptive-moving-average) | [kaufmansAMA()](/{{< latest "flux" >}}/stdlib/universe/kaufmansama/) |
| [TRIPLE_EXPONENTIAL_MOVING_AVERAGE](/enterprise_influxdb/v1.10/query_language/functions#triple-exponential-moving-average) | [tripleEMA()](/{{< latest "flux" >}}/stdlib/universe/tripleema/) |
| [TRIPLE_EXPONENTIAL_DERIVATIVE](/enterprise_influxdb/v1.10/query_language/functions#triple-exponential-derivative) | [tripleExponentialDerivative()](/{{< latest "flux" >}}/stdlib/universe/tripleexponentialderivative/) |
| [RELATIVE_STRENGTH_INDEX](/enterprise_influxdb/v1.10/query_language/functions#relative-strength-index) | [relativeStrengthIndex()](/{{< latest "flux" >}}/stdlib/universe/relativestrengthindex/) |
_<span style="font-size:.9rem" id="footnote"><span style="color:orange">*</span> The <code>to()</code> function only writes to InfluxDB 2.0.</span>_

View File

@ -0,0 +1,115 @@
---
title: Get started with Flux
description: >
Get started with Flux, InfluxData's new functional data scripting language.
This step-by-step guide will walk you through the basics and get you on your way.
menu:
enterprise_influxdb_1_10:
name: Get started with Flux
identifier: get-started
parent: Flux
weight: 2
aliases:
- /enterprise_influxdb/v1.10/flux/getting-started/
- /enterprise_influxdb/v1.10/flux/introduction/getting-started/
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/
v2: /influxdb/v2.0/query-data/get-started/
---
Flux is InfluxData's new functional data scripting language designed for querying,
analyzing, and acting on data.
This multi-part getting started guide walks through important concepts related to Flux.
It covers querying time series data from InfluxDB using Flux, and introduces Flux syntax and functions.
## What you will need
##### InfluxDB v1.8+
Flux v0.65 is built into InfluxDB v1.8 and can be used to query data stored in InfluxDB.
---
_For information about downloading and installing InfluxDB, see [InfluxDB installation](/enterprise_influxdb/v1.10/introduction/installation)._
---
##### Chronograf v1.8+
**Not required but strongly recommended**.
Chronograf v1.8's Data Explorer provides a user interface (UI) for writing Flux scripts and visualizing results.
Dashboards in Chronograf v1.8+ also support Flux queries.
---
_For information about downloading and installing Chronograf, see [Chronograf installation](/{{< latest "chronograf" >}}/introduction/installation)._
---
## Key concepts
Flux introduces important new concepts you should understand as you get started.
### Buckets
Flux introduces "buckets," a new data storage concept for InfluxDB.
A **bucket** is a named location where data is stored that has a retention policy.
It's similar to an InfluxDB v1.x "database," but is a combination of both a database and a retention policy.
When using multiple retention policies, each retention policy is treated as is its own bucket.
Flux's [`from()` function](/{{< latest "flux" >}}/stdlib/universe/from), which defines an InfluxDB data source, requires a `bucket` parameter.
When using Flux with InfluxDB v1.x, use the following bucket naming convention which combines
the database name and retention policy into a single bucket name:
###### InfluxDB v1.x bucket naming convention
```js
// Pattern
from(bucket:"<database>/<retention-policy>")
// Example
from(bucket:"telegraf/autogen")
```
### Pipe-forward operator
Flux uses pipe-forward operators (`|>`) extensively to chain operations together.
After each function or operation, Flux returns a table or collection of tables containing data.
The pipe-forward operator pipes those tables into the next function or operation where
they are further processed or manipulated.
### Tables
Flux structures all data in tables.
When data is streamed from data sources, Flux formats it as annotated comma-separated values (CSV), representing tables.
Functions then manipulate or process them and output new tables.
This makes it easy to chain together functions to build sophisticated queries.
#### Group keys
Every table has a **group key** which describes the contents of the table.
It's a list of columns for which every row in the table will have the same value.
Columns with unique values in each row are **not** part of the group key.
As functions process and transform data, each modifies the group keys of output tables.
Understanding how tables and group keys are modified by functions is key to properly
shaping your data for the desired output.
###### Example group key
```js
[_start, _stop, _field, _measurement, host]
```
Note that `_time` and `_value` are excluded from the example group key because they
are unique to each row.
## Tools for working with Flux
You have multiple [options for writing and running Flux queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/),
but as you're getting started, we recommend using the following:
### Chronograf's Data Explorer
Chronograf's Data Explorer makes it easy to write your first Flux script and visualize the results.
To use Chronograf's Flux UI, open the **Data Explorer** and to the right of the source
dropdown above the graph placeholder, select **Flux** as the source type.
This will provide **Schema**, **Script**, and **Functions** panes.
The Schema pane allows you to explore your data.
The Script pane is where you write your Flux script.
The Functions pane provides a list of functions available in your Flux queries.
<div class="page-nav-btns">
<a class="btn next" href="/enterprise_influxdb/v1.10/flux/get-started/query-influxdb/">Query InfluxDB with Flux</a>
</div>

View File

@ -0,0 +1,130 @@
---
title: Query InfluxDB with Flux
description: Learn the basics of using Flux to query data from InfluxDB.
menu:
enterprise_influxdb_1_10:
name: Query InfluxDB
parent: get-started
weight: 1
aliases:
- /enterprise_influxdb/v1.10/flux/getting-started/query-influxdb/
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/query-influxdb/
v2: /influxdb/v2.0/query-data/get-started/query-influxdb/
---
This guide walks through the basics of using Flux to query data from InfluxDB.
_**If you haven't already, make sure to install InfluxDB v1.8+, [enable Flux](/enterprise_influxdb/v1.10/flux/installation),
and choose a [tool for writing Flux queries](/enterprise_influxdb/v1.10/flux/get-started#tools-for-working-with-flux).**_
The following queries can be executed using any of the methods described in
[Execute Flux queries](/enterprise_influxdb/v1.10/flux/execute-queries/).
Be sure to provide your InfluxDB Enterprise authorization credentials with each method.
Every Flux query needs the following:
1. [A data source](#1-define-your-data-source)
2. [A time range](#2-specify-a-time-range)
3. [Data filters](#3-filter-your-data)
## 1. Define your data source
Flux's [`from()`](/{{< latest "flux" >}}/stdlib/universe/from) function defines an InfluxDB data source.
It requires a [`bucket`](/enterprise_influxdb/v1.10/flux/get-started/#buckets) parameter.
For this example, use `telegraf/autogen`, a combination of the default database and retention policy provided by the TICK stack.
```js
from(bucket:"telegraf/autogen")
```
## 2. Specify a time range
Flux requires a time range when querying time series data.
"Unbounded" queries are very resource-intensive and as a protective measure,
Flux will not query the database without a specified range.
Use the pipe-forward operator (`|>`) to pipe data from your data source into the [`range()`](/{{< latest "flux" >}}/stdlib/universe/range)
function, which specifies a time range for your query.
It accepts two properties: `start` and `stop`.
Ranges can be **relative** using negative [durations](/{{< latest "flux" >}}/spec/lexical-elements#duration-literals)
or **absolute** using [timestamps](/{{< latest "flux" >}}/spec/lexical-elements#date-and-time-literals).
###### Example relative time ranges
```js
// Relative time range with start only. Stop defaults to now.
from(bucket:"telegraf/autogen")
|> range(start: -1h)
// Relative time range with start and stop
from(bucket:"telegraf/autogen")
|> range(start: -1h, stop: -10m)
```
> Relative ranges are relative to "now."
###### Example absolute time range
```js
from(bucket:"telegraf/autogen")
|> range(start: 2018-11-05T23:30:00Z, stop: 2018-11-06T00:00:00Z)
```
#### Use the following:
For this guide, use the relative time range, `-15m`, to limit query results to data from the last 15 minutes:
```js
from(bucket:"telegraf/autogen")
|> range(start: -15m)
```
## 3. Filter your data
Pass your ranged data into the `filter()` function to narrow results based on data attributes or columns.
The `filter()` function has one parameter, `fn`, which expects an anonymous function
with logic that filters data based on columns or attributes.
Flux's anonymous function syntax is very similar to Javascript's.
Records or rows are passed into the `filter()` function as a record (`r`).
The anonymous function takes the record and evaluates it to see if it matches the defined filters.
Use the `AND` relational operator to chain multiple filters.
```js
// Pattern
(r) => (r.recordProperty comparisonOperator comparisonExpression)
// Example with single filter
(r) => (r._measurement == "cpu")
// Example with multiple filters
(r) => (r._measurement == "cpu") and (r._field != "usage_system" )
```
#### Use the following:
For this example, filter by the `cpu` measurement, the `usage_system` field, and the `cpu-total` tag value:
```js
from(bucket: "telegraf/autogen")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
```
## 4. Yield your queried data
Use Flux's `yield()` function to output the filtered tables as the result of the query.
```js
from(bucket: "telegraf/autogen")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|> yield()
```
> Chronograf and the `influx` CLI automatically assume a `yield()` function at
> the end of each script in order to output and visualize the data.
> Best practice is to include a `yield()` function, but it is not always necessary.
## Congratulations!
You have now queried data from InfluxDB using Flux.
The query shown here is a barebones example.
Flux queries can be extended in many ways to form powerful scripts.
<div class="page-nav-btns">
<a class="btn prev" href="/enterprise_influxdb/v1.10/flux/get-started/">Get started with Flux</a>
<a class="btn next" href="/enterprise_influxdb/v1.10/flux/get-started/transform-data/">Transform your data</a>
</div>

View File

@ -0,0 +1,211 @@
---
title: Flux syntax basics
description: An introduction to the basic elements of the Flux syntax with real-world application examples.
menu:
enterprise_influxdb_1_10:
name: Syntax basics
parent: get-started
weight: 3
aliases:
- /enterprise_influxdb/v1.10/flux/getting-started/syntax-basics/
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/syntax-basics/
v2: /influxdb/v2.0/query-data/get-started/syntax-basics/
---
Flux, at its core, is a scripting language designed specifically for working with data.
This guide walks through a handful of simple expressions and how they are handled in Flux.
### Simple expressions
Flux is a scripting language that supports basic expressions.
For example, simple addition:
```js
> 1 + 1
2
```
### Variables
Assign an expression to a variable using the assignment operator, `=`.
```js
> s = "this is a string"
> i = 1 // an integer
> f = 2.0 // a floating point number
```
Type the name of a variable to print its value:
```js
> s
this is a string
> i
1
> f
2
```
### Records
Flux also supports records. Each value in a record can be a different data type.
```js
> o = {name:"Jim", age: 42, "favorite color": "red"}
```
Use **dot notation** to access a properties of a record:
```js
> o.name
Jim
> o.age
42
```
Or **bracket notation**:
```js
> o["name"]
Jim
> o["age"]
42
> o["favorite color"]
red
```
{{% note %}}
Use bracket notation to reference record properties with special or
white space characters in the property key.
{{% /note %}}
### Lists
Flux supports lists. List values must be the same type.
```js
> n = 4
> l = [1,2,3,n]
> l
[1, 2, 3, 4]
```
### Functions
Flux uses functions for most of its heavy lifting.
Below is a simple function that squares a number, `n`.
```js
> square = (n) => n * n
> square(n:3)
9
```
> Flux does not support positional arguments or parameters.
> Parameters must always be named when calling a function.
### Pipe-forward operator
Flux uses the pipe-forward operator (`|>`) extensively to chain operations together.
After each function or operation, Flux returns a table or collection of tables containing data.
The pipe-forward operator pipes those tables into the next function where they are further processed or manipulated.
```js
data |> someFunction() |> anotherFunction()
```
## Real-world application of basic syntax
This likely seems familiar if you've already been through through the other [getting started guides](/enterprise_influxdb/v1.10/flux/get-started).
Flux's syntax is inspired by Javascript and other functional scripting languages.
As you begin to apply these basic principles in real-world use cases such as creating data stream variables,
custom functions, etc., the power of Flux and its ability to query and process data will become apparent.
The examples below provide both multi-line and single-line versions of each input command.
Carriage returns in Flux aren't necessary, but do help with readability.
Both single- and multi-line commands can be copied and pasted into the `influx` CLI running in Flux mode.
{{< tabs-wrapper >}}
{{% tabs %}}
[Multi-line inputs](#)
[Single-line inputs](#)
{{% /tabs %}}
{{% tab-content %}}
### Define data stream variables
A common use case for variable assignments in Flux is creating variables for one
or more input data streams.
```js
timeRange = -1h
cpuUsageUser = from(bucket: "telegraf/autogen")
|> range(start: timeRange)
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_user" and r.cpu == "cpu-total")
memUsagePercent = from(bucket: "telegraf/autogen")
|> range(start: timeRange)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
```
These variables can be used in other functions, such as `join()`, while keeping the syntax minimal and flexible.
### Define custom functions
Create a function that returns the `N` number rows in the input stream with the highest `_value`s.
To do this, pass the input stream (`tables`) and the number of results to return (`n`) into a custom function.
Then using Flux's `sort()` and `limit()` functions to find the top `n` results in the data set.
```js
topN = (tables=<-, n) => tables
|> sort(desc: true)
|> limit(n: n)
```
_More information about creating custom functions is available in the [Custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) documentation._
Using this new custom function `topN` and the `cpuUsageUser` data stream variable defined above,
find the top five data points and yield the results.
```js
cpuUsageUser
|> topN(n: 5)
|> yield()
```
{{% /tab-content %}}
{{% tab-content %}}
### Define data stream variables
A common use case for variable assignments in Flux is creating variables for multiple filtered input data streams.
```js
timeRange = -1h
cpuUsageUser = from(bucket: "telegraf/autogen")
|> range(start: timeRange)
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_user" and r.cpu == "cpu-total")
memUsagePercent = from(bucket: "telegraf/autogen")
|> range(start: timeRange)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
```
These variables can be used in other functions, such as `join()`, while keeping the syntax minimal and flexible.
### Define custom functions
Let's create a function that returns the `N` number rows in the input data stream with the highest `_value`s.
To do this, pass the input stream (`tables`) and the number of results to return (`n`) into a custom function.
Then using Flux's `sort()` and `limit()` functions to find the top `n` results in the data set.
```js
topN = (tables=<-, n) => tables |> sort(desc: true) |> limit(n: n)
```
_More information about creating custom functions is available in the [Custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) documentation._
Using the `cpuUsageUser` data stream variable defined [above](#define-data-stream-variables),
find the top five data points with the custom `topN` function and yield the results.
```js
cpuUsageUser |> topN(n:5) |> yield()
```
{{% /tab-content %}}
{{< /tabs-wrapper >}}
This query will return the five data points with the highest user CPU usage over the last hour.
<div class="page-nav-btns">
<a class="btn prev" href="/enterprise_influxdb/v1.10/flux/get-started/transform-data/">Transform your data</a>
</div>

View File

@ -0,0 +1,160 @@
---
title: Transform data with Flux
description: Learn the basics of using Flux to transform data queried from InfluxDB.
menu:
enterprise_influxdb_1_10:
name: Transform your data
parent: get-started
weight: 2
aliases:
- /enterprise_influxdb/v1.10/flux/getting-started/transform-data/
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/transform-data/
v2: /influxdb/v2.0/query-data/get-started/transform-data/
---
When [querying data from InfluxDB](/enterprise_influxdb/v1.10/flux/get-started/query-influxdb),
you often need to transform that data in some way.
Common examples are aggregating data into averages, downsampling data, etc.
This guide demonstrates using [Flux functions](/{{< latest "flux" >}}/stdlib/) to transform your data.
It walks through creating a Flux script that partitions data into windows of time,
averages the `_value`s in each window, and outputs the averages as a new table.
It's important to understand how the "shape" of your data changes through each of these operations.
## Query data
Use the query built in the previous [Query data from InfluxDB](/enterprise_influxdb/v1.10/flux/get-started/query-influxdb)
guide, but update the range to pull data from the last hour:
```js
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
```
## Flux functions
Flux provides a number of functions that perform specific operations, transformations, and tasks.
You can also [create custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) in your Flux queries.
_Functions are covered in detail in the [Flux standard library](/{{< latest "flux" >}}/stdlib/) documentation._
A common type of function used when transforming data queried from InfluxDB is an aggregate function.
Aggregate functions take a set of `_value`s in a table, aggregate them, and transform
them into a new value.
This example uses the [`mean()` function](/{{< latest "flux" >}}/stdlib/universe/mean)
to average values within time windows.
> The following example walks through the steps required to window and aggregate data,
> but there is a [`aggregateWindow()` helper function](#helper-functions) that does it for you.
> It's just good to understand the steps in the process.
## Window your data
Flux's [`window()` function](/{{< latest "flux" >}}/stdlib/universe/window) partitions records based on a time value.
Use the `every` parameter to define a duration of time for each window.
{{% note %}}
#### Calendar months and years
`every` supports all [valid duration units](/{{< latest "flux" >}}/spec/types/#duration-types),
including **calendar months (`1mo`)** and **years (`1y`)**.
{{% /note %}}
For this example, window data in five minute intervals (`5m`).
```js
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|> window(every: 5m)
```
As data is gathered into windows of time, each window is output as its own table.
When visualized, each table is assigned a unique color.
![Windowed data tables](/img/flux/windowed-data.png)
## Aggregate windowed data
Flux aggregate functions take the `_value`s in each table and aggregate them in some way.
Use the [`mean()` function](/{{< latest "flux" >}}/stdlib/universe/mean) to average the `_value`s of each table.
```js
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|> window(every: 5m)
|> mean()
```
As rows in each window are aggregated, their output table contains only a single row with the aggregate value.
Windowed tables are all still separate and, when visualized, will appear as single, unconnected points.
![Windowed aggregate data](/img/flux/windowed-aggregates.png)
## Add times to your aggregates
As values are aggregated, the resulting tables do not have a `_time` column because
the records used for the aggregation all have different timestamps.
Aggregate functions don't infer what time should be used for the aggregate value.
Therefore the `_time` column is dropped.
A `_time` column is required in the [next operation](#unwindow-aggregate-tables).
To add one, use the [`duplicate()` function](/{{< latest "flux" >}}/stdlib/universe/duplicate)
to duplicate the `_stop` column as the `_time` column for each windowed table.
```js
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|> window(every: 5m)
|> mean()
|> duplicate(column: "_stop", as: "_time")
```
## Unwindow aggregate tables
Use the `window()` function with the `every: inf` parameter to gather all points
into a single, infinite window.
```js
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|> window(every: 5m)
|> mean()
|> duplicate(column: "_stop", as: "_time")
|> window(every: inf)
```
Once ungrouped and combined into a single table, the aggregate data points will appear connected in your visualization.
![Unwindowed aggregate data](/img/flux/windowed-aggregates-ungrouped.png)
## Helper functions
This may seem like a lot of coding just to build a query that aggregates data, however going through the
process helps to understand how data changes "shape" as it is passed through each function.
Flux provides (and allows you to create) "helper" functions that abstract many of these steps.
The same operation performed in this guide can be accomplished using the
[`aggregateWindow()` function](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow).
```js
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|> aggregateWindow(every: 5m, fn: mean)
```
## Congratulations!
You have now constructed a Flux query that uses Flux functions to transform your data.
There are many more ways to manipulate your data using both Flux's primitive functions
and your own custom functions, but this is a good introduction into the basic syntax and query structure.
---
_For a deeper dive into windowing and aggregating data with example data output for each transformation,
view the [Windowing and aggregating data](/enterprise_influxdb/v1.10/flux/guides/window-aggregate) guide._
---
<div class="page-nav-btns">
<a class="btn prev" href="/enterprise_influxdb/v1.10/flux/get-started/query-influxdb/">Query InfluxDB</a>
<a class="btn next" href="/enterprise_influxdb/v1.10/flux/get-started/syntax-basics/">Syntax basics</a>
</div>

View File

@ -0,0 +1,37 @@
---
title: Query data with Flux
description: Guides that walk through both common and complex queries and use cases for Flux.
weight: 3
aliases:
- /flux/latest/
- /flux/latest/introduction
menu:
enterprise_influxdb_1_10:
name: Query with Flux
parent: Flux
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/
v2: /influxdb/v2.0/query-data/flux/
---
The following guides walk through both common and complex queries and use cases for Flux.
{{% note %}}
#### Example data variable
Many of the examples provided in the following guides use a `data` variable,
which represents a basic query that filters data by measurement and field.
`data` is defined as:
```js
data = from(bucket: "db/rp")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement" and r._field == "example-field")
```
{{% /note %}}
## Flux query guides
{{< children type="anchored-list" pages="all" >}}
---
{{< children pages="all" readmore="true" hr="true" >}}

View File

@ -0,0 +1,202 @@
---
title: Calculate percentages with Flux
list_title: Calculate percentages
description: >
Use `pivot()` or `join()` and the `map()` function to align operand values into rows and calculate a percentage.
menu:
enterprise_influxdb_1_10:
name: Calculate percentages
identifier: flux-calc-perc
parent: Query with Flux
weight: 6
list_query_example: percentages
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/calculate-percentages/
v2: /influxdb/v2.0/query-data/flux/calculate-percentages/
---
Calculating percentages from queried data is a common use case for time series data.
To calculate a percentage in Flux, operands must be in each row.
Use `map()` to re-map values in the row and calculate a percentage.
**To calculate percentages**
1. Use [`from()`](/{{< latest "flux" >}}/stdlib/built-in/inputs/from/),
[`range()`](/{{< latest "flux" >}}/stdlib/universe/range/) and
[`filter()`](/{{< latest "flux" >}}/stdlib/universe/filter/) to query operands.
2. Use [`pivot()` or `join()`](/enterprise_influxdb/v1.10/flux/guides/mathematic-operations/#pivot-vs-join)
to align operand values into rows.
3. Use [`map()`](/{{< latest "flux" >}}/stdlib/universe/map/)
to divide the numerator operand value by the denominator operand value and multiply by 100.
{{% note %}}
The following examples use `pivot()` to align operands into rows because
`pivot()` works in most cases and is more performant than `join()`.
_See [Pivot vs join](/enterprise_influxdb/v1.10/flux/guides/mathematic-operations/#pivot-vs-join)._
{{% /note %}}
```js
from(bucket: "db/rp")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "m1" and r._field =~ /field[1-2]/)
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({r with _value: r.field1 / r.field2 * 100.0}))
```
## GPU monitoring example
The following example queries data from the gpu-monitor bucket and calculates the
percentage of GPU memory used over time.
Data includes the following:
- **`gpu` measurement**
- **`mem_used` field**: used GPU memory in bytes
- **`mem_total` field**: total GPU memory in bytes
### Query mem_used and mem_total fields
```js
from(bucket: "gpu-monitor")
|> range(start: 2020-01-01T00:00:00Z)
|> filter(fn: (r) => r._measurement == "gpu" and r._field =~ /mem_/)
```
###### Returns the following stream of tables:
| _time | _measurement | _field | _value |
|:----- |:------------:|:------: | ------: |
| 2020-01-01T00:00:00Z | gpu | mem_used | 2517924577 |
| 2020-01-01T00:00:10Z | gpu | mem_used | 2695091978 |
| 2020-01-01T00:00:20Z | gpu | mem_used | 2576980377 |
| 2020-01-01T00:00:30Z | gpu | mem_used | 3006477107 |
| 2020-01-01T00:00:40Z | gpu | mem_used | 3543348019 |
| 2020-01-01T00:00:50Z | gpu | mem_used | 4402341478 |
<p style="margin:-2.5rem 0;"></p>
| _time | _measurement | _field | _value |
|:----- |:------------:|:------: | ------: |
| 2020-01-01T00:00:00Z | gpu | mem_total | 8589934592 |
| 2020-01-01T00:00:10Z | gpu | mem_total | 8589934592 |
| 2020-01-01T00:00:20Z | gpu | mem_total | 8589934592 |
| 2020-01-01T00:00:30Z | gpu | mem_total | 8589934592 |
| 2020-01-01T00:00:40Z | gpu | mem_total | 8589934592 |
| 2020-01-01T00:00:50Z | gpu | mem_total | 8589934592 |
### Pivot fields into columns
Use `pivot()` to pivot the `mem_used` and `mem_total` fields into columns.
Output includes `mem_used` and `mem_total` columns with values for each corresponding `_time`.
```js
// ...
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
```
###### Returns the following:
| _time | _measurement | mem_used | mem_total |
|:----- |:------------:| --------: | ---------: |
| 2020-01-01T00:00:00Z | gpu | 2517924577 | 8589934592 |
| 2020-01-01T00:00:10Z | gpu | 2695091978 | 8589934592 |
| 2020-01-01T00:00:20Z | gpu | 2576980377 | 8589934592 |
| 2020-01-01T00:00:30Z | gpu | 3006477107 | 8589934592 |
| 2020-01-01T00:00:40Z | gpu | 3543348019 | 8589934592 |
| 2020-01-01T00:00:50Z | gpu | 4402341478 | 8589934592 |
### Map new values
Each row now contains the values necessary to calculate a percentage.
Use `map()` to re-map values in each row.
Divide `mem_used` by `mem_total` and multiply by 100 to return the percentage.
{{% note %}}
To return a precise float percentage value that includes decimal points, the example
below casts integer field values to floats and multiplies by a float value (`100.0`).
{{% /note %}}
```js
// ...
|> map(
fn: (r) => ({
_time: r._time,
_measurement: r._measurement,
_field: "mem_used_percent",
_value: float(v: r.mem_used) / float(v: r.mem_total) * 100.0
})
)
```
##### Query results:
| _time | _measurement | _field | _value |
|:----- |:------------:|:------: | ------: |
| 2020-01-01T00:00:00Z | gpu | mem_used_percent | 29.31 |
| 2020-01-01T00:00:10Z | gpu | mem_used_percent | 31.37 |
| 2020-01-01T00:00:20Z | gpu | mem_used_percent | 30.00 |
| 2020-01-01T00:00:30Z | gpu | mem_used_percent | 35.00 |
| 2020-01-01T00:00:40Z | gpu | mem_used_percent | 41.25 |
| 2020-01-01T00:00:50Z | gpu | mem_used_percent | 51.25 |
### Full query
```js
from(bucket: "gpu-monitor")
|> range(start: 2020-01-01T00:00:00Z)
|> filter(fn: (r) => r._measurement == "gpu" and r._field =~ /mem_/ )
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(
fn: (r) => ({
_time: r._time,
_measurement: r._measurement,
_field: "mem_used_percent",
_value: float(v: r.mem_used) / float(v: r.mem_total) * 100.0
})
)
```
## Examples
#### Calculate percentages using multiple fields
```js
from(bucket: "db/rp")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement")
|> filter(fn: (r) => r._field == "used_system" or r._field == "used_user" or r._field == "total")
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({r with _value: float(v: r.used_system + r.used_user) / float(v: r.total) * 100.0}))
```
#### Calculate percentages using multiple measurements
1. Ensure measurements are in the same [bucket](/enterprise_influxdb/v1.10/flux/get-started/#buckets).
2. Use `filter()` to include data from both measurements.
3. Use `group()` to ungroup data and return a single table.
4. Use `pivot()` to pivot fields into columns.
5. Use `map()` to re-map rows and perform the percentage calculation.
<!-- -->
```js
from(bucket: "db/rp")
|> range(start: -1h)
|> filter(fn: (r) => (r._measurement == "m1" or r._measurement == "m2") and (r._field == "field1" or r._field == "field2"))
|> group()
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({r with _value: r.field1 / r.field2 * 100.0}))
```
#### Calculate percentages using multiple data sources
```js
import "sql"
import "influxdata/influxdb/secrets"
pgUser = secrets.get(key: "POSTGRES_USER")
pgPass = secrets.get(key: "POSTGRES_PASSWORD")
pgHost = secrets.get(key: "POSTGRES_HOST")
t1 = sql.from(
driverName: "postgres",
dataSourceName: "postgresql://${pgUser}:${pgPass}@${pgHost}",
query: "SELECT id, name, available FROM exampleTable",
)
t2 = from(bucket: "db/rp")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement" and r._field == "example-field")
join(tables: {t1: t1, t2: t2}, on: ["id"])
|> map(fn: (r) => ({r with _value: r._value_t2 / r.available_t1 * 100.0}))
```

View File

@ -0,0 +1,213 @@
---
title: Query using conditional logic
seotitle: Query using conditional logic in Flux
list_title: Conditional logic
description: >
This guide describes how to use Flux conditional expressions, such as `if`,
`else`, and `then`, to query and transform data. **Flux evaluates statements from left to right and stops evaluating once a condition matches.**
menu:
enterprise_influxdb_1_10:
name: Conditional logic
parent: Query with Flux
weight: 20
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/conditional-logic/
v2: /influxdb/v2.0/query-data/flux/conditional-logic/
list_code_example: |
```js
if color == "green" then "008000" else "ffffff"
```
---
Flux provides `if`, `then`, and `else` conditional expressions that allow for powerful and flexible Flux queries.
##### Conditional expression syntax
```js
// Pattern
if <condition> then <action> else <alternative-action>
// Example
if color == "green" then "008000" else "ffffff"
```
Conditional expressions are most useful in the following contexts:
- When defining variables.
- When using functions that operate on a single row at a time (
[`filter()`](/{{< latest "flux" >}}/stdlib/universe/filter/),
[`map()`](/{{< latest "flux" >}}/stdlib/universe/map/),
[`reduce()`](/{{< latest "flux" >}}/stdlib/universe/reduce) ).
## Evaluating conditional expressions
Flux evaluates statements in order and stops evaluating once a condition matches.
For example, given the following statement:
```js
if r._value > 95.0000001 and r._value <= 100.0 then
"critical"
else if r._value > 85.0000001 and r._value <= 95.0 then
"warning"
else if r._value > 70.0000001 and r._value <= 85.0 then
"high"
else
"normal"
```
When `r._value` is 96, the output is "critical" and the remaining conditions are not evaluated.
## Examples
- [Conditionally set the value of a variable](#conditionally-set-the-value-of-a-variable)
- [Create conditional filters](#create-conditional-filters)
- [Conditionally transform column values with map()](#conditionally-transform-column-values-with-map)
- [Conditionally increment a count with reduce()](#conditionally-increment-a-count-with-reduce)
### Conditionally set the value of a variable
The following example sets the `overdue` variable based on the
`dueDate` variable's relation to `now()`.
```js
dueDate = 2019-05-01T00:00:00Z
overdue = if dueDate < now() then true else false
```
### Create conditional filters
The following example uses an example `metric` variable to change how the query filters data.
`metric` has three possible values:
- Memory
- CPU
- Disk
```js
metric = "Memory"
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(
fn: (r) => if v.metric == "Memory" then
r._measurement == "mem" and r._field == "used_percent"
else if v.metric == "CPU" then
r._measurement == "cpu" and r._field == "usage_user"
else if v.metric == "Disk" then
r._measurement == "disk" and r._field == "used_percent"
else
r._measurement != "",
)
```
### Conditionally transform column values with map()
The following example uses the [`map()` function](/{{< latest "flux" >}}/stdlib/universe/map/)
to conditionally transform column values.
It sets the `level` column to a specific string based on `_value` column.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[No Comments](#)
[Comments](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```js
from(bucket: "telegraf/autogen")
|> range(start: -5m)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|> map(
fn: (r) => ({r with
level: if r._value >= 95.0000001 and r._value <= 100.0 then
"critical"
else if r._value >= 85.0000001 and r._value <= 95.0 then
"warning"
else if r._value >= 70.0000001 and r._value <= 85.0 then
"high"
else
"normal",
}),
)
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```js
from(bucket: "telegraf/autogen")
|> range(start: -5m)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|> map(
fn: (r) => ({
// Retain all existing columns in the mapped row
r with
// Set the level column value based on the _value column
level: if r._value >= 95.0000001 and r._value <= 100.0 then
"critical"
else if r._value >= 85.0000001 and r._value <= 95.0 then
"warning"
else if r._value >= 70.0000001 and r._value <= 85.0 then
"high"
else
"normal",
}),
)
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
### Conditionally increment a count with reduce()
The following example uses the [`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/)
and [`reduce()`](/{{< latest "flux" >}}/stdlib/universe/reduce/)
functions to count the number of records in every five minute window that exceed a defined threshold.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[No Comments](#)
[Comments](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```js
threshold = 65.0
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|> aggregateWindow(
every: 5m,
fn: (column, tables=<-) => tables
|> reduce(
identity: {above_threshold_count: 0.0},
fn: (r, accumulator) => ({
above_threshold_count: if r._value >= threshold then
accumulator.above_threshold_count + 1.0
else
accumulator.above_threshold_count + 0.0,
}),
),
)
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```js
threshold = 65.0
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
// Aggregate data into 5 minute windows using a custom reduce() function
|> aggregateWindow(
every: 5m,
// Use a custom function in the fn parameter.
// The aggregateWindow fn parameter requires 'column' and 'tables' parameters.
fn: (column, tables=<-) => tables
|> reduce(
identity: {above_threshold_count: 0.0},
fn: (r, accumulator) => ({
// Conditionally increment above_threshold_count if
// r.value exceeds the threshold
above_threshold_count: if r._value >= threshold then
accumulator.above_threshold_count + 1.0
else
accumulator.above_threshold_count + 0.0,
}),
),
)
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}

View File

@ -0,0 +1,69 @@
---
title: Query cumulative sum
seotitle: Query cumulative sum in Flux
list_title: Cumulative sum
description: >
Use the `cumulativeSum()` function to calculate a running total of values.
weight: 10
menu:
enterprise_influxdb_1_10:
parent: Query with Flux
name: Cumulative sum
list_query_example: cumulative_sum
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/cumulativesum/
v2: /influxdb/v2.0/query-data/flux/cumulativesum/
---
Use the [`cumulativeSum()` function](/{{< latest "flux" >}}/stdlib/universe/cumulativesum/)
to calculate a running total of values.
`cumulativeSum` sums the values of subsequent records and returns each row updated with the summed total.
{{< flex >}}
{{% flex-content "half" %}}
**Given the following input table:**
| _time | _value |
| ----- |:------:|
| 0001 | 1 |
| 0002 | 2 |
| 0003 | 1 |
| 0004 | 3 |
{{% /flex-content %}}
{{% flex-content "half" %}}
**`cumulativeSum()` returns:**
| _time | _value |
| ----- |:------:|
| 0001 | 1 |
| 0002 | 3 |
| 0003 | 4 |
| 0004 | 7 |
{{% /flex-content %}}
{{< /flex >}}
{{% note %}}
The examples below use the [example data variable](/enterprise_influxdb/v1.10/flux/guides/#example-data-variable).
{{% /note %}}
##### Calculate the running total of values
```js
data
|> cumulativeSum()
```
## Use cumulativeSum() with aggregateWindow()
[`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/)
segments data into windows of time, aggregates data in each window into a single
point, then removes the time-based segmentation.
It is primarily used to downsample data.
`aggregateWindow()` expects an aggregate function that returns a single row for each time window.
To use `cumulativeSum()` with `aggregateWindow`, use `sum` in `aggregateWindow()`,
then calculate the running total of the aggregate values with `cumulativeSum()`.
<!-- -->
```js
data
|> aggregateWindow(every: 5m, fn: sum)
|> cumulativeSum()
```

View File

@ -0,0 +1,84 @@
---
title: Check if a value exists
seotitle: Use Flux to check if a value exists
list_title: Exists
description: >
Use the Flux `exists` operator to check if a record contains a key or if that
key's value is `null`.
menu:
enterprise_influxdb_1_10:
name: Exists
parent: Query with Flux
weight: 20
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/exists/
v2: /influxdb/v2.0/query-data/flux/exists/
list_code_example: |
##### Filter null values
```js
data
|> filter(fn: (r) => exists r._value)
```
---
Use the Flux `exists` operator to check if a record contains a key or if that
key's value is `null`.
```js
p = {firstName: "John", lastName: "Doe", age: 42}
exists p.firstName
// Returns true
exists p.height
// Returns false
```
If you're just getting started with Flux queries, check out the following:
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
Use `exists` with row functions (
[`filter()`](/{{< latest "flux" >}}/stdlib/universe/filter/),
[`map()`](/{{< latest "flux" >}}/stdlib/universe/map/),
[`reduce()`](/{{< latest "flux" >}}/stdlib/universe/reduce/))
to check if a row includes a column or if the value for that column is `null`.
#### Filter null values
```js
from(bucket: "db/rp")
|> range(start: -5m)
|> filter(fn: (r) => exists r._value)
```
#### Map values based on existence
```js
from(bucket: "default")
|> range(start: -30s)
|> map(
fn: (r) => ({r with
human_readable: if exists r._value then
"${r._field} is ${string(v: r._value)}."
else
"${r._field} has no value.",
}),
)
```
#### Ignore null values in a custom aggregate function
```js
customSumProduct = (tables=<-) => tables
|> reduce(
identity: {sum: 0.0, product: 1.0},
fn: (r, accumulator) => ({r with
sum: if exists r._value then
r._value + accumulator.sum
else
accumulator.sum,
product: if exists r._value then
r.value * accumulator.product
else
accumulator.product,
}),
)
```

View File

@ -0,0 +1,112 @@
---
title: Fill null values in data
seotitle: Fill null values in data
list_title: Fill
description: >
Use the `fill()` function to replace _null_ values.
weight: 10
menu:
enterprise_influxdb_1_10:
parent: Query with Flux
name: Fill
list_query_example: fill_null
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/fill/
v2: /influxdb/v2.0/query-data/flux/fill/
---
Use the [`fill()` function](/{{< latest "flux" >}}/stdlib/universe/fill/)
to replace _null_ values with:
- [the previous non-null value](#fill-with-the-previous-value)
- [a specified value](#fill-with-a-specified-value)
<!-- -->
```js
data
|> fill(usePrevious: true)
// OR
data
|> fill(value: 0.0)
```
{{% note %}}
#### Fill empty windows of time
The `fill()` function **does not** fill empty windows of time.
It only replaces _null_ values in existing data.
Filling empty windows of time requires time interpolation
_(see [influxdata/flux#2428](https://github.com/influxdata/flux/issues/2428))_.
{{% /note %}}
## Fill with the previous value
To fill _null_ values with the previous **non-null** value, set the `usePrevious` parameter to `true`.
{{% note %}}
Values remain _null_ if there is no previous non-null value in the table.
{{% /note %}}
```js
data
|> fill(usePrevious: true)
```
{{< flex >}}
{{% flex-content %}}
**Given the following input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | null |
| 2020-01-01T00:02:00Z | 0.8 |
| 2020-01-01T00:03:00Z | null |
| 2020-01-01T00:04:00Z | null |
| 2020-01-01T00:05:00Z | 1.4 |
{{% /flex-content %}}
{{% flex-content %}}
**`fill(usePrevious: true)` returns:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | null |
| 2020-01-01T00:02:00Z | 0.8 |
| 2020-01-01T00:03:00Z | 0.8 |
| 2020-01-01T00:04:00Z | 0.8 |
| 2020-01-01T00:05:00Z | 1.4 |
{{% /flex-content %}}
{{< /flex >}}
## Fill with a specified value
To fill _null_ values with a specified value, use the `value` parameter to specify the fill value.
_The fill value must match the [data type](/{{< latest "flux" >}}/language/types/#basic-types)
of the [column](/{{< latest "flux" >}}/stdlib/universe/fill/#column)._
```js
data
|> fill(value: 0.0)
```
{{< flex >}}
{{% flex-content %}}
**Given the following input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | null |
| 2020-01-01T00:02:00Z | 0.8 |
| 2020-01-01T00:03:00Z | null |
| 2020-01-01T00:04:00Z | null |
| 2020-01-01T00:05:00Z | 1.4 |
{{% /flex-content %}}
{{% flex-content %}}
**`fill(value: 0.0)` returns:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 0.0 |
| 2020-01-01T00:02:00Z | 0.8 |
| 2020-01-01T00:03:00Z | 0.0 |
| 2020-01-01T00:04:00Z | 0.0 |
| 2020-01-01T00:05:00Z | 1.4 |
{{% /flex-content %}}
{{< /flex >}}

View File

@ -0,0 +1,149 @@
---
title: Query first and last values
seotitle: Query first and last values in Flux
list_title: First and last
description: >
Use the `first()` or `last()` functions to return the first or last point in an input table.
weight: 10
menu:
enterprise_influxdb_1_10:
parent: Query with Flux
name: First & last
list_query_example: first_last
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/first-last/
v2: /influxdb/v2.0/query-data/flux/first-last/
---
Use the [`first()`](/{{< latest "flux" >}}/stdlib/universe/first/) or
[`last()`](/{{< latest "flux" >}}/stdlib/universe/last/) functions
to return the first or last record in an input table.
```js
data
|> first()
// OR
data
|> last()
```
{{% note %}}
By default, InfluxDB returns results sorted by time, however you can use the
[`sort()` function](/{{< latest "flux" >}}/stdlib/universe/sort/)
to change how results are sorted.
`first()` and `last()` respect the sort order of input data and return records
based on the order they are received in.
{{% /note %}}
### first
`first()` returns the first non-null record in an input table.
{{< flex >}}
{{% flex-content %}}
**Given the following input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1.0 |
| 2020-01-01T00:02:00Z | 1.0 |
| 2020-01-01T00:03:00Z | 2.0 |
| 2020-01-01T00:04:00Z | 3.0 |
{{% /flex-content %}}
{{% flex-content %}}
**The following function returns:**
```js
|> first()
```
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1.0 |
{{% /flex-content %}}
{{< /flex >}}
### last
`last()` returns the last non-null record in an input table.
{{< flex >}}
{{% flex-content %}}
**Given the following input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1.0 |
| 2020-01-01T00:02:00Z | 1.0 |
| 2020-01-01T00:03:00Z | 2.0 |
| 2020-01-01T00:04:00Z | 3.0 |
{{% /flex-content %}}
{{% flex-content %}}
**The following function returns:**
```js
|> last()
```
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:04:00Z | 3.0 |
{{% /flex-content %}}
{{< /flex >}}
## Use first() or last() with aggregateWindow()
Use `first()` and `last()` with [`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/)
to select the first or last records in time-based groups.
`aggregateWindow()` segments data into windows of time, aggregates data in each window into a single
point using aggregate or selector functions, and then removes the time-based segmentation.
{{< flex >}}
{{% flex-content %}}
**Given the following input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:00:00Z | 10 |
| 2020-01-01T00:00:15Z | 12 |
| 2020-01-01T00:00:45Z | 9 |
| 2020-01-01T00:01:05Z | 9 |
| 2020-01-01T00:01:10Z | 15 |
| 2020-01-01T00:02:30Z | 11 |
{{% /flex-content %}}
{{% flex-content %}}
**The following function returns:**
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[first](#)
[last](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```js
|> aggregateWindow(
every: 1h,
fn: first,
)
```
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:00:59Z | 10 |
| 2020-01-01T00:01:59Z | 9 |
| 2020-01-01T00:02:59Z | 11 |
{{% /code-tab-content %}}
{{% code-tab-content %}}
```js
|> aggregateWindow(
every: 1h,
fn: last,
)
```
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:00:59Z | 9 |
| 2020-01-01T00:01:59Z | 15 |
| 2020-01-01T00:02:59Z | 11 |
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
{{%/flex-content %}}
{{< /flex >}}

View File

@ -0,0 +1,137 @@
---
title: Use Flux in Chronograf dashboards
description: >
This guide walks through using Flux queries in Chronograf dashboard cells,
what template variables are available, and how to use them.
menu:
enterprise_influxdb_1_10:
name: Use Flux in dashboards
parent: Query with Flux
weight: 30
canonical: /{{< latest "influxdb" "v1" >}}/flux/guides/flux-in-dashboards/
---
[Chronograf](/{{< latest "chronograf" >}}/) is the web user interface for managing for the
InfluxData platform that lest you create and customize dashboards that visualize your data.
Visualized data is retrieved using either an InfluxQL or Flux query.
This guide walks through using Flux queries in Chronograf dashboard cells.
## Using Flux in dashboard cells
---
_**Chronograf v1.8+** and **InfluxDB v1.8 with [Flux enabled](/enterprise_influxdb/v1.10/flux/installation)**
are required to use Flux in dashboards._
---
To use Flux in a dashboard cell, either create a new cell or edit an existing cell
by clicking the **pencil** icon in the top right corner of the cell.
To the right of the **Source dropdown** above the graph preview, select **Flux** as the source type.
{{< img-hd src="/img/influxdb/1-7-flux-dashboard-cell.png" alt="Flux in Chronograf dashboard cells" />}}
> The Flux source type is only available if your data source has
> [Flux enabled](/enterprise_influxdb/v1.10/flux/installation).
This will provide **Schema**, **Script**, and **Functions** panes.
### Schema pane
The Schema pane allows you to explore your data and add filters for specific
measurements, fields, and tags to your Flux script.
{{< img-hd src="/img/influxdb/1-7-flux-dashboard-add-filter.png" title="Add a filter from the Schema panel" />}}
### Script pane
The Script pane is where you write your Flux script.
In its default state, the **Script** pane includes an optional [Script Wizard](/chronograf/v1.8/guides/querying-data/#explore-data-with-flux)
that uses selected options to build a Flux query for you.
The generated query includes all the relevant functions and [template variables](#template-variables-in-flux)
required to return your desired data.
### Functions pane
The Functions pane provides a list of functions available in your Flux queries.
Clicking on a function will add it to the end of the script in the Script pane.
Hovering over a function provides documentation for the function as well as links
to deep documentation.
### Dynamic sources
Chronograf can be configured with multiple data sources.
The **Sources dropdown** allows you to select a specific data source to connect to,
but a **Dynamic Source** options is also available.
With a dynamic source, the cell will query data from whatever data source to which
Chronograf is currently connected.
Connections are managed under Chronograf's **Configuration** tab.
### View raw data
As you're building your Flux scripts, each function processes or transforms your
data is ways specific to the function.
It can be helpful to view the actual data in order to see how it is being shaped.
The **View Raw Data** toggle above the data visualization switches between graphed
data and raw data shown in table form.
{{< img-hd src="/img/influxdb/1-7-flux-dashboard-view-raw.png" alt="View raw data" />}}
_The **View Raw Data** toggle is only available when using Flux._
## Template variables in Flux
Chronograf [template variables](/{{< latest "chronograf" >}}/guides/dashboard-template-variables/)
allow you to alter specific components of cells queries using elements provided in the
Chronograf user interface.
In your Flux query, reference template variables just as you would reference defined Flux variables.
The following example uses Chronograf's [predefined template variables](#predefined-template-variables),
`dashboardTime`, `upperDashboardTime`, and `autoInterval`:
```js
from(bucket: "telegraf/autogen")
|> filter(fn: (r) => r._measurement == "cpu")
|> range(start: dashboardTime, stop: upperDashboardTime)
|> window(every: autoInterval)
```
### Predefined template variables
#### dashboardTime
The `dashboardTime` template variable represents the lower time bound of ranged data.
It's value is controlled by the time dropdown in your dashboard.
It should be used to define the `start` parameter of the `range()` function.
```js
dataSet
|> range(start: dashboardTime)
```
#### upperDashboardTime
The `upperDashboardTime` template variable represents the upper time bound of ranged data.
It's value is modified by the time dropdown in your dashboard when using an absolute time range.
It should be used to define the `stop` parameter of the `range()` function.
```js
dataSet
|> range(start: dashboardTime, stop: upperDashboardTime)
```
> As a best practice, always set the `stop` parameter of the `range()` function to `upperDashboardTime` in cell queries.
> Without it, `stop` defaults to "now" and the absolute upper range bound selected in the time dropdown is not honored,
> potentially causing unnecessary load on InfluxDB.
#### autoInterval
The `autoInterval` template variable represents the refresh interval of the dashboard
and is controlled by the refresh interval dropdown.
It's typically used to align window intervals created in
[windowing and aggregation](/enterprise_influxdb/v1.10/flux/guides/window-aggregate) operations with dashboard refreshes.
```js
dataSet
|> range(start: dashboardTime, stop: upperDashboardTime)
|> aggregateWindow(every: autoInterval, fn: mean)
```
### Custom template variables
<% warn %>
Chronograf does not support the use of custom template variables in Flux queries.
<% /warn %>
## Using Flux and InfluxQL
Within individual dashboard cells, the use of Flux and InfluxQL is mutually exclusive.
However, a dashboard may consist of different cells, each using Flux or InfluxQL.

View File

@ -0,0 +1,91 @@
---
title: Work with geo-temporal data
list_title: Geo-temporal data
description: >
Use the Flux Geo package to filter geo-temporal data and group by geographic location or track.
menu:
enterprise_influxdb_1_10:
name: Geo-temporal data
parent: Query with Flux
weight: 20
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/
v2: /influxdb/v2.0/query-data/flux/geo/
list_code_example: |
```js
import "experimental/geo"
sampleGeoData
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0})
|> geo.groupByArea(newColumn: "geoArea", level: 5)
```
---
Use the [Flux Geo package](/{{< latest "flux" >}}/stdlib/experimental/geo) to
filter geo-temporal data and group by geographic location or track.
{{% warn %}}
The Geo package is experimental and subject to change at any time.
By using it, you agree to the [risks of experimental functions](/{{< latest "flux" >}}/stdlib/experimental/#experimental-functions-are-subject-to-change).
{{% /warn %}}
**To work with geo-temporal data:**
1. Import the `experimental/geo` package.
```js
import "experimental/geo"
```
2. Load geo-temporal data. _See below for [sample geo-temporal data](#sample-data)._
3. Do one or more of the following:
- [Shape data to work with the Geo package](#shape-data-to-work-with-the-geo-package)
- [Filter data by region](#filter-geo-temporal-data-by-region) (using strict or non-strict filters)
- [Group data by area or by track](#group-geo-temporal-data)
{{< children >}}
---
## Sample data
Many of the examples in this section use a `sampleGeoData` variable that represents
a sample set of geo-temporal data.
The [Bird Migration Sample Data](https://github.com/influxdata/influxdb2-sample-data/tree/master/bird-migration-data)
available on GitHub provides sample geo-temporal data that meets the
[requirements of the Flux Geo package](/{{< latest "flux" >}}/stdlib/experimental/geo/#geo-schema-requirements).
### Load annotated CSV sample data
Use the [experimental `csv.from()` function](/{{< latest "flux" >}}/stdlib/experimental/csv/from/)
to load the sample bird migration annotated CSV data from GitHub:
```js
import `experimental/csv`
sampleGeoData = csv.from(
url: "https://github.com/influxdata/influxdb2-sample-data/blob/master/bird-migration-data/bird-migration.csv"
)
```
{{% note %}}
`csv.from(url: ...)` downloads sample data each time you execute the query **(~1.3 MB)**.
If bandwidth is a concern, use the [`to()` function](/{{< latest "flux" >}}/stdlib/built-in/outputs/to/)
to write the data to a bucket, and then query the bucket with [`from()`](/{{< latest "flux" >}}/stdlib/built-in/inputs/from/).
{{% /note %}}
### Write sample data to InfluxDB with line protocol
Use `curl` and the `influx write` command to write bird migration line protocol to InfluxDB.
Replace `db/rp` with your destination bucket:
```sh
curl https://raw.githubusercontent.com/influxdata/influxdb2-sample-data/master/bird-migration-data/bird-migration.line --output ./tmp-data
influx write -b db/rp @./tmp-data
rm -f ./tmp-data
```
Use Flux to query the bird migration data and assign it to the `sampleGeoData` variable:
```js
sampleGeoData = from(bucket: "db/rp")
|> range(start: 2019-01-01T00:00:00Z, stop: 2019-12-31T23:59:59Z)
|> filter(fn: (r) => r._measurement == "migration")
```

View File

@ -0,0 +1,129 @@
---
title: Filter geo-temporal data by region
description: >
Use the `geo.filterRows` function to filter geo-temporal data by box-shaped, circular, or polygonal geographic regions.
menu:
enterprise_influxdb_1_10:
name: Filter by region
parent: Geo-temporal data
weight: 302
related:
- /{{< latest "flux" >}}/stdlib/experimental/geo/
- /{{< latest "flux" >}}/stdlib/experimental/geo/filterrows/
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/filter-by-region/
v2: /influxdb/v2.0/query-data/flux/geo/filter-by-region/
list_code_example: |
```js
import "experimental/geo"
sampleGeoData
|> geo.filterRows(
region: {lat: 30.04, lon: 31.23, radius: 200.0},
strict: true
)
```
---
Use the [`geo.filterRows` function](/{{< latest "flux" >}}/stdlib/experimental/geo/filterrows/)
to filter geo-temporal data by geographic region:
1. [Define a geographic region](#define-a-geographic-region)
2. [Use strict or non-strict filtering](#strict-and-non-strict-filtering)
The following example uses the [sample bird migration data](/enterprise_influxdb/v1.10/flux/guides/geo/#sample-data)
and queries data points **within 200km of Cairo, Egypt**:
```js
import "experimental/geo"
sampleGeoData
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0}, strict: true)
```
## Define a geographic region
Many functions in the Geo package filter data based on geographic region.
Define a geographic region using one of the the following shapes:
- [box](#box)
- [circle](#circle)
- [polygon](#polygon)
### box
Define a box-shaped region by specifying a record containing the following properties:
- **minLat:** minimum latitude in decimal degrees (WGS 84) _(Float)_
- **maxLat:** maximum latitude in decimal degrees (WGS 84) _(Float)_
- **minLon:** minimum longitude in decimal degrees (WGS 84) _(Float)_
- **maxLon:** maximum longitude in decimal degrees (WGS 84) _(Float)_
##### Example box-shaped region
```js
{
minLat: 40.51757813,
maxLat: 40.86914063,
minLon: -73.65234375,
maxLon: -72.94921875,
}
```
### circle
Define a circular region by specifying a record containing the following properties:
- **lat**: latitude of the circle center in decimal degrees (WGS 84) _(Float)_
- **lon**: longitude of the circle center in decimal degrees (WGS 84) _(Float)_
- **radius**: radius of the circle in kilometers (km) _(Float)_
##### Example circular region
```js
{
lat: 40.69335938,
lon: -73.30078125,
radius: 20.0,
}
```
### polygon
Define a polygonal region with a record containing the latitude and longitude for
each point in the polygon:
- **points**: points that define the custom polygon _(Array of records)_
Define each point with a record containing the following properties:
- **lat**: latitude in decimal degrees (WGS 84) _(Float)_
- **lon**: longitude in decimal degrees (WGS 84) _(Float)_
##### Example polygonal region
```js
{
points: [
{lat: 40.671659, lon: -73.936631},
{lat: 40.706543, lon: -73.749177},
{lat: 40.791333, lon: -73.880327},
]
}
```
## Strict and non-strict filtering
In most cases, the specified geographic region does not perfectly align with S2 grid cells.
- **Non-strict filtering** returns points that may be outside of the specified region but
inside S2 grid cells partially covered by the region.
- **Strict filtering** returns only points inside the specified region.
_Strict filtering is less performant, but more accurate than non-strict filtering._
<span class="key-geo-cell"></span> S2 grid cell
<span class="key-geo-region"></span> Filter region
<span class="key-geo-point"></span> Returned point
{{< flex >}}
{{% flex-content %}}
**Strict filtering**
{{< svg "/static/svgs/geo-strict.svg" >}}
{{% /flex-content %}}
{{% flex-content %}}
**Non-strict filtering**
{{< svg "/static/svgs/geo-non-strict.svg" >}}
{{% /flex-content %}}
{{< /flex >}}

View File

@ -0,0 +1,70 @@
---
title: Group geo-temporal data
description: >
Use the `geo.groupByArea()` to group geo-temporal data by area and `geo.asTracks()`
to group data into tracks or routes.
menu:
enterprise_influxdb_1_10:
parent: Geo-temporal data
weight: 302
related:
- /{{< latest "flux" >}}/stdlib/experimental/geo/
- /{{< latest "flux" >}}/stdlib/experimental/geo/groupbyarea/
- /{{< latest "flux" >}}/stdlib/experimental/geo/astracks/
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/group-geo-data/
v2: /influxdb/v2.0/query-data/flux/geo/group-geo-data/
list_code_example: |
```js
import "experimental/geo"
sampleGeoData
|> geo.groupByArea(newColumn: "geoArea", level: 5)
|> geo.asTracks(groupBy: ["id"],sortBy: ["_time"])
```
---
Use the `geo.groupByArea()` to group geo-temporal data by area and `geo.asTracks()`
to group data into tracks or routes.
- [Group data by area](#group-data-by-area)
- [Group data into tracks or routes](#group-data-by-track-or-route)
### Group data by area
Use the [`geo.groupByArea()` function](/{{< latest "flux" >}}/stdlib/experimental/geo/groupbyarea/)
to group geo-temporal data points by geographic area.
Areas are determined by [S2 grid cells](https://s2geometry.io/devguide/s2cell_hierarchy.html#s2cellid-numbering)
- Specify a new column to store the unique area identifier for each point with the `newColumn` parameter.
- Specify the [S2 cell level](https://s2geometry.io/resources/s2cell_statistics)
to use when calculating geographic areas with the `level` parameter.
The following example uses the [sample bird migration data](/enterprise_influxdb/v1.10/flux/guides/geo/#sample-data)
to query data points within 200km of Cairo, Egypt and group them by geographic area:
```js
import "experimental/geo"
sampleGeoData
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0})
|> geo.groupByArea(newColumn: "geoArea", level: 5)
```
### Group data by track or route
Use [`geo.asTracks()` function](/{{< latest "flux" >}}/stdlib/experimental/geo/astracks/)
to group data points into tracks or routes and order them by time or other columns.
Data must contain a unique identifier for each track. For example: `id` or `tid`.
- Specify columns that uniquely identify each track or route with the `groupBy` parameter.
- Specify which columns to sort by with the `sortBy` parameter. Default is `["_time"]`.
The following example uses the [sample bird migration data](/enterprise_influxdb/v1.10/flux/guides/geo/#sample-data)
to query data points within 200km of Cairo, Egypt and group them into routes unique
to each bird:
```js
import "experimental/geo"
sampleGeoData
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0})
|> geo.asTracks(groupBy: ["id"], sortBy: ["_time"])
```

View File

@ -0,0 +1,116 @@
---
title: Shape data to work with the Geo package
description: >
Functions in the Flux Geo package require **lat** and **lon** fields and an **s2_cell_id** tag.
Rename latitude and longitude fields and generate S2 cell ID tokens.
menu:
enterprise_influxdb_1_10:
name: Shape geo-temporal data
parent: Geo-temporal data
weight: 301
related:
- /{{< latest "flux" >}}/stdlib/experimental/geo/
- /{{< latest "flux" >}}/stdlib/experimental/geo/shapedata/
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/shape-geo-data/
v2: /influxdb/v2.0/query-data/flux/geo/shape-geo-data/
list_code_example: |
```js
import "experimental/geo"
sampleGeoData
|> geo.shapeData(latField: "latitude", lonField: "longitude", level: 10)
```
---
Functions in the Geo package require the following data schema:
- an **s2_cell_id** tag containing the [S2 Cell ID](https://s2geometry.io/devguide/s2cell_hierarchy.html#s2cellid-numbering)
**as a token**
- a **`lat` field** field containing the **latitude in decimal degrees** (WGS 84)
- a **`lon` field** field containing the **longitude in decimal degrees** (WGS 84)
## Shape geo-temporal data
If your data already contains latitude and longitude fields, use the
[`geo.shapeData()`function](/{{< latest "flux" >}}/stdlib/experimental/geo/shapedata/)
to rename the fields to match the requirements of the Geo package, pivot the data
into row-wise sets, and generate S2 cell ID tokens for each point.
```js
import "experimental/geo"
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement")
|> geo.shapeData(latField: "latitude", lonField: "longitude", level: 10)
```
## Generate S2 cell ID tokens
The Geo package uses the [S2 Geometry Library](https://s2geometry.io/) to represent
geographic coordinates on a three-dimensional sphere.
The sphere is divided into [cells](https://s2geometry.io/devguide/s2cell_hierarchy),
each with a unique 64-bit identifier (S2 cell ID).
Grid and S2 cell ID accuracy are defined by a [level](https://s2geometry.io/resources/s2cell_statistics).
{{% note %}}
To filter more quickly, use higher S2 Cell ID levels,
but know that that higher levels increase [series cardinality](/enterprise_influxdb/v1.10/concepts/glossary/#series-cardinality).
{{% /note %}}
The Geo package requires S2 cell IDs as tokens.
To generate add S2 cell IDs tokens to your data, use one of the following options:
- [Generate S2 cell ID tokens with Telegraf](#generate-s2-cell-id-tokens-with-telegraf)
- [Generate S2 cell ID tokens language-specific libraries](#generate-s2-cell-id-tokens-language-specific-libraries)
- [Generate S2 cell ID tokens with Flux](#generate-s2-cell-id-tokens-with-flux)
### Generate S2 cell ID tokens with Telegraf
Enable the [Telegraf S2 Geo (`s2geo`) processor](https://github.com/influxdata/telegraf/tree/master/plugins/processors/s2geo)
to generate S2 cell ID tokens at a specified `cell_level` using `lat` and `lon` field values.
Add the `processors.s2geo` configuration to your Telegraf configuration file (`telegraf.conf`):
```toml
[[processors.s2geo]]
## The name of the lat and lon fields containing WGS-84 latitude and
## longitude in decimal degrees.
lat_field = "lat"
lon_field = "lon"
## New tag to create
tag_key = "s2_cell_id"
## Cell level (see https://s2geometry.io/resources/s2cell_statistics.html)
cell_level = 9
```
Telegraf stores the S2 cell ID token in the `s2_cell_id` tag.
### Generate S2 cell ID tokens language-specific libraries
Many programming languages offer S2 Libraries with methods for generating S2 cell ID tokens.
Use latitude and longitude with the `s2.CellID.ToToken` endpoint of the S2 Geometry
Library to generate `s2_cell_id` tags. For example:
- **Go:** [s2.CellID.ToToken()](https://godoc.org/github.com/golang/geo/s2#CellID.ToToken)
- **Python:** [s2sphere.CellId.to_token()](https://s2sphere.readthedocs.io/en/latest/api.html#s2sphere.CellId)
- **JavaScript:** [s2.cellid.toToken()](https://github.com/mapbox/node-s2/blob/master/API.md#cellidtotoken---string)
### Generate S2 cell ID tokens with Flux
Use the [`geo.s2CellIDToken()` function](/{{< latest "flux" >}}/stdlib/experimental/geo/s2cellidtoken/)
with existing longitude (`lon`) and latitude (`lat`) field values to generate and add the S2 cell ID token.
First, use the [`geo.toRows()` function](/{{< latest "flux" >}}/stdlib/experimental/geo/torows/)
to pivot **lat** and **lon** fields into row-wise sets:
```js
import "experimental/geo"
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement")
|> geo.toRows()
|> map(fn: (r) => ({r with s2_cell_id: geo.s2CellIDToken(point: {lon: r.lon, lat: r.lat}, level: 10)}))
```
{{% note %}}
The [`geo.shapeData()`function](/{{< latest "flux" >}}/stdlib/experimental/geo/shapedata/)
generates S2 cell ID tokens as well.
{{% /note %}}

View File

@ -0,0 +1,673 @@
---
title: Group data in InfluxDB with Flux
list_title: Group
description: >
Use the `group()` function to group data with common values in specific columns.
menu:
enterprise_influxdb_1_10:
name: Group
parent: Query with Flux
weight: 2
aliases:
- /enterprise_influxdb/v1.10/flux/guides/grouping-data/
list_query_example: group
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/group-data/
v2: /influxdb/v2.0/query-data/flux/group-data/
---
With Flux, you can group data by any column in your queried data set.
"Grouping" partitions data into tables in which each row shares a common value for specified columns.
This guide walks through grouping data in Flux and provides examples of how data is shaped in the process.
If you're just getting started with Flux queries, check out the following:
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
## Group keys
Every table has a **group key** a list of columns which for which every row in the table has the same value.
###### Example group key
```js
[_start, _stop, _field, _measurement, host]
```
Grouping data in Flux is essentially defining the group key of output tables.
Understanding how modifying group keys shapes output data is key to successfully
grouping and transforming data into your desired output.
## group() Function
Flux's [`group()` function](/{{< latest "flux" >}}/stdlib/universe/group) defines the
group key for output tables, i.e. grouping records based on values for specific columns.
###### group() example
```js
dataStream
|> group(columns: ["cpu", "host"])
```
###### Resulting group key
```js
[cpu, host]
```
The `group()` function has the following parameters:
### columns
The list of columns to include or exclude (depending on the [mode](#mode)) in the grouping operation.
### mode
The method used to define the group and resulting group key.
Possible values include `by` and `except`.
## Example grouping operations
To illustrate how grouping works, define a `dataSet` variable that queries System
CPU usage from the `db/rp` bucket.
Filter the `cpu` tag so it only returns results for each numbered CPU core.
### Data set
CPU used by system operations for all numbered CPU cores.
It uses a regular expression to filter only numbered cores.
```js
dataSet = from(bucket: "db/rp")
|> range(start: -2m)
|> filter(fn: (r) => r._field == "usage_system" and r.cpu =~ /cpu[0-9*]/)
|> drop(columns: ["host"])
```
{{% note %}}
This example drops the `host` column from the returned data since the CPU data
is only tracked for a single host and it simplifies the output tables.
Don't drop the `host` column if monitoring multiple hosts.
{{% /note %}}
{{% truncate %}}
```
Table: keys: [_start, _stop, _field, _measurement, cpu]
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:00.000000000Z 7.892107892107892
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:10.000000000Z 7.2
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:20.000000000Z 7.4
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:30.000000000Z 5.5
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:40.000000000Z 7.4
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:50.000000000Z 7.5
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:00.000000000Z 10.3
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:10.000000000Z 9.2
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:20.000000000Z 8.4
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:30.000000000Z 8.5
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:40.000000000Z 8.6
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:50.000000000Z 10.2
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:36:00.000000000Z 10.6
Table: keys: [_start, _stop, _field, _measurement, cpu]
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:00.000000000Z 0.7992007992007992
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:10.000000000Z 0.7
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:20.000000000Z 0.7
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:30.000000000Z 0.4
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:40.000000000Z 0.7
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:50.000000000Z 0.7
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:00.000000000Z 1.4
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:10.000000000Z 1.2
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:20.000000000Z 0.8
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:30.000000000Z 0.8991008991008991
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:40.000000000Z 0.8008008008008008
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:50.000000000Z 0.999000999000999
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:36:00.000000000Z 1.1022044088176353
Table: keys: [_start, _stop, _field, _measurement, cpu]
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:00.000000000Z 4.1
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:10.000000000Z 3.6
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:20.000000000Z 3.5
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:30.000000000Z 2.6
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:40.000000000Z 4.5
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:50.000000000Z 4.895104895104895
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:00.000000000Z 6.906906906906907
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:10.000000000Z 5.7
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:20.000000000Z 5.1
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:30.000000000Z 4.7
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:40.000000000Z 5.1
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:50.000000000Z 5.9
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:36:00.000000000Z 6.4935064935064934
Table: keys: [_start, _stop, _field, _measurement, cpu]
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:00.000000000Z 0.5005005005005005
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:10.000000000Z 0.5
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:20.000000000Z 0.5
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:30.000000000Z 0.3
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:40.000000000Z 0.6
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:50.000000000Z 0.6
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:00.000000000Z 1.3986013986013985
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:10.000000000Z 0.9
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:20.000000000Z 0.5005005005005005
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:30.000000000Z 0.7
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:40.000000000Z 0.6
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:50.000000000Z 0.8
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:36:00.000000000Z 0.9
```
{{% /truncate %}}
**Note that the group key is output with each table: `Table: keys: <group-key>`.**
![Group example data set](/img/flux/grouping-data-set.png)
### Group by CPU
Group the `dataSet` stream by the `cpu` column.
```js
dataSet
|> group(columns: ["cpu"])
```
This won't actually change the structure of the data since it already has `cpu`
in the group key and is therefore grouped by `cpu`.
However, notice that it does change the group key:
{{% truncate %}}
###### Group by CPU output tables
```
Table: keys: [cpu]
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 7.892107892107892 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 7.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 5.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 7.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 10.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 9.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 8.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 8.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 8.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 10.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [cpu]
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 0.7992007992007992 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 0.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 1.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 1.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 0.8991008991008991 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 0.8008008008008008 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 0.999000999000999 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.1022044088176353 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [cpu]
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 4.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 3.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 3.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 2.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 4.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 4.895104895104895 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 6.906906906906907 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 5.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 4.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 5.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 6.4935064935064934 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [cpu]
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 0.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 1.3986013986013985 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
```
{{% /truncate %}}
The visualization remains the same.
![Group by CPU](/img/flux/grouping-data-set.png)
### Group by time
Grouping data by the `_time` column is a good illustration of how grouping changes the structure of your data.
```js
dataSet
|> group(columns: ["_time"])
```
When grouping by `_time`, all records that share a common `_time` value are grouped into individual tables.
So each output table represents a single point in time.
{{% truncate %}}
###### Group by time output tables
```
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.892107892107892 usage_system cpu cpu0
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7992007992007992 usage_system cpu cpu1
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.1 usage_system cpu cpu2
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.2 usage_system cpu cpu0
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 3.6 usage_system cpu cpu2
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu cpu0
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 3.5 usage_system cpu cpu2
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.5 usage_system cpu cpu0
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.4 usage_system cpu cpu1
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 2.6 usage_system cpu cpu2
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.3 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu cpu0
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.5 usage_system cpu cpu2
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.5 usage_system cpu cpu0
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.895104895104895 usage_system cpu cpu2
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.3 usage_system cpu cpu0
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.4 usage_system cpu cpu1
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 6.906906906906907 usage_system cpu cpu2
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.3986013986013985 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 9.2 usage_system cpu cpu0
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.2 usage_system cpu cpu1
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.7 usage_system cpu cpu2
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 8.4 usage_system cpu cpu0
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu cpu1
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu cpu2
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 8.5 usage_system cpu cpu0
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8991008991008991 usage_system cpu cpu1
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.7 usage_system cpu cpu2
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 8.6 usage_system cpu cpu0
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8008008008008008 usage_system cpu cpu1
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu cpu2
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.2 usage_system cpu cpu0
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.999000999000999 usage_system cpu cpu1
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.9 usage_system cpu cpu2
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu cpu3
Table: keys: [_time]
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.6 usage_system cpu cpu0
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.1022044088176353 usage_system cpu cpu1
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 6.4935064935064934 usage_system cpu cpu2
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu cpu3
```
{{% /truncate %}}
Because each timestamp is structured as a separate table, when visualized, all
points that share the same timestamp appear connected.
![Group by time](/img/flux/grouping-by-time.png)
{{% note %}}
With some further processing, you could calculate the average CPU usage across all CPUs per point
of time and group them into a single table, but we won't cover that in this example.
If you're interested in running and visualizing this yourself, here's what the query would look like:
```js
dataSet
|> group(columns: ["_time"])
|> mean()
|> group(columns: ["_value", "_time"], mode: "except")
```
{{% /note %}}
## Group by CPU and time
Group by the `cpu` and `_time` columns.
```js
dataSet
|> group(columns: ["cpu", "_time"])
```
This outputs a table for every unique `cpu` and `_time` combination:
{{% truncate %}}
###### Group by CPU and time output tables
```
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:00.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.892107892107892 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:00.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7992007992007992 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:00.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:00.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:10.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:10.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:10.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 3.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:10.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:20.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:20.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:20.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 3.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:20.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:30.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 5.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:30.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:30.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 2.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:30.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:40.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:40.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:40.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:40.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:50.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:50.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:50.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.895104895104895 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:34:50.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:00.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 10.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:00.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 1.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:00.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 6.906906906906907 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:00.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 1.3986013986013985 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:10.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 9.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:10.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 1.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:10.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:10.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:20.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 8.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:20.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:20.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:20.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:30.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 8.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:30.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.8991008991008991 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:30.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:30.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:40.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 8.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:40.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.8008008008008008 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:40.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:40.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:50.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 10.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:50.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.999000999000999 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:50.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:35:50.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:36:00.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 10.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:36:00.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 1.1022044088176353 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:36:00.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 6.4935064935064934 usage_system cpu 2018-11-05T21:34:00.000000000Z
Table: keys: [_time, cpu]
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
2018-11-05T21:36:00.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
```
{{% /truncate %}}
When visualized, tables appear as individual, unconnected points.
![Group by CPU and time](/img/flux/grouping-by-cpu-time.png)
Grouping by `cpu` and `_time` is a good illustration of how grouping works.
## In conclusion
Grouping is a powerful way to shape your data into your desired output format.
It modifies the group keys of output tables, grouping records into tables that
all share common values within specified columns.

View File

@ -0,0 +1,119 @@
---
title: Create histograms with Flux
list_title: Histograms
description: >
Use the `histogram()` function to create cumulative histograms with Flux.
menu:
enterprise_influxdb_1_10:
name: Histograms
parent: Query with Flux
weight: 10
list_query_example: histogram
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/histograms/
v2: /influxdb/v2.0/query-data/flux/histograms/
---
Histograms provide valuable insight into the distribution of your data.
This guide walks through using Flux's `histogram()` function to transform your data into a **cumulative histogram**.
## histogram() function
The [`histogram()` function](/{{< latest "flux" >}}/stdlib/universe/histogram) approximates the
cumulative distribution of a dataset by counting data frequencies for a list of "bins."
A **bin** is simply a range in which a data point falls.
All data points that are less than or equal to the bound are counted in the bin.
In the histogram output, a column is added (le) that represents the upper bounds of of each bin.
Bin counts are cumulative.
```js
from(bucket: "telegraf/autogen")
|> range(start: -5m)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|> histogram(bins: [0.0, 10.0, 20.0, 30.0])
```
> Values output by the `histogram` function represent points of data aggregated over time.
> Since values do not represent single points in time, there is no `_time` column in the output table.
## Bin helper functions
Flux provides two helper functions for generating histogram bins.
Each generates and outputs an array of floats designed to be used in the `histogram()` function's `bins` parameter.
### linearBins()
The [`linearBins()` function](/{{< latest "flux" >}}/stdlib/built-in/misc/linearbins) generates a list of linearly separated floats.
```js
linearBins(start: 0.0, width: 10.0, count: 10)
// Generated list: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, +Inf]
```
### logarithmicBins()
The [`logarithmicBins()` function](/{{< latest "flux" >}}/stdlib/built-in/misc/logarithmicbins) generates a list of exponentially separated floats.
```js
logarithmicBins(start: 1.0, factor: 2.0, count: 10, infinty: true)
// Generated list: [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, +Inf]
```
## Examples
### Generating a histogram with linear bins
```js
from(bucket: "telegraf/autogen")
|> range(start: -5m)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|> histogram(bins: linearBins(start: 65.5, width: 0.5, count: 20, infinity: false))
```
###### Output table
```
Table: keys: [_start, _stop, _field, _measurement, host]
_start:time _stop:time _field:string _measurement:string host:string le:float _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ---------------------------- ----------------------------
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 65.5 5
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 66 6
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 66.5 8
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 67 9
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 67.5 9
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 68 10
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 68.5 12
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 69 12
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 69.5 15
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 70 23
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 70.5 30
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 71 30
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 71.5 30
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 72 30
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 72.5 30
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 73 30
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 73.5 30
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 74 30
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 74.5 30
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 75 30
```
### Generating a histogram with logarithmic bins
```js
from(bucket: "telegraf/autogen")
|> range(start: -5m)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|> histogram(bins: logarithmicBins(start: 0.5, factor: 2.0, count: 10, infinity: false))
```
###### Output table
```
Table: keys: [_start, _stop, _field, _measurement, host]
_start:time _stop:time _field:string _measurement:string host:string le:float _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ---------------------------- ----------------------------
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 0.5 0
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 1 0
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 2 0
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 4 0
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 8 0
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 16 0
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 32 0
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 64 2
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 128 30
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 256 30
```

View File

@ -0,0 +1,56 @@
---
title: Calculate the increase
seotitle: Calculate the increase in Flux
list_title: Increase
description: >
Use the `increase()` function to track increases across multiple columns in a table.
This function is especially useful when tracking changes in counter values that
wrap over time or periodically reset.
weight: 10
menu:
enterprise_influxdb_1_10:
parent: Query with Flux
name: Increase
list_query_example: increase
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/increase/
v2: /influxdb/v2.0/query-data/flux/increase/
---
Use the [`increase()` function](/{{< latest "flux" >}}/stdlib/universe/increase/)
to track increases across multiple columns in a table.
This function is especially useful when tracking changes in counter values that
wrap over time or periodically reset.
```js
data
|> increase()
```
`increase()` returns a cumulative sum of **non-negative** differences between rows in a table.
For example:
{{< flex >}}
{{% flex-content %}}
**Given the following input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1 |
| 2020-01-01T00:02:00Z | 2 |
| 2020-01-01T00:03:00Z | 8 |
| 2020-01-01T00:04:00Z | 10 |
| 2020-01-01T00:05:00Z | 0 |
| 2020-01-01T00:06:00Z | 4 |
{{% /flex-content %}}
{{% flex-content %}}
**`increase()` returns:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:02:00Z | 1 |
| 2020-01-01T00:03:00Z | 7 |
| 2020-01-01T00:04:00Z | 9 |
| 2020-01-01T00:05:00Z | 9 |
| 2020-01-01T00:06:00Z | 13 |
{{% /flex-content %}}
{{< /flex >}}

View File

@ -0,0 +1,285 @@
---
title: Join data with Flux
seotitle: Join data in InfluxDB with Flux
list_title: Join
description: This guide walks through joining data with Flux and outlines how it shapes your data in the process.
menu:
enterprise_influxdb_1_10:
name: Join
parent: Query with Flux
weight: 10
list_query_example: join
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/join/
v2: /influxdb/v2.0/query-data/flux/join/
---
The [`join()` function](/{{< latest "flux" >}}/stdlib/universe/join) merges two or more
input streams, whose values are equal on a set of common columns, into a single output stream.
Flux allows you to join on any columns common between two data streams and opens the door
for operations such as cross-measurement joins and math across measurements.
To illustrate a join operation, use data captured by Telegraf and and stored in
InfluxDB - memory usage and processes.
In this guide, we'll join two data streams, one representing memory usage and the other representing the
total number of running processes, then calculate the average memory usage per running process.
If you're just getting started with Flux queries, check out the following:
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
## Define stream variables
In order to perform a join, you must have two streams of data.
Assign a variable to each data stream.
### Memory used variable
Define a `memUsed` variable that filters on the `mem` measurement and the `used` field.
This returns the amount of memory (in bytes) used.
###### memUsed stream definition
```js
memUsed = from(bucket: "db/rp")
|> range(start: -5m)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used")
```
{{% truncate %}}
###### memUsed data output
```
Table: keys: [_start, _stop, _field, _measurement, host]
_start:time _stop:time _field:string _measurement:string host:string _time:time _value:int
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ------------------------------ --------------------------
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:00.000000000Z 10956333056
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:10.000000000Z 11014008832
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:20.000000000Z 11373428736
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:30.000000000Z 11001421824
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:40.000000000Z 10985852928
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:50.000000000Z 10992279552
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:00.000000000Z 11053568000
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:10.000000000Z 11092242432
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:20.000000000Z 11612774400
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:30.000000000Z 11131961344
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:40.000000000Z 11124805632
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:50.000000000Z 11332464640
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:00.000000000Z 11176923136
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:10.000000000Z 11181068288
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:20.000000000Z 11182579712
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:30.000000000Z 11238862848
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:40.000000000Z 11275296768
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:50.000000000Z 11225411584
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:00.000000000Z 11252690944
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:10.000000000Z 11227029504
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:20.000000000Z 11201646592
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:30.000000000Z 11227897856
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:40.000000000Z 11330428928
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:50.000000000Z 11347976192
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:00.000000000Z 11368271872
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:10.000000000Z 11269623808
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:20.000000000Z 11295637504
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:30.000000000Z 11354423296
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:40.000000000Z 11379687424
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:50.000000000Z 11248926720
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:55:00.000000000Z 11292524544
```
{{% /truncate %}}
### Total processes variable
Define a `procTotal` variable that filters on the `processes` measurement and the `total` field.
This returns the number of running processes.
###### procTotal stream definition
```js
procTotal = from(bucket: "db/rp")
|> range(start: -5m)
|> filter(fn: (r) => r._measurement == "processes" and r._field == "total")
```
{{% truncate %}}
###### procTotal data output
```
Table: keys: [_start, _stop, _field, _measurement, host]
_start:time _stop:time _field:string _measurement:string host:string _time:time _value:int
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ------------------------------ --------------------------
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:00.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:10.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:20.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:30.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:40.000000000Z 469
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:50.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:00.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:10.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:20.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:30.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:40.000000000Z 469
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:50.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:00.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:10.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:20.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:30.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:40.000000000Z 472
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:50.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:00.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:10.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:20.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:30.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:40.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:50.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:00.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:10.000000000Z 470
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:20.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:30.000000000Z 473
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:40.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:50.000000000Z 471
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:55:00.000000000Z 471
```
{{% /truncate %}}
## Join the two data streams
With the two data streams defined, use the `join()` function to join them together.
`join()` requires two parameters:
##### `tables`
A map of tables to join with keys by which they will be aliased.
In the example below, `mem` is the alias for `memUsed` and `proc` is the alias for `procTotal`.
##### `on`
An array of strings defining the columns on which the tables will be joined.
_**Both tables must have all columns specified in this list.**_
```js
join(tables: {mem: memUsed, proc: procTotal}, on: ["_time", "_stop", "_start", "host"])
```
{{% truncate %}}
###### Joined output table
```
Table: keys: [_field_mem, _field_proc, _measurement_mem, _measurement_proc, _start, _stop, host]
_field_mem:string _field_proc:string _measurement_mem:string _measurement_proc:string _start:time _stop:time host:string _time:time _value_mem:int _value_proc:int
---------------------- ---------------------- ----------------------- ------------------------ ------------------------------ ------------------------------ ------------------------ ------------------------------ -------------------------- --------------------------
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:00.000000000Z 10956333056 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:10.000000000Z 11014008832 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:20.000000000Z 11373428736 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:30.000000000Z 11001421824 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:40.000000000Z 10985852928 469
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:50.000000000Z 10992279552 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:00.000000000Z 11053568000 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:10.000000000Z 11092242432 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:20.000000000Z 11612774400 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:30.000000000Z 11131961344 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:40.000000000Z 11124805632 469
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:50.000000000Z 11332464640 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:00.000000000Z 11176923136 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:10.000000000Z 11181068288 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:20.000000000Z 11182579712 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:30.000000000Z 11238862848 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:40.000000000Z 11275296768 472
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:50.000000000Z 11225411584 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:00.000000000Z 11252690944 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:10.000000000Z 11227029504 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:20.000000000Z 11201646592 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:30.000000000Z 11227897856 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:40.000000000Z 11330428928 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:50.000000000Z 11347976192 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:00.000000000Z 11368271872 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:10.000000000Z 11269623808 470
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:20.000000000Z 11295637504 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:30.000000000Z 11354423296 473
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:40.000000000Z 11379687424 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:50.000000000Z 11248926720 471
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:55:00.000000000Z 11292524544 471
```
{{% /truncate %}}
Notice the output table includes the following columns:
- `_field_mem`
- `_field_proc`
- `_measurement_mem`
- `_measurement_proc`
- `_value_mem`
- `_value_proc`
These represent the columns with values unique to the two input tables.
## Calculate and create a new table
With the two streams of data joined into a single table, use the
[`map()` function](/{{< latest "flux" >}}/stdlib/universe/map)
to build a new table by mapping the existing `_time` column to a new `_time`
column and dividing `_value_mem` by `_value_proc` and mapping it to a
new `_value` column.
```js
join(tables: {mem: memUsed, proc: procTotal}, on: ["_time", "_stop", "_start", "host"])
|> map(fn: (r) => ({_time: r._time, _value: r._value_mem / r._value_proc}))
```
{{% truncate %}}
###### Mapped table
```
Table: keys: [_field_mem, _field_proc, _measurement_mem, _measurement_proc, _start, _stop, host]
_field_mem:string _field_proc:string _measurement_mem:string _measurement_proc:string _start:time _stop:time host:string _time:time _value:int
---------------------- ---------------------- ----------------------- ------------------------ ------------------------------ ------------------------------ ------------------------ ------------------------------ --------------------------
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:00.000000000Z 23311346
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:10.000000000Z 23434061
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:20.000000000Z 24147407
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:30.000000000Z 23407280
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:40.000000000Z 23423993
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:50.000000000Z 23338173
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:00.000000000Z 23518229
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:10.000000000Z 23600515
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:20.000000000Z 24708030
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:30.000000000Z 23685024
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:40.000000000Z 23720267
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:50.000000000Z 24060434
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:00.000000000Z 23730197
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:10.000000000Z 23789506
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:20.000000000Z 23792722
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:30.000000000Z 23861704
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:40.000000000Z 23888340
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:50.000000000Z 23833145
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:00.000000000Z 23941895
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:10.000000000Z 23887296
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:20.000000000Z 23833290
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:30.000000000Z 23838424
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:40.000000000Z 24056112
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:50.000000000Z 24093367
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:00.000000000Z 24136458
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:10.000000000Z 23977922
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:20.000000000Z 23982245
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:30.000000000Z 24005123
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:40.000000000Z 24160695
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:50.000000000Z 23883071
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:55:00.000000000Z 23975635
```
{{% /truncate %}}
This table represents the average amount of memory in bytes per running process.
## Real world example
The following function calculates the batch sizes written to an InfluxDB cluster by joining
fields from `httpd` and `write` measurements in order to compare `pointReq` and `writeReq`.
The results are grouped by cluster ID so you can make comparisons across clusters.
```js
batchSize = (cluster_id, start=-1m, interval=10s) => {
httpd = from(bucket: "telegraf")
|> range(start: start)
|> filter(fn: (r) => r._measurement == "influxdb_httpd" and r._field == "writeReq" and r.cluster_id == cluster_id)
|> aggregateWindow(every: interval, fn: mean)
|> derivative(nonNegative: true, unit: 60s)
write = from(bucket: "telegraf")
|> range(start: start)
|> filter(fn: (r) => r._measurement == "influxdb_write" and r._field == "pointReq" and r.cluster_id == cluster_id)
|> aggregateWindow(every: interval, fn: max)
|> derivative(nonNegative: true, unit: 60s)
return join(tables: {httpd: httpd, write: write}, on: ["_time", "_stop", "_start", "host"])
|> map(fn: (r) => ({_time: r._time, _value: r._value_httpd / r._value_write}))
|> group(columns: cluster_id)
}
batchSize(cluster_id: "enter cluster id here")
```

View File

@ -0,0 +1,183 @@
---
title: Manipulate timestamps with Flux
list_title: Manipulate timestamps
description: >
Use Flux to process and manipulate timestamps.
menu:
enterprise_influxdb_1_10:
name: Manipulate timestamps
parent: Query with Flux
weight: 20
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/manipulate-timestamps/
v2: /influxdb/v2.0/query-data/flux/manipulate-timestamps/
---
Every point stored in InfluxDB has an associated timestamp.
Use Flux to process and manipulate timestamps to suit your needs.
- [Convert timestamp format](#convert-timestamp-format)
- [Calculate the duration between two timestamps](#calculate-the-duration-between-two-timestamps)
- [Retrieve the current time](#retrieve-the-current-time)
- [Normalize irregular timestamps](#normalize-irregular-timestamps)
- [Use timestamps and durations together](#use-timestamps-and-durations-together)
{{% note %}}
If you're just getting started with Flux queries, check out the following:
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
{{% /note %}}
## Convert timestamp format
- [Unix nanosecond to RFC3339](#unix-nanosecond-to-rfc3339)
- [RFC3339 to Unix nanosecond](#rfc3339-to-unix-nanosecond)
### Unix nanosecond to RFC3339
Use the [`time()` function](/{{< latest "flux" >}}/stdlib/universe/type-conversions/time/)
to convert a [Unix **nanosecond** timestamp](/{{< latest "influxdb" "v2" >}}/reference/glossary/#unix-timestamp)
to an [RFC3339 timestamp](/{{< latest "influxdb" "v2" >}}/reference/glossary/#rfc3339-timestamp).
```js
time(v: 1568808000000000000)
// Returns 2019-09-18T12:00:00.000000000Z
```
### RFC3339 to Unix nanosecond
Use the [`uint()` function](/{{< latest "flux" >}}/stdlib/universe/type-conversions/uint/)
to convert an RFC3339 timestamp to a Unix nanosecond timestamp.
```js
uint(v: 2019-09-18T12:00:00.000000000Z)
// Returns 1568808000000000000
```
## Calculate the duration between two timestamps
Flux doesn't support mathematical operations using [time type](/{{< latest "flux" >}}/language/types/#time-types) values.
To calculate the duration between two timestamps:
1. Use the `uint()` function to convert each timestamp to a Unix nanosecond timestamp.
2. Subtract one Unix nanosecond timestamp from the other.
3. Use the `duration()` function to convert the result into a duration.
```js
time1 = uint(v: 2019-09-17T21:12:05Z)
time2 = uint(v: 2019-09-18T22:16:35Z)
duration(v: time2 - time1)
// Returns 25h4m30s
```
{{% note %}}
Flux doesn't support duration column types.
To store a duration in a column, use the [`string()` function](/{{< latest "flux" >}}/stdlib/universe/type-conversions/string/)
to convert the duration to a string.
{{% /note %}}
## Retrieve the current time
- [Current UTC time](#current-utc-time)
- [Current system time](#current-system-time)
### Current UTC time
Use the [`now()` function](/{{< latest "flux" >}}/stdlib/built-in/misc/now/) to
return the current UTC time in RFC3339 format.
```js
now()
```
{{% note %}}
`now()` is cached at runtime, so all instances of `now()` in a Flux script
return the same value.
{{% /note %}}
### Current system time
Import the `system` package and use the [`system.time()` function](/{{< latest "flux" >}}/stdlib/system/time/)
to return the current system time of the host machine in RFC3339 format.
```js
import "system"
system.time()
```
{{% note %}}
`system.time()` returns the time it is executed, so each instance of `system.time()`
in a Flux script returns a unique value.
{{% /note %}}
## Normalize irregular timestamps
To normalize irregular timestamps, truncate all `_time` values to a specified unit
with the [`truncateTimeColumn()` function](/{{< latest "flux" >}}/stdlib/universe/truncatetimecolumn/).
This is useful in [`join()`](/{{< latest "flux" >}}/stdlib/universe/join/)
and [`pivot()`](/{{< latest "flux" >}}/stdlib/universe/pivot/)
operations where points should align by time, but timestamps vary slightly.
```js
data
|> truncateTimeColumn(unit: 1m)
```
{{< flex >}}
{{% flex-content %}}
**Input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:00:49Z | 2.0 |
| 2020-01-01T00:01:01Z | 1.9 |
| 2020-01-01T00:03:22Z | 1.8 |
| 2020-01-01T00:04:04Z | 1.9 |
| 2020-01-01T00:05:38Z | 2.1 |
{{% /flex-content %}}
{{% flex-content %}}
**Output:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:00:00Z | 2.0 |
| 2020-01-01T00:01:00Z | 1.9 |
| 2020-01-01T00:03:00Z | 1.8 |
| 2020-01-01T00:04:00Z | 1.9 |
| 2020-01-01T00:05:00Z | 2.1 |
{{% /flex-content %}}
{{< /flex >}}
## Use timestamps and durations together
- [Add a duration to a timestamp](#add-a-duration-to-a-timestamp)
- [Subtract a duration from a timestamp](#subtract-a-duration-from-a-timestamp)
### Add a duration to a timestamp
The [`experimental.addDuration()` function](/{{< latest "flux" >}}/stdlib/experimental/addduration/)
adds a duration to a specified time and returns the resulting time.
{{% warn %}}
By using `experimental.addDuration()`, you accept the
[risks of experimental functions](/{{< latest "flux" >}}/stdlib/experimental/#experimental-functions-are-subject-to-change).
{{% /warn %}}
```js
import "experimental"
experimental.addDuration(d: 6h, to: 2019-09-16T12:00:00Z)
// Returns 2019-09-16T18:00:00.000000000Z
```
### Subtract a duration from a timestamp
The [`experimental.subDuration()` function](/{{< latest "flux" >}}/stdlib/experimental/subduration/)
subtracts a duration from a specified time and returns the resulting time.
{{% warn %}}
By using `experimental.subDuration()`, you accept the
[risks of experimental functions](/{{< latest "flux" >}}/stdlib/experimental/#experimental-functions-are-subject-to-change).
{{% /warn %}}
```js
import "experimental"
experimental.subDuration(d: 6h, from: 2019-09-16T12:00:00Z)
// Returns 2019-09-16T06:00:00.000000000Z
```

View File

@ -0,0 +1,200 @@
---
title: Transform data with mathematic operations
seotitle: Transform data with mathematic operations in Flux
list_title: Transform data with math
description: >
Use the `map()` function to remap column values and apply mathematic operations.
menu:
enterprise_influxdb_1_10:
name: Transform data with math
parent: Query with Flux
weight: 5
list_query_example: map_math
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/mathematic-operations/
v2: /influxdb/v2.0/query-data/flux/mathematic-operations/
---
Flux supports mathematic expressions in data transformations.
This article describes how to use [Flux arithmetic operators](/{{< latest "flux" >}}/language/operators/#arithmetic-operators)
to "map" over data and transform values using mathematic operations.
If you're just getting started with Flux queries, check out the following:
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
##### Basic mathematic operations
```js
// Examples executed using the Flux REPL
> 9 + 9
18
> 22 - 14
8
> 6 * 5
30
> 21 / 7
3
```
<p style="font-size:.85rem;font-style:italic;margin-top:-2rem;">See <a href="/influxdb/v2.0/tools/repl/">Flux read-eval-print-loop (REPL)</a>.</p>
{{% note %}}
#### Operands must be the same type
Operands in Flux mathematic operations must be the same data type.
For example, integers cannot be used in operations with floats.
Otherwise, you will get an error similar to:
```
Error: type error: float != int
```
To convert operands to the same type, use [type-conversion functions](/{{< latest "flux" >}}/stdlib/universe/type-conversions/)
or manually format operands.
The operand data type determines the output data type.
For example:
```js
100 // Parsed as an integer
100.0 // Parsed as a float
// Example evaluations
> 20 / 8
2
> 20.0 / 8.0
2.5
```
{{% /note %}}
## Custom mathematic functions
Flux lets you [create custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) that use mathematic operations.
View the examples below.
###### Custom multiplication function
```js
multiply = (x, y) => x * y
multiply(x: 10, y: 12)
// Returns 120
```
###### Custom percentage function
```js
percent = (sample, total) => (sample / total) * 100.0
percent(sample: 20.0, total: 80.0)
// Returns 25.0
```
### Transform values in a data stream
To transform multiple values in an input stream, your function needs to:
- [Handle piped-forward data](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions/#functions-that-manipulate-piped-forward-data).
- Each operand necessary for the calculation exists in each row _(see [Pivot vs join](#pivot-vs-join) below)_.
- Use the [`map()` function](/{{< latest "flux" >}}/stdlib/universe/map) to iterate over each row.
The example `multiplyByX()` function below includes:
- A `tables` parameter that represents the input data stream (`<-`).
- An `x` parameter which is the number by which values in the `_value` column are multiplied.
- A `map()` function that iterates over each row in the input stream.
It uses the `with` operator to preserve existing columns in each row.
It also multiples the `_value` column by `x`.
```js
multiplyByX = (x, tables=<-) => tables
|> map(fn: (r) => ({r with _value: r._value * x}))
data
|> multiplyByX(x: 10)
```
## Examples
### Convert bytes to gigabytes
To convert active memory from bytes to gigabytes (GB), divide the `active` field
in the `mem` measurement by 1,073,741,824.
The `map()` function iterates over each row in the piped-forward data and defines
a new `_value` by dividing the original `_value` by 1073741824.
```js
from(bucket: "db/rp")
|> range(start: -10m)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "active")
|> map(fn: (r) => ({r with _value: r._value / 1073741824}))
```
You could turn that same calculation into a function:
```js
bytesToGB = (tables=<-) => tables
|> map(fn: (r) => ({r with _value: r._value / 1073741824}))
data
|> bytesToGB()
```
#### Include partial gigabytes
Because the original metric (bytes) is an integer, the output of the operation is an integer and does not include partial GBs.
To calculate partial GBs, convert the `_value` column and its values to floats using the
[`float()` function](/{{< latest "flux" >}}/stdlib/universe/type-conversions/float)
and format the denominator in the division operation as a float.
```js
bytesToGB = (tables=<-) => tables
|> map(fn: (r) => ({r with _value: float(v: r._value) / 1073741824.0}))
```
### Calculate a percentage
To calculate a percentage, use simple division, then multiply the result by 100.
```js
> 1.0 / 4.0 * 100.0
25.0
```
_For an in-depth look at calculating percentages, see [Calculate percentates](/enterprise_influxdb/v1.10/flux/guides/calculate-percentages)._
## Pivot vs join
To query and use values in mathematical operations in Flux, operand values must
exists in a single row.
Both `pivot()` and `join()` will do this, but there are important differences between the two:
#### Pivot is more performant
`pivot()` reads and operates on a single stream of data.
`join()` requires two streams of data and the overhead of reading and combining
both streams can be significant, especially for larger data sets.
#### Use join for multiple data sources
Use `join()` when querying data from different buckets or data sources.
##### Pivot fields into columns for mathematic calculations
```js
data
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({r with _value: (r.field1 + r.field2) / r.field3 * 100.0}))
```
##### Join multiple data sources for mathematic calculations
```js
import "sql"
import "influxdata/influxdb/secrets"
pgUser = secrets.get(key: "POSTGRES_USER")
pgPass = secrets.get(key: "POSTGRES_PASSWORD")
pgHost = secrets.get(key: "POSTGRES_HOST")
t1 = sql.from(
driverName: "postgres",
dataSourceName: "postgresql://${pgUser}:${pgPass}@${pgHost}",
query: "SELECT id, name, available FROM exampleTable",
)
t2 = from(bucket: "db/rp")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement" and r._field == "example-field")
join(tables: {t1: t1, t2: t2}, on: ["id"])
|> map(fn: (r) => ({r with _value: r._value_t2 / r.available_t1 * 100.0}))
```

View File

@ -0,0 +1,143 @@
---
title: Find median values
seotitle: Find median values in Flux
list_title: Median
description: >
Use the `median()` function to return a value representing the `0.5` quantile (50th percentile) or median of input data.
weight: 10
menu:
enterprise_influxdb_1_10:
parent: Query with Flux
name: Median
list_query_example: median
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/median/
v2: /influxdb/v2.0/query-data/flux/median/
---
Use the [`median()` function](/{{< latest "flux" >}}/stdlib/universe/median/)
to return a value representing the `0.5` quantile (50th percentile) or median of input data.
## Select a method for calculating the median
Select one of the following methods to calculate the median:
- [estimate_tdigest](#estimate-tdigest)
- [exact_mean](#exact-mean)
- [exact_selector](#exact-selector)
### estimate_tdigest
**(Default)** An aggregate method that uses a [t-digest data structure](https://github.com/tdunning/t-digest)
to compute an accurate `0.5` quantile estimate on large data sources.
Output tables consist of a single row containing the calculated median.
{{< flex >}}
{{% flex-content %}}
**Given the following input table:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1.0 |
| 2020-01-01T00:02:00Z | 1.0 |
| 2020-01-01T00:03:00Z | 2.0 |
| 2020-01-01T00:04:00Z | 3.0 |
{{% /flex-content %}}
{{% flex-content %}}
**`estimate_tdigest` returns:**
| _value |
|:------:|
| 1.5 |
{{% /flex-content %}}
{{< /flex >}}
### exact_mean
An aggregate method that takes the average of the two points closest to the `0.5` quantile value.
Output tables consist of a single row containing the calculated median.
{{< flex >}}
{{% flex-content %}}
**Given the following input table:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1.0 |
| 2020-01-01T00:02:00Z | 1.0 |
| 2020-01-01T00:03:00Z | 2.0 |
| 2020-01-01T00:04:00Z | 3.0 |
{{% /flex-content %}}
{{% flex-content %}}
**`exact_mean` returns:**
| _value |
|:------:|
| 1.5 |
{{% /flex-content %}}
{{< /flex >}}
### exact_selector
A selector method that returns the data point for which at least 50% of points are less than.
Output tables consist of a single row containing the calculated median.
{{< flex >}}
{{% flex-content %}}
**Given the following input table:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1.0 |
| 2020-01-01T00:02:00Z | 1.0 |
| 2020-01-01T00:03:00Z | 2.0 |
| 2020-01-01T00:04:00Z | 3.0 |
{{% /flex-content %}}
{{% flex-content %}}
**`exact_selector` returns:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:02:00Z | 1.0 |
{{% /flex-content %}}
{{< /flex >}}
{{% note %}}
The examples below use the [example data variable](/enterprise_influxdb/v1.10/flux/guides/#example-data-variable).
{{% /note %}}
## Find the value that represents the median
Use the default method, `"estimate_tdigest"`, to return all rows in a table that
contain values in the 50th percentile of data in the table.
```js
data
|> median()
```
## Find the average of values closest to the median
Use the `exact_mean` method to return a single row per input table containing the
average of the two values closest to the mathematical median of data in the table.
```js
data
|> median(method: "exact_mean")
```
## Find the point with the median value
Use the `exact_selector` method to return a single row per input table containing the
value that 50% of values in the table are less than.
```js
data
|> median(method: "exact_selector")
```
## Use median() with aggregateWindow()
[`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/)
segments data into windows of time, aggregates data in each window into a single
point, and then removes the time-based segmentation.
It is primarily used to downsample data.
To specify the [median calculation method](#select-a-method-for-calculating-the-median) in `aggregateWindow()`, use the
[full function syntax](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/#specify-parameters-of-the-aggregate-function):
```js
data
|> aggregateWindow(every: 5m, fn: (tables=<-, column) => tables |> median(method: "exact_selector"))
```

View File

@ -0,0 +1,143 @@
---
title: Monitor states
seotitle: Monitor states and state changes in your events and metrics with Flux.
description: Flux provides several functions to help monitor states and state changes in your data.
menu:
enterprise_influxdb_1_10:
name: Monitor states
parent: Query with Flux
weight: 20
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/monitor-states/
v2: /influxdb/v2.0/query-data/flux/monitor-states/
---
Flux helps you monitor states in your metrics and events:
- [Find how long a state persists](#find-how-long-a-state-persists)
- [Count the number of consecutive states](#count-the-number-of-consecutive-states)
- [Detect state changes](#example-query-to-count-machine-state)
If you're just getting started with Flux queries, check out the following:
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux.
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/executing-queries/) to discover a variety of ways to run your queries.
## Find how long a state persists
1. Use the [`stateDuration()`](/{{< latest "flux" >}}/stdlib/universe/stateduration/) function to calculate how long a column value has remained the same value (or state). Include the following information:
- **Column to search:** any tag key, tag value, field key, field value, or measurement.
- **Value:** the value (or state) to search for in the specified column.
- **State duration column:** a new column to store the state duration─the length of time that the specified value persists.
- **Unit:** the unit of time (`1s` (by default), `1m`, `1h`) used to increment the state duration.
<!-- -->
```js
data
|> stateDuration(
fn: (r) => r._column_to_search == "value_to_search_for",
column: "state_duration",
unit: 1s,
)
```
2. Use `stateDuration()` to search each point for the specified value:
- For the first point that evaluates `true`, the state duration is set to `0`. For each consecutive point that evaluates `true`, the state duration increases by the time interval between each consecutive point (in specified units).
- If the state is `false`, the state duration is reset to `-1`.
### Example query with stateDuration()
The following query searches the `doors` bucket over the past 5 minutes to find how many seconds a door has been `closed`.
```js
from(bucket: "doors")
|> range(start: -5m)
|> stateDuration(
fn: (r) => r._value == "closed",
column: "door_closed",
unit: 1s,
)
```
In this example, `door_closed` is the **State duration** column. If you write data to the `doors` bucket every minute, the state duration increases by `60s` for each consecutive point where `_value` is `closed`. If `_value` is not `closed`, the state duration is reset to `0`.
#### Query results
Results for the example query above may look like this (for simplicity, we've omitted the measurement, tag, and field columns):
```bash
_time _value door_closed
2019-10-26T17:39:16Z closed 0
2019-10-26T17:40:16Z closed 60
2019-10-26T17:41:16Z closed 120
2019-10-26T17:42:16Z open -1
2019-10-26T17:43:16Z closed 0
2019-10-26T17:44:27Z closed 60
```
## Count the number of consecutive states
1. Use the `stateCount()` function and include the following information:
- **Column to search:** any tag key, tag value, field key, field value, or measurement.
- **Value:** to search for in the specified column.
- **State count column:** a new column to store the state count─the number of consecutive records in which the specified value exists.
<!-- -->
```js
data
|> stateCount(
fn: (r) => r._column_to_search == "value_to_search_for",
column: "state_count"
)
```
2. Use `stateCount()` to search each point for the specified value:
- For the first point that evaluates `true`, the state count is set to `1`. For each consecutive point that evaluates `true`, the state count increases by 1.
- If the state is `false`, the state count is reset to `-1`.
### Example query with stateCount()
The following query searches the `doors` bucket over the past 5 minutes and
calculates how many points have `closed` as their `_value`.
```js
from(bucket: "doors")
|> range(start: -5m)
|> stateDuration(fn: (r) => r._value == "closed", column: "door_closed")
```
This example stores the **state count** in the `door_closed` column.
If you write data to the `doors` bucket every minute, the state count increases
by `1` for each consecutive point where `_value` is `closed`.
If `_value` is not `closed`, the state count is reset to `-1`.
#### Query results
Results for the example query above may look like this (for simplicity, we've omitted the measurement, tag, and field columns):
```bash
_time _value door_closed
2019-10-26T17:39:16Z closed 1
2019-10-26T17:40:16Z closed 2
2019-10-26T17:41:16Z closed 3
2019-10-26T17:42:16Z open -1
2019-10-26T17:43:16Z closed 1
2019-10-26T17:44:27Z closed 2
```
#### Example query to count machine state
The following query checks the machine state every minute (idle, assigned, or busy).
InfluxDB searches the `servers` bucket over the past hour and counts records with a machine state of `idle`, `assigned` or `busy`.
```js
from(bucket: "servers")
|> range(start: -1h)
|> filter(fn: (r) => r.machine_state == "idle" or r.machine_state == "assigned" or r.machine_state == "busy")
|> stateCount(fn: (r) => r.machine_state == "busy", column: "_count")
|> stateCount(fn: (r) => r.machine_state == "assigned", column: "_count")
|> stateCount(fn: (r) => r.machine_state == "idle", column: "_count")
```

View File

@ -0,0 +1,112 @@
---
title: Calculate the moving average
seotitle: Calculate the moving average in Flux
list_title: Moving Average
description: >
Use the `movingAverage()` or `timedMovingAverage()` functions to return the moving average of data.
weight: 10
menu:
enterprise_influxdb_1_10:
parent: Query with Flux
name: Moving Average
list_query_example: moving_average
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/moving-average/
v2: /influxdb/v2.0/query-data/flux/moving-average/
---
Use the [`movingAverage()`](/{{< latest "flux" >}}/stdlib/universe/movingaverage/)
or [`timedMovingAverage()`](/{{< latest "flux" >}}/stdlib/universe/timedmovingaverage/)
functions to return the moving average of data.
```js
data
|> movingAverage(n: 5)
// OR
data
|> timedMovingAverage(every: 5m, period: 10m)
```
### movingAverage()
For each row in a table, `movingAverage()` returns the average of the current value and
**previous** values where `n` is the total number of values used to calculate the average.
If `n = 3`:
| Row # | Calculation |
|:-----:|:----------- |
| 1 | _Insufficient number of rows_ |
| 2 | _Insufficient number of rows_ |
| 3 | (Row1 + Row2 + Row3) / 3 |
| 4 | (Row2 + Row3 + Row4) / 3 |
| 5 | (Row3 + Row4 + Row5) / 3 |
{{< flex >}}
{{% flex-content %}}
**Given the following input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1.0 |
| 2020-01-01T00:02:00Z | 1.2 |
| 2020-01-01T00:03:00Z | 1.8 |
| 2020-01-01T00:04:00Z | 0.9 |
| 2020-01-01T00:05:00Z | 1.4 |
| 2020-01-01T00:06:00Z | 2.0 |
{{% /flex-content %}}
{{% flex-content %}}
**The following would return:**
```js
|> movingAverage(n: 3)
```
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:03:00Z | 1.33 |
| 2020-01-01T00:04:00Z | 1.30 |
| 2020-01-01T00:05:00Z | 1.36 |
| 2020-01-01T00:06:00Z | 1.43 |
{{% /flex-content %}}
{{< /flex >}}
### timedMovingAverage()
For each row in a table, `timedMovingAverage()` returns the average of the
current value and all row values in the **previous** `period` (duration).
It returns moving averages at a frequency defined by the `every` parameter.
Each color in the diagram below represents a period of time used to calculate an
average and the time a point representing the average is returned.
If `every = 30m` and `period = 1h`:
{{< svg "/static/svgs/timed-moving-avg.svg" >}}
{{< flex >}}
{{% flex-content %}}
**Given the following input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1.0 |
| 2020-01-01T00:02:00Z | 1.2 |
| 2020-01-01T00:03:00Z | 1.8 |
| 2020-01-01T00:04:00Z | 0.9 |
| 2020-01-01T00:05:00Z | 1.4 |
| 2020-01-01T00:06:00Z | 2.0 |
{{% /flex-content %}}
{{% flex-content %}}
**The following would return:**
```js
|> timedMovingAverage(every: 2m, period: 4m)
```
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:02:00Z | 1.000 |
| 2020-01-01T00:04:00Z | 1.333 |
| 2020-01-01T00:06:00Z | 1.325 |
| 2020-01-01T00:06:00Z | 1.150 |
{{% /flex-content %}}
{{< /flex >}}

View File

@ -0,0 +1,162 @@
---
title: Find percentile and quantile values
seotitle: Query percentile and quantile values in Flux
list_title: Percentile & quantile
description: >
Use the `quantile()` function to return all values within the `q` quantile or
percentile of input data.
weight: 10
menu:
enterprise_influxdb_1_10:
parent: Query with Flux
name: Percentile & quantile
list_query_example: quantile
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/percentile-quantile/
v2: /influxdb/v2.0/query-data/flux/percentile-quantile/
---
Use the [`quantile()` function](/{{< latest "flux" >}}/stdlib/universe/quantile/)
to return a value representing the `q` quantile or percentile of input data.
## Percentile versus quantile
Percentiles and quantiles are very similar, differing only in the number used to calculate return values.
A percentile is calculated using numbers between `0` and `100`.
A quantile is calculated using numbers between `0.0` and `1.0`.
For example, the **`0.5` quantile** is the same as the **50th percentile**.
## Select a method for calculating the quantile
Select one of the following methods to calculate the quantile:
- [estimate_tdigest](#estimate-tdigest)
- [exact_mean](#exact-mean)
- [exact_selector](#exact-selector)
### estimate_tdigest
**(Default)** An aggregate method that uses a [t-digest data structure](https://github.com/tdunning/t-digest)
to compute a quantile estimate on large data sources.
Output tables consist of a single row containing the calculated quantile.
If calculating the `0.5` quantile or 50th percentile:
{{< flex >}}
{{% flex-content %}}
**Given the following input table:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1.0 |
| 2020-01-01T00:02:00Z | 1.0 |
| 2020-01-01T00:03:00Z | 2.0 |
| 2020-01-01T00:04:00Z | 3.0 |
{{% /flex-content %}}
{{% flex-content %}}
**`estimate_tdigest` returns:**
| _value |
|:------:|
| 1.5 |
{{% /flex-content %}}
{{< /flex >}}
### exact_mean
An aggregate method that takes the average of the two points closest to the quantile value.
Output tables consist of a single row containing the calculated quantile.
If calculating the `0.5` quantile or 50th percentile:
{{< flex >}}
{{% flex-content %}}
**Given the following input table:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1.0 |
| 2020-01-01T00:02:00Z | 1.0 |
| 2020-01-01T00:03:00Z | 2.0 |
| 2020-01-01T00:04:00Z | 3.0 |
{{% /flex-content %}}
{{% flex-content %}}
**`exact_mean` returns:**
| _value |
|:------:|
| 1.5 |
{{% /flex-content %}}
{{< /flex >}}
### exact_selector
A selector method that returns the data point for which at least `q` points are less than.
Output tables consist of a single row containing the calculated quantile.
If calculating the `0.5` quantile or 50th percentile:
{{< flex >}}
{{% flex-content %}}
**Given the following input table:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:01:00Z | 1.0 |
| 2020-01-01T00:02:00Z | 1.0 |
| 2020-01-01T00:03:00Z | 2.0 |
| 2020-01-01T00:04:00Z | 3.0 |
{{% /flex-content %}}
{{% flex-content %}}
**`exact_selector` returns:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:02:00Z | 1.0 |
{{% /flex-content %}}
{{< /flex >}}
{{% note %}}
The examples below use the [example data variable](/enterprise_influxdb/v1.10/flux/guides/#example-data-variable).
{{% /note %}}
## Find the value representing the 99th percentile
Use the default method, `"estimate_tdigest"`, to return all rows in a table that
contain values in the 99th percentile of data in the table.
```js
data
|> quantile(q: 0.99)
```
## Find the average of values closest to the quantile
Use the `exact_mean` method to return a single row per input table containing the
average of the two values closest to the mathematical quantile of data in the table.
For example, to calculate the `0.99` quantile:
```js
data
|> quantile(q: 0.99, method: "exact_mean")
```
## Find the point with the quantile value
Use the `exact_selector` method to return a single row per input table containing the
value that `q * 100`% of values in the table are less than.
For example, to calculate the `0.99` quantile:
```js
data
|> quantile(q: 0.99, method: "exact_selector")
```
## Use quantile() with aggregateWindow()
[`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/)
segments data into windows of time, aggregates data in each window into a single
point, and then removes the time-based segmentation.
It is primarily used to downsample data.
To specify the [quantile calculation method](#select-a-method-for-calculating-the-quantile) in
`aggregateWindow()`, use the [full function syntax](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/#specify-parameters-of-the-aggregate-function):
```js
data
|> aggregateWindow(
every: 5m,
fn: (tables=<-, column) => tables
|> quantile(q: 0.99, method: "exact_selector"),
)
```

View File

@ -0,0 +1,75 @@
---
title: Query fields and tags
seotitle: Query fields and tags in InfluxDB using Flux
description: >
Use the `filter()` function to query data based on fields, tags, or any other column value.
`filter()` performs operations similar to the `SELECT` statement and the `WHERE`
clause in InfluxQL and other SQL-like query languages.
weight: 1
menu:
enterprise_influxdb_1_10:
parent: Query with Flux
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/query-fields/
v2: /influxdb/v2.0/query-data/flux/query-fields/
list_code_example: |
```js
from(bucket: "db/rp")
|> range(start: -1h)
|> filter(fn: (r) =>
r._measurement == "example-measurement" and
r._field == "example-field" and
r.tag == "example-tag"
)
```
---
Use the [`filter()` function](/{{< latest "flux" >}}/stdlib/universe/filter/)
to query data based on fields, tags, or any other column value.
`filter()` performs operations similar to the `SELECT` statement and the `WHERE`
clause in InfluxQL and other SQL-like query languages.
## The filter() function
`filter()` has an `fn` parameter that expects a **predicate function**,
an anonymous function comprised of one or more **predicate expressions**.
The predicate function evaluates each input row.
Rows that evaluate to `true` are **included** in the output data.
Rows that evaluate to `false` are **excluded** from the output data.
```js
// ...
|> filter(fn: (r) => r._measurement == "example-measurement" )
```
The `fn` predicate function requires an `r` argument, which represents each row
as `filter()` iterates over input data.
Key-value pairs in the row record represent columns and their values.
Use **dot notation** or **bracket notation** to reference specific column values in the predicate function.
Use [logical operators](/{{< latest "flux" >}}/language/operators/#logical-operators)
to chain multiple predicate expressions together.
```js
// Row record
r = {foo: "bar", baz: "quz"}
// Example predicate function
(r) => r.foo == "bar" and r["baz"] == "quz"
// Evaluation results
(r) => true and true
```
## Filter by fields and tags
The combination of [`from()`](/{{< latest "flux" >}}/stdlib/built-in/inputs/from),
[`range()`](/{{< latest "flux" >}}/stdlib/universe/range),
and `filter()` represent the most basic Flux query:
1. Use `from()` to define your [bucket](/enterprise_influxdb/v1.10/flux/get-started/#buckets).
2. Use `range()` to limit query results by time.
3. Use `filter()` to identify what rows of data to output.
```js
from(bucket: "db/rp")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement" and r.tag == "example-tag")
|> filter(fn: (r) => r._field == "example-field")
```

View File

@ -0,0 +1,165 @@
---
title: Calculate the rate of change
seotitle: Calculate the rate of change in Flux
list_title: Rate
description: >
Use the `derivative()` function to calculate the rate of change between subsequent values or the
`aggregate.rate()` function to calculate the average rate of change per window of time.
If time between points varies, these functions normalize points to a common time interval
making values easily comparable.
weight: 10
menu:
enterprise_influxdb_1_10:
parent: Query with Flux
name: Rate
list_query_example: rate_of_change
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/rate/
v2: /influxdb/v2.0/query-data/flux/rate/
---
Use the [`derivative()` function](/{{< latest "flux" >}}/stdlib/universe/derivative/)
to calculate the rate of change between subsequent values or the
[`aggregate.rate()` function](/{{< latest "flux" >}}/stdlib/experimental/aggregate/rate/)
to calculate the average rate of change per window of time.
If time between points varies, these functions normalize points to a common time interval
making values easily comparable.
- [Rate of change between subsequent values](#rate-of-change-between-subsequent-values)
- [Average rate of change per window of time](#average-rate-of-change-per-window-of-time)
## Rate of change between subsequent values
Use the [`derivative()` function](/{{< latest "flux" >}}/stdlib/universe/derivative/)
to calculate the rate of change per unit of time between subsequent _non-null_ values.
```js
data
|> derivative(unit: 1s)
```
By default, `derivative()` returns only positive derivative values and replaces negative values with _null_.
Cacluated values are returned as [floats](/{{< latest "flux" >}}/language/types/#numeric-types).
{{< flex >}}
{{% flex-content %}}
**Given the following input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:00:00Z | 250 |
| 2020-01-01T00:04:00Z | 160 |
| 2020-01-01T00:12:00Z | 150 |
| 2020-01-01T00:19:00Z | 220 |
| 2020-01-01T00:32:00Z | 200 |
| 2020-01-01T00:51:00Z | 290 |
| 2020-01-01T01:00:00Z | 340 |
{{% /flex-content %}}
{{% flex-content %}}
**`derivative(unit: 1m)` returns:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:04:00Z | |
| 2020-01-01T00:12:00Z | |
| 2020-01-01T00:19:00Z | 10.0 |
| 2020-01-01T00:32:00Z | |
| 2020-01-01T00:51:00Z | 4.74 |
| 2020-01-01T01:00:00Z | 5.56 |
{{% /flex-content %}}
{{< /flex >}}
Results represent the rate of change **per minute** between subsequent values with
negative values set to _null_.
### Return negative derivative values
To return negative derivative values, set the `nonNegative` parameter to `false`,
{{< flex >}}
{{% flex-content %}}
**Given the following input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:00:00Z | 250 |
| 2020-01-01T00:04:00Z | 160 |
| 2020-01-01T00:12:00Z | 150 |
| 2020-01-01T00:19:00Z | 220 |
| 2020-01-01T00:32:00Z | 200 |
| 2020-01-01T00:51:00Z | 290 |
| 2020-01-01T01:00:00Z | 340 |
{{% /flex-content %}}
{{% flex-content %}}
**The following returns:**
```js
|> derivative(unit: 1m, nonNegative: false)
```
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:04:00Z | -22.5 |
| 2020-01-01T00:12:00Z | -1.25 |
| 2020-01-01T00:19:00Z | 10.0 |
| 2020-01-01T00:32:00Z | -1.54 |
| 2020-01-01T00:51:00Z | 4.74 |
| 2020-01-01T01:00:00Z | 5.56 |
{{% /flex-content %}}
{{< /flex >}}
Results represent the rate of change **per minute** between subsequent values and
include negative values.
## Average rate of change per window of time
Use the [`aggregate.rate()` function](/{{< latest "flux" >}}/stdlib/experimental/aggregate/rate/)
to calculate the average rate of change per window of time.
```js
import "experimental/aggregate"
data
|> aggregate.rate(every: 1m, unit: 1s, groupColumns: ["tag1", "tag2"])
```
`aggregate.rate()` returns the average rate of change (as a [float](/{{< latest "flux" >}}/language/types/#numeric-types))
per `unit` for time intervals defined by `every`.
Negative values are replaced with _null_.
{{% note %}}
`aggregate.rate()` does not support `nonNegative: false`.
{{% /note %}}
{{< flex >}}
{{% flex-content %}}
**Given the following input:**
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:00:00Z | 250 |
| 2020-01-01T00:04:00Z | 160 |
| 2020-01-01T00:12:00Z | 150 |
| 2020-01-01T00:19:00Z | 220 |
| 2020-01-01T00:32:00Z | 200 |
| 2020-01-01T00:51:00Z | 290 |
| 2020-01-01T01:00:00Z | 340 |
{{% /flex-content %}}
{{% flex-content %}}
**The following returns:**
```js
|> aggregate.rate(every: 20m, unit: 1m)
```
| _time | _value |
|:----- | ------:|
| 2020-01-01T00:20:00Z | |
| 2020-01-01T00:40:00Z | 10.0 |
| 2020-01-01T01:00:00Z | 4.74 |
| 2020-01-01T01:20:00Z | 5.56 |
{{% /flex-content %}}
{{< /flex >}}
Results represent the **average change rate per minute** of every **20 minute interval**
with negative values set to _null_.
Timestamps represent the right bound of the time window used to average values.

View File

@ -0,0 +1,86 @@
---
title: Use regular expressions in Flux
list_title: Regular expressions
description: This guide walks through using regular expressions in evaluation logic in Flux functions.
menu:
enterprise_influxdb_1_10:
name: Regular expressions
parent: Query with Flux
weight: 20
list_query_example: regular_expressions
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/regular-expressions/
v2: /influxdb/v2.0/query-data/flux/regular-expressions/
---
Regular expressions (regexes) are incredibly powerful when matching patterns in large collections of data.
With Flux, regular expressions are primarily used for evaluation logic in predicate functions for things
such as filtering rows, dropping and keeping columns, state detection, etc.
This guide shows how to use regular expressions in your Flux scripts.
If you're just getting started with Flux queries, check out the following:
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
## Go regular expression syntax
Flux uses Go's [regexp package](https://golang.org/pkg/regexp/) for regular expression search.
The links [below](#helpful-links) provide information about Go's regular expression syntax.
## Regular expression operators
Flux provides two comparison operators for use with regular expressions.
#### `=~`
When the expression on the left **MATCHES** the regular expression on the right, this evaluates to `true`.
#### `!~`
When the expression on the left **DOES NOT MATCH** the regular expression on the right, this evaluates to `true`.
## Regular expressions in Flux
When using regex matching in your Flux scripts, enclose your regular expressions with `/`.
The following is the basic regex comparison syntax:
###### Basic regex comparison syntax
```js
expression =~ /regex/
expression !~ /regex/
```
## Examples
### Use a regex to filter by tag value
The following example filters records by the `cpu` tag.
It only keeps records for which the `cpu` is either `cpu0`, `cpu1`, or `cpu2`.
```js
from(bucket: "db/rp")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_user" and r.cpu =~ /cpu[0-2]/)
```
### Use a regex to filter by field key
The following example excludes records that do not have `_percent` in a field key.
```js
from(bucket: "db/rp")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "mem" and r._field =~ /_percent/)
```
### Drop columns matching a regex
The following example drops columns whose names do not being with `_`.
```js
from(bucket: "db/rp")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "mem")
|> drop(fn: (column) => column !~ /_.*/)
```
## Helpful links
##### Syntax documentation
[regexp Syntax GoDoc](https://godoc.org/regexp/syntax)
[RE2 Syntax Overview](https://github.com/google/re2/wiki/Syntax)
##### Go regex testers
[Regex Tester - Golang](https://regex-golang.appspot.com/assets/html/index.html)
[Regex101](https://regex101.com/)

View File

@ -0,0 +1,249 @@
---
title: Extract scalar values in Flux
list_title: Extract scalar values
description: >
Use Flux stream and table functions to extract scalar values from Flux query output.
This lets you, for example, dynamically set variables using query results.
menu:
enterprise_influxdb_1_10:
name: Extract scalar values
parent: Query with Flux
weight: 20
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/scalar-values/
v2: /influxdb/v2.0/query-data/flux/scalar-values/
list_code_example: |
```js
scalarValue = {
_record =
data
|> tableFind(fn: key => true)
|> getRecord(idx: 0)
return _record._value
}
```
---
Use Flux [stream and table functions](/{{< latest "flux" >}}/stdlib/universe/stream-table/)
to extract scalar values from Flux query output.
This lets you, for example, dynamically set variables using query results.
**To extract scalar values from output:**
1. [Extract a table](#extract-a-table).
2. [Extract a column from the table](#extract-a-column-from-the-table)
_**or**_ [extract a row from the table](#extract-a-row-from-the-table).
_The samples on this page use the [sample data provided below](#sample-data)._
{{% warn %}}
#### Current limitations
- The InfluxDB user interface (UI) does not currently support raw scalar output.
Use [`map()`](/{{< latest "flux" >}}/stdlib/universe/map/) to add
scalar values to output data.
- The [Flux REPL](/enterprise_influxdb/v1.10/flux/guides/execute-queries/#influx-cli) does not currently support
Flux stream and table functions (also known as "dynamic queries").
See [#15321](https://github.com/influxdata/influxdb/issues/15231).
{{% /warn %}}
## Extract a table
Flux formats query results as a stream of tables.
To extract a scalar value from a stream of tables, you must first extract a single table.
to extract a single table from the stream of tables.
{{% note %}}
If query results include only one table, it is still formatted as a stream of tables.
You still must extract that table from the stream.
{{% /note %}}
Use [`tableFind()`](/{{< latest "flux" >}}/stdlib/universe/stream-table/tablefind/)
to extract the **first** table whose [group key](/enterprise_influxdb/v1.10/flux/get-started/#group-keys)
values match the `fn` **predicate function**.
The predicate function requires a `key` record, which represents the group key of
each table.
```js
sampleData
|> tableFind(fn: (key) => key._field == "temp" and key.location == "sfo")
```
The example above returns a single table:
| _time | location | _field | _value |
|:----- |:--------:|:------:| ------:|
| 2019-11-01T12:00:00Z | sfo | temp | 65.1 |
| 2019-11-01T13:00:00Z | sfo | temp | 66.2 |
| 2019-11-01T14:00:00Z | sfo | temp | 66.3 |
| 2019-11-01T15:00:00Z | sfo | temp | 66.8 |
{{% note %}}
#### Extract the correct table
Flux functions do not guarantee table order and `tableFind()` returns only the
**first** table that matches the `fn` predicate.
To extract the table that includes the data you actually want, be very specific in
your predicate function or filter and transform your data to minimize the number
of tables piped-forward into `tableFind()`.
{{% /note %}}
## Extract a column from the table
Use the [`getColumn()` function](/{{< latest "flux" >}}/stdlib/universe/stream-table/getcolumn/)
to output an array of values from a specific column in the extracted table.
```js
sampleData
|> tableFind(fn: (key) => key._field == "temp" and key.location == "sfo")
|> getColumn(column: "_value")
// Returns [65.1, 66.2, 66.3, 66.8]
```
### Use extracted column values
Use a variable to store the array of values.
In the example below, `SFOTemps` represents the array of values.
Reference a specific index (integer starting from `0`) in the array to return the
value at that index.
```js
SFOTemps = sampleData
|> tableFind(fn: (key) => key._field == "temp" and key.location == "sfo")
|> getColumn(column: "_value")
SFOTemps
// Returns [65.1, 66.2, 66.3, 66.8]
SFOTemps[0]
// Returns 65.1
SFOTemps[2]
// Returns 66.3
```
## Extract a row from the table
Use the [`getRecord()` function](/{{< latest "flux" >}}/stdlib/universe/stream-table/getrecord/)
to output data from a single row in the extracted table.
Specify the index of the row to output using the `idx` parameter.
The function outputs a record with key-value pairs for each column.
```js
sampleData
|> tableFind(fn: (key) => key._field == "temp" and key.location == "sfo")
|> getRecord(idx: 0)
// Returns {
// _time:2019-11-11T12:00:00Z,
// _field:"temp",
// location:"sfo",
// _value: 65.1
// }
```
### Use an extracted row record
Use a variable to store the extracted row record.
In the example below, `tempInfo` represents the extracted row.
Use [dot notation](/enterprise_influxdb/v1.10/flux/get-started/syntax-basics/#records) to reference
keys in the record.
```js
tempInfo = sampleData
|> tableFind(fn: (key) => key._field == "temp" and key.location == "sfo")
|> getRecord(idx: 0)
tempInfo
// Returns {
// _time:2019-11-11T12:00:00Z,
// _field:"temp",
// location:"sfo",
// _value: 65.1
// }
tempInfo._time
// Returns 2019-11-11T12:00:00Z
tempInfo.location
// Returns sfo
```
## Example helper functions
Create custom helper functions to extract scalar values from query output.
##### Extract a scalar field value
```js
// Define a helper function to extract field values
getFieldValue = (tables=<-, field) => {
extract = tables
|> tableFind(fn: (key) => key._field == field)
|> getColumn(column: "_value")
return extract[0]
}
// Use the helper function to define a variable
lastJFKTemp = sampleData
|> filter(fn: (r) => r.location == "kjfk")
|> last()
|> getFieldValue(field: "temp")
lastJFKTemp
// Returns 71.2
```
##### Extract scalar row data
```js
// Define a helper function to extract a row as a record
getRow = (tables=<-, field, idx=0) => {
extract = tables
|> tableFind(fn: (key) => true)
|> getRecord(idx: idx)
return extract
}
// Use the helper function to define a variable
lastReported = sampleData
|> last()
|> getRow(field: "temp")
"The last location to report was ${lastReported.location}.
The temperature was ${string(v: lastReported._value)}°F."
// Returns:
// The last location to report was kord.
// The temperature was 38.9°F.
```
---
## Sample data
The following sample data set represents fictional temperature metrics collected
from three locations.
It's formatted in [annotated CSV](https://v2.docs.influxdata.com/v2.0/reference/syntax/annotated-csv/) and imported
into the Flux query using the [`csv.from()` function](/{{< latest "flux" >}}/stdlib/csv/from/).
Place the following at the beginning of your query to use the sample data:
{{% truncate %}}
```js
import "csv"
sampleData = csv.from(csv: "
#datatype,string,long,dateTime:RFC3339,string,string,double
#group,false,true,false,true,true,false
#default,,,,,,
,result,table,_time,location,_field,_value
,,0,2019-11-01T12:00:00Z,sfo,temp,65.1
,,0,2019-11-01T13:00:00Z,sfo,temp,66.2
,,0,2019-11-01T14:00:00Z,sfo,temp,66.3
,,0,2019-11-01T15:00:00Z,sfo,temp,66.8
,,1,2019-11-01T12:00:00Z,kjfk,temp,69.4
,,1,2019-11-01T13:00:00Z,kjfk,temp,69.9
,,1,2019-11-01T14:00:00Z,kjfk,temp,71.0
,,1,2019-11-01T15:00:00Z,kjfk,temp,71.2
,,2,2019-11-01T12:00:00Z,kord,temp,46.4
,,2,2019-11-01T13:00:00Z,kord,temp,46.3
,,2,2019-11-01T14:00:00Z,kord,temp,42.7
,,2,2019-11-01T15:00:00Z,kord,temp,38.9
")
```
{{% /truncate %}}

View File

@ -0,0 +1,64 @@
---
title: Sort and limit data with Flux
seotitle: Sort and limit data in InfluxDB with Flux
list_title: Sort and limit
description: >
Use the `sort()`function to order records within each table by specific columns and the
`limit()` function to limit the number of records in output tables to a fixed number, `n`.
menu:
enterprise_influxdb_1_10:
name: Sort and limit
parent: Query with Flux
weight: 3
list_query_example: sort_limit
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/sort-limit/
v2: /influxdb/v2.0/query-data/flux/sort-limit/
---
Use the [`sort()`function](/{{< latest "flux" >}}/stdlib/universe/sort)
to order records within each table by specific columns and the
[`limit()` function](/{{< latest "flux" >}}/stdlib/universe/limit)
to limit the number of records in output tables to a fixed number, `n`.
If you're just getting started with Flux queries, check out the following:
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
##### Example sorting system uptime
The following example orders system uptime first by region, then host, then value.
```js
from(bucket: "db/rp")
|> range(start: -12h)
|> filter(fn: (r) => r._measurement == "system" and r._field == "uptime")
|> sort(columns: ["region", "host", "_value"])
```
The [`limit()` function](/{{< latest "flux" >}}/stdlib/universe/limit)
limits the number of records in output tables to a fixed number, `n`.
The following example shows up to 10 records from the past hour.
```js
from(bucket:"db/rp")
|> range(start:-1h)
|> limit(n:10)
```
You can use `sort()` and `limit()` together to show the top N records.
The example below returns the 10 top system uptime values sorted first by
region, then host, then value.
```js
from(bucket: "db/rp")
|> range(start: -12h)
|> filter(fn: (r) => r._measurement == "system" and r._field == "uptime")
|> sort(columns: ["region", "host", "_value"])
|> limit(n: 10)
```
You now have created a Flux query that sorts and limits data.
Flux also provides the [`top()`](/{{< latest "flux" >}}/stdlib/universe/top)
and [`bottom()`](/{{< latest "flux" >}}/stdlib/universe/bottom)
functions to perform both of these functions at the same time.

View File

@ -0,0 +1,215 @@
---
title: Query SQL data sources
seotitle: Query SQL data sources with InfluxDB
list_title: Query SQL data
description: >
The Flux `sql` package provides functions for working with SQL data sources.
Use `sql.from()` to query SQL databases like PostgreSQL and MySQL
menu:
enterprise_influxdb_1_10:
parent: Query with Flux
list_title: SQL data
weight: 20
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/sql/
v2: /influxdb/v2.0/query-data/flux/sql/
list_code_example: |
```js
import "sql"
sql.from(
driverName: "postgres",
dataSourceName: "postgresql://user:password@localhost",
query: "SELECT * FROM example_table",
)
```
---
The Flux `sql` package provides functions for working with SQL data sources.
[`sql.from()`](/{{< latest "flux" >}}/stdlib/sql/from/) lets you query SQL data sources
like [PostgreSQL](https://www.postgresql.org/), [MySQL](https://www.mysql.com/),
and [SQLite](https://www.sqlite.org/index.html), and use the results with InfluxDB
dashboards, tasks, and other operations.
- [Query a SQL data source](#query-a-sql-data-source)
- [Join SQL data with data in InfluxDB](#join-sql-data-with-data-in-influxdb)
- [Sample sensor data](#sample-sensor-data)
## Query a SQL data source
To query a SQL data source:
1. Import the `sql` package in your Flux query
2. Use the `sql.from()` function to specify the driver, data source name (DSN),
and query used to query data from your SQL data source:
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[PostgreSQL](#)
[MySQL](#)
[SQLite](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```js
import "sql"
sql.from(
driverName: "postgres",
dataSourceName: "postgresql://user:password@localhost",
query: "SELECT * FROM example_table",
)
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```js
import "sql"
sql.from(
driverName: "mysql",
dataSourceName: "user:password@tcp(localhost:3306)/db",
query: "SELECT * FROM example_table",
)
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```js
// NOTE: InfluxDB OSS and InfluxDB Cloud do not have access to
// the local filesystem and cannot query SQLite data sources.
// Use the Flux REPL to query an SQLite data source.
import "sql"
sql.from(
driverName: "sqlite3",
dataSourceName: "file:/path/to/test.db?cache=shared&mode=ro",
query: "SELECT * FROM example_table",
)
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
_See the [`sql.from()` documentation](/{{< latest "flux" >}}/stdlib/sql/from/) for
information about required function parameters._
## Join SQL data with data in InfluxDB
One of the primary benefits of querying SQL data sources from InfluxDB
is the ability to enrich query results with data stored outside of InfluxDB.
Using the [air sensor sample data](#sample-sensor-data) below, the following query
joins air sensor metrics stored in InfluxDB with sensor information stored in PostgreSQL.
The joined data lets you query and filter results based on sensor information
that isn't stored in InfluxDB.
```js
// Import the "sql" package
import "sql"
// Query data from PostgreSQL
sensorInfo = sql.from(
driverName: "postgres",
dataSourceName: "postgresql://localhost?sslmode=disable",
query: "SELECT * FROM sensors",
)
// Query data from InfluxDB
sensorMetrics = from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "airSensors")
// Join InfluxDB query results with PostgreSQL query results
join(tables: {metric: sensorMetrics, info: sensorInfo}, on: ["sensor_id"])
```
---
## Sample sensor data
The [sample data generator](#download-and-run-the-sample-data-generator) and
[sample sensor information](#import-the-sample-sensor-information) simulate a
group of sensors that measure temperature, humidity, and carbon monoxide
in rooms throughout a building.
Each collected data point is stored in InfluxDB with a `sensor_id` tag that identifies
the specific sensor it came from.
Sample sensor information is stored in PostgreSQL.
**Sample data includes:**
- Simulated data collected from each sensor and stored in the `airSensors` measurement in **InfluxDB**:
- temperature
- humidity
- co
- Information about each sensor stored in the `sensors` table in **PostgreSQL**:
- sensor_id
- location
- model_number
- last_inspected
### Import and generate sample sensor data
#### Download and run the sample data generator
`air-sensor-data.rb` is a script that generates air sensor data and stores the data in InfluxDB.
To use `air-sensor-data.rb`:
1. [Create a database](/enterprise_influxdb/v1.10/introduction/get-started/#creating-a-database) to store the data.
2. Download the sample data generator. _This tool requires [Ruby](https://www.ruby-lang.org/en/)._
<a class="btn download" style="color:#fff" href="/downloads/air-sensor-data.rb" download>Download Air Sensor Generator</a>
3. Give `air-sensor-data.rb` executable permissions:
```
chmod +x air-sensor-data.rb
```
4. Start the generator. Specify your database.
```
./air-sensor-data.rb -d database-name
```
The generator begins to write data to InfluxDB and will continue until stopped.
Use `ctrl-c` to stop the generator.
_**Note:** Use the `--help` flag to view other configuration options._
5. Query your target database to ensure the generated data is writing successfully.
The generator doesn't catch errors from write requests, so it will continue running
even if data is not writing to InfluxDB successfully.
```
from(bucket: "database-name/autogen")
|> range(start: -1m)
|> filter(fn: (r) => r._measurement == "airSensors")
```
#### Import the sample sensor information
1. [Download and install PostgreSQL](https://www.postgresql.org/download/).
2. Download the sample sensor information CSV.
<a class="btn download" style="color:#fff" href="/downloads/sample-sensor-info.csv" download>Download Sample Data</a>
3. Use a PostgreSQL client (`psql` or a GUI) to create the `sensors` table:
```
CREATE TABLE sensors (
sensor_id character varying(50),
location character varying(50),
model_number character varying(50),
last_inspected date
);
```
4. Import the downloaded CSV sample data.
_Update the `FROM` file path to the path of the downloaded CSV sample data._
```
COPY sensors(sensor_id,location,model_number,last_inspected)
FROM '/path/to/sample-sensor-info.csv' DELIMITER ',' CSV HEADER;
```
5. Query the table to ensure the data was imported correctly:
```
SELECT * FROM sensors;
```

View File

@ -0,0 +1,351 @@
---
title: Window and aggregate data with Flux
seotitle: Window and aggregate data in InfluxDB with Flux
list_title: Window & aggregate
description: >
This guide walks through windowing and aggregating data with Flux and outlines
how it shapes your data in the process.
menu:
enterprise_influxdb_1_10:
name: Window & aggregate
parent: Query with Flux
weight: 4
list_query_example: aggregate_window
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/window-aggregate/
v2: /influxdb/v2.0/query-data/flux/window-aggregate/
---
A common operation performed with time series data is grouping data into windows of time,
or "windowing" data, then aggregating windowed values into a new value.
This guide walks through windowing and aggregating data with Flux and demonstrates
how data is shaped in the process.
If you're just getting started with Flux queries, check out the following:
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
{{% note %}}
The following example is an in-depth walk-through of the steps required to window and aggregate data.
The [`aggregateWindow()` function](#summing-up) performs these operations for you, but understanding
how data is shaped in the process helps to successfully create your desired output.
{{% /note %}}
## Data set
For the purposes of this guide, define a variable that represents your base data set.
The following example queries the memory usage of the host machine.
```js
dataSet = from(bucket: "db/rp")
|> range(start: -5m)
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|> drop(columns: ["host"])
```
{{% note %}}
This example drops the `host` column from the returned data since the memory data
is only tracked for a single host and it simplifies the output tables.
Dropping the `host` column is optional and not recommended if monitoring memory
on multiple hosts.
{{% /note %}}
`dataSet` can now be used to represent your base data, which will look similar to the following:
{{% truncate %}}
```
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:00.000000000Z 71.11611366271973
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:10.000000000Z 67.39630699157715
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:20.000000000Z 64.16666507720947
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:30.000000000Z 64.19951915740967
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:40.000000000Z 64.2122745513916
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:50.000000000Z 64.22209739685059
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 64.6336555480957
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:10.000000000Z 64.16516304016113
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:20.000000000Z 64.18349742889404
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:30.000000000Z 64.20474052429199
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:40.000000000Z 68.65062713623047
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:50.000000000Z 67.20139980316162
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 70.9143877029419
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:10.000000000Z 64.14549350738525
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:20.000000000Z 64.15379047393799
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:30.000000000Z 64.1592264175415
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:40.000000000Z 64.18190002441406
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:50.000000000Z 64.28837776184082
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 64.29731845855713
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:10.000000000Z 64.36963081359863
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:20.000000000Z 64.37397003173828
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:30.000000000Z 64.44413661956787
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:40.000000000Z 64.42906856536865
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:50.000000000Z 64.44573402404785
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.48912620544434
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:10.000000000Z 64.49522972106934
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:20.000000000Z 64.48652744293213
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:30.000000000Z 64.49949741363525
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:40.000000000Z 64.4949197769165
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:50.000000000Z 64.49787616729736
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
```
{{% /truncate %}}
## Windowing data
Use the [`window()` function](/{{< latest "flux" >}}/stdlib/universe/window)
to group your data based on time bounds.
The most common parameter passed with the `window()` is `every` which
defines the duration of time between windows.
Other parameters are available, but for this example, window the base data
set into one minute windows.
```js
dataSet
|> window(every: 1m)
```
{{% note %}}
The `every` parameter supports all [valid duration units](/{{< latest "flux" >}}/language/types/#duration-types),
including **calendar months (`1mo`)** and **years (`1y`)**.
{{% /note %}}
Each window of time is output in its own table containing all records that fall within the window.
{{% truncate %}}
###### window() output tables
```
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:00.000000000Z 71.11611366271973
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:10.000000000Z 67.39630699157715
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:20.000000000Z 64.16666507720947
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:30.000000000Z 64.19951915740967
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:40.000000000Z 64.2122745513916
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:50.000000000Z 64.22209739685059
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 64.6336555480957
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:10.000000000Z 64.16516304016113
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:20.000000000Z 64.18349742889404
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:30.000000000Z 64.20474052429199
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:40.000000000Z 68.65062713623047
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:50.000000000Z 67.20139980316162
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 70.9143877029419
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:10.000000000Z 64.14549350738525
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:20.000000000Z 64.15379047393799
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:30.000000000Z 64.1592264175415
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:40.000000000Z 64.18190002441406
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:50.000000000Z 64.28837776184082
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 64.29731845855713
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:10.000000000Z 64.36963081359863
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:20.000000000Z 64.37397003173828
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:30.000000000Z 64.44413661956787
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:40.000000000Z 64.42906856536865
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:50.000000000Z 64.44573402404785
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.48912620544434
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:10.000000000Z 64.49522972106934
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:20.000000000Z 64.48652744293213
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:30.000000000Z 64.49949741363525
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:40.000000000Z 64.4949197769165
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:50.000000000Z 64.49787616729736
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:55:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
```
{{% /truncate %}}
When visualized in the InfluxDB UI, each window table is displayed in a different color.
![Windowed data](/img/flux/simple-windowed-data.png)
## Aggregate data
[Aggregate functions](/{{< latest "flux" >}}/stdlib/universe) take the values
of all rows in a table and use them to perform an aggregate operation.
The result is output as a new value in a single-row table.
Since windowed data is split into separate tables, aggregate operations run against
each table separately and output new tables containing only the aggregated value.
For this example, use the [`mean()` function](/{{< latest "flux" >}}/stdlib/universe/mean)
to output the average of each window:
```js
dataSet
|> window(every: 1m)
|> mean()
```
{{% truncate %}}
###### mean() output tables
```
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 65.88549613952637
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 65.50651391347249
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 65.30719598134358
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 64.39330975214641
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 64.49386278788249
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
2018-11-03T17:55:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 64.49816226959229
```
{{% /truncate %}}
Because each data point is contained in its own table, when visualized,
they appear as single, unconnected points.
![Aggregated windowed data](/img/flux/simple-windowed-aggregate-data.png)
### Recreate the time column
**Notice the `_time` column is not in the [aggregated output tables](#mean-output-tables).**
Because records in each table are aggregated together, their timestamps no longer
apply and the column is removed from the group key and table.
Also notice the `_start` and `_stop` columns still exist.
These represent the lower and upper bounds of the time window.
Many Flux functions rely on the `_time` column.
To further process your data after an aggregate function, you need to re-add `_time`.
Use the [`duplicate()` function](/{{< latest "flux" >}}/stdlib/universe/duplicate) to
duplicate either the `_start` or `_stop` column as a new `_time` column.
```js
dataSet
|> window(every: 1m)
|> mean()
|> duplicate(column: "_stop", as: "_time")
```
{{% truncate %}}
###### duplicate() output tables
```
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 65.88549613952637
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 65.50651391347249
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 65.30719598134358
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.39330975214641
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49386278788249
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:55:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
```
{{% /truncate %}}
## "Unwindow" aggregate tables
Keeping aggregate values in separate tables generally isn't the format in which you want your data.
Use the `window()` function to "unwindow" your data into a single infinite (`inf`) window.
```js
dataSet
|> window(every: 1m)
|> mean()
|> duplicate(column: "_stop", as: "_time")
|> window(every: inf)
```
{{% note %}}
Windowing requires a `_time` column which is why it's necessary to
[recreate the `_time` column](#recreate-the-time-column) after an aggregation.
{{% /note %}}
###### Unwindowed output table
```
Table: keys: [_start, _stop, _field, _measurement]
_start:time _stop:time _field:string _measurement:string _time:time _value:float
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 65.88549613952637
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 65.50651391347249
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 65.30719598134358
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.39330975214641
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49386278788249
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
```
With the aggregate values in a single table, data points in the visualization are connected.
![Unwindowed aggregate data](/img/flux/simple-unwindowed-data.png)
## Summing up
You have now created a Flux query that windows and aggregates data.
The data transformation process outlined in this guide should be used for all aggregation operations.
Flux also provides the [`aggregateWindow()` function](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow)
which performs all these separate functions for you.
The following Flux query will return the same results:
###### aggregateWindow function
```js
dataSet
|> aggregateWindow(every: 1m, fn: mean)
```

View File

@ -0,0 +1,33 @@
---
title: Enable Flux
description: Instructions for enabling Flux in your InfluxDB configuration.
menu:
enterprise_influxdb_1_10:
name: Enable Flux
parent: Flux
weight: 1
---
Flux is packaged with **InfluxDB v1.8+** and does not require any additional installation,
however it is **disabled by default and needs to be enabled**.
## Enable Flux
Enable Flux by setting the `flux-enabled` option to `true` under the `[http]` section of your `influxdb.conf`:
###### influxdb.conf
```toml
# ...
[http]
# ...
flux-enabled = true
# ...
```
> The default location of your `influxdb.conf` depends on your operating system.
> More information is available in the [Configuring InfluxDB](/enterprise_influxdb/v1.10/administration/config/#using-the-configuration-file) guide.
When InfluxDB starts, the Flux daemon starts as well and data can be queried using Flux.

View File

@ -0,0 +1,180 @@
---
title: Optimize Flux queries
description: >
Optimize your Flux queries to reduce their memory and compute (CPU) requirements.
weight: 4
menu:
enterprise_influxdb_1_10:
parent: Flux
canonical: /influxdb/cloud/query-data/optimize-queries/
aliases:
- /enterprise_influxdb/v1.10/flux/guides/optimize-queries
---
Optimize your Flux queries to reduce their memory and compute (CPU) requirements.
- [Start queries with pushdowns](#start-queries-with-pushdowns)
- [Avoid processing filters inline](#avoid-processing-filters-inline)
- [Avoid short window durations](#avoid-short-window-durations)
- [Use "heavy" functions sparingly](#use-heavy-functions-sparingly)
- [Use set() instead of map() when possible](#use-set-instead-of-map-when-possible)
- [Balance time range and data precision](#balance-time-range-and-data-precision)
- [Measure query performance with Flux profilers](#measure-query-performance-with-flux-profilers)
## Start queries with pushdowns
**Pushdowns** are functions or function combinations that push data operations to the underlying data source rather than operating on data in memory. Start queries with pushdowns to improve query performance. Once a non-pushdown function runs, Flux pulls data into memory and runs all subsequent operations there.
#### Pushdown functions and function combinations
The following pushdowns are supported in InfluxDB Enterprise 1.10+.
| Functions | Supported |
| :----------------------------- | :------------------: |
| **count()** | {{< icon "check" "v2" >}} |
| **drop()** | {{< icon "check" "v2" >}} |
| **duplicate()** | {{< icon "check" "v2" >}} |
| **filter()** {{% req " \*" %}} | {{< icon "check" "v2" >}} |
| **fill()** | {{< icon "check" "v2" >}} |
| **first()** | {{< icon "check" "v2" >}} |
| **group()** | {{< icon "check" "v2" >}} |
| **keep()** | {{< icon "check" "v2" >}} |
| **last()** | {{< icon "check" "v2" >}} |
| **max()** | {{< icon "check" "v2" >}} |
| **mean()** | {{< icon "check" "v2" >}} |
| **min()** | {{< icon "check" "v2" >}} |
| **range()** | {{< icon "check" "v2" >}} |
| **rename()** | {{< icon "check" "v2" >}} |
| **sum()** | {{< icon "check" "v2" >}} |
| **window()** | {{< icon "check" "v2" >}} |
| _Function combinations_ | |
| **window()** \|> **count()** | {{< icon "check" "v2" >}} |
| **window()** \|> **first()** | {{< icon "check" "v2" >}} |
| **window()** \|> **last()** | {{< icon "check" "v2" >}} |
| **window()** \|> **max()** | {{< icon "check" "v2" >}} |
| **window()** \|> **min()** | {{< icon "check" "v2" >}} |
| **window()** \|> **sum()** | {{< icon "check" "v2" >}} |
{{% caption %}}
{{< req "\*" >}} **filter()** only pushes down when all parameter values are static.
See [Avoid processing filters inline](#avoid-processing-filters-inline).
{{% /caption %}}
Use pushdown functions and function combinations at the beginning of your query.
Once a non-pushdown function runs, Flux pulls data into memory and runs all
subsequent operations there.
##### Pushdown functions in use
```js
from(bucket: "db/rp")
|> range(start: -1h) //
|> filter(fn: (r) => r.sensor == "abc123") //
|> group(columns: ["_field", "host"]) // Pushed to the data source
|> aggregateWindow(every: 5m, fn: max) //
|> filter(fn: (r) => r._value >= 90.0) //
|> top(n: 10) // Run in memory
```
### Avoid processing filters inline
Avoid using mathematic operations or string manipulation inline to define data filters.
Processing filter values inline prevents `filter()` from pushing its operation down
to the underlying data source, so data returned by the
previous function loads into memory.
This often results in a significant performance hit.
For example, the following query uses [Chronograf dashboard template variables](/{{< latest "chronograf" >}}/guides/dashboard-template-variables/)
and string concatenation to define a region to filter by.
Because `filter()` uses string concatenation inline, it can't push its operation
to the underlying data source and loads all data returned from `range()` into memory.
```js
from(bucket: "db/rp")
|> range(start: -1h)
|> filter(fn: (r) => r.region == v.provider + v.region)
```
To dynamically set filters and maintain the pushdown ability of the `filter()` function,
use variables to define filter values outside of `filter()`:
```js
region = v.provider + v.region
from(bucket: "db/rp")
|> range(start: -1h)
|> filter(fn: (r) => r.region == region)
```
## Avoid short window durations
Windowing (grouping data based on time intervals) is commonly used to aggregate and downsample data.
Increase performance by avoiding short window durations.
More windows require more compute power to evaluate which window each row should be assigned to.
Reasonable window durations depend on the total time range queried.
## Use "heavy" functions sparingly
The following functions use more memory or CPU than others.
Consider their necessity in your data processing before using them:
- [map()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/map/)
- [reduce()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/aggregates/reduce/)
- [join()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/join/)
- [union()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/union/)
- [pivot()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/pivot/)
## Use set() instead of map() when possible
[`set()`](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/set/),
[`experimental.set()`](/influxdb/v2.0/reference/flux/stdlib/experimental/set/),
and [`map`](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/map/)
can each set columns value in data, however **set** functions have performance
advantages over `map()`.
Use the following guidelines to determine which to use:
- If setting a column value to a predefined, static value, use `set()` or `experimental.set()`.
- If dynamically setting a column value using **existing row data**, use `map()`.
#### Set a column value to a static value
The following queries are functionally the same, but using `set()` is more performant than using `map()`.
```js
data
|> map(fn: (r) => ({ r with foo: "bar" }))
// Recommended
data
|> set(key: "foo", value: "bar")
```
#### Dynamically set a column value using existing row data
```js
data
|> map(fn: (r) => ({ r with foo: r.bar }))
```
## Balance time range and data precision
To ensure queries are performant, balance the time range and the precision of your data.
For example, if you query data stored every second and request six months worth of data,
results would include ≈15.5 million points per series.
Depending on the number of series returned after `filter()`([cardinality](/enterprise_influxdb/v1.10/concepts/glossary/#series-cardinality)),
this can quickly become many billions of points.
Flux must store these points in memory to generate a response.
Use [pushdowns](#pushdown-functions-and-function-combinations) to optimize how
many points are stored in memory.
## Measure query performance with Flux profilers
Use the [Flux Profiler package](/influxdb/v2.0/reference/flux/stdlib/profiler/)
to measure query performance and append performance metrics to your query output.
The following Flux profilers are available:
- **query**: provides statistics about the execution of an entire Flux script.
- **operator**: provides statistics about each operation in a query.
Import the `profiler` package and enable profilers with the `profile.enabledProfilers` option.
```js
import "profiler"
option profiler.enabledProfilers = ["query", "operator"]
// Query to profile
```
For more information about Flux profilers, see the [Flux Profiler package](/influxdb/v2.0/reference/flux/stdlib/profiler/).

View File

@ -0,0 +1,12 @@
---
title: InfluxDB Enterprise guides
description: Step-by-step guides for using InfluxDB Enterprise.
aliases:
- /enterprise/v1.8/guides/
menu:
enterprise_influxdb_1_10:
name: Guides
weight: 60
---
{{< children hlevel="h2" >}}

View File

@ -0,0 +1,191 @@
---
title: Authenticate requests to InfluxDB Enterprise
description: >
Calculate percentages using basic math operators available in InfluxQL or Flux.
This guide walks through use cases and examples of calculating percentages from two values in a single query.
menu:
enterprise_influxdb_1_10:
weight: 25
parent: Guides
name: Authenticate requests
---
_To require valid credentials for cluster access, see ["Enable authentication"](/enterprise_influxdb/v1.10/administration/configure/security/authentication/)._
## Authenticate requests
### Authenticate with the InfluxDB API
Authenticate with the [InfluxDB API](/enterprise_influxdb/v1.10/tools/api/) using one of the following options:
- [Authenticate with basic authentication](#authenticate-with-basic-authentication)
- [Authenticate with query parameters in the URL or request body](#authenticate-with-query-parameters-in-the-url-or-request-body)
If you authenticate with both basic authentication **and** the URL query parameters,
the user credentials specified in the query parameters take precedence.
The following examples demonstrate queries with [admin user](#admin-users) permissions.
To learn about different users types, permissions, and how to manage users, see [authorization](#authorization).
{{% note %}}
InfluxDB Enterprise redacts passwords in log output when you enable authentication.
{{% /note %}}
#### Authenticate with basic authentication
```bash
curl -G http://localhost:8086/query \
-u todd:password4todd \
--data-urlencode "q=SHOW DATABASES"
```
#### Authenticate with query parameters in the URL or request body
Set `u` as the username and `p` as the password.
##### Credentials as query parameters
```bash
curl -G "http://localhost:8086/query?u=todd&p=password4todd" \
--data-urlencode "q=SHOW DATABASES"
```
##### Credentials in the request body
```bash
curl -G http://localhost:8086/query \
--data-urlencode "u=todd" \
--data-urlencode "p=password4todd" \
--data-urlencode "q=SHOW DATABASES"
```
### Authenticate with the CLI
There are three options for authenticating with the [CLI](/enterprise_influxdb/v1.10/tools/influx-cli/):
- [Authenticate with environment variables](#authenticate-with-environment-variables)
- [Authenticate with CLI flags](#authenticate-with-cli-flags)
- [Authenticate with credentials in the influx shell](#authenticate-with-credentials-in-the-influx-shell)
#### Authenticate with environment variables
Use the `INFLUX_USERNAME` and `INFLUX_PASSWORD` environment variables to provide
authentication credentials to the `influx` CLI.
```bash
export INFLUX_USERNAME=todd
export INFLUX_PASSWORD=password4todd
echo $INFLUX_USERNAME $INFLUX_PASSWORD
todd password4todd
influx
Connected to http://localhost:8086 version {{< latest-patch >}}
InfluxDB shell {{< latest-patch >}}
```
#### Authenticate with CLI flags
Use the `-username` and `-password` flags to provide authentication credentials
to the `influx` CLI.
```bash
influx -username todd -password password4todd
Connected to http://localhost:8086 version {{< latest-patch >}}
InfluxDB shell {{< latest-patch >}}
```
#### Authenticate with credentials in the influx shell
Start the `influx` shell and run the `auth` command.
Enter your username and password when prompted.
```bash
$ influx
Connected to http://localhost:8086 version {{< latest-patch >}}
InfluxDB shell {{< latest-patch >}}
> auth
username: todd
password:
>
```
### Authenticate using JWT tokens
For a more secure alternative to using passwords, include JWT tokens with requests to the InfluxDB API.
This is currently only possible through the [InfluxDB HTTP API](/enterprise_influxdb/v1.10/tools/api/).
1. **Add a shared secret in your InfluxDB Enterprise configuration file**.
InfluxDB Enterprise uses the shared secret to encode the JWT signature.
By default, `shared-secret` is set to an empty string, in which case no JWT authentication takes place.
<!-- TODO: meta, data, or both? -->
Add a custom shared secret in your [InfluxDB configuration file](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#shared-secret).
The longer the secret string, the more secure it is:
```toml
[http]
shared-secret = "my super secret pass phrase"
```
Alternatively, to avoid keeping your secret phrase as plain text in your InfluxDB configuration file,
set the value with the `INFLUXDB_HTTP_SHARED_SECRET` environment variable.
2. **Generate your JWT token**.
Use an authentication service to generate a secure token
using your InfluxDB username, an expiration time, and your shared secret.
There are online tools, such as [https://jwt.io/](https://jwt.io/), that will do this for you.
The payload (or claims) of the token must be in the following format:
```json
{
"username": "myUserName",
"exp": 1516239022
}
```
- **username** - The name of your InfluxDB user.
- **exp** - The expiration time of the token in UNIX epoch time.
For increased security, keep token expiration periods short.
For testing, you can manually generate UNIX timestamps using [https://www.unixtimestamp.com/index.php](https://www.unixtimestamp.com/index.php).
Encode the payload using your shared secret.
You can do this with either a JWT library in your own authentication server or by hand at [https://jwt.io/](https://jwt.io/).
The generated token follows this format: `<header>.<payload>.<signature>`
3. **Include the token in HTTP requests**.
Include your generated token as part of the `Authorization` header in HTTP requests:
```
Authorization: Bearer <myToken>
```
{{% note %}}
Only unexpired tokens will successfully authenticate.
Be sure your token has not expired.
{{% /note %}}
#### Example query request with JWT authentication
```bash
curl -G "http://localhost:8086/query?db=demodb" \
--data-urlencode "q=SHOW DATABASES" \
--header "Authorization: Bearer <header>.<payload>.<signature>"
```
## Authenticate Telegraf requests to InfluxDB
Authenticating [Telegraf](/{{< latest "telegraf" >}}/) requests to an InfluxDB instance with
authentication enabled requires some additional steps.
In the Telegraf configuration file (`/etc/telegraf/telegraf.conf`), uncomment
and edit the `username` and `password` settings.
```toml
###############################################################################
# OUTPUT PLUGINS #
###############################################################################
# ...
[[outputs.influxdb]]
# ...
username = "example-username" # Provide your username
password = "example-password" # Provide your password
# ...
```
Restart Telegraf and you're all set!

View File

@ -0,0 +1,274 @@
---
title: Calculate percentages in a query
description: >
Calculate percentages using basic math operators available in InfluxQL or Flux.
This guide walks through use-cases and examples of calculating percentages from two values in a single query.
menu:
enterprise_influxdb_1_10:
weight: 50
parent: Guides
name: Calculate percentages
aliases:
- /enterprise_influxdb/v1.10/guides/calculating_percentages/
v2: /influxdb/v2.0/query-data/flux/calculate-percentages/
---
Use Flux or InfluxQL to calculate percentages in a query.
{{< tabs-wrapper >}}
{{% tabs %}}
[Flux](#)
[InfluxQL](#)
{{% /tabs %}}
{{% tab-content %}}
[Flux](/flux/latest/) lets you perform simple math equations, for example, calculating a percentage.
## Calculate a percentage
Learn how to calculate a percentage using the following examples:
- [Basic calculations within a query](#basic-calculations-within-a-query)
- [Calculate a percentage from two fields](#calculate-a-percentage-from-two-fields)
- [Calculate a percentage using aggregate functions](#calculate-a-percentage-using-aggregate-functions)
- [Calculate the percentage of total weight per apple variety](#calculate-the-percentage-of-total-weight-per-apple-variety)
- [Calculate the aggregate percentage per variety](#calculate-the-percentage-of-total-weight-per-apple-variety)
## Basic calculations within a query
When performing any math operation in a Flux query, you must complete the following steps:
1. Specify the [bucket](/{{< latest "influxdb" "v2" >}}/query-data/get-started/#buckets) to query from and the time range to query.
2. Filter your data by measurements, fields, and other applicable criteria.
3. Align values in one row (required to perform math in Flux) by using one of the following functions:
- To query **from multiple** data sources, use the [`join()` function](/{{< latest "flux" >}}/stdlib/universe/join/).
- To query **from the same** data source, use the [`pivot()` function](/{{< latest "flux" >}}/stdlib/universe/pivot/).
For examples using the `join()` function to calculate percentages and more examples of calculating percentages, see [Calculate percentages with Flux](/{{< latest "influxdb" "v2" >}}/query-data/flux/calculate-percentages/).
#### Data variable
To shorten examples, we'll store a basic Flux query in a `data` variable for reuse.
Here's how that looks in Flux:
```js
// Query data from the past 15 minutes pivot fields into columns so each row
// contains values for each field
data = from(bucket: "your_db/your_retention_policy")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "measurement_name" and r._field =~ /field[1-2]/)
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
```
Each row now contains the values necessary to perform a math operation. For example, to add two field keys, start with the `data` variable created above, and then use `map()` to re-map values in each row.
```js
data
|> map(fn: (r) => ({ r with _value: r.field1 + r.field2}))
```
> **Note:** Flux supports basic math operators such as `+`,`-`,`/`, `*`, and `()`. For example, to subtract `field2` from `field1`, change `+` to `-`.
## Calculate a percentage from two fields
Use the `data` variable created above, and then use the [`map()` function](/{{< latest "flux" >}}/stdlib/universe/map/) to divide one field by another, multiply by 100, and add a new `percent` field to store the percentage values in.
```js
data
|> map(
fn: (r) => ({
_time: r._time,
_measurement: r._measurement,
_field: "percent",
_value: field1 / field2 * 100.0
})
)
```
>**Note:** In this example, `field1` and `field2` are float values, hence multiplied by 100.0. For integer values, multiply by 100 or use the `float()` function to cast integers to floats.
## Calculate a percentage using aggregate functions
Use [`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow) to window data by time and perform an aggregate function on each window.
```js
from(bucket: "<database>/<retention_policy>")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "measurement_name" and r._field =~ /fieldkey[1-2]/)
|> aggregateWindow(every: 1m, fn: sum)
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({r with _value: r.field1 / r.field2 * 100.0}))
```
## Calculate the percentage of total weight per apple variety
Use simulated apple stand data to track the weight of apples (by type) throughout a day.
1. [Download the sample data](https://gist.githubusercontent.com/sanderson/8f8aec94a60b2c31a61f44a37737bfea/raw/c29b239547fa2b8ee1690f7d456d31f5bd461386/apple_stand.txt)
2. Import the sample data:
```bash
influx -import -path=path/to/apple_stand.txt -precision=ns -database=apple_stand
```
Use the following query to calculate the percentage of the total weight each variety
accounts for at each given point in time.
```js
from(bucket: "apple_stand/autogen")
|> range(start: 2018-06-18T12:00:00Z, stop: 2018-06-19T04:35:00Z)
|> filter(fn: (r) => r._measurement == "variety")
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(
fn: (r) => ({r with
granny_smith: r.granny_smith / r.total_weight * 100.0,
golden_delicious: r.golden_delicious / r.total_weight * 100.0,
fuji: r.fuji / r.total_weight * 100.0,
gala: r.gala / r.total_weight * 100.0,
braeburn: r.braeburn / r.total_weight * 100.0,
}),
)
```
## Calculate the average percentage of total weight per variety each hour
With the apple stand data from the prior example, use the following query to calculate the average percentage of the total weight each variety accounts for per hour.
```js
from(bucket: "apple_stand/autogen")
|> range(start: 2018-06-18T00:00:00Z, stop: 2018-06-19T16:35:00Z)
|> filter(fn: (r) => r._measurement == "variety")
|> aggregateWindow(every: 1h, fn: mean)
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(
fn: (r) => ({r with
granny_smith: r.granny_smith / r.total_weight * 100.0,
golden_delicious: r.golden_delicious / r.total_weight * 100.0,
fuji: r.fuji / r.total_weight * 100.0,
gala: r.gala / r.total_weight * 100.0,
braeburn: r.braeburn / r.total_weight * 100.0,
}),
)
```
{{% /tab-content %}}
{{% tab-content %}}
[InfluxQL](/enterprise_influxdb/v1.10/query_language/) lets you perform simple math equations
which makes calculating percentages using two fields in a measurement pretty simple.
However there are some caveats of which you need to be aware.
## Basic calculations within a query
`SELECT` statements support the use of basic math operators such as `+`,`-`,`/`, `*`, `()`, etc.
```sql
-- Add two field keys
SELECT field_key1 + field_key2 AS "field_key_sum" FROM "measurement_name" WHERE time < now() - 15m
-- Subtract one field from another
SELECT field_key1 - field_key2 AS "field_key_difference" FROM "measurement_name" WHERE time < now() - 15m
-- Grouping and chaining mathematical calculations
SELECT (field_key1 + field_key2) - (field_key3 + field_key4) AS "some_calculation" FROM "measurement_name" WHERE time < now() - 15m
```
## Calculating a percentage in a query
Using basic math functions, you can calculate a percentage by dividing one field value
by another and multiplying the result by 100:
```sql
SELECT (field_key1 / field_key2) * 100 AS "calculated_percentage" FROM "measurement_name" WHERE time < now() - 15m
```
## Calculating a percentage using aggregate functions
If using aggregate functions in your percentage calculation, all data must be referenced
using aggregate functions.
_**You can't mix aggregate and non-aggregate data.**_
All Aggregate functions need a `GROUP BY time()` clause defining the time intervals
in which data points are grouped and aggregated.
```sql
SELECT (sum(field_key1) / sum(field_key2)) * 100 AS "calculated_percentage" FROM "measurement_name" WHERE time < now() - 15m GROUP BY time(1m)
```
## Examples
#### Sample data
The following example uses simulated Apple Stand data that tracks the weight of
baskets containing different varieties of apples throughout a day of business.
1. [Download the sample data](https://gist.githubusercontent.com/sanderson/8f8aec94a60b2c31a61f44a37737bfea/raw/c29b239547fa2b8ee1690f7d456d31f5bd461386/apple_stand.txt)
2. Import the sample data:
```bash
influx -import -path=path/to/apple_stand.txt -precision=ns -database=apple_stand
```
### Calculating percentage of total weight per apple variety
The following query calculates the percentage of the total weight each variety
accounts for at each given point in time.
```sql
SELECT
("braeburn"/total_weight)*100,
("granny_smith"/total_weight)*100,
("golden_delicious"/total_weight)*100,
("fuji"/total_weight)*100,
("gala"/total_weight)*100
FROM "apple_stand"."autogen"."variety"
```
<div class='view-in-chronograf' data-query-override='SELECT
("braeburn"/total_weight)*100,
("granny_smith"/total_weight)*100,
("golden_delicious"/total_weight)*100,
("fuji"/total_weight)*100,
("gala"/total_weight)*100
FROM "apple_stand"."autogen"."variety"'>
\*</div>
If visualized as a [stacked graph](/chronograf/v1.8/guides/visualization-types/#stacked-graph)
in Chronograf, it would look like:
![Percentage of total per apple variety](/img/influxdb/1-5-calc-percentage-apple-variety.png)
### Calculating aggregate percentage per variety
The following query calculates the average percentage of the total weight each variety
accounts for per hour.
```sql
SELECT
(mean("braeburn")/mean(total_weight))*100,
(mean("granny_smith")/mean(total_weight))*100,
(mean("golden_delicious")/mean(total_weight))*100,
(mean("fuji")/mean(total_weight))*100,
(mean("gala")/mean(total_weight))*100
FROM "apple_stand"."autogen"."variety"
WHERE time >= '2018-06-18T12:00:00Z' AND time <= '2018-06-19T04:35:00Z'
GROUP BY time(1h)
```
<div class='view-in-chronograf' data-query-override='SELECT%0A%20%20%20%20%28mean%28"braeburn"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"granny_smith"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"golden_delicious"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"fuji"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"gala"%29%2Fmean%28total_weight%29%29%2A100%0AFROM%20"apple_stand"."autogen"."variety"%0AWHERE%20time%20>%3D%20%272018-06-18T12%3A00%3A00Z%27%20AND%20time%20<%3D%20%272018-06-19T04%3A35%3A00Z%27%0AGROUP%20BY%20time%281h%29'></div>
_**Note the following about this query:**_
- It uses aggregate functions (`mean()`) for pulling all data.
- It includes a `GROUP BY time()` clause which aggregates data into 1 hour blocks.
- It includes an explicitly limited time window. Without it, aggregate functions
are very resource-intensive.
If visualized as a [stacked graph](/chronograf/v1.8/guides/visualization-types/#stacked-graph)
in Chronograf, it would look like:
![Hourly average percentage of total per apple variety](/img/influxdb/1-5-calc-percentage-hourly-apple-variety.png)
{{% /tab-content %}}
{{< /tabs-wrapper >}}

View File

@ -0,0 +1,224 @@
---
title: Downsample and retain data
description: Downsample data to keep high precision while preserving storage.
menu:
enterprise_influxdb_1_10:
weight: 30
parent: Guides
aliases:
- /enterprise_influxdb/v1.10/guides/downsampling_and_retention/
v2: /influxdb/v2.0/process-data/common-tasks/downsample-data/
---
InfluxDB can handle hundreds of thousands of data points per second. Working with that much data over a long period of time can create storage concerns.
A natural solution is to downsample the data; keep the high precision raw data for only a limited time, and store the lower precision, summarized data longer.
This guide describes how to automate the process of downsampling data and expiring old data using InfluxQL. To downsample and retain data using Flux and InfluxDB 2.0,
see [Process data with InfluxDB tasks](/influxdb/v2.0/process-data/).
### Definitions
- **Continuous query** (CQ) is an InfluxQL query that runs automatically and periodically within a database.
CQs require a function in the `SELECT` clause and must include a `GROUP BY time()` clause.
- **Retention policy** (RP) is the part of InfluxDB data structure that describes for how long InfluxDB keeps data.
InfluxDB compares your local server's timestamp to the timestamps on your data and deletes data older than the RP's `DURATION`.
A single database can have several RPs and RPs are unique per database.
This guide doesn't go into detail about the syntax for creating and managing CQs and RPs or tasks.
If you're new to these concepts, we recommend reviewing the following:
- [CQ documentation](/enterprise_influxdb/v1.10/query_language/continuous_queries/) and
- [RP documentation](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management).
### Sample data
This section uses fictional real-time data to track the number of food orders
to a restaurant via phone and via website at ten second intervals.
We store this data in a [database](/enterprise_influxdb/v1.10/concepts/glossary/#database) or [bucket]() called `food_data`, in
the [measurement](/enterprise_influxdb/v1.10/concepts/glossary/#measurement) `orders`, and
in the [fields](/enterprise_influxdb/v1.10/concepts/glossary/#field) `phone` and `website`.
Sample:
```bash
name: orders
------------
time phone website
2016-05-10T23:18:00Z 10 30
2016-05-10T23:18:10Z 12 39
2016-05-10T23:18:20Z 11 56
```
### Goal
Assume that, in the long run, we're only interested in the average number of orders by phone
and by website at 30 minute intervals.
In the next steps, we use RPs and CQs to:
* Automatically aggregate the ten-second resolution data to 30-minute resolution data
* Automatically delete the raw, ten-second resolution data that are older than two hours
* Automatically delete the 30-minute resolution data that are older than 52 weeks
### Database preparation
We perform the following steps before writing the data to the database
`food_data`.
We do this **before** inserting any data because CQs only run against recent
data; that is, data with timestamps that are no older than `now()` minus
the `FOR` clause of the CQ, or `now()` minus the `GROUP BY time()` interval if
the CQ has no `FOR` clause.
#### 1. Create the database
```sql
> CREATE DATABASE "food_data"
```
#### 2. Create a two-hour `DEFAULT` retention policy
InfluxDB writes to the `DEFAULT` retention policy if we do not supply an explicit RP when
writing a point to the database.
We make the `DEFAULT` RP keep data for two hours, because we want InfluxDB to
automatically write the incoming ten-second resolution data to that RP.
Use the
[`CREATE RETENTION POLICY`](/enterprise_influxdb/v1.10/query_language/manage-database/#create-retention-policies-with-create-retention-policy)
statement to create a `DEFAULT` RP:
```sql
> CREATE RETENTION POLICY "two_hours" ON "food_data" DURATION 2h REPLICATION 1 DEFAULT
```
That query creates an RP called `two_hours` that exists in the database
`food_data`.
`two_hours` keeps data for a `DURATION` of two hours (`2h`) and it's the `DEFAULT`
RP for the database `food_data`.
{{% warn %}}
The replication factor (`REPLICATION 1`) is a required parameter but must always
be set to 1 for single node instances.
{{% /warn %}}
> **Note:** When we created the `food_data` database in step 1, InfluxDB
automatically generated an RP named `autogen` and set it as the `DEFAULT`
RP for the database.
The `autogen` RP has an infinite retention period.
With the query above, the RP `two_hours` replaces `autogen` as the `DEFAULT` RP
for the `food_data` database.
#### 3. Create a 52-week retention policy
Next we want to create another retention policy that keeps data for 52 weeks and is not the
`DEFAULT` retention policy (RP) for the database.
Ultimately, the 30-minute rollup data will be stored in this RP.
Use the
[`CREATE RETENTION POLICY`](/enterprise_influxdb/v1.10/query_language/manage-database/#create-retention-policies-with-create-retention-policy)
statement to create a non-`DEFAULT` retention policy:
```sql
> CREATE RETENTION POLICY "a_year" ON "food_data" DURATION 52w REPLICATION 1
```
That query creates a retention policy (RP) called `a_year` that exists in the database
`food_data`.
The `a_year` setting keeps data for a `DURATION` of 52 weeks (`52w`).
Leaving out the `DEFAULT` argument ensures that `a_year` is not the `DEFAULT`
RP for the database `food_data`.
That is, write and read operations against `food_data` that do not specify an
RP will still go to the `two_hours` RP (the `DEFAULT` RP).
#### 4. Create the continuous query
Now that we've set up our RPs, we want to create a continuous query (CQ) that will automatically
and periodically downsample the ten-second resolution data to the 30-minute
resolution, and then store those results in a different measurement with a different
retention policy.
Use the
[`CREATE CONTINUOUS QUERY`](/enterprise_influxdb/v1.10/query_language/continuous_queries/)
statement to generate a CQ:
```sql
> CREATE CONTINUOUS QUERY "cq_30m" ON "food_data" BEGIN
SELECT mean("website") AS "mean_website",mean("phone") AS "mean_phone"
INTO "a_year"."downsampled_orders"
FROM "orders"
GROUP BY time(30m)
END
```
That query creates a CQ called `cq_30m` in the database `food_data`.
`cq_30m` tells InfluxDB to calculate the 30-minute average of the two fields
`website` and `phone` in the measurement `orders` and in the `DEFAULT` RP
`two_hours`.
It also tells InfluxDB to write those results to the measurement
`downsampled_orders` in the retention policy `a_year` with the field keys
`mean_website` and `mean_phone`.
InfluxDB will run this query every 30 minutes for the previous 30 minutes.
> **Note:** Notice that we fully qualify (that is, we use the syntax
`"<retention_policy>"."<measurement>"`) the measurement in the `INTO`
clause.
InfluxDB requires that syntax to write data to an RP other than the `DEFAULT`
RP.
### Results
With the new CQ and two new RPs, `food_data` is ready to start receiving data.
After writing data to our database and letting things run for a bit, we see
two measurements: `orders` and `downsampled_orders`.
```sql
> SELECT * FROM "orders" LIMIT 5
name: orders
---------
time phone website
2016-05-13T23:00:00Z 10 30
2016-05-13T23:00:10Z 12 39
2016-05-13T23:00:20Z 11 56
2016-05-13T23:00:30Z 8 34
2016-05-13T23:00:40Z 17 32
> SELECT * FROM "a_year"."downsampled_orders" LIMIT 5
name: downsampled_orders
---------------------
time mean_phone mean_website
2016-05-13T15:00:00Z 12 23
2016-05-13T15:30:00Z 13 32
2016-05-13T16:00:00Z 19 21
2016-05-13T16:30:00Z 3 26
2016-05-13T17:00:00Z 4 23
```
The data in `orders` are the raw, ten-second resolution data that reside in the
two-hour RP.
The data in `downsampled_orders` are the aggregated, 30-minute resolution data
that are subject to the 52-week RP.
Notice that the first timestamps in `downsampled_orders` are older than the first
timestamps in `orders`.
This is because InfluxDB has already deleted data from `orders` with timestamps
that are older than our local server's timestamp minus two hours (assume we
executed the `SELECT` queries at `2016-05-14T00:59:59Z`).
InfluxDB will only start dropping data from `downsampled_orders` after 52 weeks.
> **Notes:**
>
* Notice that we fully qualify (that is, we use the syntax
`"<retention_policy>"."<measurement>"`) `downsampled_orders` in
the second `SELECT` statement. We must specify the RP in that query to `SELECT`
data that reside in an RP other than the `DEFAULT` RP.
>
* By default, InfluxDB checks to enforce an RP every 30 minutes.
Between checks, `orders` may have data that are older than two hours.
The rate at which InfluxDB checks to enforce an RP is a configurable setting,
see
[Database Configuration](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#check-interval).
Using a combination of RPs and CQs, we've successfully set up our database to
automatically keep the high precision raw data for a limited time, create lower
precision data, and store that lower precision data for a longer period of time.
Now that you have a general understanding of how these features can work
together, check out the detailed documentation on [CQs](/enterprise_influxdb/v1.10/query_language/continuous_queries/) and [RPs](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management)
to see all that they can do for you.

View File

@ -0,0 +1,343 @@
---
title: Hardware sizing guidelines
Description: >
Review configuration and hardware guidelines for InfluxDB OSS (open source) and InfluxDB Enterprise.
menu:
enterprise_influxdb_1_10:
weight: 40
parent: Guides
---
Review configuration and hardware guidelines for InfluxDB Enterprise:
* [Enterprise overview](#enterprise-overview)
* [Query guidelines](#query-guidelines)
* [InfluxDB OSS guidelines](#influxdb-oss-guidelines)
* [InfluxDB Enterprise cluster guidelines](#influxdb-enterprise-cluster-guidelines)
* [When do I need more RAM?](#when-do-i-need-more-ram)
* [Recommended cluster configurations](#recommended-cluster-configurations)
* [Storage: type, amount, and configuration](#storage-type-amount-and-configuration)
For InfluxDB OSS instances, see [OSS hardware sizing guidelines](https://docs.influxdata.com/influxdb/v1.8/guides/hardware_sizing/).
> **Disclaimer:** Your numbers may vary from recommended guidelines. Guidelines provide estimated benchmarks for implementing the most performant system for your business.
## Enterprise overview
InfluxDB Enterprise supports the following:
- more than 750,000 field writes per second
- more than 100 moderate queries per second ([see Query guides](#query-guidelines))
- more than 10,000,000 [series cardinality](/influxdb/v1.8/concepts/glossary/#series-cardinality)
InfluxDB Enterprise distributes multiple copies of your data across a cluster,
providing high-availability and redundancy, so an unavailable node doesnt significantly impact the cluster.
Please [contact us](https://www.influxdata.com/contact-sales/) for assistance tuning your system.
If you want a single node instance of InfluxDB that's fully open source, requires fewer writes, queries, and unique series than listed above, and do **not require** redundancy, we recommend InfluxDB OSS.
> **Note:** Without the redundancy of a cluster, writes and queries fail immediately when a server is unavailable.
## Query guidelines
> Query complexity varies widely on system impact. Recommendations for both single nodes and clusters are based on **moderate** query loads.
For **simple** or **complex** queries, we recommend testing and adjusting the suggested requirements as needed. Query complexity is defined by the following criteria:
| Query complexity | Criteria |
|:------------------|:---------------------------------------------------------------------------------------|
| Simple | Have few or no functions and no regular expressions |
| | Are bounded in time to a few minutes, hours, or 24 hours at most |
| | Typically execute in a few milliseconds to a few dozen milliseconds |
| Moderate | Have multiple functions and one or two regular expressions |
| | May also have `GROUP BY` clauses or sample a time range of multiple weeks |
| | Typically execute in a few hundred or a few thousand milliseconds |
| Complex | Have multiple aggregation or transformation functions or multiple regular expressions |
| | May sample a very large time range of months or years |
| | Typically take multiple seconds to execute |
## InfluxDB Enterprise cluster guidelines
### Meta nodes
> Set up clusters with an odd number of meta nodes─an even number may cause issues in certain configurations.
A cluster must have a **minimum of three** independent meta nodes for data redundancy and availability. A cluster with `2n + 1` meta nodes can tolerate the loss of `n` meta nodes.
Meta nodes do not need very much computing power. Regardless of the cluster load, we recommend the following guidelines for the meta nodes:
* vCPU or CPU: 1-2 cores
* RAM: 512 MB - 1 GB
* IOPS: 50
### Data nodes
A cluster with one data node is valid but has no data redundancy. Redundancy is set by the [replication factor](/influxdb/v1.8/concepts/glossary/#replication-factor) on the retention policy the data is written to. Where `n` is the replication factor, a cluster can lose `n - 1` data nodes and return complete query results.
>**Note:** For optimal data distribution within the cluster, use an even number of data nodes.
Guidelines vary by writes per second per node, moderate queries per second per node, and the number of unique series per node.
#### Guidelines per node
| vCPU or CPU | RAM | IOPS | Writes per second | Queries* per second | Unique series |
| ----------: | -------: | ----: | ----------------: | ------------------: | ------------: |
| 2 cores | 4-8 GB | 1000 | < 5,000 | < 5 | < 100,000 |
| 4-6 cores | 16-32 GB | 1000+ | < 100,000 | < 25 | < 1,000,000 |
| 8+ cores | 32+ GB | 1000+ | > 100,000 | > 25 | > 1,000,000 |
* Guidelines are provided for moderate queries. Queries vary widely in their impact on the system. For simple or complex queries, we recommend testing and adjusting the suggested requirements as needed. See [query guidelines](#query-guidelines) for detail.
## When do I need more RAM?
In general, more RAM helps queries return faster. Your RAM requirements are primarily determined by [series cardinality](/influxdb/v1.8/concepts/glossary/#series-cardinality). Higher cardinality requires more RAM. Regardless of RAM, a series cardinality of 10 million or more can cause OOM (out of memory) failures. You can usually resolve OOM issues by redesigning your [schema](/influxdb/v1.8/concepts/glossary/#schema).
## Guidelines per cluster
InfluxDB Enterprise guidelines vary by writes and queries per second, series cardinality, replication factor, and infrastructure-AWS EC2 R4 instances or equivalent:
- R4.xlarge (4 cores)
- R4.2xlarge (8 cores)
- R4.4xlarge (16 cores)
- R4.8xlarge (32 cores)
> Guidelines stem from a DevOps monitoring use case: maintaining a group of computers and monitoring server metrics (such as CPU, kernel, memory, disk space, disk I/O, network, and so on).
### Recommended cluster configurations
Cluster configurations guidelines are organized by:
- Series cardinality in your data set: 10,000, 100,000, 1,000,000, or 10,000,000
- Number of data nodes
- Number of server cores
For each cluster configuration, you'll find guidelines for the following:
- **maximum writes per second only** (no dashboard queries are running)
- **maximum queries per second only** (no data is being written)
- **maximum simultaneous queries and writes per second, combined**
#### Review cluster configuration tables
1. Select the series cardinality tab below, and then click to expand a replication factor.
2. In the **Nodes x Core** column, find the number of data nodes and server cores in your configuration, and then review the recommended **maximum** guidelines.
{{< tabs-wrapper >}}
{{% tabs %}}
[10,000 series](#)
[100,000 series](#)
[1,000,000 series](#)
[10,000,000 series](#)
{{% /tabs %}}
{{% tab-content %}}
Select one of the following replication factors to see the recommended cluster configuration for 10,000 series:
{{% expand "Replication factor, 1" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 1 x 4 | 188,000 | 5 | 4 + 99,000 |
| 1 x 8 | 405,000 | 9 | 8 + 207,000 |
| 1 x 16 | 673,000 | 15 | 14 + 375,000 |
| 1 x 32 | 1,056,000 | 24 | 22 + 650,000 |
| 2 x 4 | 384,000 | 14 | 14 + 184,000 |
| 2 x 8 | 746,000 | 22 | 22 + 334,000 |
| 2 x 16 | 1,511,000 | 56 | 40 + 878,000 |
| 2 x 32 | 2,426,000 | 96 | 68 + 1,746,000 |
{{% /expand %}}
{{% expand "Replication factor, 2" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 2 x 4 | 296,000 | 16 | 16 + 151,000 |
| 2 x 8 | 560,000 | 30 | 26 + 290,000 |
| 2 x 16 | 972,000 | 54 | 50 + 456,000 |
| 2 x 32 | 1,860,000 | 84 | 74 + 881,000 |
| 4 x 8 | 1,781,000 | 100 | 64 + 682,000 |
| 4 x 16 | 3,430,000 | 192 | 104 + 1,732,000 |
| 4 x 32 | 6,351,000 | 432 | 188 + 3,283,000 |
| 6 x 8 | 2,923,000 | 216 | 138 + 1,049,000 |
| 6 x 16 | 5,650,000 | 498 | 246 + 2,246,000 |
| 6 x 32 | 9,842,000 | 1248 | 528 + 5,229,000 |
| 8 x 8 | 3,987,000 | 632 | 336 + 1,722,000 |
| 8 x 16 | 7,798,000 | 1384 | 544 + 3,911,000 |
| 8 x 32 | 13,189,000 | 3648 | 1,152 + 7,891,000 |
{{% /expand %}}
{{% expand "Replication factor, 3" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 3 x 8 | 815,000 | 63 | 54 + 335,000 |
| 3 x 16 | 1,688,000 | 120 | 87 + 705,000 |
| 3 x 32 | 3,164,000 | 255 | 132 + 1,626,000 |
| 6 x 8 | 2,269,000 | 252 | 168 + 838,000 |
| 6 x 16 | 4,593,000 | 624 | 336 + 2,019,000 |
| 6 x 32 | 7,776,000 | 1340 | 576 + 3,624,000 |
{{% /expand %}}
{{% /tab-content %}}
{{% tab-content %}}
Select one of the following replication factors to see the recommended cluster configuration for 100,000 series:
{{% expand "Replication factor, 1" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 1 x 4 | 143,000 | 5 | 4 + 77,000 |
| 1 x 8 | 322,000 | 9 | 8 + 167,000 |
| 1 x 16 | 624,000 | 17 | 12 + 337,000 |
| 1 x 32 | 1,114,000 | 26 | 18 + 657,000 |
| 2 x 4 | 265,000 | 14 | 12 + 115,000 |
| 2 x 8 | 573,000 | 30 | 22 + 269,000 |
| 2 x 16 | 1,261,000 | 52 | 38 + 679,000 |
| 2 x 32 | 2,335,000 | 90 | 66 + 1,510,000 |
{{% /expand %}}
{{% expand "Replication factor, 2" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 2 x 4 | 196,000 | 16 | 14 + 77,000 |
| 2 x 8 | 482,000 | 30 | 24 + 203,000 |
| 2 x 16 | 1,060,000 | 60 | 42 + 415,000 |
| 2 x 32 | 1,958,000 | 94 | 64 + 984,000 |
| 4 x 8 | 1,144,000 | 108 | 68 + 406,000 |
| 4 x 16 | 2,512,000 | 228 | 148 + 866,000 |
| 4 x 32 | 4,346,000 | 564 | 320 + 1,886,000 |
| 6 x 8 | 1,802,000 | 252 | 156 + 618,000 |
| 6 x 16 | 3,924,000 | 562 | 384 + 1,068,000 |
| 6 x 32 | 6,533,000 | 1340 | 912 + 2,083,000 |
| 8 x 8 | 2,516,000 | 712 | 360 + 1,020,000 |
| 8 x 16 | 5,478,000 | 1632 | 1,024 + 1,843,000 |
| 8 x 32 | 1,0527,000 | 3392 | 1,792 + 4,998,000 |
{{% /expand %}}
{{% expand "Replication factor, 3" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 3 x 8 | 616,000 | 72 | 51 + 218,000 |
| 3 x 16 | 1,268,000 | 117 | 84 + 438,000 |
| 3 x 32 | 2,260,000 | 189 | 114 + 984,000 |
| 6 x 8 | 1,393,000 | 294 | 192 + 421,000 |
| 6 x 16 | 3,056,000 | 726 | 456 + 893,000 |
| 6 x 32 | 5,017,000 | 1584 | 798 + 1,098,000 |
{{% /expand %}}
{{% /tab-content %}}
{{% tab-content %}}
Select one of the following replication factors to see the recommended cluster configuration for 1,000,000 series:
{{% expand "Replication factor, 2" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:-------------:|------------------:|-------------------:|:---------------------------:|
| 2 x 4 | 104,000 | 18 | 12 + 54,000 |
| 2 x 8 | 195,000 | 36 | 24 + 99,000 |
| 2 x 16 | 498,000 | 70 | 44 + 145,000 |
| 2 x 32 | 1,195,000 | 102 | 84 + 232,000 |
| 4 x 8 | 488,000 | 120 | 56 + 222,000 |
| 4 x 16 | 1,023,000 | 244 | 112 + 428,000 |
| 4 x 32 | 2,686,000 | 468 | 208 + 729,000 |
| 6 x 8 | 845,000 | 270 | 126 + 356,000 |
| 6 x 16 | 1,780,000 | 606 | 288 + 663,000 |
| 6 x 32 | 430,000 | 1,488 | 624 + 1,209,000 |
| 8 x 8 | 1,831,000 | 808 | 296 + 778,000 |
| 8 x 16 | 4,167,000 | 1,856 | 640 + 2,031,000 |
| 8 x 32 | 7,813,000 | 3,201 | 896 + 4,897,000 |
{{% /expand %}}
{{% expand "Replication factor, 3" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 3 x 8 | 234,000 | 72 | 42 + 87,000 |
| 3 x 16 | 613,000 | 120 | 75 + 166,000 |
| 3 x 32 | 1,365,000 | 141 | 114 + 984,000 |
| 6 x 8 | 593,000 | 318 | 144 + 288,000 |
| 6 x 16 | 1,545,000 | 744 | 384 + 407,000 |
| 6 x 32 | 3,204,000 | 1632 | 912 + 505,000 |
{{% /expand %}}
{{% /tab-content %}}
{{% tab-content %}}
Select one of the following replication factors to see the recommended cluster configuration for 10,000,000 series:
{{% expand "Replication factor, 1" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 2 x 4 | 122,000 | 16 | 12 + 81,000 |
| 2 x 8 | 259,000 | 36 | 24 + 143,000 |
| 2 x 16 | 501,000 | 66 | 44 + 290,000 |
| 2 x 32 | 646,000 | 142 | 54 + 400,000 |
{{% /expand %}}
{{% expand "Replication factor, 2" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 2 x 4 | 87,000 | 18 | 14 + 56,000 |
| 2 x 8 | 169,000 | 38 | 24 + 98,000 |
| 2 x 16 | 334,000 | 76 | 46 + 224,000 |
| 2 x 32 | 534,000 | 136 | 58 + 388,000 |
| 4 x 8 | 335,000 | 120 | 60 + 204,000 |
| 4 x 16 | 643,000 | 256 | 112 + 395,000 |
| 4 x 32 | 967,000 | 560 | 158 + 806,000 |
| 6 x 8 | 521,000 | 378 | 144 + 319,000 |
| 6 x 16 | 890,000 | 582 | 186 + 513,000 |
| 8 x 8 | 699,000 | 1,032 | 256 + 477,000 |
| 8 x 16 | 1,345,000 | 2,048 | 544 + 741,000 |
{{% /expand %}}
{{% expand "Replication factor, 3" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 3 x 8 | 170,000 | 60 | 42 + 98,000 |
| 3 x 16 | 333,000 | 129 | 76 + 206,000 |
| 3 x 32 | 609,000 | 178 | 60 + 162,000 |
| 6 x 8 | 395,000 | 402 | 132 + 247,000 |
| 6 x 16 | 679,000 | 894 | 150 + 527,000 |
{{% /expand %}}
{{% /tab-content %}}
{{< /tabs-wrapper >}}
## Storage: type, amount, and configuration
### Storage volume and IOPS
Consider the type of storage you need and the amount. InfluxDB is designed to run on solid state drives (SSDs) and memory-optimized cloud instances, for example, AWS EC2 R5 or R4 instances. InfluxDB isn't tested on hard disk drives (HDDs) and we don't recommend HDDs for production. For best results, InfluxDB servers must have a minimum of 1000 IOPS on storage to ensure recovery and availability. We recommend at least 2000 IOPS for rapid recovery of cluster data nodes after downtime.
See your cloud provider documentation for IOPS detail on your storage volumes.
### Bytes and compression
Database names, [measurements](/influxdb/v1.8/concepts/glossary/#measurement), [tag keys](/influxdb/v1.8/concepts/glossary/#tag-key), [field keys](/influxdb/v1.8/concepts/glossary/#field-key), and [tag values](/influxdb/v1.8/concepts/glossary/#tag-value) are stored only once and always as strings. [Field values](/influxdb/v1.8/concepts/glossary/#field-value) and [timestamps](/influxdb/v1.8/concepts/glossary/#timestamp) are stored for every point.
Non-string values require approximately three bytes. String values require variable space, determined by string compression.
### Separate `wal` and `data` directories
When running InfluxDB in a production environment, store the `wal` directory and the `data` directory on separate storage devices. This optimization significantly reduces disk contention under heavy write load──an important consideration if the write load is highly variable. If the write load does not vary by more than 15%, the optimization is probably not necessary.

View File

@ -0,0 +1,322 @@
---
title: Migrate InfluxDB OSS instances to InfluxDB Enterprise clusters
description: >
Migrate a running instance of InfluxDB open source (OSS) to an InfluxDB Enterprise cluster.
aliases:
- /enterprise/v1.10/guides/migration/
menu:
enterprise_influxdb_1_10:
name: Migrate InfluxDB OSS to Enterprise
weight: 10
parent: Guides
---
Migrate a running instance of InfluxDB open source (OSS) to an InfluxDB Enterprise cluster.
{{% note %}}
Migration transfers all users from the OSS instance to the InfluxDB Enterprise cluster.
{{% /note %}}
## Migrate an OSS instance to InfluxDB Enterprise
Complete the following tasks
to migrate data from OSS to an InfluxDB Enterprise cluster without downtime or missing data.
1. Upgrade InfluxDB OSS and InfluxDB Enterprise to the latest stable versions.
- [Upgrade InfluxDB OSS](/{{< latest "influxdb" "v1" >}}/administration/upgrading/)
- [Upgrade InfluxDB Enterprise](/enterprise_influxdb/v1.10/administration/upgrading/)
2. On each meta node and each data node,
add the IP and hostname of your OSS instance to the `/etc/hosts` file.
This will allow the nodes to communicate with the OSS instance.
3. On the OSS instance, take a portable backup from OSS using the **influxd backup** command
with the `-portable` flag:
```sh
influxd backup -portable -host <IP address>:8088 /tmp/mysnapshot
```
Note the current date and time when you take the backup.
For more information, see [influxd backup](/influxdb/v1.8/tools/influxd/backup/).
4. Restore the backup on the cluster by running the following:
```sh
influxd-ctl restore [ -host <host:port> ] <path-to-backup-files>
```
> **Note:** InfluxDB Enterprise uses the **influxd-ctl utility** to back up and restore data. For more information,
see [influxd-ctl](/enterprise_influxdb/v1.10/tools/influxd-ctl)
and [`restore`](/enterprise_influxdb/v1.10/administration/backup-and-restore/#restore).
5. To avoid data loss, dual write to both OSS and Enterprise while completing the remaining steps.
This keeps the OSS and cluster active for testing and acceptance work. For more information, see [Write data with the InfluxDB API](/enterprise_influxdb/v1.10/guides/write_data/).
6. [Export data from OSS](/enterprise_influxdb/v1.10/administration/backup-and-restore/#exporting-data)
from the time the backup was taken to the time the dual write started.
For example, if you take the backup on `2020-07-19T00:00:00.000Z`,
and started writing data to Enterprise at `2020-07-19T23:59:59.999Z`,
you would run the following command:
```sh
influx_inspect export -compress -start 2020-07-19T00:00:00.000Z -end 2020-07-19T23:59:59.999Z`
```
For more information, see [`-export`](/enterprise_influxdb/v1.10/tools/influx_inspect#export).
7. [Import data into Enterprise](/enterprise_influxdb/v1.10/administration/backup-and-restore/#importing-data).
8. Verify data is successfully migrated to your Enterprise cluster. See:
- [Query data with the InfluxDB API](/enterprise_influxdb/v1.10/guides/query_data/)
- [View data in Chronograf](/{{< latest "chronograf" >}}/)
Next, stop writes to OSS instance, and remove it.
#### Stop writes and remove OSS
1. Stop all writes to the InfluxDB OSS instance.
2. Stop the `influxdb` service on the InfluxDB OSS instance server.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[sysvinit](#)
[systemd](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
sudo service influxdb stop
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
sudo systemctl stop influxdb
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
3. Double check that the service is stopped.
The following command should return nothing:
```bash
ps ax | grep influxd
```
4. Remove the InfluxDB OSS package.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[Debian & Ubuntu](#)
[RHEL & CentOS](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
sudo apt-get remove influxdb
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
sudo yum remove influxdb
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
<!--
### Migrate a data set with downtime
1. [Stop writes and remove OSS](#stop-writes-and-remove-oss)
2. [Back up OSS configuration](#back-up-oss-configuration)
3. [Add the upgraded OSS instance to the InfluxDB Enterprise cluster](#add-the-new-data-node-to-the-cluster)
4. [Add existing data nodes back to the cluster](#add-existing-data-nodes-back-to-the-cluster)
5. [Rebalance the cluster](#rebalance-the-cluster)
#### Stop writes and remove OSS
1. Stop all writes to the InfluxDB OSS instance.
2. Stop the `influxdb` service on the InfluxDB OSS instance.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[sysvinit](#)
[systemd](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
sudo service influxdb stop
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
sudo systemctl stop influxdb
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
Double check that the service is stopped.
The following command should return nothing:
```bash
ps ax | grep influxd
```
3. Remove the InfluxDB OSS package.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[Debian & Ubuntu](#)
[RHEL & CentOS](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
sudo apt-get remove influxdb
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
sudo yum remove influxdb
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
#### Back up and migrate your InfluxDB OSS configuration file
1. **Back up your InfluxDB OSS configuration file**.
If you have custom configuration settings for InfluxDB OSS, back up and save your configuration file.
{{% warn %}}
Without a backup, you'll lose custom configuration settings when updating the InfluxDB binary.
{{% /warn %}}
2. **Update the InfluxDB binary**.
> Updating the InfluxDB binary overwrites the existing configuration file.
> To keep custom settings, back up your configuration file.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[Debian & Ubuntu](#)
[RHEL & CentOS](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
wget https://dl.influxdata.com/enterprise/releases/influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
sudo dpkg -i influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
wget https://dl.influxdata.com/enterprise/releases/influxdb-data-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
sudo yum localinstall influxdb-data-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
3. **Update the configuration file**.
In `/etc/influxdb/influxdb.conf`:
- set `hostname` to the full hostname of the data node
- set `license-key` in the `[enterprise]` section to the license key you received on InfluxPortal
**or** set `license-path` in the `[enterprise]` section to
the local path to the JSON license file you received from InfluxData.
{{% warn %}}
The `license-key` and `license-path` settings are mutually exclusive and one must remain set to an empty string.
{{% /warn %}}
```toml
# Hostname advertised by this host for remote addresses.
# This must be accessible to all nodes in the cluster.
hostname="<data-node-hostname>"
[enterprise]
# license-key and license-path are mutually exclusive,
# use only one and leave the other blank
license-key = "<your_license_key>"
license-path = "/path/to/readable/JSON.license.file"
```
{{% note %}}
Transfer any custom settings from the backup of your OSS configuration file
to the new Enterprise configuration file.
{{% /note %}}
4. **Update the `/etc/hosts` file**.
Add all meta and data nodes to the `/etc/hosts` file to allow the OSS instance
to communicate with other nodes in the InfluxDB Enterprise cluster.
5. **Start the data node**.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[sysvinit](#)
[systemd](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
sudo service influxdb start
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
sudo systemctl start influxdb
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
#### Add the new data node to the cluster
After you upgrade your OSS instance to InfluxDB Enterprise, add the node to your Enterprise cluster.
From a **meta** node in the cluster, run:
```bash
influxd-ctl add-data <new-data-node-hostname>:8088
```
The output should look like:
```bash
Added data node y at new-data-node-hostname:8088
```
#### Add existing data nodes back to the cluster
If you removed any existing data nodes from your InfluxDB Enterprise cluster,
add them back to the cluster.
1. From a **meta** node in the InfluxDB Enterprise cluster, run the following for
**each data node**:
```bash
influxd-ctl add-data <the-hostname>:8088
```
It should output:
```bash
Added data node y at the-hostname:8088
```
2. Verify that all nodes are now members of the cluster as expected:
```bash
influxd-ctl show
```
Once added to the cluster, InfluxDB synchronizes data stored on the upgraded OSS
node with other data nodes in the cluster.
It may take a few minutes before the existing data is available.
-->
## Rebalance the cluster
1. Use the [`ALTER RETENTION POLICY`](/enterprise_influxdb/v1.10/query_language/manage-database/#modify-retention-policies-with-alter-retention-policy)
statement to increase the [replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor)
on all existing retention polices to the number of data nodes in your cluster.
2. [Rebalance your cluster manually](/enterprise_influxdb/v1.10/guides/rebalance/)
to meet the desired replication factor for existing shards.
3. If you were using [Chronograf](/{{< latest "chronograf" >}}/),
add your Enterprise instance as a new data source.

View File

@ -0,0 +1,111 @@
---
title: Query data with the InfluxDB API
description: Query data with Flux and InfluxQL in the InfluxDB API.
alias:
-/docs/v1.8/query_language/querying_data/
menu:
enterprise_influxdb_1_10:
weight: 20
parent: Guides
aliases:
- /enterprise_influxdb/v1.10/guides/querying_data/
v2: /influxdb/v2.0/query-data/
---
The InfluxDB API is the primary means for querying data in InfluxDB (see the [command line interface](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/) and [client libraries](/enterprise_influxdb/v1.10/tools/api_client_libraries/) for alternative ways to query the database).
Query data with the InfluxDB API using [Flux](#query-data-with-flux) or [InfluxQL](#query-data-with-influxql).
> **Note**: The following examples use `curl`, a command line tool that transfers data using URLs. Learn the basics of `curl` with the [HTTP Scripting Guide](https://curl.haxx.se/docs/httpscripting.html).
## Query data with Flux
For Flux queries, the `/api/v2/query` endpoint accepts `POST` HTTP requests. Use the following HTTP headers:
- `Accept: application/csv`
- `Content-type: application/vnd.flux`
If you have authentication enabled, provide your InfluxDB username and password with the `Authorization` header and `Token` schema. For example: `Authorization: Token username:password`.
The following example queries Telegraf data using Flux:
:
```bash
$ curl -XPOST localhost:8086/api/v2/query -sS \
-H 'Accept:application/csv' \
-H 'Content-type:application/vnd.flux' \
-d 'from(bucket:"telegraf")
|> range(start:-5m)
|> filter(fn:(r) => r._measurement == "cpu")'
```
Flux returns [annotated CSV](/influxdb/v2.0/reference/syntax/annotated-csv/):
```
{,result,table,_start,_stop,_time,_value,_field,_measurement,cpu,host
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:19Z,4.152553004641827,usage_user,cpu,cpu-total,host1
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:29Z,7.608695652173913,usage_user,cpu,cpu-total,host1
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:39Z,2.9363988504310883,usage_user,cpu,cpu-total,host1
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:49Z,6.915093159934975,usage_user,cpu,cpu-total,host1}
```
The header row defines column labels for the table. The `cpu` [measurement](/enterprise_influxdb/v1.10/concepts/glossary/#measurement) has four points, each represented by one of the record rows. For example the first point has a [timestamp](/enterprise_influxdb/v1.10/concepts/glossary/#timestamp) of `2020-04-07T18:08:19`.
### Flux
Check out the [Get started with Flux](/influxdb/v2.0/query-data/get-started/) to learn more about building queries with Flux.
For more information about querying data with the InfluxDB API using Flux, see the [API reference documentation](/enterprise_influxdb/v1.10/tools/api/#influxdb-2-0-api-compatibility-endpoints).
## Query data with InfluxQL
To perform an InfluxQL query, send a `GET` request to the `/query` endpoint, set the URL parameter `db` as the target database, and set the URL parameter `q` as your query.
You can also use a `POST` request by sending the same parameters either as URL parameters or as part of the body with `application/x-www-form-urlencoded`.
The example below uses the InfluxDB API to query the same database that you encountered in [Writing Data](/enterprise_influxdb/v1.10/guides/writing_data/).
```bash
curl -G 'http://localhost:8086/query?pretty=true' --data-urlencode "db=mydb" --data-urlencode "q=SELECT \"value\" FROM \"cpu_load_short\" WHERE \"region\"='us-west'"
```
InfluxDB returns JSON:
```json
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "cpu_load_short",
"columns": [
"time",
"value"
],
"values": [
[
"2015-01-29T21:55:43.702900257Z",
2
],
[
"2015-01-29T21:55:43.702900257Z",
0.55
],
[
"2015-06-11T20:46:02Z",
0.64
]
]
}
]
}
]
}
```
> **Note:** Appending `pretty=true` to the URL enables pretty-printed JSON output.
While this is useful for debugging or when querying directly with tools like `curl`, it is not recommended for production use as it consumes unnecessary network bandwidth.
### InfluxQL
Check out the [Data Exploration page](/enterprise_influxdb/v1.10/query_language/explore-data/) to get acquainted with InfluxQL.
For more information about querying data with the InfluxDB API using InfluxQL, see the [API reference documentation](/enterprise_influxdb/v1.10/tools/api/#influxdb-1-x-http-endpoints).

View File

@ -0,0 +1,190 @@
---
title: Write data with the InfluxDB API
description: >
Use the command line interface (CLI) to write data into InfluxDB with the API.
menu:
enterprise_influxdb_1_10:
weight: 10
parent: Guides
aliases:
- /enterprise_influxdb/v1.10/guides/writing_data/
v2: /influxdb/v2.0/write-data/
---
Write data into InfluxDB using the [command line interface](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/), [client libraries](/enterprise_influxdb/v1.10/clients/api/), and plugins for common data formats such as [Graphite](/enterprise_influxdb/v1.10/write_protocols/graphite/).
> **Note**: The following examples use `curl`, a command line tool that transfers data using URLs. Learn the basics of `curl` with the [HTTP Scripting Guide](https://curl.haxx.se/docs/httpscripting.html).
### Create a database using the InfluxDB API
To create a database send a `POST` request to the `/query` endpoint and set the URL parameter `q` to `CREATE DATABASE <new_database_name>`.
The example below sends a request to InfluxDB running on `localhost` and creates the `mydb` database:
```bash
curl -i -XPOST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE mydb"
```
### Write data using the InfluxDB API
The InfluxDB API is the primary means of writing data into InfluxDB.
- To **write to a database using the InfluxDB 1.8 API**, send `POST` requests to the `/write` endpoint. For example, to write a single point to the `mydb` database.
The data consists of the [measurement](/enterprise_influxdb/v1.10/concepts/glossary/#measurement) `cpu_load_short`, the [tag keys](/enterprise_influxdb/v1.10/concepts/glossary/#tag-key) `host` and `region` with the [tag values](/enterprise_influxdb/v1.10/concepts/glossary/#tag-value) `server01` and `us-west`, the [field key](/enterprise_influxdb/v1.10/concepts/glossary/#field-key) `value` with a [field value](/enterprise_influxdb/v1.10/concepts/glossary/#field-value) of `0.64`, and the [timestamp](/enterprise_influxdb/v1.10/concepts/glossary/#timestamp) `1434055562000000000`.
```bash
curl -i -XPOST 'http://localhost:8086/write?db=mydb'
--data-binary 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000'
```
- To **write to a database using the InfluxDB 2.0 API (compatible with InfluxDB 1.8+)**, send `POST` requests to the [`/api/v2/write` endpoint](/enterprise_influxdb/v1.10/tools/api/#api-v2-write-http-endpoint):
```bash
curl -i -XPOST 'http://localhost:8086/api/v2/write?bucket=db/rp&precision=ns' \
--header 'Authorization: Token username:password' \
--data-raw 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000'
```
When writing points, you must specify an existing database in the `db` query parameter.
Points will be written to `db`'s default retention policy if you do not supply a retention policy via the `rp` query parameter.
See the [InfluxDB API Reference](/enterprise_influxdb/v1.10/tools/api/#write-http-endpoint) documentation for a complete list of the available query parameters.
The body of the POST or [InfluxDB line protocol](/enterprise_influxdb/v1.10/concepts/glossary/#influxdb-line-protocol) contains the time series data that you want to store. Data includes:
- **Measurement (required)**
- **Tags**: Strictly speaking, tags are optional but most series include tags to differentiate data sources and to make querying both easy and efficient.
Both tag keys and tag values are strings.
- **Fields (required)**: Field keys are required and are always strings, and, [by default](/enterprise_influxdb/v1.10/write_protocols/line_protocol_reference/#data-types), field values are floats.
- **Timestamp**: Supplied at the end of the line in Unix time in nanoseconds since January 1, 1970 UTC - is optional. If you do not specify a timestamp, InfluxDB uses the server's local nanosecond timestamp in Unix epoch.
Time in InfluxDB is in UTC format by default.
> **Note:** Avoid using the following reserved keys: `_field`, `_measurement`, and `time`. If reserved keys are included as a tag or field key, the associated point is discarded.
### Configure gzip compression
InfluxDB supports gzip compression. To reduce network traffic, consider the following options:
* To accept compressed data from InfluxDB, add the `Accept-Encoding: gzip` header to InfluxDB API requests.
* To compress data before sending it to InfluxDB, add the `Content-Encoding: gzip` header to InfluxDB API requests.
For details about enabling gzip for client libraries, see your client library documentation.
#### Enable gzip compression in the Telegraf InfluxDB output plugin
* In the Telegraf configuration file (telegraf.conf), under [[outputs.influxdb]], change
`content_encoding = "identity"` (default) to `content_encoding = "gzip"`
>**Note**
Writes to InfluxDB 2.x [[outputs.influxdb_v2]] are configured to compress content in gzip format by default.
### Writing multiple points
Post multiple points to multiple series at the same time by separating each point with a new line.
Batching points in this manner results in much higher performance.
The following example writes three points to the database `mydb`.
The first point belongs to the series with the measurement `cpu_load_short` and tag set `host=server02` and has the server's local timestamp.
The second point belongs to the series with the measurement `cpu_load_short` and tag set `host=server02,region=us-west` and has the specified timestamp `1422568543702900257`.
The third point has the same specified timestamp as the second point, but it is written to the series with the measurement `cpu_load_short` and tag set `direction=in,host=server01,region=us-west`.
```bash
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server02 value=0.67
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257'
```
### Writing points from a file
Write points from a file by passing `@filename` to `curl`.
The data in the file should follow the [InfluxDB line protocol syntax](/enterprise_influxdb/v1.10/write_protocols/write_syntax/).
Example of a properly-formatted file (`cpu_data.txt`):
```txt
cpu_load_short,host=server02 value=0.67
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257
```
Write the data in `cpu_data.txt` to the `mydb` database with:
```bash
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary @cpu_data.txt
```
> **Note:** If your data file has more than 5,000 points, it may be necessary to split that file into several files in order to write your data in batches to InfluxDB.
By default, the HTTP request times out after five seconds.
InfluxDB will still attempt to write the points after that time out but there will be no confirmation that they were successfully written.
### Schemaless Design
InfluxDB is a schemaless database.
You can add new measurements, tags, and fields at any time.
Note that if you attempt to write data with a different type than previously used (for example, writing a string to a field that previously accepted integers), InfluxDB will reject those data.
### A note on REST
InfluxDB uses HTTP solely as a convenient and widely supported data transfer protocol.
Modern web APIs have settled on REST because it addresses a common need.
As the number of endpoints grows the need for an organizing system becomes pressing.
REST is the industry agreed style for organizing large numbers of endpoints.
This consistency is good for those developing and consuming the API: everyone involved knows what to expect.
REST, however, is a convention.
InfluxDB makes do with three API endpoints.
This simple, easy to understand system uses HTTP as a transfer method for [InfluxQL](/enterprise_influxdb/v1.10/query_language/spec/).
The InfluxDB API makes no attempt to be RESTful.
### HTTP response summary
* 2xx: If your write request received `HTTP 204 No Content`, it was a success!
* 4xx: InfluxDB could not understand the request.
* 5xx: The system is overloaded or significantly impaired.
#### Examples
##### Writing a float to a field that previously accepted booleans
```bash
curl -i -XPOST 'http://localhost:8086/write?db=hamlet' --data-binary 'tobeornottobe booleanonly=true'
curl -i -XPOST 'http://localhost:8086/write?db=hamlet' --data-binary 'tobeornottobe booleanonly=5'
```
returns:
```bash
HTTP/1.1 400 Bad Request
Content-Type: application/json
Request-Id: [...]
X-Influxdb-Version: {{< latest-patch >}}
Date: Wed, 01 Mar 2017 19:38:01 GMT
Content-Length: 150
{"error":"field type conflict: input field \"booleanonly\" on measurement \"tobeornottobe\" is type float, already exists as type boolean dropped=1"}
```
##### Writing a point to a database that doesn't exist
```bash
curl -i -XPOST 'http://localhost:8086/write?db=atlantis' --data-binary 'liters value=10'
```
returns:
```bash
HTTP/1.1 404 Not Found
Content-Type: application/json
Request-Id: [...]
X-Influxdb-Version: {{< latest-patch >}}
Date: Wed, 01 Mar 2017 19:38:35 GMT
Content-Length: 45
{"error":"database not found: \"atlantis\""}
```
### Next steps
Now that you know how to write data with the InfluxDB API, discover how to query them with the [Querying data](/enterprise_influxdb/v1.10/guides/querying_data/) guide!
For more information about writing data with the InfluxDB API, please see the [InfluxDB API reference](/enterprise_influxdb/v1.10/tools/api/#write-http-endpoint).

View File

@ -0,0 +1,14 @@
---
title: Introducing InfluxDB Enterprise
description: Tasks required to get up and running with InfluxDB Enterprise.
aliases:
- /enterprise/v1.8/introduction/
weight: 2
menu:
enterprise_influxdb_1_10:
name: Introduction
---
{{< children >}}

Some files were not shown because too many files have changed in this diff Show More